OcNOS SP : Streaming Telemetry Guide : Streaming Telemetry
Streaming Telemetry
Overview
Streaming telemetry allows users to monitor network health by efficiently streaming operational data of interest from OcNOS routers. This structured data is transmitted to remote management systems for proactive network monitoring and understanding CPU and memory usage in managed devices for troubleshooting.
A machine learning (ML) database can be created with telemetry data to establish a baseline for normal network operation and predict or mitigate network issues.
Feature Characteristics
In OcNOS various gRPC Network Management Interface (gNMI) Subscription Modes, Telemetry Modes, and Encoding Types are supported, providing efficient network management capabilities.
gNMI Subscription Modes
Streaming Telemetry Dial-In Mode: In this mode, the collector initiates a connection to the target device (OcNOS) and subscribes to receive telemetry data from OcNOS devices.
Streaming Telemetry Dial-Out Mode (Persistent Subscriptions): In this mode, the target (OcNOS) initiates the gRPC tunnel connection to the collector. Once the connection is established, the collector invokes the “Publish" RPC on the target. Subscriptions configured on the target are then streamed on that connection at the specified sample interval. These subscriptions remain active on OcNOS devices as long as the corresponding configuration on the target exists. If the gRPC tunnel connection is interrupted or the target reboots, the gNMI server on the target re-establishes the connection to the gNMI collector, ensuring continued streaming.
Streaming Telemetry Modes
Stream Mode: Enables continuous and real-time transmission of telemetry data from OcNOS devices to the monitoring system. The stream mode applies to both the dial-in and dial-out gNMI subscription modes.
Poll Mode: Poll mode subscriptions allow for on-demand data retrieval through a long-lived RPC. Subscribers initiate this mode by sending a Subscribe request message, followed by sending an empty Poll message to receive the desired data.
Note: The system supports Poll mode only in Dial-in subscription mode.
Once Mode: In Once mode subscription, the OcNOS device responds to a subscribe request with a one-time data retrieval, similar to a get request. Upon receiving the Once mode subscribe request, the device sends back the subscribe response for all subscriptions in the list and terminates the RPC.
Note: The system supports Once mode only in Dial-in subscription mode.
gNMI In-Band Support
gNMI In-Band support in OcNOS enables streaming telemetry data transmission across any one of the default, management, and user-defined VRFs. If no VRF is defined, streaming telemetry is automatically enabled within the default VRF. This enhancement allows network operators to utilize existing data interfaces for efficient in-band telemetry data transmission.
Encoding Types
Protocol Buffers (protobuf): Offers a compact binary serialization format for efficient encoding and transmission of structured telemetry data. Protobuf is optimized for performance and bandwidth efficiency.
JavaScript Object Notation (JSON): Provides a human-readable data interchange format commonly used for telemetry data representation. JSON encoding facilitates interoperability and ease of integration with various applications and tools. It adheres to the JSON specification outlined in RFC7159, employing relevant quoting. Consequently, string values are quoted while number values remain unquoted.
JSON-IETF: This variant of JSON encoding adheres to the IETF standards, ensuring consistency and compatibility with industry specifications. JSON_IETF encoded data conforms the rules outlined in RFC7951 for JSON serialization.
OcNOS supports the protobuf, JSON, and JSON-IETF encoding types for both the dial-in and dial-out gNMI subscription modes
Support for IPI Native Data Models and OpenConfig Data Models
Streaming Telemetry Data Models: OcNOS supports IPI native data models and OpenConfig data models, providing standardized representations of network configurations and telemetry data. This support enhances interoperability and facilitates consistent management across heterogeneous network environments.
Scale and Minimum Sample Interval Supported
To limit the impact of telemetry on critical features of the OcNOS target device, certain limits have been implemented for different platform types.
High End Platforms
A system is considered high range if it has eight or more CPU cores and is not based on an "Intel Atom" processor. Users can subscribe to a maximum of 100 sensor paths (including Dial-In and Dial-Out subscriptions) at any given time. The minimum supported sample interval is 10 seconds.
Standard Range Platforms
A system is considered standard range if it has fewer than eight CPU cores or is based on an "Intel Atom" processor. Users can subscribe to a maximum of 50 sensor paths (including Dial-In and Dial-Out subscriptions) at any given time. The minimum supported sample interval is 90 seconds.
Note:  
The total count of sensor paths includes the child paths of a subscribe request. For instance, if a subscribe request has four child paths, the total sensor paths count equals five (the given path plus four child paths). Use the show streaming-telemetry command to display the minimum sample interval and the maximum number of sensor paths supported for a platform.
Scale Scenarios
1. New Subscribe RPC Request Makes Total Paths To Not Exceed Max Allowed: When new paths are added to the existing paths already handled by the gNMI server, the total number does not exceed the maximum limit of 50 paths for standard platforms and 100 paths for high-end platforms. Consequently, the gNMI server accepts this subscribe request and proceeds with the processing.
2. New Subscribe RPC Request Makes Total Paths To Reach Allowed Max: With the new Subscribe RPC Request, the total paths handled would be exactly equal to 50 paths for standard platforms and 100 paths for high-end platforms. The gNMI server accepts the new subscribe request; however, a warning is logged by the gNMI server, indicating that the maximum number of paths has been reached, and it signifies that no new Subscribe RPC Stream mode requests will be handled until the number of currently handled paths drops below 100.
3. New Subscribe RPC Request Makes Total Paths To Exceed Allowed Max: With the new Subscribe RPC Request, the total paths handled exceed 50 paths for standard platforms and 100 paths for high-end platforms. The gNMI server returns an error. The RPC request is not closed but will be accepted and responded to when the total number of paths handled drops to a level that can accommodate this RPC request.
Minimum Sample Interval: The minimum supported sample interval is 10 seconds for “High-end” platform and 90 seconds for “Standard” platform type. Any sampling mode request with a sample interval of less than the minimum allowed will result in an error. However, if a sample interval is 0, it defaults to the minimum sample interval supported by the gNMI server for that platform type.
gnmic Installation
gNMI Collector Tool
For dial-in subscription mode, except when using “proto” encoding, use the open-source gNMI collector tool (gnmic). Install the open-source gNMI collector tool (gnmic) with the command:
bash -c "$(curl -sL https://get-gnmic.openconfig.net)
For dial-out subscription mode or when “proto” encoding is needed, use the gnmic tool from the gNMI collector package. It is delivered with the OcNOS installer, named OcNOS-<SKU NAME>-<version>-telemetry-client-bin.tar, and includes the gNMI Client collector application (gnmic) and the IPI_OC.proto files.
Streaming Telemetry Commands
This section lists the telemetry commands.
debug cml
Use this command to enable or disable debugging information for CML streaming telemetry.
Command Syntax
debug cml enable telemetry
debug cml disable telemetry
Parameters
None
Default
By default, debugging information is disabled.
Command Mode
Exec Mode
Applicability
This command was introduced in OcNOS version 6.4.1.
Examples
The following example illustrates how to enable and disable the telemetry debugging information.
OcNOS#debug cml enable telemetry
OcNOS#debug cml disable telemetry
debug telemetry gnmi
Use this command to enable or disable gNMI server debugging logs with severity levels.
Command Syntax
debug telemetry gnmi (enable) (severity (debug|info|warning|error|fatal|panic|d-panic)|) (vrf (management|NAME)|)
debug telemetry gnmi (disable) (severity (debug|info|warning|error|fatal|panic|d-panic)|) (vrf (management|NAME)|)
Parameters
debug
Logs a message at debug level
info
Logs a message at info level
warning
Logs a message at warning level
error
Logs a message at error level
fatal
Logs a message and causes the program to exit with return code 1.
panic
Logs a message and triggers the program to generate a traceback.
d-panic
Logs at the Panic level
d-panic
Logs at the Panic level
d-panic
Logs at the Panic level
vrf management
(Optional) Enables gNMI server debugging logs in the management VRF.
vrf NAME
(Optional) Disables gNMI server debugging logs in a user-defined VRF.
Default
By default, this command is disabled, and the gNMI server debugging level in the disabled state is set to the Error level.
Command Mode
Configure Mode
Applicability
Introduced in OcNOS version 6.4.1 and added the vrf (NAME|management) parameter in the OcNOS version 6.5.2.
Examples
The following example illustrates how to enable and disable the telemetry debug logs in a default VRF and their corresponding show output.
OcNOS(config)#feature streaming-telemetry
OcNOS(config)#debug telemetry gnmi enable severity warning
OcNOS(config)#commit
OcNOS(config)#show running-config streaming-telemetry
!
feature streaming-telemetry
debug telemetry gnmi enable severity warning
!
OcNOS(config)#debug telemetry gnmi disable severity warning
OcNOS(config)#commit
OcNOS(config)#show running-config streaming-telemetry
!
feature streaming-telemetry
!
feature streaming-telemetry
Use this command to enable the streaming telemetry and, upon configuration, to start the gNMI server. The gNMI server initiates listening for incoming gRPC connections on port 9339.
Note:  
Users can configure streaming telemetry on any one of the default, management, and user-defined VRFs. If no VRF is defined, streaming telemetry is automatically enabled in the default VRF.
If streaming telemetry is already configured in any VRF (default, management, or user-defined), attempting to configure it for another VRF will result in an error, as gNMI can only be enabled on one VRF at a time. To configure the streaming feature on a new VRF, disable the streaming telemetry on the current VRF first, commit the change, and then configure the feature on the required VRF.
Use the no parameter of this command to disable the streaming telemetry, It will stop the gNMI server.
Command Syntax
feature streaming-telemetry (vrf (NAME|management)|)
no feature streaming-telemetry (vrf (NAME|management)|)
Parameters
 
Default
By default, the streaming-telemetry feature is disabled.
Command Mode
Configure mode
Applicability
Introduced in OcNOS version 6.4.1 and added the vrf (NAME|management) parameter in the OcNOS version 6.5.2.
Examples
The following example illustrates how to enable the streaming telemetry on the default, management, and user-defined VRFs.
Default VRF
OcNOS#configure terminal
OcNOS(config)#feature streaming-telemetry
OcNOS(config)#commit
Management VRF
OcNOS#configure terminal
OcNOS(config)#feature streaming-telemetry vrf management
OcNOS(config)#commit
User-defined VRF
OcNOS#configure terminal
OcNOS(config)#ip vrf VRF1
OcNOS(config-vrf)#exit
OcNOS(config)#feature streaming-telemetry vrf VRF1
OcNOS(config)#commit
 
show streaming-telemetry
Use this command to display the streaming-telemetry details. of persistent (dial-out) and dynamic (dial-in) subscription connection details, including the POLL mode subscriptions.
The “show streaming-telemetry and all its sub-commands” also shows the max sensor-paths and minimum sample-interval for that platform.
Command Syntax
show streaming-telemetry
Parameters
None
Command Mode
Exec mode
Applicability
This command was introduced in OcNOS version 6.5.2.
Examples
The following example displays the streaming telemetry details.
OcNOS#show streaming-telemetry
 
Feature streaming telemetry : Enabled
 
VRF : management
Platform type : Standard range
Maximum sensor-paths : 50
Minimum sample-interval : 90
Number of active sensor-paths : 6 (Dial-In : 2, Dial-out : 4)
Tunnel-server Retry-interval : Default-60 (seconds)
 
SI : Sampling Interval in seconds
Enc-Type : Encoding type
OriginPath : Sensor Path
 
Dial-In STREAM Mode Subscription Details:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ClientIP:Port ID SI Enc-Type Origin:Path
------------- ------ ---- -------- ------------
10.12.43.175:41592 4234 95 JSON_IETF ipi:/interfaces/interface[name=eth0]/state
ipi:/interfaces/interface[name=eth0]/state/counters
 
 
Dial-In POLL Mode Subscription Details:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ClientIP:Port ID Enc-Type Origin:Path
------------- ------ -------- ------------
10.12.43.175:41592 11448 JSON_IETF ipi:/components/component[name=PSU-1]/state/temperature
ipi:/components/component[name=PSU-1]/state/board-fru
ipi:/components/component[name=PSU-1]/state
ipi:/components/component[name=PSU-1]/state/memory
 
Dial-Out Subscription Details:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1. Subscription-name : SUB-1
Status : ACTIVE
Enc-Type : JSON
Tunnel-server details:
~~~~~~~~~~~~~~~~~~~~~~
Destination-group Status Tunnel-IP:Port
----------------- ------ ---------------
tunnel-1 IN-ACTIVE 10.12.66.160:11161
Sensor-group details:
~~~~~~~~~~~~~~~~~~~~~
Sensor-group SI Origin:Path
------------ ---- -----------
Platform 100 ipi:/components/component[name=CHASSIS]/state
[*]ipi:/components/component[name=CHASSIS]/state/memory
[*]ipi:/components/component[name=CHASSIS]/state/board-fru
[*]ipi:/components/component[name=CHASSIS]/state/temperature
[*]-> Indicates child path learnt from parent config, not configured by user
 
The below table explains the output fields.
 
show streaming-telemetry output details
Field
Description
Feature streaming telemetry
Shows if the streaming telemetry feature is enabled or disabled.
VRF
Specifies the VRF type.
Platform type
Displays the platform type is standard or high range.
Maximum sensor-paths
Shows the maximum number of sensor paths allowed. For more details, refer to Scale Scenarios section.
Minimum sample-interval
Indicates the minimum sampling interval in seconds. For more details, refer to Scale Scenarios section.
Number of active sensor-paths
Shows the total number of active sensor paths for Dial-In and Dial-Out subscriptions (Stream mode subscriptions).
Tunnel-server Retry-interval
Displays the duration between retry attempts in seconds.
Enc-Type
Indicates the encoding type used for each subscription.
SI
Denotes the sampling interval in seconds.
Origin:Path
Displays the origin and path of the data being monitored.
ClientIP:Port
Displays the IP address and port of the client in Dial-In Mode subscriptions.
ID
Shows the unique identifier for each subscription.
Subscription-name
Shows the name of the Dial-Out subscription.
Status
Indicates if the subscription is active or inactive.
Tunnel-server details
Provides details about the tunnel server, including destination group, status, and IP:Port.
Sensor-group details
Show the details about the sensor group, including the sampling interval and origin:path.
 
 
show running-config streaming-telemetry
Use this command to display streaming telemetry status in the running configuration.
Command Syntax
show running-config streaming-telemetry
Parameters
None
Command Mode
Exec mode and Configuration Mode
Applicability
This command was introduced in OcNOS version 6.4.1.
Examples
The following example shows the streaming telemetry status in the show running-config output.
OcNOS#configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
OcNOS(config)#feature streaming-telemetry
OcNOS(config)#commit
OcNOS(config)#show running-config streaming-telemetry
!
feature streaming-telemetry
!
OcNOS(config)#exit
OcNOS#show running-config streaming-telemetry
!
feature streaming-telemetry
!
Troubleshooting
Follow the below troubleshooting steps, to debug telemetry related issues:
Verify Collector (gnmic) Command Options: Verify the input parameters, such as the sensor path, prefix and origin “ipi:”.
Check the Encoding Method Compatibility: Check that the request conforms to the supported encoding methods.
Ensure Proper Connectivity: Validate the connectivity between the router and the remote management system. This involves verifying network settings, ports, firewalls, and any potential disruptions in communication.
Collector: If gnmic does not receive a response or not receiving expected response, restart the request using the “--log” option. If more verbose debug output is needed, consider adding the “--debug” option as well. The gnmic tool displays the possible cause for any error, which helps in debugging the issue.
gNMI Server: If the issue is on server side, follow the steps below to troubleshoot telemetry issues on the OcNOS target. Enable debug and verify the logs in /var/log/messages file.
1. In configure mode, enable debug with a specific severity level either “info” or “debug” level, using the following command:
Note: To disable the debug telemetry, configure debug telemetry gnmi (disable) command.
2. In Exec mode, enable telemetry related debugs, using the following command:
Note: To disable telemetry related debugs, configure “debug cml disable telemetry” command.
3. To check the state of streaming telemetry, collect the output of the following commands based on the telemetry mode:
Note:  
For Dial-out mode, Subscription status could become inactive for the following reasons:
Sensor group(s) and destination group(s) are not configured
Destination group(s) are not configured
Sensor group(s) are not configured
Sensor-group(s) doesn't have any sensor-path(s) configured, and destination-group(s) doesn't have any tunnel-server(s) configured.
Destination-group(s) doesn't have any tunnel-server(s) configured
Sensor-group(s) doesn't have any sensor-path(s) configured
Note: If telemetry is in “disabled” state, then telemetry feature need to enabled.
4. Collect the output of the following command to gather diagnostic information and the logs in /var/log/messages file, to triage further.
show techsupport gnmi