OcNOS-SP : Troubleshooting Guide : System Management
System Management
This chapter contains steps to resolve system management issues.
 
Symptom/Cause
Solution
Non availability of telnet/ssh service
When the node is booting up, we disable all remote access. Upon the start of hostpd, the service xinetd starts.
Make sure hostpd is running or stated during init sequence of board initialization and xinted service is running. Execute at the Linux prompt and verify the listening socket:
#ip netns exec zebosfib1 netstat -tpln
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:705 0.0.0.0:* LISTEN 30044/snmpd
tcp 0 0 0.0.0.0:199 0.0.0.0:* LISTEN 30044/snmpd
tcp6 0 0 :::22 :::* LISTEN 29997/xinetd
tcp6 0 0 :::23 :::* LISTEN 29997/xinetd
tcp6 0 0 :::830 :::* LISTEN 29997/xinetd
Failure to authenticate a user
If the basic files for Linux authentication of a user are missing/corrupted, the login to the node is denied. Using console root user, make sure the /etc/passwd file has an entry for the user trying to login. Look for authentication errors are in /var/log/messages, for more about such failures.
Remote access to the node via telnet/ssh hangs
The shell imish/cmlsh is configured for all OcNOS users, except for user root, which is accessible via console only. If the module imi or cmld is not responding, then there will be no imish/cmlsh prompt after successful login.
 
The system monitoring module (pservd) restarts such hung modules, recovering hang states of one of more modules. Look for the core directory (/var/log/crash/cores) and syslog messages in /var/log/messages to find the actions from system monitoring module.
Continuous restart of any module
If any module is restarting continuously, disable monitoring such module via:
 
no software-watchdog <module name>
 
If the NSM/HSL module crashes or hangs, the system reboots.
 
The system does not reboot automatically when the earlier two reboots were due to HSL or NSM crashes during the initial few minutes of board boot up. This is to stop continuous reboots of the system due to NSM/HSL crashes.
 
There is no mechanism to disable this except for disabling the pservd service i.e. service pservd stop.
 
If the module pservd is hung, it will restart in 5 minutes.
Deleting ZebOS.conf looses management IP address
During ONIE installation, if you do not configure a static IP address, OcNOS boots and gets an IP address for eth0 (management port) through DHCP and updates the /etc/network/interfaces file. Once you configure a static IP address from the OcNOS command line and save the configuration, OcNOS updates /etc/network/interfaces and changes the method used to configure eth0 from dhcp to static.
In this scenario, if you delete ZebOS.conf, then the management IP address is lost and you can only recover management access by assigning an IP address via the console.
sys-update install <installer> failure
No free space left on system. Minimum 1 GB space is needed:remove some files to make available space > 1GB on device.
Binaries not compatible with the board: use proper installer file for the respective board.
Installer not downloaded properly, try again: downloaded installer file is not complete.
Source Interface not found.
OcNOS version you are trying to upgrade is already Installed: no need to upgrade again, you have the same version already installed.
File not found on board: installer file is not present on board for given path, provide valid path for installer file.
File not found on server: installer file is not present on the server provided in the link, provide valid link for installer file.
Server connection timed out: waited 60 seconds for server to respond.
Unsupported protocol: the ftp, http, tftp, and file protocols are supported.
Invalid installer: installer file is not valid.
Device license is not compatible with new software, upgrading will invalidate device license - Indicates current software and new software are not same and customer has to load new license as per new image.
% Source interface is not up : Ensure source interface is UP.
Note: When the sys-update operation stops without any error, check whether the IP reachability is there to download the installer.
sys-update install <deb package> failure
No free space left on system. Minimum 1 GB space is needed: remove some files to make available space > 1GB on device.
Unsupported protocol: the ftp, http, tftp, and file protocols are supported.
Unsupported OCNOS image format (need:*.deb): deb package name should be like <filename>.deb.
Kernel changes are present in this version, sysupdate not possible: upgrade using installer.
Binaries not compatible with the board: use proper installer file for the respective board.
OcNOS version you are trying to upgrade is already Installed: no need to upgrade again, you have the same version already installed.
Non-ZEBM to ZEBM upgrade using *.deb not allowed: use installer for non-ZEBM to ZEBM upgrade.
ZEBM to non-ZEBM upgrade using *.deb not allowed: use installer for ZEBM to non-ZEBM upgrade.
License Troubleshooting
Note: If you install OcNOS SP 3.0 (or later) for the first time on a device and then perform license activation, the activated license is deactivated if you install any version before OcNOS SP 3.0. To recover, the license has to be activated/installed again.
Symptom/Cause
Error
Solution
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Failure: license get <url> / license refresh
License file (IPI-DEVICEID.bin) Not Found
license is not present on system, use "license get <url>" to install the license
License installation failed due to incorrect Device ID in the License file, please use the relevant device specific license file
Downloaded license is not for the current device and it is removed. Use "license get <url>" to install the correct license.
The allowed time to process the license is expired, please download the fresh license from the FNO portal
License file lifetime has expired, but this is not an actual license expiration error. Also this lifetime value is not visible to the user. So download the license from FNO portal using "license get <url>" again.
 
Note: The Lifetime field is the lifetime of the capability response, in seconds, after which the response is considered “stale” and cannot be processed by the client or server. IPInfusion has the lifetime set to 3628800 seconds or 42 days. If a capability response is created and held without installing for more than the specified period (42 days), it turns stale and the target device would not be able to process this.
Response is out of order with previous responses, also show license is not reflecting the new license features.
User have already installed a license which is downloaded more recently than the current license. But once you land in this error case, re-installation of either of these two licenses will not be helpful anymore. So download the license from portal freshly and install "license get <url>" command.
Failed to create trusted storage
Remove the contents of /cfg/license/ then install the license using "license get <url>" CLI.
Invalid license file
License file might be corrupt, so download and install the license from FNO license portal using "license get <url>". If it still fails, validate the checksum of the license file in /cfg/license/bin/ with the one downloaded from the FNO portal.
Start date for the license is in the future
Correct the system clock and issue the "license refresh" command to install the license
Empty license file
Download the license from FNO license portal, and install using "license get <url>".
Failed to process capability response / Failed to process the license file
Remove the contents of /cfg/license/ then install the license using "license get <url>" CLI.
Command "license get" is not installing the given license file, but processes old license and fails.
Correct the system clock and issue "license get <url>" to install the license again.
 
License is not matching with device software
License file SKU is not compatible with device software, please map the right SKU, then generate and install the license.
 
Empty license response received: license is not mapped with SKU or the license server exhausted its limit
Select a SKU while generating license from FNO license portal or increase the license pool on the license server to accommodate more devices.