ECS – xDoctor: “One or more network interfaces are down or missing”

Article Number: 503814 Article Version: 5 Article Type: Break Fix



Elastic Cloud Storage,ECS Appliance,ECS Appliance Hardware



xDoctor is reporting the below warning:

admin@ecs1:~> sudo -i xdoctor --report --archive=2017-09-01_064438 -CEWDisplaying xDoctor Report (2017-09-01_064438) Filter:['CRITICAL', 'ERROR', 'WARNING'] ...Timestamp = 2017-09-01_064438 Category = platform Source = ip show Severity = WARNING Node = 169.254.1.1 Message = One or more network interfaces are down or missing Extra = {'169.254.1.4': ['slave-0']} 

Connect to the node 169.254.1.4 in question and see in this case connection to rabbit switch is down:

admin@ecs4:~> sudo lldpcli show neighbor-------------------------------------------------------------------------------LLDP neighbors:-------------------------------------------------------------------------------Interface: slave-1, via: LLDP, RID: 1, Time: 28 days, 16:42:58 Chassis: ChassisID: mac 44:4c:a8:f5:63:ad SysName: hare SysDescr: Arista Networks EOS version 4.16.6M running on an Arista Networks DCS-7050SX-64 MgmtIP: 192.168.219.253 Capability: Bridge, on Capability: Router, off Port: PortID: ifname Ethernet12 PortDescr: MLAG group 4-------------------------------------------------------------------------------Interface: private, via: LLDP, RID: 2, Time: 28 days, 16:42:44 Chassis: ChassisID: mac 44:4c:a8:d1:77:b9 SysName: turtle SysDescr: Arista Networks EOS version 4.16.6M running on an Arista Networks DCS-7010T-48 MgmtIP: 192.168.219.251 Capability: Bridge, on Capability: Router, off Port: PortID: ifname Ethernet4 PortDescr: Nile Node04 (Data)-------------------------------------------------------------------------------admin@ecs4:~> 

Check public interface config:

admin@ecs4:~> sudo cat /etc/sysconfig/network/ifcfg-publicBONDING_MASTER=yesBONDING_MODULE_OPTS="miimon=100 mode=4 xmit_hash_policy=layer3+4"BONDING_SLAVE0=slave-0BONDING_SLAVE1=slave-1BOOTPROTO=staticIPADDR=10.x.x.x/22MTU=1500STARTMODE=autoadmin@ecs4:~> 
admin@ecs4:~> viprexec -i "grep Mode /proc/net/bonding/public"Output from host : 192.168.219.1Bonding Mode: IEEE 802.3ad Dynamic link aggregationOutput from host : 192.168.219.2Bonding Mode: IEEE 802.3ad Dynamic link aggregationOutput from host : 192.168.219.3Bonding Mode: IEEE 802.3ad Dynamic link aggregationOutput from host : 192.168.219.4Bonding Mode: IEEE 802.3ad Dynamic link aggregationadmin@ecs4:~> 

Check interface link status:

admin@ecs4:~> viprexec -i 'ip link show | egrep "slave-|public"'Output from host : 192.168.219.1bash: public: command not found3: slave-0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 10005: slave-1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 100010: public: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group defaultOutput from host : 192.168.219.2bash: public: command not found3: slave-0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 10005: slave-1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 100010: public: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group defaultOutput from host : 192.168.219.3bash: public: command not found4: slave-0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 10005: slave-1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 100010: public: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group defaultOutput from host : 192.168.219.4bash: public: command not found2: slave-0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master public state DOWN mode DEFAULT group default qlen 10005: slave-1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master public state UP mode DEFAULT group default qlen 100010: public: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group defaultadmin@ecs4:~> 
admin@ecs4:~> sudo ethtool slave-0Settings for slave-0: Supported ports: [ FIBRE ] Supported link modes: 10000baseT/Full Supported pause frame use: No Supports auto-negotiation: No Advertised link modes: 10000baseT/Full Advertised pause frame use: No Advertised auto-negotiation: No Speed: Unknown! Duplex: Unknown! (255) Port: Other PHYAD: 0 Transceiver: external Auto-negotiation: off Supports Wake-on: d Wake-on: d Current message level: 0x00000007 (7) drv probe link Link detected: no

Refer to ECS Hardware Guide for details of specific port on the switch.

The ECS Hardware Guide is available in SolVe as well as at support.emc.com:

https://support.emc.com/docu62946_ECS-3.1-Hardware-Guide.pdf?language=en_US

Port 12 on rabbit switch is connected to slave-0 interface of node 4.

Connect to rabbit with admin credentials from a different node and check interface status:

admin@ecs1:~> ssh rabbitPassword:Last login: Tue Sep 5 11:13:30 2017 from 192.168.219.1rabbit>show interfaces Ethernet12Ethernet12 is down, line protocol is notpresent (notconnect) Hardware is Ethernet, address is 444c.a8de.8f83 (bia 444c.a8de.8f83) Description: MLAG group 4 Member of Port-Channel4 Ethernet MTU 9214 bytes , BW 10000000 kbit Full-duplex, 10Gb/s, auto negotiation: off, uni-link: n/a Loopback Mode : None 0 link status changes since last clear Last clearing of "show interface" counters never 5 minutes input rate 0 bps (0.0% with framing overhead), 0 packets/sec 5 minutes output rate 0 bps (0.0% with framing overhead), 0 packets/sec 0 packets input, 0 bytes Received 0 broadcasts, 0 multicast 0 runts, 0 giants 0 input errors, 0 CRC, 0 alignment, 0 symbol, 0 input discards 0 PAUSE input 0 packets output, 0 bytes Sent 0 broadcasts, 0 multicast 0 output errors, 0 collisions 0 late collision, 0 deferred, 0 output discards 0 PAUSE outputrabbit> 

The above interface status shows link also down and there has been never any I/O traffic on this interface.



SFP was not properly seated during install phase.



Customer was able to re-seat the SFP interface. After that the link automatically was detected and came online.

Otherwise CE need to go onsite for a physical inspection of SFP module, cable etc. connecting to slave-x interface on node.

Extract from ECS Hardware Guide:

Network cabling

The network cabling diagrams apply to U-Series, D-Series, or C-Series ECS Appliance in an Dell EMC or customer provided rack.

To distinguish between the three switches, each switch has a nickname:
  • Hare: 10 GbE public switch is at the top of the rack in a U- or D-Series or the top switch in a C-Series segment.
  • Rabbit: 10 GbE public switch is located just below the hare in the top of the rack in a U- or D-Series or below the hare switch in a C-Series segment.
  • Turtle: 1 GbE private switch that is located below rabbit in the top of the rack in a U-Series or below the hare switch in a C-Series segment.
U- and D-Series network cabling

The following figure shows a simplified network cabling diagram for an eight-node configuration for a U- or D-Series ECS Appliance as configured by Dell EMC or a customer in a supplied rack. Following this figure, other detailed figures and tables provide port, label, and cable color information.

Switches cabling

The ECS Hardware and cabling guide is showing rabbit and hare switches labeled as switch 1 and switch 2 what can cause confusion when cabling want to be verified.

See below table for matching switches and ports as well as the picture for also showing the appropriate switch port numbers.

Switch 1 = Rabbit = Bottom switch

Switch 2 = Hare = Top switch

Node ports:

Slave-0 = P01 = right port – connects to Switch 1 / Rabbit / Bottom switch

Slave-1 = P02 = left port – connects to Switch 2 / Hare / Top switch

User-added image

Related:

Leave a Reply