Large Dataset Design – Hardware Layout & Installation Considerations

In this next article in the series, we’ll take a look at some of the significant aspects of large cluster physical design and hardware installation.



Most Isilon nodes utilize a 35 inch depth chassis and will fit in a standard depth data center cabinet. However, high capacity models such as the HD400 and A2000 have 40 inch depth chassis and require extended depth cabinets such as the APC 3350 or Dell EMC Titan-HD rack.



hardware_1.png

Additional room must be provided for opening the FRU service trays at the rear of the nodes and, in Gen6 hardware, the disk sleds at the front of the chassis. Isilon nodes are either 2RU or 4RU in height (with the exception of the 1RU diskless accelerator and backup accelerator nodes).



Note that the Isilon A2000 nodes can also be purchased as a 7.2PB turnkey pre-racked solution.



Weight is another critical factor to keep in mind. Individual 4RU chassis can weigh up to around 300lbs each, and the floor tile capacity for each individual cabinet or rack must be kept in mind. For the large archive nodes styles (HD400 and A2000), the considerable node weight may prevent racks from being fully populated with Isilon equipment. If the cluster uses a variety of node types, installing the larger, heavier nodes at the bottom of each rack and the lighter chassis at the top can help distribute weight evenly across the cluster racks’ floor tiles.



There are no lift handles on a Gen6 chassis. However, the drive sleds can be removed to provide handling points if no lift is available. With all the drive sleds removed, but leaving the rear compute modules inserted, the chassis weight drops to a more manageable 115lbs. It is strongly recommended to use a lift for installation of Gen6 chassis and the 4RU earlier generation nodes.

Ensure that smaller Ethernet switches are drawing cool air from the front of the rack, not from inside the cabinet, as they are shorter than the IB switches. This can be achieved either with switch placement or by using rack shelving.Cluster backend switches ship with the appropriate rails (or tray) for proper installation of the switch in the rack. These rail kits are adjustable to fit NEMA front rail to rear rail spacing ranging from 22 in to 34 in.

Note that the Celestica Ethernet switch rails are designed to overhang the rear NEMA rails to align the switch with the Generation 6 chassis at the rear of the rack. These require a minimum clearance of 36 in from the front NEMA rail to the rear of the rack, in order to ensure that the rack door can be closed.

Consider the following large cluster topology, for example:



hardware_2.png

This contiguous eleven rack architecture is designed to scale up to ninety six 4RU nodes as the environment grows, while keeping cable management simple and taking the considerable weight of the Infiniband cables off the connectors as much as possible.



Best practices include:



  • Pre-allocate and reserve adjacent racks in the same isle to fully accommodate the anticipated future cluster expansion
  • Reserve an empty 4RU ‘mailbox’ slot above the center of each rack for pass-through cable management.
  • Dedicate the central rack in the group for the back-end and front-end switches – in this case rack F (image below).



Below, the two top Ethernet switches are for front-end connectivity and the lower two Infiniband switches handle the cluster’s redundant back-end connections.



hardware_3.png



Image showing cluster Front and Back-end Switches (Rack F Above)



The 4RU “mailbox” space is utilized for cable pass-through between node racks and the central switch rack. This allows cabling runs to be kept as short and straight as possible.



hardware_4.png



Rear of Rack View Showing Mailbox Space and Backend Network Cabling (Rack E Above)



Excess cabling can be neatly stored in 12” service coils on a cable tray above the rack, if available, or at the side of the rack as illustrated below.



hardware_5.png

Rack Side View Detailing Excess Cable Coils (Rack E Above)



Successful large cluster infrastructures depend heavily on the proficiency of the installer and their optimizations for maintenance and future expansion.



Note that for Hadoop workloads, Isilon is compatible with the rack awareness feature of HDFS to provide balancing in the placement of data. Rack locality keeps the data flow internal to the rack.

Related:

Switch questions

Hi

The customer I am working with have an RPQ approved to use their own switches for Hare/Rabbit. They want some clarifications on the below. Please can you help?

– They want to use DAC cables, this means they won’t be using Dell EMC provided SFPs. Do we foresee any issues – performance or otherwise?

– They are currently planning to get the install done with one switch and later provide 2 new switches (one to add and one to replace). There will be degradation when the old switch is being replaced. Do we foresee any issues?

– Must the hare/rabbit switches support vPC?

Also, if any of the above is possible and will have no major issues – do any of these need a fresh RPQ?

Thanks in advance for any help.

Regards, Ram.

Related:

Re: Brocade 48K Merge into Existing Fabrics

Hi All,

We are trying to re-deploy the switches, these are shipped from a different datacenter, and now we are targeting them bring into (merge) production/existing fabrics this weekend. As part of the process, we requested the Network connections, at this point of time i have remote login (serial connection) via Cyclade.

I need this connection only one time to set up ippaddrset. Once that is done, i can telnet/ssh and then configure the switch upfront, by configupload from existing switch and modify parameters (like unique domain id) and then configdownload to the switch to be merged. In this process, can you please give high level steps to be executed?

i have already cleared the zone configurations.

High level steps to merge a brocade 48k into production fabrics.

Thanks in Advance

Related:

Re: someone please tell me how to shutdown the Cisco MDS 9513 switch and restart it after

Before answering your question, it becomes important to ask what exactly you are trying to do?

You can turn the ‘power switch ****’, off, to turn the switch off. Conversely you can turn the ‘power switch ****’, on, to turn the switch on. If that is what you want to know. However if the switch is in production and you are planning a movement etc, you need to follow other steps to gracefully bring the fabric/switch down and then do a power off else the entire fabric will go down.

You may go through the previous topic, under this forum for a deeper discussion on the same issue, for your benefit. Have a look at the Power Supply image given below, it shows the ‘power switch ****’, as well.

PS9513.bmp

Related:

How to determine a switch’s role in a Clos network with MLAG’d leaves

EMC logo


A group of us have been experimenting with IP Fabrics recently and I’m curious to know if anyone has used the following approach or knows of work that accomplishes the same goal in a different way.  

When initializing an IP Fabric, one of the first decisions that needs to be made concerns the switch role (e.g., is this a leaf or a spine switch).  There are multiple approaches that could be used to aid with this decision:

  1. Rely on the factory to preset the personality of each switch before it is shipped. The challenge with this approach is that it requires additional information to be supplied by the customer and consumed by the technician in the factory.
  2. Rely on the customer to identify the switch roles (e.g., via MAC Address) as racks are being installed. This approach could be perceived as burdensome to the customer.
  3. Determine the switch role programmatically and allow the customer to override the switch roles if necessary.  The diagram below and the explanation following it provide an overview of how this might be possible if we assume MLAG’d leaves are going to be a part of the topology.

Picture1

 

Step 0: Every switch will provide the Network Controller with the list of switches it is connected to.  This information can be gathered via LLDP. 

Step 1: The Network Controller will separate the switches into two groups; the first group (i.e., A, B and C) will be connected to a common set of other switches (i.e., D, E, F, G, H, I), the second group (i.e., D, E, F, G, H, I) will not be connected to a common set of switches. 

Step 2: Label the common set of switches that the first group is connected to as set Ω and understand that anything connected to Ω is a spine switch and should be included in set Φ.

Step 3: For each member in Ω, remove the members of Φ from its list of connected members.

Step 4: For each member in Ω, use the remaining switch in the list of connected members as its MLAG partner.

Comments welcome!


Update your feed preferences


   

   


   


   

submit to reddit
   

Related:

Re: CISCO Fabric Mgr display question

We are adding HP BladeFrame Switches to our environment via ISLs to our Core switches.

When you pull up FM, go to the “Physical Attributes” frame and select ISLs, we notice that

the BFS shows up as the “From Switch” while the Core Switch shows up as the “To Switch”

but then the next switch shows up with the core switch as the “From switch” and the BFS as the

“to switch”.

Is this a timing issue when the ports are initializing or is there a better reason?

Thanks

Related:

Re: Is there a way to use SRM workbench for multi-region config

Hi John.

I encounter this issue frequently and there is a straightforward way around it.

First, make sure it is in automatic mode and add all the devices you want to discover, but make sure you add each site quantities separately. For example, if you have 2 Cisco switches in site A and 2 in site B, then add 2 lots of Cisco switches – 2 switches in each. You’ll get 2 Cisco switch collectors on screen. Repeat for all site devices, separating as required.

Then, switch the design to manual mode and start moving the collectors around so that you consolidate a certain site’s devices on 1 or more Collecting VMs – which become that site’s local Collecting requirements. You may have to temporarily increase some of the Collecting VMs memory, to give you head room to move things around – remember to reset after you have finished. You may have to add additional Collecting VMs too, but don’t do that until you really have to, as they can’t easily be deleted.

When done, you should be able to have everything in one SRM infrastructure, but easily identify separate site Collecting VMs. On some of the Collecting VM graphics, you can even edit the names to make more sense.

Hope this helps.

Related:

7019983: Reveal Switches

This document (7019983) is provided subject to the disclaimer at the end of this document.

Environment

Reveal (all builds)

Situation

What are the switches in Reveal.ini?

Resolution

Aside from the regular settings that are available in the Reveal interface, there are additional switches that can provide even more functionality to the software.

These switches can be added to the settings.ini file found in C:Program FilesGWAVAGWAVA Reveal directory.
To enable the switch, simply type them in and set them = 1. To turn off these switches change the number to 0 and to turn them on change the number to 1. Here is a list of known switches in Reveal and the description for each:
  • AllowTrustedAppInSecDom=0 This switch allows you to connect to a secondary domain and view mailboxes and messages within the secondary domain just as you would do in the primary domain. This is turned off by default.
  • UseMultiLoginAddressbookSupport=1 This switch provides extra debug logging within Reveal. By default it is turned on.
  • UseOutgoingSharedFolders=1 This switch will show the shared folders for a mailbox that has shared the folder with other mailboxes. If turned off will not show the shared folders within Reveal. By default it is turned on.
  • UseSambaFix=0 This turns on the option to use a SAMBA share to connect to a primary domain on Linux. This is turned off by default.
  • UseFullNamesInTree=1 This option shows you the full name within the mailbox tree list. This is turned on by default.
  • UseArchiveBrowsing=1 This switch allows you to search the archives within GroupWise using Reveal. When turned on it gives you new menu options to specify the locations of the archives. This option must be added in manually to the settings.ini file and by default is turned off.
  • UseSearchInTrash=1 This switch turns on the ability to search messages within the trash container. This option must be added in manually to the settings.ini file and by default is turned off.
  • UseExternalDomains=1 This switch allows you to connect Reveal to external domains. This switch much be added in manually into the settings.ini and by default is turned off.
  • CanDelete=1 This switch allows you to delete messages from the live GroupWise system using Reveal. This option must be added in manually to the settings.ini file and by default is turned off. Note: To delete a message right click and click Delete Message.

Additional Information

This article was originally published in the GWAVA knowledgebase as article ID 1438.

Disclaimer

This Support Knowledgebase provides a valuable tool for NetIQ/Novell/SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented “AS IS” WITHOUT WARRANTY OF ANY KIND.

Related: