Re: Maintenance mode during Rebuild / Rebalance?

Hello guys,

Can I set Maintenance Mode to an SDS while a Rebuild/Rebalance ?

At ScaleIO 2.0 User guide I see the following:

To invoke maintenance mode, the following conditions are required:

– Only one Fault Unit (or standalone SDS) can be in maintenance mode at any given time.

– No other SDSs can be in degraded or failed state (force override can be used).

– There must be adequate space on other SDSs for the additional backup (force override can be used).

Note

Use of force override options when entering maintenance mode can lead to data unavailability while maintenance mode is activated

At the Managing Dell EMC ScaleIO Ready Nodes using Dell EMC OpenManage Essentials.pdf it mentions:

3.2.1 Meet prerequisites for maintenance

The following prerequisites should be met:

• ESXi servers should be in a cluster with high availability (HA), Distributed Resource Scheduling (DRS) and vMotion enabled. This will allow the administrator to update and reboot the server without impacting the virtual machines’ availability.

• The administrator must verify in the ScaleIO GUI that:

• No SDC or SDS is disconnected.

• No other Fault Unit (standalone SDS) is in Maintenance Mode.

There must be adequate space on other SDSs for additional backup.

No rebuild or rebalance is running in the background.

• No degraded capacity exists.

• No SDS device is in error state.

Thank you in advance,

Kleanthis

Related:

Maintenance mode during Rebuild / Rebalance?

Hello guys,

Can I set Maintenance Mode to an SDS while a Rebuild/Rebalance ?

At ScaleIO 2.0 User guide I see the following:

To invoke maintenance mode, the following conditions are required:

– Only one Fault Unit (or standalone SDS) can be in maintenance mode at any given time.

– No other SDSs can be in degraded or failed state (force override can be used).

– There must be adequate space on other SDSs for the additional backup (force override can be used).

Note

Use of force override options when entering maintenance mode can lead to data unavailability while maintenance mode is activated

At the Managing Dell EMC ScaleIO Ready Nodes using Dell EMC OpenManage Essentials.pdf it mentions:

3.2.1 Meet prerequisites for maintenance

The following prerequisites should be met:

• ESXi servers should be in a cluster with high availability (HA), Distributed Resource Scheduling (DRS) and vMotion enabled. This will allow the administrator to update and reboot the server without impacting the virtual machines’ availability.

• The administrator must verify in the ScaleIO GUI that:

• No SDC or SDS is disconnected.

• No other Fault Unit (standalone SDS) is in Maintenance Mode.

There must be adequate space on other SDSs for additional backup.

No rebuild or rebalance is running in the background.

• No degraded capacity exists.

• No SDS device is in error state.

Thank you in advance,

Kleanthis

Related:

Software Defined Storage Availability (Part 2): The Math Behind Availability

EMC logo


As we covered in our previous post ScaleIO can easily be configured to deliver 6-9’s of availability or higher using only 2 replicas that saves 33% of the cost compared to other solutions while providing very high performance. In this blog we will discuss the facts of availability using math and demystify the myth behinds ScaleIO’s high availability.

For data loss or data unavailability to occur in a system with two replicas of data (such as ScaleIO) there must be two concurrent failures or a second failure must occur before the system recovers from a first failure. Therefore one of the following four scenarios must occur:

  1. Two drive failures in a storage pool OR
  2. Two nodes failures in a storage pool OR
  3. A node failed followed by a drive failure OR
  4. A drive failed followed by a node failure

Let us choose two popular ScaleIO configurations and derive the availability of each.

  1. 20 x ScaleIO servers deployed on Dell EMC’s PowerEdge Servers R740xd with 24 SSD drives each, 1.92TB SSD drive size using 4 x 10GbE Network. In this configuration we will assume that the rebuild time is network bound.
  2. 20 x ScaleIO servers deployed on Dell EMC’s PowerEdge Servers R640 with 10 SSD drives each, 1.92TB SSD drives using 2 x 25GbE Network. In this configuration we will assume that the rebuild time is SSD bound.

Note: ScaleIO best practices recommend a maximum of 300 drives in a storage pool, therefore for the first configuration we will configure two storage pools with 240 drives in each pool.

To calculate the availability of a ScaleIO system we will leverage a couple of well know academic publications:

  1. RAID: High Performance Reliable secondary Storage (from UC Berkeley) and
  2. A Case for Redundant Array of Inexpensive Disks (RAID).

We will adjust the formulas in the paper to the ScaleIO architecture and model the different failures.

Two Drive Failures

We will use the following formula to calculate the MTBF of ScaleIO system for a two drive failure scenario:

Where:

  • N = Number of drives in a system
  • G = Number of drives in a storage pool
  • M = Number of drives per server
  • K = 8,760 hours
( 1 Year)
  • = MTBF of a single drive
  • = Mean Time to Repair – repair/rebuild time of a failed drive

Note: This formula assumes that two drives that fail in the same ScaleIO SDS (server) will not cause DU/DL as the ScaleIO architecture guarantees that replicas of the same data will NEVER reside on the same physical node.

Let’s assume two scenarios – in the first scenario the rebuild process is constrained by network bandwidth – in the second scenario the rebuild process is constrained by drive performance bandwidth.

Network Bound

In this case we assume that the rebuild time/performance is limited by the availability of network bandwidth. This will be the case if you deploy a dense configuration such as the DELL 740xd servers with a large number of SSDs in a single server. In this case, the MTTR function is:

Where:

  • S – Number of servers in a ScaleIO cluster
  • Network Speed – Bandwidth in GB/s available for rebuild traffic (excluding application traffic)
  • Conservative_Factor = factor additional time to complete the rebuild (to be conservative).

Plugging in the relevant values in the formula above, we get a MTTR of ~1.5 minutes for the 20 x R740, 24 SSDS @ 1.92TB w/ 4 X 10GbE network connections configuration (two storage pools w/ 240 drives per pool). The 20 x R640, 10SSDs @ 1.92TB w/ 2 X 25GbE network connections config provides MTTR of ~2 minutes. These MTTR values reflect the superiority of ScaleIO’s declustered RAID architecture that result in a very fast rebuild time. In a later post we will show how those MTTR values are critical and how they impact system availability and operational efficiency.

SSD Drive Bound

In this case, the rebuild time/performance is bound by the number of SSD drives and the rebuild time is a function of the number of drives available in the system. This will be the case if you deploy less dense configurations such as the 1U Dell EMC PowerEdge R640 servers. In this case, the MTTR function is:

Where:

  • G – Number of drives in a storage pool
  • Drive_Speed – Drive speed available for rebuild
  • Conservative_Factor = factor additional time to complete the rebuild (to be conservative).

System availability is calculated by dividing the time that the system is available and running, by the total time the system was running added to the restore time. For availability we will use the following formula:

Where:

  • RTO – Recovery Time Objective or the amount of time it takes to recover a system after a data loss event (For example: if two drives fail in a single pool), where data needs to be recovered from a backup system. We will be highly conservative and will consider Data Unavailability (DU) scenarios as bad as Data Loss (DL) scenarios therefore we will use RTO in the availability formula.

Note: the only purpose of RTO is to translate MTBF to availability.

Node and Device Failure

Next, let’s discuss the system’s MTBF when a node fails and followed by a drive failure, for this scenario we will be using the followed model:

Where:

  • M = Number of drives per node
  • G = Number of drives in the pool
  • S = Number of servers in the system
  • K = Number of hours in 1 year i.e. 8,760 hours
  • MTBFdrive = MTBF of a single drive
  • MTBFserver = MTBF of a single node
  • MTTRserver = repair/rebuild time of failed server

In a similar way, one can develop the formulas for other failure sequences such as a drive failure after a node failure and a second node failure after a first node failure.

Network Bound Rebuild Process

In this case we assume that rebuild time/performance is constrained by network bandwidth. We will make similar assumptions as for drive failure. In this case, the MTTR function is:

Where:

  • M – Number of drives per server
  • S – Number of servers in a ScaleIO cluster
  • Network Speed – Bandwidth in GB/s available for rebuild traffic (excluding application traffic)
  • Conservative_Factor = factor additional time to complete the rebuild to be conservative

Plugging the relevant values in the formula above, we get a MTTR of ~30 minutes for the 20 x R740, 24 SSDS @ 1.92TB w/ 4 X 10GbE network connections configuration (two storage pools w/ 240 drives per pool). The 20 x R640, 10SSDs @ 1.92TB w/ 2 x 25GbE Network config provides MTRR of ~20 minutes. During system recovery ScaleIO rebuilt about 48TB of data for the first configuration and about 21TB for the second configuration.

SSD Drive Bound

In this case we assume that the Rebuild time/performance is SSD drive bound and the rebuild time is a function of the number of drives available in the system. Using the same assumptions as for drive failures, the MTTR function is:

Where:

  • G – Number of drives in a storage pool
  • M – Number of drives per server
  • Drive_Speed – Drive speed available for rebuild
  • Conservative_Factor = factor additional time to complete the rebuild to be conservative

Based on the provided formulas let’s calculate the availability of ScaleIO system based on the two different configurations:

20 x R740, 24 SSDS @ 1.92TB w/ 4 X 10GbE Network

(Deploying 2 storage pools w/ 240 drives per pool)

Reliability (MTBF) Availability
Drive After Drive 43,986 [Years] 0.999999955
Drive After Node 6,404 [Years] 0.999999691
Node After Drive 138,325 [Years] 0.999999985
Node After Node 38,424 [Years] 0.999999897
Overall System 4,714 [Years] 0.99999952 or 6-9’s

20 x R640, 10SSDs @ 1.92TB w/ 2 x 25GbE:

Reliability (MTBF) Availability
Drive After Drive 105,655 [Years] 0.999999983
Drive After Node 27,665 [Years] 0.999999937
Node After Drive 276,650 [Years] 0.999999993
Node After Node 69,163 [Years] 0.999999975
Overall System 15,702 [Years] 0.99999989 or 6-9’s

Since these calculations are complex, ScaleIO provides its customers with FREE online tools to build HW configurations and obtain availability numbers that includes all possible failure scenarios. We advise customers to use this tool, rather than crunch complex mathematics, to build system configurations based on desired system availability targets.

As you can see, yet again, we prove that the ScaleIO system easily exceeds 6-9’s of availability with just 2 replicas of the data. Unlike other vendors, neither extra additional data replicas nor erasure coding is required!  So do you have to deploy three replica copies to achieve enterprise availability? No you do not! The myth is BUSTED.



ENCLOSURE:https://blog.dellemc.com/uploads/2018/02/Facts-Myths-Blackboard-Chalkboard-Yellow-Arrows-1000×500.jpg

Update your feed preferences


   

   


   


   

submit to reddit
   

Related:

  • No Related Posts

Software Defined Storage Availability (Part 1): Why Do With Three What You Can Do With Two?

EMC logo


Must an enterprise deploy 3 or more replicas of data, or some form of erasure coding, to achieve enterprise class availability (of 99.9999% uptime)? Several software-defined storage (SDS) vendors seem to insist so. But is this true? We will methodically discuss this topic in a series of three blog posts and equip you with better knowledge on this complex and very important topic.

When it comes to safety, aircraft manufacturers are a great example of an industry that goes to great lengths to ensure maximum passenger safety.  When you boarded your last flight, did you board an aircraft powered by two jet engines or four? Do you generally feel unsafe when you fly aircrafts with two jet engines? Passengers, airliners and airline industry experts all agree that today’s two engine aircrafts provides an equal amount of safety as aircrafts with four jet engines. Why is this so? It is because innovations in engine and aircraft technologies have made a single engine very powerful and efficient – powerful enough to take-off, cruise or land using just a single engine. And once safety is ensured, superior economics determines the winner. As two engines are more economical than four, it makes them hugely popular with airliners (lower costs) and customers (lower ticket prices).

The aircraft analogy is very similar to enterprise storage – enterprise customers can enjoy high levels of availability on well architected systems that require fewer copies of data, costing a lot less, but do not require any form of erasure coding. Dell EMC’s industry leading ScaleIO is one such system. Designed for demanding enterprises, ScaleIO is a data center grade software defined storage solution that regularly meets and exceeds six or even seven nines of availability.

5-9’s or 6-9’s? Definition of Enterprise Availability SLAs

Availability of a system is defined in terms how much time a system is up and running throughout the year. A popular measure of availability is in terms of percentages:

Availability Downtime per year
90% (“one nine”) 36.5 days
99% (“two nines”) 3.65 days
99.9% (“three nines”) 8.76 hours
99.99% (“four nines”) 52.56 minutes
99.999% (“five nines”) 5.26 minutes
99.9999% (“six nines”) 31.5 seconds
99.99999% (“seven nines”) 3.15 seconds

For example, 6-9’s of availability means that a system is unavailable for 31.5 seconds in a year.

The ScaleIO Magic

Using ScaleIO you can easily build a ScaleIO cluster with 6-9’s of availability or more. ScaleIO’s unique declustered raid technology, its ability to quickly detect failures and perform fast rebuild allow our customers to get predictable and consistent performance and achieve 6-9’s or more availability with 33% lower storage capacity (and proportionate cost) than other vendors’ solutions that demand 3 copies of data.

The ScaleIO architecture utilizes multiple mechanisms to achieve enterprise grade availability:

  • The ScaleIO data protection scheme is based on full replicas deployed in a declustered raid scheme. This scheme offers superior recovery time with enterprise availability and provides consistent performance for customer’s applications even during a drive or node rebuild.
  • ScaleIO’s rebuild process utilizes an efficient many-to-many scheme. The rebuild is invoked in the event of a drive or node failure and therefore the rebuild scheme allows for a very fast rebuild.
  • ScaleIO uses ALL SDS devices in the storage pool for rebuild operations. For example, let’s say a pool has 200 drives, when one drive fails, the other 199 drives will be utilized to rebuild the data of the failed drive. This results in extremely quick rebuilds with minimum impact to application performance.
  • ScaleIO detects a disk failure in seconds. This time includes the time it takes the Operating System to detect the issue plus the time it takes ScaleIO to start the rebuild process. ScaleIO starts the rebuild immediately after detecting the disk failure while some software-defined storage solutions wait a considerable amount of time before starting the rebuild process (typically tens of minutes and some have a default of an hour). As will be shown later, the inability to detect failure quickly, and then start the rebuild process immediately, significantly reduces the availability of those solutions.
  • ScaleIO architectural innovations such as Protection Domains, Storage Pools and Fault Sets help customers manage failure domains very flexibly and in some cases improve availability.

Read more on ScaleIO’s unique architecture here.

Since these calculations are complex, ScaleIO provides its customers with FREE online tools to build HW configurations based on ScaleIO Ready Nodes to get comprehensive availability numbers that includes multiple possible failures scenarios. We advise customers to use this tool to build hardware configuration based on desired system availability target. If you are a ScaleIO customer, the tool can be accessed here.

Here are some sample configurations each with availability of 6-9’s or higher

 

Configuration Number

Configuration Detail

Availability

Full Sizer Output

1

30 x R640 Servers,  10 x 3.84TB SAS SSDs, 2 x 25GbE network, 1PB raw: 1 Storage Pool 99.9999179% Click here

2

49 x R740dx Servers, 24 x 960GB SAS SSDs, 2 x 10GbE network, one storage pool 1PB raw, 3 Storage pools w 196 devices per pool

 

99.9999669%

(per storage pool)

Click here

3

5 x R740dx Servers, 24 x 960GB SAS SSDs, 2 x 10GbE, one storage pool, 100TB raw,  3 storage pools w/ 40 devices per pool

99.9999968%

(per storage pool)

Click here

4

12 x R640 Servers, 10 x 3.84TB SAS SSDs, 2 x 10GbE, 1 storage pool, 250TB raw 99.9999846% Click here

5

72 x R640 servers, 10 x3.84 SAS SSDs, 2 x 25GbE, 1 storage pool, 2500TB raw 99.9999460% Click here

As you can see a variety of highly customizable configurations are possible with varying performance and capacity to meet our customer’s needs with every configuration assured of 6-9’s of availability or higher. So like two engine jets, where modern technology is now capable of providing equal safety as four engines, ScaleIO architecture is capable to deliver 6-9’s of availability requiring just 2 replicas of data, and resulting in 33% lower cost. So: why do with three what you can do with two?



ENCLOSURE:https://blog.dellemc.com/uploads/2018/01/Software-Defined-Storage-Server-Room-Data-Center-1000×500.jpg

Update your feed preferences


   

   


   


   

submit to reddit
   

Related:

New ECS HA Design white paper published

http://www.emc.com/collateral/whitepaper/h16344-elastic-cloud-storage-ha-design.pdf

Check out this new ECS HA Design white paper. It provides details about the technology behind how ECS provides enterprise availability such as:

• how the distributed infrastructure provides increased system availability

• advanced data protection methods that provide data durability such as erasure coding and triple mirroring

• how data is distributed for optimal availability

• automatic failure detection

• built-in self-healing methods

• disk, node and network failure resolution details

• disaster recovery – how ECS protects against site wide failures

• how consistency is maintained in an active-active multi-site configuration

• how site-wide failures are detected

• access options during a site outage

• how data durability is re-established after a permanent site-wide failure

Related: