Dell EMC Unity: Storage Processor Panic during Pool Full Event (DELL EMC Correctable)

Article Number: 525029 Article Version: 2 Article Type: Break Fix



Dell EMC Unity 300,Dell EMC Unity All Flash,Dell EMC Unity Family,Dell EMC Unity Hybrid

A Storage process has panicked when a Pool became full and one or more file systems are shown as offline and needing recovery. Any Snapshots for the file system marked as offline will be also offline and shown as needing recovery.

The sequence of the panic

  1. is the Pool got full and the system Started invalidating the snapshots
  2. Started unmouting the snapshots
  3. Tried to update the meta data and the superblock and to allocate getSlice which fails due to the pool being full
  4. The operation to unmount the Snapshot (FS) timeouts and causes an SP panic.

Storage Pool reached 100% full.

The file system will need free space in the pool in order for a recovery to be started of the File. This normally done by a

Pool expansion followed by recovery of both file systems and their snapshots.

If the Pool cannot be expanded, then free space needs to be made in the Storage Pool by deletion of unneeded File Systems, Snaps or LUNs in the pool.

If help is needed for recovery, please contact support and reference this KB article.

Related:

RecoverPoint for Virtual Machines: Wrong Port Group may be used during installation if duplicate names exist

Article Number: 502846 Article Version: 3 Article Type: Break Fix



RecoverPoint for Virtual Machines,RecoverPoint for Virtual Machines 4.2,RecoverPoint for Virtual Machines 4.3,RecoverPoint for Virtual Machines 5.0,RecoverPoint for Virtual Machines 5.1

During initial installation, if there’s a Port Group on a vSwitch and another Port Group on a dvSwitch (Distributed vSwitch) with the same name, the system would only show one of them in the DGUI network mapping options, without any way to tell if it’s the vSwitch Port Group or the distributed vSwitch Port Group.

The system will then select one of the port groups, it may be be different each time – since it’s a set of networks names and they have the same name.

ESX splitter will then have communication issues.

Port Groups have the same name on vSwitch and dvSwitch and installation code is not designed to distinguish between them, only by name which is not unique in this case.

Installation of RecoverPoint for Virtual Machines, while the customer has Port Groups configured with the same name on vSwitch and dvSwitch

Workaround:

Change the one of the Port Groups’ name, so all Port Group have unique names.

Resolution:

Dell EMC engineering is currently investigating this issue. A permanent fix is still in progress. Contact the Dell EMC Customer Support Center or your service representative for assistance and reference this solution ID.

Related:

RecoverPoint for Virtual Machines: How to properly Shutdown and Startup a cluster

Article Number: 502844 Article Version: 3 Article Type: How To



RecoverPoint for Virtual Machines,RecoverPoint for Virtual Machines 5.0,RecoverPoint for Virtual Machines 5.1

The following procedures have been tested and approved by RecoverPoint Engineering for Shutdown and Startup system operations for RP4VM 5.0.x and 5.1.x:

Shutdown procedure:

1. Shutdown protected VMs (or replica/shadow VMs if applicable)

2. Bookmark all CGs (can be done a bookmark request on a groupset (without parallel bookmarking) containing all CGs)

3. Pause Transfer (can also be using a groupset as above) on all affected CGs

4. Shutdown vRPAs

5. Shutdown ESXs and vCenter

6. Shutdown Storage arrays (if applicable)

7. Shutdown SAN switches/directors (if applicable)

Startup procedure:

1. Startup SAN switches/directors (if applicable)

2. Startup Storage arrays (if applicable)

3. ESXs and vCenter started up

4. vRPAs startup

5. RP4VMs validations for CG status, splitter connectivity, etc.

6. Startup protected VMs

7. Start transfer of paused CGs

Related:

Re: How to force remove vrpa from the vrpa cluster (version 5.0 SP1)

Hi there,

In this specific case, it’s an issue due to a failed upgrade (as RPVM cannot be upgraded directly to 5.2, has to go through 5.1.1.4/5), we would need to get info on the VMs listed in the SR to further troubleshoot this one.

Hope that helps,

Idan Kentor

Sr. Principal Product Technology Engineer – RecoverPoint and RecoverPoint for VMs

@IdanKentor

idan.kentor@emc.com

Related:

Re: Re: Can not undo writes when enable access during recover production(RP4VM)

Hi,

Enable Direct Image Access is disabled intentionally in Test a Copy and Recover Production as well as failover as these recovery operations cannot be performed while there’s direct image access. Direct IA cannot be executed only through the Test a Copy Wizard. Same for Undo Writes, although might be perceived as a legitimate operation for all recovery operations, it’s a known issue. As a workaround, undo writes through Test a Copy and then use the current image when proceeding to any of the other wizards.

Regards,

Idan Kentor

RecoverPoint Corporate Systems Engineering

@IdanKentor

Related:

  • No Related Posts

RecoverPoint for Virtual Machines: Install fails at 53% creating virtual repository

Article Number: 525663 Article Version: 2 Article Type: Break Fix



RecoverPoint for Virtual Machines 5.2,RecoverPoint for Virtual Machines 5.2 P1

Issue:

RP4MVs setup/cluster install is failing @ 53% in Deployer.

Symptoms:

clusterLogic.log shows

2018/09/19 20:16:59.336 [pool-6-thread-189] (BaseInstallationServerAdapter.java:285) ERROR – errorType: OPERATION_FAILED , userMSG: Operation failed. Failed to create JAM profile.

2018/09/19 20:16:59.337 [pool-6-thread-189] (BaseInstallationServerAdapter.java:292) ERROR – Transaction failed. transactionId=318, timeoutInSeconds=900, errorMSG=Operation failed. Failed to create JAM profile. , errorType=OPERATION_FAILED, value=null

2018/09/19 20:16:59.337 [pool-6-thread-189] (BaseRpaCall.java:52) ERROR – CreateVirtualRepositoryCall failed.

java.lang.RuntimeException: Operation failed. Failed to create JAM profile.

server.log shows

2018-09-19 20:11:53,631 [CommandWorker-14] (CreateJirafPolicyAction.java:47) ERROR – Failed creating new JCD policy: java.lang.RuntimeException: RP Filter metadata not found, please verify that RP Filter is installed.

2018-09-19 20:11:53,631 [CommandWorker-14] (BaseAction.java:46) ERROR – CreateJirafPolicyAction Failed.

java.lang.RuntimeException: RP Filter metadata not found, please verify that RP Filter is installed.

at com.emc.recoverpoint.connectors.vi.infra.pbm.PbmCreatePolicyCommand.waitAndGetRPFilterMetadata(PbmCreatePolicyCommand.java:129) ~[vi_connector_commons.jar:?]

at com.emc.recoverpoint.connectors.vi.infra.pbm.PbmCreatePolicyCommand.addRPRules(PbmCreatePolicyCommand.java:74) ~[vi_connector_commons.jar:?]

at com.emc.recoverpoint.connectors.vi.infra.pbm.PbmCreatePolicyCommand.perform(PbmCreatePolicyCommand.java:66) ~[vi_connector_commons.jar:?]

at com.emc.recoverpoint.connectors.vi.infra.pbm.PbmCreatePolicyCommand.perform(PbmCreatePolicyCommand.java:18) ~[vi_connector_commons.jar:?]

at com.emc.recoverpoint.connectors.vi.infra.pbm.BasePBMCommand.call(BasePBMCommand.java:27) ~[vi_connector_commons.jar:?]

at com.emc.recoverpoint.connectors.vi.infra.PBMProxy.pbmCreateCreateDummyJCDPolicy(PBMProxy.java:90) ~[vi_connector_commons.jar:?]

at com.emc.recoverpoint.connectors.actions.create.CreateJirafPolicyAction.createPolicy(CreateJirafPolicyAction.java:44) ~[vsphere_actions.jar:?]

at com.emc.recoverpoint.connectors.actions.create.CreateRPPolicyAction.perform(CreateRPPolicyAction.java:35) ~[vsphere_actions.jar:?]

at com.emc.recoverpoint.connectors.actions.create.CreateRPPolicyAction.perform(CreateRPPolicyAction.java:14) ~[vsphere_actions.jar:?]

at com.emc.recoverpoint.connectors.actions.infra.BaseAction.call(BaseAction.java:30) [vsphere_actions.jar:?]

at com.kashya.installation.server.commands.vsphere.CreateVirtualRepositoryCommand.createJirafPolicyId(CreateVirtualRepositoryCommand.java:61) [com.kashya.recoverpoint.installation.server.jar:?]

at com.kashya.installation.server.commands.vsphere.CreateVirtualRepositoryCommand.preExecute(CreateVirtualRepositoryCommand.java:39) [com.kashya.recoverpoint.installation.server.jar:?]

at com.kashya.installation.server.commands.Command.runNormal(Command.java:108) [com.kashya.recoverpoint.installation.server.jar:?]

at com.kashya.installation.server.commands.Command.run(Command.java:48) [com.kashya.recoverpoint.installation.server.jar:?]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_172]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_172]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_172]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_172]

at com.kashya.installation.server.ThreadPoolFactory$1.run(ThreadPoolFactory.java:43) [com.kashya.recoverpoint.installation.server.jar:?]

at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]

2018-09-19 20:11:53,632 [CommandWorker-14] (CreateVirtualRepositoryCommand.java:41) WARN – Failed to create JAM profile, calling troubleshoot action…

com.kashya.installation.server.exceptions.CommandFailedException: Operation failed. Failed to create JAM profile.

There are VASA (Storage) IOFilter providers that are seen as offline/disconnected by the VC

RP4VM Install

Workaround:

1. In vSphere web client:

a. For VC 6.0 – Select the relevant VC, then choose the “Manage” and the “Storage Providers” tabs consequently.

b. For VC 6.5 – Select the relevant VC, then choose the “Configure” tab and select the “Storage Providers” from the sub menu on the left side.

2. Unregister IOFILTER providers whose status isn’t “online”.

3. After all of the providers from “2” are gone from the list, resync the providers. After the resync has finished all the IOFilter providers should appear as “online”.

Resolution:

Dell EMC engineering is currently investigating this issue. A permanent fix is still in progress. Contact the Dell EMC Customer Support Center or your service representative for assistance and reference this solution ID.

Related:

Re: Warning: RecoverPoint copy contains volumes from more than one VPLEX CG

Dear all,

I have this warning in my RP configuration:

RecoverPoint copy contains volumes from more than one VPLEX consistency group. It is recommended a RecoverPoint copy contain volumes from only one VPLEX consitency group. VPLEX consistency group: [No VPLEX CG,produzione] ; VX-DCSIT-RPA



I checked all the CG,and I did not find volumes replicated in more than one CG.

Can someone help me understand this warning?

Regards,

Jacopo

Related:

Warning: RecoverPoint copy contains volumes from more than one VPLEX CG

Dear all,

I have this warning in my RP configuration:

RecoverPoint copy contains volumes from more than one VPLEX consistency group. It is recommended a RecoverPoint copy contain volumes from only one VPLEX consitency group. VPLEX consistency group: [No VPLEX CG,produzione] ; VX-DCSIT-RPA



I checked all the CG,and I did not find volumes replicated in more than one CG.

Can someone help me understand this warning?

Regards,

Jacopo

Related:

VPLEX: Cluster shutdown procedure with MetroPoint configuration will have have IO disruption

Article Number: 499498 Article Version: 3 Article Type: Break Fix



VPLEX VS1,VPLEX VS2,VPLEX VS6,VPLEX Metro,VPLEX GeoSynchrony 5.4 Service Pack 1,VPLEX GeoSynchrony 5.4 Service Pack 1 Patch 1,VPLEX GeoSynchrony 5.4 Service Pack 1 Patch 3,VPLEX GeoSynchrony 5.4 Service Pack 1 Patch 4

The official SolVe Desktop procedure to shutdown or restart a VPLEX cluster will result in IO disruption on RecoverPoint-enabled distributed consistency-groups for MetroPoint configurations until IO is manually resumed in the applicable consistency-groups.

Normal RecoverPoint configurations (non-MetroPoint) will not have IO disruption.

The procedure Phase 1, Task number 16, asks to identify and take note of the RecoverPoint enabled distributed consistency-groups (if applicable) with detach rule configured as winner for the cluster that is going to be shutdown / restarted (i.e. cluster-1-winner if cluster-1 is going to be shutdown / restarted).

If RecoverPoint is enabled on a distributed consistency-group, you cannot modify the group’s detach-rule, hence if you shutdown or restart the cluster that is configured as the winner for those consistency-group, the distributed devices within the consistency-group will be suspended in MetroPoint configurations until it is manually resumed in the cluster that will remain up. The procedure’s Phase 1 Task 21 asks to manually resume these suspended RecoverPoint enabled consistency-groups on the cluster that will remain online with the “choose-winner” CLI command once you have shutdown the winning cluster.

This means it’s expected to experience IO disruption on the distributed devices within the RecoverPoint enabled distributed consistency-groups for the time frame between shutting down the winning cluster and manually resuming then in the cluster that remains online.

Depending on the host type, this time frame might be enough for them to consider they permanently lost access to those volumes, entering a state where a reboot is necessary to recover access to them even after manually resuming IO on the VPLEX cluster that remains online.

Follow the official SolVe Desktop procedure

For MetroPoint configurations, currently you should plan your maintenance considering that there will be IO disruption in all volumes within RecoverPoint enabled distributed consisdency-groups, and act accordingly from host side to prevent unplanned outages or issues.

Engineering is working on alternative steps to prevent IO disruption on MetroPoint, which will be added to this KB article and the official VPLEX cluster shutdown or restart SolVe Desktop procedure when available.

Engineering will also add more Clarification and clear steps for the VPLEX cluster shutdown or restart procedure with RecoverPoint in the official SolVe Desktop procedure.

Related:

VxRail: iSCSI datastore causing host not responding and all VMs on it disconnected

Article Number: 501456 Article Version: 3 Article Type: Break Fix



VxRail Appliance Family,RecoverPoint

– Host connection and power state Dial home or customer reporting host not responding from the vCenter view and all the VMs in that host not responding, yetthe host itself is reachable on the network and all the VMs on it are accessible.

User-added image

The host itself and the VMs are reachable on the network.

User-added image

-One or more VMs are connected to iSCSI datastore, Whilesome of the hosts have SW iSCSI with port binding and vmk’s in different subnets.

User-added image

For Example :

Adapter Vmknic IPv4 IPv4 Subnet Mask

——- —— —- —————-

vmhba40 vmk4 XXX.XXX.241.XX 255.255.255.0

vmhba40 vmk5 XXX.XXX.244.XX 255.255.255.0

==This deployment usually required by RecoverPoint

Remove the port binding as we don’t need a port binding if the iSCSI and vmk ports are on different subnet.

Action Plan :

To remove the vmk’s from port binding, Please follow procedure below:

1) Migrate all VMs to other servers

2) Get the host into Maintenance mode

3) Unmount all iSCSI datastores (unmount, DO NOT delete them)

4) Remove the vmkernel ports from port binding – if you get prompted that you are about to remove the last vmkernel port from the setup, click YES

The steps to Remove the vmkernel ports from port binding:

https://pubs.vmware.com/vsphere-60/index.jsp?topic=%2Fcom.vmware.vsphere.html.hostclient.doc%2FGUID-94CBA46B-B8B8-4377-B936-CC71C1880E02.html

5) Perform a storage rescan

6) Reboot the server

7) Exit maintenance mode

8) Migrate VMs from another host to the updated host, and repeat the process

Related: