VPLEX: Health-check –full reports Call Home “Error” state post NDU[1]

Article Number: 523118 Article Version: 3 Article Type: Break Fix



VPLEX GeoSynchrony,VPLEX Local,VPLEX Metro,VPLEX Series,VPLEX VS2,VPLEX VS6

An Error is reporting in the commandhealth-check –full post upgrade but the Call Home functions properly.

  • Pre NDU Health-check –full doesn’t report an error.

  • Post NDUHealth-check –full reports “Checking Call Home Status” as Error.

  • ConnectEMC_config.xml file looks the same as pre NDU as post NDU.

  • No issues seen in connectemc related logs.

  • The SMTP service is reachable and non-blocked.

  • Call-Home works right, for every triggered call-home test.

  • SYR / CLM system determine call home alerts have being correctly received. Hence, confirming Connecthome is received.

Comparing PRE & POST Non-Disruptive Upgrade (NDU)

PRE NDU

VPlexcli:/> health-check –full

Configuration (CONF):

Checking VPlexCli connectivity to directors……………….. OK

Checking Directors Commission……………………………. OK

Checking Directors Communication Status…………………… OK

Checking Directors Operation Status………………………. OK

Checking Inter-director management connectivity……………. OK

Checking ports status…………………………………… OK

Checking Call Home……………………………………… OK

Checking Connectivity…………………………………… OK

POST NDU

VPlexcli:/> health-check –full

Configuration (CONF):

Checking VPlexCli connectivity to directors……………….. OK

Checking Directors Commission……………………………. OK

Checking Directors Communication Status…………………… OK

Checking Directors Operation Status………………………. OK

Checking Inter-director management connectivity……………. OK

Checking ports status…………………………………… OK

Checking Call Home Status……………………………….. Error

service@vplexMM:/var/log/VPlex/cli> more health_check_full_scan.log

Configuration (CONF):

Checking VPlexCli connectivity to directors……………….. OK

Checking Directors Commission……………………………. OK

Checking Directors Communication Status…………………… OK

Checking Directors Operation Status………………………. OK

Checking Inter-director management connectivity……………. OK

Checking ports status…………………………………… OK

Checking Call Home Status……………………………….. Error

Email Server under Notification type: ‘onSuccess/onFailure’ is either

Not reachable or invalid.

Check if Email Server IP address: ‘10.1.111.100’ is reachable and valid.

Email Server under Notification type: ‘Primary’ and ‘Failover’ are either

Not reachable or invalid.

Check if Email Server IP address: ‘10.1.111.100’ and ‘10.1.111.100’ are

Reachable and valid.

service@vplexMM:/opt/emc/connectemc> cat ConnectEMC_config.xml

<?xml version=”1.0″ encoding=”UTF-8″ standalone=”no” ?>

<ConnectEMCConfig SchemaVersion=”1.1.0″>

<ConnectConfig Type=”Email”>

<Retries>7</Retries>

<Notification>Primary</Notification>

<Timeout>700</Timeout>

<Description></Description>

<BsafeEncrypt>no</BsafeEncrypt>

<IPProtocol>IPV4</IPProtocol>

<EmailServer>10.1.111.100</EmailServer>

<EmailAddress>emailalert@EMC.com</EmailAddress>

<EmailSender>VPlex_CKM00000000999@EMC.com</EmailSender>

<EmailFormat>ASCII</EmailFormat>

<EmailSubject>Call Home</EmailSubject>

<STARTTLS>no</STARTTLS>

<IncludeCallHomeData>no</IncludeCallHomeData>

<InsertBefore></InsertBefore>

<PreProcess></PreProcess>

<PostProcess></PostProcess>

<HeloParameter></HeloParameter>

</ConnectConfig>

<ConnectConfig Type=”Email”>

<Retries>7</Retries>

<Notification>Failover</Notification>

<Timeout>700</Timeout>

<Description></Description>

<BsafeEncrypt>no</BsafeEncrypt>

<IPProtocol>IPV4</IPProtocol>

<EmailServer>10.1.111.100</EmailServer>

<EmailAddress>emailalert@EMC.com</EmailAddress>

<EmailSender> VPlex_CKM00000000999@EMC.com</EmailSender>

<EmailFormat>ASCII</EmailFormat>

<EmailSubject>Call Home</EmailSubject>

<STARTTLS>no</STARTTLS>

<IncludeCallHomeData>no</IncludeCallHomeData>

<InsertBefore></InsertBefore>

<PreProcess></PreProcess>

<PostProcess></PostProcess>

<HeloParameter></HeloParameter>

</ConnectConfig>

<ConnectConfig Type=”Email”>

<Retries>7</Retries>

<Notification>onSuccess/onFailure</Notification>

<Timeout>700</Timeout>

<Description></Description>

<BsafeEncrypt>no</BsafeEncrypt>

<IPProtocol>IPV4</IPProtocol>

<EmailServer>10.1.111.100</EmailServer>

<EmailAddress>customer@genericemailaddress.com</EmailAddress>

<EmailSender>VPlex_CKM00000000999@EMC.com</EmailSender>

<EmailFormat>ASCII</EmailFormat>

<EmailSubject>Call Home</EmailSubject>

<STARTTLS>no</STARTTLS>

<IncludeCallHomeData>yes</IncludeCallHomeData>

<InsertBefore></InsertBefore>

<PreProcess></PreProcess>

<PostProcess></PostProcess>

<HeloParameter></HeloParameter>

</ConnectConfig>

</ConnectEMCConfig>

service@vplexMM:/var/log/ConnectEMC/logs> ping 10.1.111.100

PING 10.1.111.100 (10.1.111.100) 56(84) bytes of data.

— 10.1.111.100 ping statistics —

6 packets transmitted, 0 received, 100% packet loss, time 5010ms

service@vplexMM:~> telnet 10.1.111.100 25

Trying 10.1.111.100…

Connected to 10.1.111.100

Escape character is ‘^]’.

220 emc.com

helo localhost

250 emc.com

mail from: VPlex_CKM00000000999@EMC.com

250 2.1.0 Ok

rcpt to:customer@genericemailaddress.com

250 2.1.0 Ok

VPlexcli:/notifications/call-home> test

call-home test was successful.


As per the above information, this means that the customer is allowing the SMTP service on port “25” only and not the ICMP “ping”.

This error is expected and can be ignored once you verify that the test call home is working and appearing under /opt/emc/connectemc/archive

service@vplexMM:/opt/emc/connectemc/archive> ll

-rw-r—– 1 service users 2814 Jun 25 13:17 RSC_CKM00000000999_062518_011656000.xml

-rw-r—– 1 service users 2814 Jun 25 10:54 RSC_CKM00000000999_062518_105401000.xml

-rw-r—– 1 service users 2814 Jun 25 11:11 RSC_CKM00000000999_062518_111102000.xml

-rw-r—– 1 service users 2814 Jun 25 11:48 RSC_CKM00000000999_062518_114834000.xml

Checking call home status is part of the health-check — full script which does the following:

1- Check the email server for each notification type in /opt/emc/connectemc/ConnectEMC_config.xml

2- Ping the server. If the server is not pingable for any reason (not reachable via network, server is shutdown, ICMP service is blocked via firewall, the <EmailServer> is a DNS name instead of the name in the ConnectEMC_config.xml file).

As a result, the commandhealth-check –full script will fail and will show the following error:

Checking Call Home Status……………………………….. Error

The current healthcheck script checks if call home is enabled and generates a “Warning” state if it’s disabled.

The healthcheck script also checks if call home has been functioning properly with several verifications such as: checking call homes have been generated; the call home emails have been sent successfully sent; or if SMTP server ping is alive.

If any of these verifications fail, the script’s result will be flagged with an error as shown:

Checking Call Home Status……………………………….. Error

After enabling the ICMP protocol on the firewall level between the VPLEX management server and their selected email server used (ESRS, customer’s email server), the Call Home “Error” status is now clean:

VPlexcli:/> health-check –full

Configuration (CONF):

Checking VPlexCli connectivity to directors……………….. OK

Checking Directors Commission……………………………. OK

Checking Directors Communication Status…………………… OK

Checking Directors Operation Status………………………. OK

Checking Inter-director management connectivity……………. OK

Checking ports status…………………………………… OK

Checking Call Home Status……………………………….. OK

Checking Connectivity…………………………………… OK

Checking COM Port Power Level……………………………. OK

Checking Meta Data Backup……………………………….. OK

Checking Meta Data Slot Usage……………………………. OK

Related:

  • No Related Posts

Re: RecoverPoint for VMs Alerts

Yes, do this first and as per your original request then configure your alerts/events.

Regards,

Rich Forshaw

Consultant Corporate Systems Engineer – RecoverPoint & VPLEX (EMEA)

Data Protection and Availability Solutions

EMC Europe Limited

Mobile: 44 (0) 7730 781169<tel:44%20(0)%207730%20781169>

E-mail: richard.forshaw@emc.com<mailto:richard.forshaw@emc.com>

Twitter: @rw4shaw

Related:

  • No Related Posts

VPLEX: How to create web vendor signed certificate for VPLEX performance monitor

Article Number: 504672 Article Version: 4 Article Type: How To



VPLEX Performance Monitor,VPLEX Series,VPLEX GeoSynchrony,VPLEX Local,VPLEX Metro,VPLEX Geo,VPLEX VS2,VPLEX VS6,VPLEX GeoSynchrony 6.0 Service Pack 1 Patch 5,VPLEX GeoSynchrony 6.0 Service Pack 1 Patch 4,VPLEX GeoSynchrony 6.0 Patch 1

Issue:If you launch VPLEX Performance Monitor GUI with a self-signed certificate, the browser displays the following warning message:User-added imageResolution:1. Generate the certificate signing request/CSR file from VPlex peformance monitor VM in order to generate SSL certificates to import vendor signed certificate for VPlex performance monitor, login into VPLEX performance monitor VM using SSH.This command will auto generate two files after filling the required attributes. (see the example in attachement)
localhost:~ # openssl req -new -newkey rsa:2048 -nodes -keyout vplex-monitor-key.pem -out vplex-monitor-csr.csr
2. The customer uses the certificate signing request .csr file and submit it to the vendor CA portal to generate the certificate file.3. Upload the certificate and key files to this directory /home/appadmin/appcentric/keys/4. Ensure that the certificate and key files have the following names:
vplex-monitor-cert.pem vplex-monitor-key.pem
5. Run the following commands at the localhost:~ prompt to ensure that the certificates have the correct permissions:
localhost:~ # chown appadmin:users /home/appadmin/appcentric/keys/*.pemlocalhost:~ # chmod 600 /home/appadmin/appcentric/keys/*.pemlocalhost:~ # ll /home/appadmin/appcentric/keys/total 8-rw------- 1 appadmin users 1192 Oct 6 18:36 vplex-monitor-cert.pem-rw------- 1 appadmin users 1675 Oct 6 18:36 vplex-monitor-key.pem
6. Type the following command to restart the VPLEX Performance Monitor server :
localhost:~ # sudo forever restartall 

  • VPLEX performance monitor is an independent product from VPLEX
  • For creating, web vendor signed certificate for VPLEX GUI Kindly check KB#495191: https://support.emc.com/kb/495191

Related:

  • No Related Posts

VPLEX: VPLEX takes 3x-4x longer to free up space compared to Native SCSI UNMAP on supported arrays

Article Number: 502965 Article Version: 3 Article Type: Break Fix



VPLEX Series,VPLEX Local,VPLEX Metro,VPLEX GeoSynchrony,VPLEX GeoSynchrony 5.5 Service Pack 1,VPLEX GeoSynchrony 5.5 Service Pack 1 Patch 1,VPLEX GeoSynchrony 5.5 Service Pack 1 Patch 2,VPLEX GeoSynchrony 5.5 Service Pack 2

In the current VPLEX UNMAP SCSI command implementation, UNMAP SCSI command completion latency is expected to be 3x-4x longer than native XtremIO/VNX/Unity/VMAX array implementation of the SCSI UNMAP command. Storage reclamation using UNMAP SCSI commands is a maintenance activity. Users should consider the increased processing time required in order to reclaim storage through VPLEX and plan accordingly.

Typically, VPLEX average latency in completing individual SCSI UNMAP command is in sub seconds. VPLEX provides the following UNMAP SCSI command statistics:

  • The number of UNMAP SCSI commands per second seen at the target
  • The average latency in quarters per second of UNMAP SCSI command at the target

These statistics work with the following targets:

  • front-end port
  • front-end director
  • front-end logical unit
  • host initiator port

Users must create new monitors to read these statistics. Please refer to the VPLEX documentation for details about creating new VPLEX monitors

Current VPLEX UNMAP implementation can only issue maximum 1MB UNMAP in its request to XtremIO storage volume compared to host issuing 4MB UNMAP when XtremIO SV is directly used by the host. This VPLEX implementation limitation is the main cause for the decreased VPLEX UNMAP performance. UNMAP is considered as a maintenance activity to free unused blocks at a host-application-convenient time and therefore UNMAP performance was considered as a secondary goal in the VPLEX implementation.

This is an expected behavior. Customers should consider the increased processing time required in order to reclaim storage through VPLEX and plan accordingly.

Related:

  • No Related Posts

ViPR Controller: No volumes visible for catalog service Export Volume to a Host

Article Number: 526085 Article Version: 3 Article Type: Break Fix



ViPR Controller,ViPR Controller Controller 3.6 SP2,ViPR Controller Controller 3.6 SP1,ViPR Controller Controller 3.6,ViPR Controller Controller 3.5

When attempting to run an “Export Volume to a Host” order, to export a VPLEX volume to a cluster, the volumes field does not populate with a list of available volumes to be exported.

ViPR Controller sasvc service log reports the following API call to check the VirtualArray association to the cluster:

vipr1 sasvc 2018-08-08 14:31:52,380 [qtp514587349-47 - /catalog/asset-options/vipr.unassignedBlockVolume] INFO LoggingFilter.java (line 238) 196 > GET https://<VIP/FQDN>:4443/vdc/varrays/search?cluster=urn:storageos:Cluster:d08eb16f-e018-4e14-bff4-bd209d1cf8e2:vdc1

ViPR Controller sasvc service log reports the response to the above API call:

vipr1 sasvc 2018-08-08 14:31:52,685 [qtp514587349-47 - /catalog/asset-options/vipr.unassignedBlockVolume] INFO LoggingFilter.java (line 125) 196 < 200 took 305 ms{"resource":[]}



As the API response is empty, this prevents ViPR Controller from showing volumes that are available to be exported to the cluster.

The cluster that the user is attempting to export to contains hosts that are associated to more than one VirtualArray.

The API call used in Export Volume to a Host, expects the hosts in the cluster to be associated to just one VirtualArray.

This behaviour is per design.

The service catalog Export Volume to a Host is not designed to export a VPLEX volume to a cluster whose hosts are associated to more than one VirtualArray.

The service catalog “Export VPLEX Volume” should be used to export a volume to clusters of this configuration.

The API call used in “Export VPLEX Volume” is run against each host in the cluster as opposed to the cluster itself:

vipr1 sasvc 2018-08-30 12:25:55,259 [qtp514587349-1211] INFO LoggingFilter.java (line 238) 1361 > GET https://<VIP/FQDN>:4443/vdc/varrays/search?host=urn:storageos:Host:019d835c-3998-4b81-b083-5e2202b0ba3f:vdc1

As an example take the following:

  • The user has added a VPLEX Metro to ViPR Controller.
    • No VPLEX cross-connect configured
  • The user has configured 2 VirtualArrays in ViPR Controller
    • The first represents VPLEX cluster-1 and its backend array
    • The second represents VPLEX cluster-2 and its backend array
  • The user has configured a stretched host cluster with 4 hosts
    • 2 of the hosts have connectivity back to VPLEX cluster-1 and VA_1 by association
    • 2 of the hosts have connectivity back to VPLEX cluster-2 and VA_2 by association

When the user attempts to run “Export Volume to a Host” against this cluster the API expects all 4 hosts to be associated to the same VirtualArray.

When the user attempts to run Export VPLEX Volume, this API call is not made and ViPR Controller is successfully able to populate the volume field.

Related:

  • No Related Posts

Data Migration from SVC into VPLEX

Hi there,

I would like to migrate LUNs of my bare metal hosts from SVC to VPLEX. If possible without downtime.

I have ESX boot LUNs in SVC, for example, which I need to move to my VPLEX. What`s the best way to do this?

In SVC I was able to zone the “old” storage device into SVC and to provide the old LUNs as so called “Image Mode VDISKs” to the same host. Migration of all these LUNs into a different pool made the old storage device obsolet after completion.

Storage is provided to my VPLEX from two Unitys and to my SVC from two V7000.

Can I zone the SVC to my VPLEX in some way?

Should I provide storage from my Unitys to my SVC (in its own pool), move vdisks from V7000 devices into that Unity pool and provide it to my VPLEX in some kind?

Any hints or best practises?

Best regards

globber

Related:

  • No Related Posts

VPLEX Metro. How does the management of iops work?

This is happening to a client that analyzes the consumption of iops between the storages of the two extremes of vplex, both are similar.

My explanation is that having Distributed File Systems is logical that the iops are similar, but the client states that according to his analysis the vplex allows access to both ends at the same time according to the best availability.

According to what I previously knew and what I was able to reveal up to now, this situation does not occur for the same consistency group, what I can do for the active / active configuration is for different consistency groups.

Is it right or am I wrong?

Thank you very much for sharing your knowledge.

Related:

  • No Related Posts

VPlex KCS Newsletter November 2018

VPlex KCS Newsletter November 2018

KCS newsletters are available to view on the VPlex Remote Support Inside Dell Page: https://inside.dell.com/groups/vplex

KB Article

Link

https://support.emc.com/kb/526188

Customers

KB Article

Link

https://support.emc.com/kb/458412

Customers

KB Article

Link

https://support.emc.com/kb/332240

Customers

New KBs to address NDU issues

KB Article

Link

https://support.emc.com/kb/523838

Customer

KB Article

Link

https://support.emc.com/kb/523981

Customer

KBs for issues in 6.0 SP1 P5

KB Article

Link

VPLEX: For VS6 Hardware Possible Total Cluster Outage (TCO) during NDU I/O Forwarding Phase due to FCID differences

https://support.emc.com/kb/502886

Customers

Note:

Do not dispatch for replacement of any parts for this issue.

It is first to be escalated to engineering for their review and recommendation of what action(s) need to be taken.

Current Target Codes

GeoSynchrony 5.5 Service Pack 2 Patch 4 (5.5.2.04.00.01) as of 6th April 2018

GeoSynchrony 6.0 Service Pack 1 Patch 7 (6.0.1.07.00.04) as of 15th May 2018

GeoSynchrony 6.1.x (Karst) was released in September 2018

  • Karst Q2’FY19 release of 6.1.0.00.00.23

“Nalanda” = VPLEX GeoSynchrony 6.2

End Of Service Life (EOSL)

  • VPLEX VE – EOSL Date is 28th February 2019
  • VPLEX VS2 Geo – EOL Date is 31st July 2015
  • VPLEX VS1 – EOSL Date 31st August 2017
  • 5.1.x EOSL on 30 April 2016
  • 5.2.x EOSL on 30 April 2017
  • 5.3.x EOSL on 30 April 2018
  • 5.4.x EOSL on 30 April 2019

KB Article

Link

ETA 501370 EMC VPLEX: Potential silent data inconsistency if the target leg of a FULL Thin Rebuild or migration is a thin device in GeoSynchrony 6.0 Service Pack 1, and later

https://support.emc.com/kb/501370

Customers

NOTE**

This issue is specific to FULL Thin Rebuilds and migrations where the target is a thin device

DR1 log rebuilds (including Thin log rebuilds) are unaffected by this issue.

All rebuilds on thick devices are unaffected by this issue.

All rebuilds on thin devices where VPLEX rebuilds are set to thick are unaffected by the issue.

Latest Top 10 Viewed KB Articles:

Number

KB Article Title

Link

1

323253 : VPLEX: How to collect logs from a VPLEX Instance

https://support.emc.com/kb/323253

Customers

2

334964 : VPLEX: How to restart VPlexManagementConsole to refresh VPLEX CLI/GUI

https://support.emc.com/kb/334964

Customers

3

324094 : VPLEX: How to manually configure call home.

https://support.emc.com/kb/324094

Customers

4

330110 : VPLEX: How to reset the admin password

https://support.emc.com/kb/330110

Customers

5

457313 : VPLEX: 0x8a4a91fa / 0x8a4a31fb Management Server partitions full or exceeded xx percent threshold limit

https://support.emc.com/kb/457313

Customers

6

323313 : VPLEX: How to perform a Basic Health check using VPlexcli

https://support.emc.com/kb/323313

Customers

7

472497: VPLEX: Logical-units and Array Connectivity show as degraded

https://support.emc.com/kb/472497

Customers

8

336564 : VPLEX: How to identify ports on a VPLEX director with low Rx and/or TX power – 0x8a54600f, 0x8a36303a, 0x8a363039

https://support.emc.com/kb/336564

Customers

9

335182: VPLEX: Slow performance on VPLEX with a workload consisting of medium to large outstanding queue depth of large block reads

https://support.emc.com/kb/335182

Customers

10

463942: VPLEX: Random temporary loss of connection to storage devices and/or performance degradation on ESXi hosts from version 5.5 u2

https://support.emc.com/kb/463942

Customers

Useful to know Articles:

  • KB# 484453- VPLEX: Best Practice Guidelines – Master Article [https://support.emc.com/kb/484453]
    • This master KB article provides links to up to date VPlex Best Practices documents

****With swarm requests increasing the following KB articles will assist in initiating and engaging in a successful collaboration******

  • KB#484448- VPLEX: How to collaborate with VPlex Support [https://support.emc.com/kb/484448]
    • This KB article provides a template for the case owner to fill out
    • The article also requests VPlex collect-diagnostics logs and provides the link to the KB article on how to collect these

  • KB#336056- How To Collaborate with the XtremIO Team [https://support.emc.com/kb/336056]
    • This KBA outlines the information needed in order to effectively collaborate with the XtremIO support team

  • KB#335164- Connectrix: How to collaborate with the Connectivity Team [https://support.emc.com/kb/335164]
    • This KB article outlines how to collaborate with Connectivity
    • This will enable the Connectivity team to efficiently assist in swarm requests

The following articles can be provided to the customer for gathering Host Grabs to avoid unnecessary collaborations with the Host teams:

  1. How To Create Solaris Grabs: https://support.emc.com/kb/468243
  2. How To Run EMC Grabs On An AIX Host: https://support.emc.com/kb/335706
  3. How To Create Linux Grabs: https://support.emc.com/kb/468251
  4. How To Run EMC Grabs on HP-UX: https://support.emc.com/kb/335700
  5. What logs need to be collected for Microsoft Windows-based EMC software?: https://support.emc.com/kb/304457
  6. How do I run an ESX or ESXi EMCGrab?: https://support.emc.com/kb/323232

** Recovery Procedures **

Recover procedures for data loss events in underlying back end storage arrays explaining what VPlex can and cannot do to recover data.

VPlex symptom: dead storage-volume

VNX Uncorrectables:

https://support.emc.com/kb/488035: VNX1/2: Uncorrectables/Coherency reported on VNX/VNX2 provisioned to a VPLEX (where device is marked as Dead due to SCSI check Condition 3/11/0) (DellEMC correctable)

Raid-1 device recovery:

https://support.emc.com/kb/470385 Recovering a VPLEX Metro Geo leg for distributed devices after storage failure or Data Loss on that leg occurred.

Additional info:

  • – Solve procedure generator VPlex > VPlex procedures > administration procedures > manage > Recovering a raid-1 leg after experiencing data loss due to backend array failure this is a more detailed version KB 470385 above
  • – KB 488035 above also has many supporting KB’s and doc’s

Related:

  • No Related Posts

ViPR Controller: Change virtual pool order failing for migration multiple VPLEX volumes

Article Number: 525383 Article Version: 3 Article Type: Break Fix



ViPR Controller Controller 3.0,ViPR Controller Controller 3.5,ViPR Controller Controller 3.6,ViPR Controller Controller 3.6 SP1,ViPR Controller Controller 3.6 SP2,VPLEX GeoSynchrony 6.0 Service Pack 1 Patch 6

Issue 1: Change virtual pool order is failing for multiple VPLEX local/distribute volumes. Attempts to migrate multiple VPLEX volumes through

VIPR Controller in one Change Virtual Pool order will result in the error: ‘The controller references resource: [xxxxxxxx4], whereas the hardware reported the actual resource as: [[xxxxxxxx5]] ‘ randomly during LUN WWN validation

Issue 2: After the “cancel” action of the original VPLEX migration, the rollback took a long time to complete. The VIPR Controller rollback of the VPLEX migration initiated by cancelling will take around 90 min to complete.

Impact version

VPLEX : prior to code 6.0.1 P7

VIPR : Prior to code 3.6.2.2

Impact VIPR order name

Change Virtual Pool (VPLEX data migration)

For issue 1, the root cause is that VPLEX returned incorrect information when VIPR sent multiple lun validation requests to VPLEX:

Volume name: CVP_1

Native ID: device_VCKMxxxxx8-xxxxxxx4

Volume name:CVP_2

Native ID: device_VCKMxxxxx8-xxxxxxx5

VIPR sent lun query request:

vipr1 vipr1 controllersvc-vplex-api 2018-06-22 08:40:52,660 [22815|changeVirtualPool|f5af1f18-dbd4-4548-84c9-ebd49b5bf1ae] INFO VPlexApiDiscoveryManager.java (line 4017) Drill-down command POST data is {“args”:”-r device_VCKMxxxxx8-xxxxxx4“}

VPLEX returned the information:

vipr1 vipr1 controllersvc-vplex-api 2018-06-22 08:40:54,084 [22815|changeVirtualPool|f5af1f18-dbd4-4548-84c9-ebd49b5bf1ae] INFO VPlexApiDiscoveryManager.java (line 4021) Drill-down command response is {

“response”: {

“context”: null,

“message”: null,

“exception”: null,

“custom-data”: “local-device: device_VCKMxxxxx8-xxxxxxx5 (cluster-1)n extent: extent_VCKMxxxxx8-xxxxxxx5_1 n storage-volume: VCKMxxxxx8-xxxxxxx5

The Error Message ‘The controller references resource: [xxxxxxxx4], whereas the hardware reported the actual resource as: [[xxxxxxxx5]] ‘

From the ViPR Controller logs for validating device device_VCKMxxxxx8-xxxxxxx4, ViPR Controller sent a request to VPLEX for device device_VCKMxxxxx8-xxxxxxx4, but VPLEX responded with device device_VCKMxxxxx8-xxxxxxx5. ViPR Controller compared device_VCKMxxxxx8-xxxxxxx4 (in database) with device_VCKMxxxxx8-xxxxxxx5(return from VPLEX) and the comparison failed, prompting the error.

This issue will not occur when migrating a single VPLEX volume (as opposed to multiple VPLEX volumes) in each Change virtual pool order.

For issue 2, the root cause was a code bug. In the current code, when a volume migration job get an exception (due to any reason, for example, due to a cancellation trigger by ViPR Controller.) ViPR will keep trying for 1 hr (see log like:-Updating status IN_PROGRESS ).before updating the status to FALIED. Updating the migration job status is the reason that VIPR Controller takes a long time to complete the rollback.

2018-07-05 07:33:49,464 pool-58-thread-1 ERROR VPlexMigrationJob.java (line 185) Unexpected error getting status of migration urn:storageos:Migration:a593219b-c4af-4ec9-bbc5-2c72e9bc482a:vdc1 on VPlex urn:storageos:StorageSystem:09be660c-9aa8-42d7-bdba-a527d39246b1:vdc1: null

com.emc.storageos.vplex.api.VPlexApiException: Could not find the migration with name M_xxxxxxx

2018-07-05 07:33:49,465 pool-58-thread-1 DEBUG VPlexMigrationJob.java (line 195) Updating status IN_PROGRESS

2018-07-05 07:33:49,465 pool-58-thread-1 DEBUG VPlexMigrationJob.java (line 82) Polled migration job

2018-07-05 08:41:26,219 pool-58-thread-1 ERROR VPlexMigrationJob.java (line 185) Unexpected error getting status of migration urn:storageos:Migration:a593219b-c4af-4ec9-bbc5-2c72e9bc482a:vdc1 on VPlex urn:storageos:StorageSystem:09be660c-9aa8-42d7-bdba-a527d39246b1:vdc1: null

com.emc.storageos.vplex.api.VPlexApiException: Could not find the migration with name M_xxxxxxx

2018-07-05 08:41:26,220 pool-58-thread-1 DEBUG VPlexMigrationJob.java (line 195) Updating status FAILED

Workaround for the Issue 1:

1. Execute the Change Virtual Pool first order with only 1 VPLEX volume.

2. Confirm that the validation of the first order has been completed and that the migration has started on VPLEX.

3. Run the second Change Virtual Pool order with only 1 VPLEX volume

4. Repeat steps 1, 2, and 3 above when multiple VPLEX volumes need to be migrated

Permanent Fix for the issue 1:

1. Upgrade VPLEX code to 6.0.1 P7 which will be released around 9/17/18.

Workaround for the issue 2:

None

Permanent Fix for the issue 2:

1. Upgrade VIPR Controller to version 3.6.2.2, which will be released around the end of Q3 2018. After upgrading to 3.6.2.2 the rollback should complete in a few minutes.

Related:

  • No Related Posts

ViPR SRM: missing LVM Physical Volumes in reports for Linux hosts

Article Number: 525094 Article Version: 3 Article Type: Break Fix



ViPR SRM,ViPR SRM 3.7,ViPR SRM 4.0,ViPR SRM 4.1,ViPR SRM 4.2

The RSH collector doesn’t correctly collect LVM information on some Linux host. As a result, there are inconsistencies in LVM reports regarding LV and PV numbers and names.

The “/sbin/lvs -o lv_name,vg_name,devices” command displays multiple PVs for some LVs if the LVM is configured that way. E.g.

LV VG Devices

root rpool /dev/sda2(0)

root rpool /dev/sda2(1920)

swap rpool /dev/sda2(768)



data_LV host_VG /dev/mapper/VPLEX.09fc(0),/dev/mapper/VPLEX.09fd(0)

Notice that LV data_LV has 2 devices/PVs listed. Currently, the LunMappingDetection.pl script used to do discovery on hosts cannot handle multiple PVs per LV.

This problem is fixed in SRM v 4.3

Jira SRS-38372

Related:

  • No Related Posts