RecoverPoint: ScaleIO Initialization not moving, host crash or stuck on highload

Article Number: 503004 Article Version: 3 Article Type: Break Fix

RecoverPoint,RecoverPoint CL,RecoverPoint EX,ScaleIO Product Family,ScaleIO Software

IOs to offsets greater than 1T are causing a short init on the CG. Host crash/stuck on high load.

Symptoms found in the logs:

ScaleIO host logs:

sdw1 kernel: attempt to access beyond end of device

sdw1 kernel: sdr: rw=33, want=2147483736, limit=1967000000


sdw4 kernel: NMI watchdog: BUG: soft lockup – CPU#40 stuck for 22s! [splDataPathExec:97896]

localhost kernel: INFO: task dd:3540 blocked for more than 120 seconds.

Splitter logs:

sdw1 kernel: 4967/4967: RPS:#0 – spl_kbox_end_io : offset = 2147483704, len = 16384, MajorMinor(65, 16), error status = -5

sdw1 kernel: 1351/1351: RPS:#1 – CommandIoSplit_KboxEndIo: Immediate MOH is true. Moving to Tracking. vol guid=0xe102395df73e4b67

RPA (Storage):

st_handle_write_atio: huge write !!! len=1048576 max_chunk_len=524288 (in bytes), ox_id=0x16, cd_remote_entity_id=0x6bca7c6343a4ccca, vlun=0x2c2d8

Splitter type(s): ScaleIO Splitter

Affected versions: 5.0.1,,

Splitter has no limitation on number of inflight IOs. Splitter’s RPA IO timeout flow is wrongly handled. Historically RPA supports IOs to addresses up to 1T.




Dell EMC engineering is currently investigating this issue. A permanent fix is still in progress. Contact the Dell EMC Customer Support Center or your service representative for assistance and reference this solution ID.


Leave a Reply