Re: DD Cloud Tier chunk size – what is it?

Hi David,

So in general most of the traffic going from a DDR to the cloud will be file data – this is stored in ~64Kb ‘compression regions’/chunks and this size is not configurable. There will be other types of data also written to the cloud (i.e. metadata) however this is likely to be a much smaller proportion of what is uploaded.

Note, however, that its not possible to take a file on the DDR and divide its size by 64Kb to work out how many put requests you are likely to see as all data in the cloud is de-duplicated/compressed.

For example, lets say you have a 10Mb file on the active tier of your DDR which you are going to migrate to the cloud – you might think you can do 10Mb/64Kb = 160 PUT requests. Note, however, that this wouldn’t be correct for the following reasons:

– The data being written to the cloud will be de-duplicated against data already in the cloud. For example if 95% of the data in your 10Mb file already exists within the cloud unit you are migrating to (as its referenced by other files which have already been migrated) the DDR will only need to upload the 5% unique data (i.e. 512Kb / 8 PUT requests). Working out how much of a file on the active tier is ‘unique’ when compared with existing data in a cloud unit is very complex and certainly not something that customers can do themselves (so you cannot gain any insight into how much data a file will upload during migration without actually uploading it)

– The data being written to the cloud will be compressed prior to upload. So again lets consider that 95% of the files data already exists in the cloud unit so only 512Kb need to be physically uploaded. If, however, this is compressed via lz before being uploaded it might get 2x compression so now only 256Kb physical data needs to be uploaded (i.e. 4 PUT requests). Again compression ratios depend on a number of factors and its pretty much impossible to say how ‘compressible’ some data is without actually compressing in during migration

Basically customers don’t get any insight into this process which can make it hard to estimate exact costings. That being said DD LTR (long term retention to cloud) has been designed so that it writes to/reads from the cloud as little as possible to minimise costs.

Sorry I don’t have a better answer but I hope this helps to some extent.

Thanks, James


Leave a Reply