How to find the cause of a single event file content corruption?

A few days I copied a large (56GB) file from a workstation to a file server. After checking the copy I found out it had a few bytes different from original.

Details:

  • source system:
    • Medion Akoya P5350 D
    • Windows 8.1 Pro 64 bit
    • SATA HDD (NTFS)
  • destination system:
    • HP ProLiant MicroServer N36L , ECC RAM
    • Windows Server 2012 R2 Standard
    • ReFS on Storage Spaces 2-way mirror

The file was copied by drag-and-drop from the local disk to the network shared folder. The file size is 56886041991 bytes.

A second copy done the same way one day later was OK (checked by md5sum). Comparing reveals there are 97 bytes that differ. (see below)
The only pattern I see is that the broken bytes are clustered in three groups where each 128th byte is changed.

What can I do? Where to start looking for the cause?
It can not be the disks, as they would report a read error in case of corruption, and even if not, ReFS would not notice the bad checksum and read the sector from the other disk and if that is corrupted too, it would (should) report a read error. SATA has CRC. RAM has ECC. Network has 2 layers of checksums. The Workstation has no ECC memory. Maybe network driver bugs?

Output of cmp -l

Related:


Leave a Reply