RAID 5 write hole

The RAID write hole is a known data corruption issue in older and low-end RAID arrays, caused by interrupted destaging of writes to disk.[1]

Issue

Consider the following configuration with 2 disks and a parity disk in a RAID 3 configuration:

Device 1 Device 2 Parity device
1 0* 1(stripe sum is odd)
0* 0 0(stripe sum is even)
1 1 0(stripe sum is even)

The disk is being written to when an adverse situation happens, such as a power outage or sudden disk failure. Suppose there are outstanding writes to the areas of disk marked with an asterisk (*) above. The RAID protocol stipulates that a write must happen at the same time as a parity update. Thus the following changes would be made:

Device 1 Device 2 Parity device
1 01 1(stripe sum is odd)0(stripe sum is even)
01 0 0(stripe sum is even)1(stripe sum is odd)
1 1 0(stripe sum is even)

However the writing process is interrupted, so the parity is incorrect. This may only occur on one stripe, or multiple stripes, depending on the implementation of the RAID driver and underlying hardware (due to out-of-order caching, etc.).:

Device 1 Device 2 Parity device
1 1 1(stripe sum is odd - WRONG)
0 0 1(stripe sum is odd - WRONG)
1 1 0(stripe sum is even)

The error may remain undetected indefinitely, because of application-layer redundancy or other practices. However, the main issue happens when a disk in the RAID fails. As can be seen, if any disk fails, the RAID will be rebuilt with incorrect information due to the incorrect parity. As can also be seen, the flipped bits can be anywhere on the virtual RAID device: it might be in a completely unrelated file, in the filesystem metadata, etc. This is known as the "RAID 5 write-hole".

Potential causes

Mitigation

References