What is a "read disturb" and are flash vendors doing anything to mitigate this issue at the controller level?
A read disturb occurs when the read methodology utilized on a NAND cell causes data to change on bordering or nearby NAND cells within a flash block. All of the cells on flash NAND chips exist on the same die. That means there can be cross-coupling between bordering cells in a block. Cell reads require energy to pass through that cell to determine its state. That energy has the potential to bleed enough of a charge off an adjacent cell so that it pushes that cell's bit past its threshold. The result is a change in its state. What's curious about this flash problem is that the cell being read is not the one that changes. It is the read cell's neighbors that are at risk. Unsolicited data change is never a good thing.
In general, read disturb is not all that common today because it requires a lot of reads to a specific cell for it to happen. Usually the amount of reads required to cause a read disturb is measured in the hundreds of thousands. And the flash storage vendors are quite cognizant of the read disturb problem. That awareness has led them to build a series of measures to mitigate or eliminate the problem.
The first is to set a block read threshold since the last erase cycle. The controller copies that block to another unused or erased block when the threshold is reached. The block is completely refreshed and resets the counter. The second is more sensitive error detection and correction codes (ECC) that detect a read disturb error and correct it. If the flash controller does not copy the block before a read disturb error occurs, and the errors are too many for the ECC to correct, data loss will occur.
This was first published in July 2014