2020-05-23

<mru> ram bitflips would go undetected

2016-11-21

<hramrach> yes, I can see how doing integrity checks when you have RAM bitflips can cause failure whereas not doing the checks just potentially stores the flipped bit somewhere

2016-10-25

<ssvb> again, end users don't mind the risk of having occasional bitflip problems

2016-07-21

<jonkerj> bitflips all over

2016-06-14

<enrico_> (apart from bitflips problems etc... of course)
<bbrezillon> the other problem is that, on some NANDs, when you partially write an eraseblock, you got tons of bitflips in some pages, and at some points those bitflips turn into uncorrectable errors (because of read-disturb)

2016-06-12

<bbrezillon> and AFAICT, it's the same IP with some extra features like 'bitflips in erased page detection', or 'dedicated DMA interface'

2015-12-07

<jemk> it failed at bitflip each time, all bits 1 but a single one flipped to 0

2015-10-26

<asmir> ssvb: first it was a memory corruption (bitflip test I think), than deadlock

2015-06-15

<bbrezillon> actually the problem is not when you read it, but you might flash a u-boot image which might contain too many bitflips from the beginning
<bbrezillon> which means you might be more prone to bitflips

2015-05-27

<bbrezillon> and I'm not even mentionning that such an image might contain bitflips that should be corrected before even considering flashing it with raw accessors

2014-10-10

<hno> slapin, what do you mean by bitflip code?
<slapin> bbrezillon: these are common, and are the current source of FUD for libnand/ext3 vs ubifs. When you read page, some bitflip might occur in this and other pages
<bbrezillon> slapin: haven't read the whole discussion yet, but yes libnand is correctly dealing with bitflips thanks to the ECC engine and the read retry mechanism
<slapin> hno, hramrach, bbrezillon: btw, is there any bitflip code in libnand? can't see any :(
<slapin> UBFS is prone to bitflip errors, but it is still more reliable than AW's block device.

2014-08-28

<petrosagg> bbrezillon1: but that number can't be higher than the ECC strength, since by definition this is the maximum amount of bitflips that can be corrected
<quitte> well - except for fixable bitflips detected
<bbrezill1> quitte: not for every read => for those who generate ECC errors (uncorrectable bitflips) or that exceed the bitflip threshold
<bbrezill1> quitte: we have to rework the read_retry to test for all available internal bit levels and choose the bit level that generate the less bitflips, instead of sticking on the first one that can be corrected by ECC
<bbrezill1> quitte: this patch has a drawback => it prevents UBI from detection pages where bitflips are almost exceding the ECC strength

2014-08-24

<bbrezillon> quitte_: well, actually it should work most of the time, but this is definitely not recommended (and the bitflips you're seeing when you try to load a big kernel, could possibly be fixed with randomization, even if read_retry will most likely fix those)
<bbrezillon> quitte_: randomization is here have a good distribution of data over a NAND block in order to limit bitflips
<bbrezillon> quitte_: your bootloader will at most take a few (say < 40) pages, but kernel and UBI will fill entire blocks and the data stored there might contain similat and thus generate more bitflips
<bbrezillon> the more pages you write in a block the more chance you have to get bitflips, and this is even worst when you write the same pattern on several pages
<bbrezillon> quitte: mtd->bitflip_threshold = mtd->ecc_strength + 1;
<quitte> bbrezillon: http://code.bulix.org/qp31kt-86809 is that actually what you meant to give me to increase the bitflip threshold?

2014-08-22

<quitte> bbrezillon: is that to increase the bitflip threshold?
<bbrezillon> quitte: AFAIU the messages it says it miss 1 LEB, and this might be caused by wear leveling trying to copy a block with too many bitflips to another block
<bbrezillon> quitte: you should somehow change the bitflips_threshold before attaching the ubi device
<bbrezillon> anthony_emtrion: and when there's too many bitflips (i.e. above the bitflips_threshold) some wear leveling layers (like UBI) "torture" the block to test its reliability

2014-08-20

<bbrezillon> quitte: read retry is just a way to change reference levels (see Fig. 1) between each read, and find the reference level where you hit the least amount of bitflips
<bbrezillon> quitte: the problem with that approach is that we may quickly hit the bitflip threshold, and thus UBI will keep moving data from one block to another ...
<bbrezillon> quitte: the current implementation retry the page read until there's a valid read (all the bitflips found on a page are sucessfully corrected)

2014-08-19

<bbrezillon> petrosagg: it can't work without ECC (there's too much bitflips on MLC NANDs)

2014-08-13

<quitte> bbrezillon: nanddump gives me loads of "uncorrectable bitflip(s)" on the first partition. is nanddump what you used to read boot0 ?

2014-01-15

<bbrezillon> (I mean too much bitflips for ECC gracefully correct it)
<bbrezillon> could you test writing on the first 5 pages of a block, and see if you got bitflips on the first page ?
<bbrezillon> I'm still experiencing weird things: when I write on the 4th page of a block I get a lot of bitflips on the 1st one (too much to be corrected by the ECC hw or soft)

2014-01-10

<oliv3r> yeah i think bitflips is one of the thigns ubifs is 'WIP'
<slapin_> mripard: bitflips problem is yet another thing which plagues flash interfaces (and ubifs for that matter)
<mripard> oliv3r: even reads can trigger bitflips to neighbour pages