N900 microSD card I/O errors and corruption

N900 microSD card I/O errors and corruption

Paul Hartman

2011-03-29 15:32 UTC
Hi,

I've got three microSD cards. They work fine on my PCs, I've done
read/write tests and data is not corrupted. But, in my N900, two of
the three are not stable, leading to corruption.

Transcend 8GB class 6 - bad
Adata 16GB class 10 - bad
Sandisk 16GB class 2 - good

I suspect maybe the N900 isn't providing enough voltage to the SD card
and some cards are less tolerant of low-voltage situations than
others. Does anyone know if it's possible to tell what voltage it is
using or change the voltage of the SD card in N900?

Or if there's some other explanation... maybe I have bad luck, maybe
the cards are bad but only the N900 can expose it.

dmesg shows things like this:

[33713.501464] mmcblk1: error -110 sending read/write command,
response 0x900, card status 0xe00
[33713.501495] mmcblk1: error -110 transferring data, sector 27271168,
nr 8, card status 0xc00
[33713.570129] end_request: I/O error, dev mmcblk1, sector 27271169
[33713.570159] Buffer I/O error on device mmcblk1p4, logical block 0
[33713.570159] lost page write due to I/O error on mmcblk1p4
[33754.895355] mmcblk1: error -110 transferring data, sector 30941184,
nr 16, card status 0xc00
[33754.895690] end_request: I/O error, dev mmcblk1, sector 30941185
[33754.895721] Buffer I/O error on device mmcblk1p4, logical block 458752
[33754.895751] lost page write due to I/O error on mmcblk1p4
[33754.895812] end_request: I/O error, dev mmcblk1, sector 30941192
[33754.895843] Buffer I/O error on device mmcblk1p4, logical block 458753
[33754.895843] lost page write due to I/O error on mmcblk1p4
[33755.504272] mmcblk1: error -110 transferring data, sector 31203328,
nr 16, card status 0xc00
[33755.504638] end_request: I/O error, dev mmcblk1, sector 31203329
[33755.504669] Buffer I/O error on device mmcblk1p4, logical block 491520
[33755.504699] lost page write due to I/O error on mmcblk1p4
[33755.504760] end_request: I/O error, dev mmcblk1, sector 31203336
[33755.504760] Buffer I/O error on device mmcblk1p4, logical block 491521
[33755.504791] lost page write due to I/O error on mmcblk1p4
[33756.204315] mmcblk1: error -110 sending read/write command,
response 0x900, card status 0xe00
[33756.204345] mmcblk1: error -110 transferring data, sector 31465472,
nr 16, card status 0xc00
[33756.268493] end_request: I/O error, dev mmcblk1, sector 31465473
[33756.268524] Buffer I/O error on device mmcblk1p4, logical block 524288
[33756.268554] lost page write due to I/O error on mmcblk1p4
[33756.268585] end_request: I/O error, dev mmcblk1, sector 31465480
[33756.268615] Buffer I/O error on device mmcblk1p4, logical block 524289
[33756.268615] lost page write due to I/O error on mmcblk1p4
[33756.968139] mmcblk1: error -110 sending read/write command,
response 0x900, card status 0xe00
[33756.968200] mmcblk1: error -110 transferring data, sector 31727616,
nr 16, card status 0xc00
[33757.027191] end_request: I/O error, dev mmcblk1, sector 31727617
[33757.027221] Buffer I/O error on device mmcblk1p4, logical block 557056
[33757.027252] lost page write due to I/O error on mmcblk1p4
[33757.027313] end_request: I/O error, dev mmcblk1, sector 31727624
[33757.027313] Buffer I/O error on device mmcblk1p4, logical block 557057
[33757.027343] lost page write due to I/O error on mmcblk1p4
[33757.727172] mmcblk1: error -110 sending read/write command,
response 0x900, card status 0xe00
[33757.727203] mmcblk1: error -110 transferring data, sector 31989760,
nr 16, card status 0xc00
[33757.786773] end_request: I/O error, dev mmcblk1, sector 31989761
[33757.786804] Buffer I/O error on device mmcblk1p4, logical block 589824
[33757.786834] lost page write due to I/O error on mmcblk1p4
[33757.786865] end_request: I/O error, dev mmcblk1, sector 31989768
[33758.486755] mmcblk1: error -110 sending read/write command,
response 0x900, card status 0xe00
[33758.486816] mmcblk1: error -110 transferring data, sector 32251904,
nr 16, card status 0xc00
[33758.549682] end_request: I/O error, dev mmcblk1, sector 32251905
[33758.549774] end_request: I/O error, dev mmcblk1, sector 32251912
  •  Reply

Re: N900 microSD card I/O errors and corruption

Eero Tamminen
Karma: 161
2011-03-30 16:24 UTC
Hi,

On 03/29/2011 06:32 PM, ext Paul Hartman wrote:
> I've got three microSD cards. They work fine on my PCs, I've done
> read/write tests and data is not corrupted. But, in my N900, two of
> the three are not stable, leading to corruption.

Does it afterwards show as corrupted on the PC too?

> Transcend 8GB class 6 - bad
> Adata 16GB class 10 - bad
> Sandisk 16GB class 2 - good
>
> I suspect maybe the N900 isn't providing enough voltage to the SD card
> and some cards are less tolerant of low-voltage situations than
> others. Does anyone know if it's possible to tell what voltage it is
> using or change the voltage of the SD card in N900?

You aren't by any chance changing the cards by taking the back cover
out without powering off your device first?

Opening the back cover does an emergency shutdown on disks in case
user rips battery out next (that's apparently a common way to get
"phone not reachable" message back to your boss/wife/dog when they
call you, at least in some parts of the world).

If there were writes being done to the card at that time, it may
corrupt. Power off your device first if you want to be sure you
can switch the card safely.

Also, the back cover has a magnetic latch that's used for detecting
when it's opened. If you have something magnetic next to your phone,
it may cause phone to think that back cover is being opened. See:
https://bugs.maemo.org/show_bug.cgi?id=8235#c15


- Eero

> Or if there's some other explanation... maybe I have bad luck, maybe
> the cards are bad but only the N900 can expose it.
>
> dmesg shows things like this:
>
> [33713.501464] mmcblk1: error -110 sending read/write command,
> response 0x900, card status 0xe00
> [33713.501495] mmcblk1: error -110 transferring data, sector 27271168,
> nr 8, card status 0xc00
> [33713.570129] end_request: I/O error, dev mmcblk1, sector 27271169
> [33713.570159] Buffer I/O error on device mmcblk1p4, logical block 0
> [33713.570159] lost page write due to I/O error on mmcblk1p4
> [33754.895355] mmcblk1: error -110 transferring data, sector 30941184,
> nr 16, card status 0xc00
> [33754.895690] end_request: I/O error, dev mmcblk1, sector 30941185
> [33754.895721] Buffer I/O error on device mmcblk1p4, logical block 458752
> [33754.895751] lost page write due to I/O error on mmcblk1p4
> [33754.895812] end_request: I/O error, dev mmcblk1, sector 30941192
> [33754.895843] Buffer I/O error on device mmcblk1p4, logical block 458753
> [33754.895843] lost page write due to I/O error on mmcblk1p4
> [33755.504272] mmcblk1: error -110 transferring data, sector 31203328,
> nr 16, card status 0xc00
> [33755.504638] end_request: I/O error, dev mmcblk1, sector 31203329
> [33755.504669] Buffer I/O error on device mmcblk1p4, logical block 491520
> [33755.504699] lost page write due to I/O error on mmcblk1p4
> [33755.504760] end_request: I/O error, dev mmcblk1, sector 31203336
> [33755.504760] Buffer I/O error on device mmcblk1p4, logical block 491521
> [33755.504791] lost page write due to I/O error on mmcblk1p4
> [33756.204315] mmcblk1: error -110 sending read/write command,
> response 0x900, card status 0xe00
> [33756.204345] mmcblk1: error -110 transferring data, sector 31465472,
> nr 16, card status 0xc00
> [33756.268493] end_request: I/O error, dev mmcblk1, sector 31465473
> [33756.268524] Buffer I/O error on device mmcblk1p4, logical block 524288
> [33756.268554] lost page write due to I/O error on mmcblk1p4
> [33756.268585] end_request: I/O error, dev mmcblk1, sector 31465480
> [33756.268615] Buffer I/O error on device mmcblk1p4, logical block 524289
> [33756.268615] lost page write due to I/O error on mmcblk1p4
> [33756.968139] mmcblk1: error -110 sending read/write command,
> response 0x900, card status 0xe00
> [33756.968200] mmcblk1: error -110 transferring data, sector 31727616,
> nr 16, card status 0xc00
> [33757.027191] end_request: I/O error, dev mmcblk1, sector 31727617
> [33757.027221] Buffer I/O error on device mmcblk1p4, logical block 557056
> [33757.027252] lost page write due to I/O error on mmcblk1p4
> [33757.027313] end_request: I/O error, dev mmcblk1, sector 31727624
> [33757.027313] Buffer I/O error on device mmcblk1p4, logical block 557057
> [33757.027343] lost page write due to I/O error on mmcblk1p4
> [33757.727172] mmcblk1: error -110 sending read/write command,
> response 0x900, card status 0xe00
> [33757.727203] mmcblk1: error -110 transferring data, sector 31989760,
> nr 16, card status 0xc00
> [33757.786773] end_request: I/O error, dev mmcblk1, sector 31989761
> [33757.786804] Buffer I/O error on device mmcblk1p4, logical block 589824
> [33757.786834] lost page write due to I/O error on mmcblk1p4
> [33757.786865] end_request: I/O error, dev mmcblk1, sector 31989768
> [33758.486755] mmcblk1: error -110 sending read/write command,
> response 0x900, card status 0xe00
> [33758.486816] mmcblk1: error -110 transferring data, sector 32251904,
> nr 16, card status 0xc00
> [33758.549682] end_request: I/O error, dev mmcblk1, sector 32251905
> [33758.549774] end_request: I/O error, dev mmcblk1, sector 32251912
>

  •  Reply

Re: N900 microSD card I/O errors and corruption

Paul Hartman

2011-04-02 23:43 UTC
On Wed, Mar 30, 2011 at 11:24 AM, Eero Tamminen <eero.tamminen@nokia.com> wrote:
> Hi,

Hi Eero, thanks for your response.

> On 03/29/2011 06:32 PM, ext Paul Hartman wrote:
>>
>> I've got three microSD cards. They work fine on my PCs, I've done
>> read/write tests and data is not corrupted. But, in my N900, two of
>> the three are not stable, leading to corruption.
>
> Does it afterwards show as corrupted on the PC too?

Yes.

> You aren't by any chance changing the cards by taking the back cover
> out without powering off your device first?

Definitely not that, the only time I ever remove my cover is to change
the card when I buy a new one, and I fully shutdown the phone first
before doing so. My USB port still works, so I don't need to remove
the battery on a regular basis.

> Opening the back cover does an emergency shutdown on disks in case
> user rips battery out next (that's apparently a common way to get
> "phone not reachable" message back to your boss/wife/dog when they
> call you, at least in some parts of the world).

I would never do that. I rarely make or receive phone calls. I just
checked call logs on my N900 and I have 6 calls since 1st of January.
Those "pull the battery" people should learn about offline mode, hey.
:)

> Also, the back cover has a magnetic latch that's used for detecting
> when it's opened. If you have something magnetic next to your phone,
> it may cause phone to think that back cover is being opened. See:
>        https://bugs.maemo.org/show_bug.cgi?id=8235#c15

I'm aware of the magnetic switch, I don't think it's that. I keep my
N900 on my desk and in my pocket, I don't wear magnets or use cases
for my N900. The magnets on back cover appear to be in place. I don't
have any of the dmesg lines about cover opened/closed.


But I have some new information!

I've experimented more with the Adata card, and now I notice that the
errors are 100% reproducable. If I mkfs it results in the same exact
errors in dmesg (same blocks) every single time! So this makes me
really think it's a problem with the kernel drivers (or maybe the SD
controller itself) on the N900. I think if the card truly had
bad-blocks, the wear leveling would cause the errors not to be the
same every time, and I'd see them also on my PC.

I formatted the cards using the SD Association's official formatting
tool on a MS Windows box, with full write erase over the whole device.
No errors and all went normally. Testing on my PC with 2 different
microSDHC card-readers I was not able to reproduce any corruption
after filling up the card, unmounting & removing, reinserting and
reading md5sum of the contents.

A simple test, on my PC:

$ sudo mkfs.ext3 -v -Ldebian -m0 /dev/sdg4
mke2fs 1.41.14 (22-Dec-2010)
fs_types for mke2fs.conf resolution: 'ext3'
Calling BLKDISCARD from 0 to 2612002816 failed.
Filesystem label=debian
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
159680 inodes, 637696 blocks
0 blocks (0.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=654311424
20 block groups
32768 blocks per group, 32768 fragments per group
7984 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912

Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 30 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.

$ sudo fsck -v -f /dev/sdg4
fsck from util-linux 2.19
e2fsck 1.41.14 (22-Dec-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

11 inodes used (0.01%)
0 non-contiguous files (0.0%)
0 non-contiguous directories (0.0%)
# of inodes with ind/dind/tind blocks: 0/0/0
27369 blocks used (4.29%)
0 bad blocks
1 large file

0 regular files
2 directories
0 character device files
0 block device files
0 fifos
0 links
0 symbolic links (0 fast symbolic links)
0 sockets
--------
2 files

So there are no errors found, and nothing shows up in dmesg. The new
partition is in-tact and works normally if I copy files, flush caches,
read back and checksum them on my PC.

However, when I perform the same thing on my N900, dmesg is full of
"-110" errors (that I posted in my first message), and fsck
immediately following mkfs finds errors in the new filesystem! That's
not good...

With my Sandisk Class 2 card, the test works successfully on the N900.
I'm using it even for swap partition, so there's a constant and heavy
I/O load on the card and I never had any problems with that. So I
don't think it's a hardware problem with the cover or magnets.

When I googled about it, I found some very similar reports with the
same dmesg lines, about similar hardware (such as Pandora) and
discussions about SD card problems with voltages, that's why I thought
it might be relevant. But I have no idea how to determine the voltage
in use at any given time. Maybe it's a red herring. Maybe it's a
timing issue, where the sdhci driver cannot properly identify some
cards' capabilities (or maybe the card misrepresents its own
attributes) and doesn't set the proper clock speed/bandwidth/whatever
it has. Based on the error messages I got in dmesg I think something
like this might be possibly responsible.

N900 kernel version is about 2 years old? So there have probably been
a lot of patches and glitch-workarounds added to the sdhci driver in
the kernel. Maybe they can be brought back to the 2.6.28 kernel.
However, if the SD controller itself in the N900 is unable to handle
these cards then I don't know if any driver will help.

MeeGo has a recent kernel (in fact, the reason I bought this fast card
was for testing MeeGo), but maybe someday if I can install MeeGo to
internal memory of N900 then I can test the SD card. But for now I
don't think I am able to do that.

And then, to confuse the matter even more, we have this person's story
how he had 2 N900's and he has one specific SD card works fine in one
device but doesn't work in the other, which might indicate hardware
problem (or a different revision?):
http://forums.internettablettalk.com/showpost.php?p=964462&postcount=25

My N900 is the USA version, purchased in December 2009, made in Korea,
and has revision 2101. In case IMEI can reveal anything useful about
the manufacturing batch I can tell you in private email.

Thanks for your suggestions.
  •  Reply

Re: N900 microSD card I/O errors and corruption

Paul Hartman

2011-04-03 15:23 UTC
On Sat, Apr 2, 2011 at 11:43 PM, Paul Hartman
<paul.hartman+maemo@gmail.com> wrote:
> So there are no errors found, and nothing shows up in dmesg. The new
> partition is in-tact and works normally if I copy files, flush caches,
> read back and checksum them on my PC.
>
> However, when I perform the same thing on my N900, dmesg is full of
> "-110" errors (that I posted in my first message), and fsck
> immediately following mkfs finds errors in the new filesystem! That's
> not good...

I thought of another test, I attached the troublesome microSD card to
my N900 with a USB card reader, using power kernel and USB Hostmode
Enabler. This way, it works perfectly, all tests finish with no
errors! The card reader is mounted with USB mass storage driver.

So now I really think it might be some problem with the SD/MMC driver
in N900. Or a physical defect with the card slot hardware.

In N900 service manual it says the SD card slot operates at 48MHz but
I've read that high-speed SD cards should use 50MHz instead. I don't
know how to tell what's actually in use.
  •  Reply

Re: N900 microSD card I/O errors and corruption

Paul Hartman

2011-04-24 06:57 UTC
On Tue, Mar 29, 2011 at 10:32 AM, Paul Hartman
<paul.hartman+maemo@gmail.com> wrote:
> I've got three microSD cards. They work fine on my PCs, I've done
> read/write tests and data is not corrupted. But, in my N900, two of
> the three are not stable, leading to corruption.

I've made a simple modification to the omap_hsmmc module and now the
errors and corruption seem to be gone completely. I will test some
more and see if it lasts. So far so good. Fingers crossed. Knock on
wood. :)
  •  Reply

Re: N900 microSD card I/O errors and corruption

doctor watson
Karma: 7
2011-04-24 22:10 UTC
Hi,
i just bought a brand new hyperspeed 16gb sdhc card (officially class 10, does 18mb/s read and 14mb/s write).
the card is not a famous brand name. it's called "flashraptor".
i started to download maps for my vacation which is in two days when i noted the same errors as in post 1 of this thread in dmesg.
tried to reproduce: fails all the time.
did mkfs again, fsck afterwards always produces erros. connected to pc in mass storage mode, same tests, fail as well.
plugged it into pc in sd card reader: all works fine.

im so annoyed since i needed this card for vacation.
@paul: if your modded module works for you, would you let me test it?
btw, im using kernel-power
Regards,
Doc
  •  Reply

Re: N900 microSD card I/O errors and corruption

Jan Martinec
Karma: 5
2011-05-03 07:23 UTC
> I have similar problems with 16GB Kingston class10 sdcard, Kingston class2 works fine. Can you publish the modification in omap_hsmmc for testing?
> On Tue, Mar 29, 2011 at 10:32 AM, Paul Hartman
> <paul.hartman+maemo@gmail.com> wrote:
> > I've got three microSD cards. They work fine on my PCs, I've done
> > read/write tests and data is not corrupted. But, in my N900, two of
> > the three are not stable, leading to corruption.
>
> I've made a simple modification to the omap_hsmmc module and now the
> errors and corruption seem to be gone completely. I will test some
> more and see if it lasts. So far so good. Fingers crossed. Knock on
> wood. :)
>
>
  •  Reply

Re: N900 microSD card I/O errors and corruption

Alban Browaeys

2011-05-17 13:09 UTC
> On Tue, Mar 29, 2011 at 10:32 AM, Paul Hartman
> <paul.hartman+maemo at gmail.com> wrote:
> > I've got three microSD cards. They work fine on my PCs, I've done
> > read/write tests and data is not corrupted. But, in my N900, two of
> > the three are not stable, leading to corruption.
>
> I've made a simple modification to the omap_hsmmc module and now the
> errors and corruption seem to be gone completely. I will test some
> more and see if it lasts. So far so good. Fingers crossed. Knock on
> wood. :)


May I ask you what was this simple modification . Even if it ended not working fully it might
provide clues.

Best regards,
Alban

  •  Reply

Re: N900 microSD card I/O errors and corruption

Paul Hartman

2011-05-17 14:36 UTC
On Tue, May 17, 2011 at 8:09 AM, Alban Browaeys <prahal@yahoo.com> wrote:
>> On Tue, Mar 29, 2011 at 10:32 AM, Paul Hartman
>> <paul.hartman+maemo at gmail.com> wrote:
>> > I've got three microSD cards. They work fine on my PCs, I've done
>> > read/write tests and data is not corrupted. But, in my N900, two of
>> > the three are not stable, leading to corruption.
>>
>> I've made a simple modification to the omap_hsmmc module and now the
>> errors and corruption seem to be gone completely. I will test some
>> more and see if it lasts. So far so good. Fingers crossed. Knock on
>> wood. :)
>
>
> May I ask you what was this simple modification . Even if it ended not working fully it might
> provide clues.

In linux kernel sources drivers/mmc/host/omap_hsmmc.c replace the
set_data_timeout function with this one:

static void set_data_timeout(struct omap_hsmmc_host *host,
unsigned int timeout_ns,
unsigned int timeout_clks)
{
uint32_t reg;

reg = OMAP_HSMMC_READ(host->base, SYSCTL);

reg &= ~DTO_MASK;
reg |= DTO << DTO_SHIFT;
OMAP_HSMMC_WRITE(host->base, SYSCTL, reg);
}

Basically, it changes it to use the default DTO value of 0xE rather
than trying to calculate dynamic DTO based on the particular SD card
timings and characteristics. I read this advice on the linux-omap
mailing list from someone having the same problem (but with different
hardware).

I don't know if it's a problem in the driver DTO calculation logic, or
bad metadata in the SD cards causing that logic to be flawed. The same
workaround is done in other drivers for standard SD cards, too. Some
question why bother to change the DTO at all, if 0xE works with all of
them.

After a month using it like this, I have no problems. My SD card works
fine, no corruption.
  •  Reply