Corrupt eMMC card?

Post Reply
oddulf
Posts: 10
Joined: Mon Nov 12, 2018 8:42 pm
languages_spoken: english, swedish
ODROIDs: C1+, XU4Q
Has thanked: 0
Been thanked: 0
Contact:

Corrupt eMMC card?

Unread post by oddulf » Mon Nov 12, 2018 11:13 pm

Hi all,

first post, first odroid, C1+. All went swimmingly with a preloaded eMMC Ubuntu 18.04 mate until last Friday.

I'm only using it headless so I deleted some mate packages: libreoffice, etc; I also did a major apt upgrade and tested a (probably dodgy) micro usb hub. A few hours later it died. I got it back with fsck and set up a fsck scan on every reboot; but it keept rebooting at irregular intervals but usually within 20 minutes.

After spending the weekend sorting out anything with low priority in the journalctl (missing modules, usb hub crashing etc) it still kept rebooting. The only thing left was:

Nov 11 21:09:10 kir1 org.freedesktop.Notifications[1046]: Unable to init server: Could not connect: Connection refused
Nov 11 21:09:10 kir1 mate-notificati[2224]: cannot open display:
Nov 11 21:13:13 kir1 kernel: set watch dog suspend timeout 6 seconds

Since I don't need mate anyway, I decided to reflash the eMMC card with a minimal Ubuntu 18.04.01.
After a few failed flashing attempts, I managed to get it going after deleting both partitions in gparted before flashing.

I left it with an ssh session open. After an hour and a half it crashed again; here is the UART logs:

Code: Select all

[  241.435380@1] INFO: task kworker/3:1:34 blocked for more than 120 seconds.
[  241.438658@1] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  241.460016@1] BUG: using smp_processor_id() in preemptible [00000000] code: khungtaskd/43
[  241.479978@1] Kernel panic - not syncing: hung_task: blocked tasks
[  241.480643@1] CPU: 1 PID: 43 Comm: khungtaskd Not tainted 3.10.107-11 #2
[  241.487349@1] [<c0014a40>] (unwind_backtrace+0x0/0xf4) from [<c0011a44>] (show_stack+0x10/0x14)
[  241.496053@1] [<c0011a44>] (show_stack+0x10/0x14) from [<c0676368>] (panic+0xa0/0x1f4)
[  241.503939@1] [<c0676368>] (panic+0xa0/0x1f4) from [<c0092740>] (watchdog+0x244/0x294)
[  241.511867@1] [<c0092740>] (watchdog+0x244/0x294) from [<c004aeb8>] (kthread+0xb0/0xb4)
[  241.519864@1] [<c004aeb8>] (kthread+0xb0/0xb4) from [<c000dcc0>] (ret_from_fork+0x14/0x34)
[  241.528091@2] CPU2: stopping
[  241.530857@2] CPU: 2 PID: 76 Comm: irq/110-sdhc Not tainted 3.10.107-11 #2
[  241.537791@2] [<c0014a40>] (unwind_backtrace+0x0/0xf4) from [<c0011a44>] (show_stack+0x10/0x14)
[  241.546500@2] [<c0011a44>] (show_stack+0x10/0x14) from [<c00132cc>] (handle_IPI+0xd4/0x17c)
[  241.554829@2] [<c00132cc>] (handle_IPI+0xd4/0x17c) from [<c0008470>] (gic_handle_irq+0x58/0x5c)
[  241.563520@2] [<c0008470>] (gic_handle_irq+0x58/0x5c) from [<c000d800>] (__irq_svc+0x40/0x70)
[  241.572022@2] Exception stack(0xebe5fe78 to 0xebe5fec0)
[  241.577177@2] fe60:                                                       0ec803ff 003ffffc
[  241.585568@2] fe80: 3ffff8e0 fe109000 c0a848e4 00000000 03f0003d 003ffffc c098e98c 00002000
[  241.593904@2] fea0: 600f0013 00000000 00000000 ebe5fec0 c023062c c001f5e0 a00f0013 ffffffff
[  241.602246@2] [<c000d800>] (__irq_svc+0x40/0x70) from [<c001f5e0>] (cycle_read_timerE1+0x10/0x14)
[  241.611115@2] [<c001f5e0>] (cycle_read_timerE1+0x10/0x14) from [<c023062c>] (__timer_delay+0x28/0x5c)
[  241.620327@2] [<c023062c>] (__timer_delay+0x28/0x5c) from [<c04a0a44>] (aml_sdhc_wait_ready+0x5c/0x9c)
[  241.629618@2] [<c04a0a44>] (aml_sdhc_wait_ready+0x5c/0x9c) from [<c04a1ccc>] (aml_sdhc_data_thread+0x450/0x900)
[  241.642855@2] [<c04a1ccc>] (aml_sdhc_data_thread+0x450/0x900) from [<c0093ef4>] (irq_thread+0xc0/0x128)
[  241.655649@2] [<c0093ef4>] (irq_thread+0xc0/0x128) from [<c004aeb8>] (kthread+0xb0/0xb4)
[  241.661587@2] [<c004aeb8>] (kthread+0xb0/0xb4) from [<c000dcc0>] (ret_from_fork+0x14/0x34)
[  241.669811@3] CPU3: stopping
[  241.673214@3] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.10.107-11 #2
[  241.679135@3] [<c0014a40>] (unwind_backtrace+0x0/0xf4) from [<c0011a44>] (show_stack+0x10/0x14)
[  241.687884@3] [<c0011a44>] (show_stack+0x10/0x14) from [<c00132cc>] (handle_IPI+0xd4/0x17c)
[  241.696200@3] [<c00132cc>] (handle_IPI+0xd4/0x17c) from [<c0008470>] (gic_handle_irq+0x58/0x5c)
[  241.704882@3] [<c0008470>] (gic_handle_irq+0x58/0x5c) from [<c000d800>] (__irq_svc+0x40/0x70)
[  241.713404@3] Exception stack(0xec6a3f98 to 0xec6a3fe0)
[  241.718577@3] 3f80:                                                       00000003 00000000
[  241.726957@3] 3fa0: 006fbb28 00000000 c0976590 c0686078 ec6a2000 c09d7b5e ec6a2000 c09d7b5e
[  241.735280@3] 3fc0: ec6a2000 ec6a2000 00000000 ec6a3fe0 c000efd4 c000efd8 60070013 ffffffff
[  241.743630@3] [<c000d800>] (__irq_svc+0x40/0x70) from [<c000efd8>] (arch_cpu_idle+0x28/0x2c)
[  241.752041@3] [<c000efd8>] (arch_cpu_idle+0x28/0x2c) from [<c006e540>] (cpu_startup_entry+0xf8/0x154)
[  241.764625@3] [<c006e540>] (cpu_startup_entry+0xf8/0x154) from [<00872ec4>] (0x872ec4)
[  241.770404@0] CPU0: stopping
[  241.773844@0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.107-11 #2
[  241.779748@0] [<c0014a40>] (unwind_backtrace+0x0/0xf4) from [<c0011a44>] (show_stack+0x10/0x14)
[  241.788480@0] [<c0011a44>] (show_stack+0x10/0x14) from [<c00132cc>] (handle_IPI+0xd4/0x17c)
[  241.796817@0] [<c00132cc>] (handle_IPI+0xd4/0x17c) from [<c0008470>] (gic_handle_irq+0x58/0x5c)
[  241.805484@0] [<c0008470>] (gic_handle_irq+0x58/0x5c) from [<c000d800>] (__irq_svc+0x40/0x70)
[  241.813993@0] Exception stack(0xc0963f68 to 0xc0963fb0)
[  241.819130@0] 3f60:                   00000000 00000000 00030364 00000000 c0976590 c0686078
[  241.827493@0] 3f80: c0962000 c09d7b5e c0962000 c09d7b5e c0962000 c0962000 00000000 c0963fb0
[  241.835779@0] 3fa0: c000efd4 c000efd8 600f0013 ffffffff
[  241.840903@0] [<c000d800>] (__irq_svc+0x40/0x70) from [<c000efd8>] (arch_cpu_idle+0x28/0x2c)
[  241.849306@1] SMP: failed to stop se_i24e+8x98/0@0] [<c000efd8>] (arch_cpu_idle+0x28/0x2c) from [<c006e540>] (cpu_startup_entry+0xf8/0x154)
[  241.865021@0] [<c006e540>] (cpu_startup_entry+0xf8/0x154) from [<c0927ac8>] (start_kernel+0x348/0x354)
Any ideas? Could the card be corrupted or should I persevere? It boots up fine again, but I don't know for how long. I am under a bit of time pressure so would need to order another card today if the current one is no good.

Many thanks for your help!!

User avatar
odroid
Site Admin
Posts: 32121
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 119 times
Been thanked: 292 times
Contact:

Re: Corrupt eMMC card?

Unread post by odroid » Tue Nov 13, 2018 9:14 am

Which power supply do you use?

oddulf
Posts: 10
Joined: Mon Nov 12, 2018 8:42 pm
languages_spoken: english, swedish
ODROIDs: C1+, XU4Q
Has thanked: 0
Been thanked: 0
Contact:

Re: Corrupt eMMC card?

Unread post by oddulf » Tue Nov 13, 2018 4:57 pm

The hard kernel 5V /2A UK plug.

oddulf
Posts: 10
Joined: Mon Nov 12, 2018 8:42 pm
languages_spoken: english, swedish
ODROIDs: C1+, XU4Q
Has thanked: 0
Been thanked: 0
Contact:

Re: Corrupt eMMC card?

Unread post by oddulf » Tue Nov 13, 2018 8:13 pm

My main concern is to know if anything in the UART log would indicate a hardware/eMMC fault. It's been running beautifully overnight so I'm less concerned than yesterday.

After posting this I found some oldish references elsewhere on this forum to random crashes due to that C1+ sometimes can't cope with low-speed USB peripherals. Is this still the case? All my three keyboards show up as 1.5Mbps using lsusb -t but one of them - in fact, the one plugged in at the time of the UART-logged crash - has shown some strange behaviour at other times too. Although a keyboard is not needed once deployed, I will need to use a data cable to the UPS which shows up as 1.5Mbps as well.

Reading some of the other posts, would changing the /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor, which currently has the value "interactive", make it more robust? Should I use a USB hub from the micro USB port?

Many thanks!
Ulf

User avatar
tobetter
Posts: 3902
Joined: Mon Feb 25, 2013 10:55 am
languages_spoken: Korean, English
ODROIDs: X, X2, U2, U3, XU3, C1
Location: Paju, South Korea
Has thanked: 38 times
Been thanked: 154 times
Contact:

Re: Corrupt eMMC card?

Unread post by tobetter » Tue Nov 13, 2018 9:42 pm

If the crash happened when your keyboard is connected, I'd recommend to connect them to USB OTG port. Obviously, you will need USB OTG cable as well as USB hub if you need multiple input devices. If you are having a doubt if your eMMC is corrupted, you can run fsck.ext4 on another Linux box, and you also can grep the kernel log with "mmcblk" or "ext4".

User avatar
rooted
Posts: 6610
Joined: Fri Dec 19, 2014 9:12 am
languages_spoken: english
Location: Gulf of Mexico, US
Has thanked: 104 times
Been thanked: 20 times
Contact:

Re: Corrupt eMMC card?

Unread post by rooted » Tue Nov 13, 2018 10:16 pm

Between the khungtaskd and watchdog issues in the log it seems like perhaps watchdog needs to be disabled?

What are you running on the device? Seems to be something CPU intensive?

oddulf
Posts: 10
Joined: Mon Nov 12, 2018 8:42 pm
languages_spoken: english, swedish
ODROIDs: C1+, XU4Q
Has thanked: 0
Been thanked: 0
Contact:

Re: Corrupt eMMC card?

Unread post by oddulf » Thu Nov 15, 2018 11:05 pm

Thank you both tobetter and rooted for your helpful answers!

Yes, I did run fsck on the root partition and it sorted a lot of stuff, so yes there was some corruption; I have enabled fsck on every startup now but doesn't seem to have to repair anything. It's been running fine ever since Monday and kernel logs are happy, so I'll stick with the card. Just got my OTG hub for the keyboard in the post today so will test it out tomorrow :-)

I'm running Ubuntu 18.04 minimal headless, to be deployed remotely. Got quite a lot of extra packages on there so maybe that's showing up in CPU activity? Don't know. I'm under the impression that watchdog is another safety mechanism especially useful for remote devices, so I'll stick with it until it causes further problems.

Many thanks again for your comments; they really confirmed that what I had been thinking is right and that I can have confidence in the card!
Ulf

Post Reply

Return to “Hardware and peripherals”

Who is online

Users browsing this forum: No registered users and 0 guests