Syncing two UAS devices causes Ubuntu to hang

Post Reply
Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Fri May 17, 2019 6:18 pm

Edit: summary of this thread

As of the time of this edit, Hardkernel are working on this issue. It turns out the problem has something to do with reading data from multiple USB devices at once, and it happens more quickly the faster you read data from them. The issue causes Ubuntu to stop recognising the N2's USB ports (including the micro-USB port) until you power the device off and on again, but ethernet still works so it's still possible to log in over SSH.

If you just came here looking for a solution, you can safely skip to odroid's latest update as of the time of writing.

How to check if you have this issue

To check if you have this issue, do the following:
  1. if your N2 crashes too quickly to test, optionally detach all your hard disks before powering it on
  2. open one terminal for each hard drive attached to your N2. SSH sessions over the built-in ethernet socket are best, but you can press alt+F1, alt+F2 etc. on the console to get multiple sessions
    • this issue causes all USB ports to fail, so you won't be able to connect through a USB keyboard or wireless ethernet dongle once you've triggered the bug
  3. in one terminal, run lsusb -t and make a note of how many lines it displays
  4. in every open terminal, run dmesg -w & - this will print a huge amount of debugging information, which you can ignore for now
  5. once you've finished typing dmesg -w & in all the terminals, it's time to trigger the bug...
    • if your N2 crashes on its own just by plugging the disks in, plug them in now and wait
    • if your N2 is stable until you actually use the disks, run dd if=/dev/sd<letter> of=/dev/null status=progress in each terminal (change <letter> to a in the first terminal, b in the second, and so on
  6. watch any open terminal for a little while - you should see a few cryptic error messages from dmesg, then after a few minutes you'll get a message that's several screen-lengths long - this means you've replicated the bug
  7. if you have an ethernet cable connected to your N2, SSH should still work. Run lsusb -t again - you should see all your USB ports are now missing
  8. shut your device down...
    • if you can connect with SSH, do shutdown -h now in the normal way
    • otherwise, you'll have to unplug the N2
If your dd processes all fail in about a minute and lsusb shows nothing attached afterwards, then you probably have the bug described in this thread. If you have the same behaviour but it takes a while longer before it happens (less than 30 minutes), you probably have the same issue but your disk might be partially immune for some reason. If the test runs happily for over half an hour, your issue is probably different to this one.

Possible workaround

The following workaround is based on a suggestion from mad_ady. Most people report it fixes the bug, but it also slows down data transfer by 20-50%.
  1. apply the workaround temporarily until the next reboot: echo 32 | tee /sys/class/block/?d?/queue/max_sectors_kb > /dev/null
  2. optionally rerun the test to make sure this fix works for you
  3. apply the workaround automatically at reboot: echo '@reboot root echo 32 | tee /sys/class/block/?d?/queue/max_sectors_kb > /dev/null' > /etc/cron.d/odroid-workaround
  4. whenever you attach devices to an already-booted N2, rerun the workaround: echo 32 | tee /sys/class/block/?d?/queue/max_sectors_kb > /dev/null
  5. optionally check the workaround was applied correctly: cat /sys/class/block/?d?/queue/max_sectors_kb
    • you should see one line for each attached disk drive, each of which should be 32 if the fix was applied correctly
  6. when odroid come out with a fix, remove this workaround: rm /etc/cron.d/odroid-workaround
    • an older version of this post recommended a different workaround. If you used that workaround, you should also rm /etc/modprobe/odroid-workaround.conf
Note for people using RAID arrays

If you have configured a RAID array, you need to apply the workaround to your RAID devices as well as your raw disks. The pattern .../block/?d?/... in the workaround should match raw disks (e.g. sda) and conventionally-named RAID arrays (e.g. md0), so the workaround should work for you automatically. But if your RAID devices use some other naming scheme (like /dev/my-raid-disk), you will need to change .../block/?d?/... to some other pattern that matches your scheme.

Original post (historical interest only - please ignore)

I'm running Ubuntu on an Odroid N2, which crashes in under a minute when syncing a RAID array containing two USB drives. I think I'm supposed to report to you guys to rule out errors in Odroid, Ubuntu or my own reasoning. But I suspect this is a hard drive quirk that should be patched into the kernel.

Steps to reproduce the error
  1. get an Odroid N2
  2. install the official Ubuntu 20190329 image on a micro SD card
  3. attach a Western Digital Gaming drive (USB ID 1058:261e)
  4. attach a Toshiba External USB 3.0 drive (USB ID 0480:0900)
  5. create an MD RAID1 array with both devices
  6. wait a few moments while the devices resync
  7. expected: the device slowly resyncs
    observed: the system hangs and prints the attached messages to kern.log
Note 1: I can't replicate the error using dd with either device on its own. Doing so would allow me to rule out one device, so I'd appreciate any suggestions.

Note 2: I can provide exact instructions for creating a RAID array, but the instructions will be fairly long and boring so I'd rather confirm there's no easier way to replicate the problem first.

Workaround and suggested next steps

I've been able to work around the issue like so:

Code: Select all

echo 'options usb-storage quirks=0480:0900:g' > /etc/modprobe/quirk.conf # either of these will
echo 'options usb-storage quirks=1058:261e:g' > /etc/modprobe/quirk.conf # work around the issue
According to the kernel parameters file, this disables transferring more than 240 sectors at a time on UAS devices. Because I can only replicate the issue by doing a RAID sync between the two devices, I guess that disabling large transfers on either device effectively disables it for both?

I'm still left with the following possibilities:
  • could the Odroid itself be the problem? Has anyone else successfully configured a pair of UAS devices in a RAID array?
  • is there a way to trigger this issue on a single device, so we can confirm which one has the issue?
  • if we can confirm which device has the issue, would I be right that we're supposed to submit a patch to unusual_uas.h?
Attachments
kern.log
(67.33 KiB) Downloaded 17 times
Last edited by Andrew Sayers on Sun Jun 16, 2019 4:50 am, edited 5 times in total.
These users thanked the author Andrew Sayers for the post:
hobo (Sat May 25, 2019 8:16 am)

User avatar
odroid
Site Admin
Posts: 30976
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 21 times
Been thanked: 138 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Fri May 17, 2019 6:25 pm

We will try to reproduce the UAS driver stability issue by end of the next week after running another USB storage stability test.
viewtopic.php?f=181&t=34849

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Fri May 17, 2019 6:51 pm

Thanks for the quick reply. I've now subscribed to the other thread in case this is related, but a few notes:
  • the logs in that thread don't look much like mine - the only similar line I can see is tag#16 CDB: Write(16) there kinda resembles tag#0 CDB: Write same(16) for me
  • most people in that thread seem to be describing errors that happen randomly, sometimes days apart, whereas I can replicate this issue reliably in under a minute (which might be useful for debugging!)
  • that thread seems to describe USB disconnecting but the system carrying on, which is the opposite of my problem - the kernel crashes, but the lights on the Odroid and the devices keep blinking (for at least 8 hours, after which I pulled the power out)
  • both of my devices are native USB, not connected through an SATA bridge board
Anyway, good luck and let me know if I can do anything to help.

User avatar
odroid
Site Admin
Posts: 30976
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 21 times
Been thanked: 138 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Mon May 20, 2019 9:18 am

I believe your HDD enclosures should have a SATA bridge inside.
Can you show me "lsusb -t" and "lsusb" outputs?

Also let us know your current kernel version.
The latest one should be 4.9.177

xabolcs
Posts: 31
Joined: Fri Jun 22, 2018 6:37 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 34 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by xabolcs » Mon May 20, 2019 3:10 pm

From the logs:

Code: Select all

May 17 06:17:39 andrew kernel: [  280.335362@0] CPU: 0 PID: 2070 Comm: kworker/0:3 Not tainted 4.9.162-22 #1
May 17 06:17:39 andrew kernel: [  280.335363@0] Hardware name: Hardkernel ODROID-N2 (DT)
Andrew Sayers, you should update your Ubuntu and try to reproduce the issue with the updated kernel again!

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 5:18 pm

odroid wrote:
Mon May 20, 2019 9:18 am
I believe your HDD enclosures should have a SATA bridge inside.
Can you show me "lsusb -t" and "lsusb" outputs?

Also let us know your current kernel version.
The latest one should be 4.9.177
Sorry about the kernel version - I've been wiping and reinstalling Ubuntu fairly regularly, and must have forgotten to upgrade that time.

Requested debug info

Output of lsusb:

Code: Select all

Bus 002 Device 004: ID 1058:261e Western Digital Technologies, Inc. 
Bus 002 Device 003: ID 0480:0900 Toshiba America Inc 
Bus 002 Device 002: ID 05e3:0620 Genesys Logic, Inc. 
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 04f2:0402 Chicony Electronics Co., Ltd Genius LuxeMate i200 Keyboard
Bus 001 Device 002: ID 05e3:0610 Genesys Logic, Inc. 4-port hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Output of lsusb -t:

Code: Select all

/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M
        |__ Port 2: Dev 3, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
        |__ Port 4: Dev 4, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/2p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
    |__ Port 2: Dev 3, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
    |__ Port 2: Dev 3, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
Replicating on the latest kernel

I've now reinstalled again and reproduced the issue with the latest kernel. Here's the complete bash history (every command since reinstalling from the latest Ubuntu image):

Commands from session 1 - this includes upgrading the kernel after waiting for ntp to fix the date:

Code: Select all

apt-get update
date
apt-get update
apt-get dist-upgrade
lsusb > lsusb1.txt
lsusb -t > lsusb2.txt
fdisk /dev/sda # will create a GPT partition table with a single Linux RAID partition spanning the whole disk
fdisk /dev/sdb # will create a GPT partition table with a single Linux RAID partition spanning the whole disk
reboot
Commands from session 2a - this is where I trigger the bug. The system found an old partition and mounted it, and I've left the umount command in for completeness. You probably wouldn't need to run it while replicating the bug:

Code: Select all

dmesg -w &
while sleep 60 ; do date ; lsusb ; lsusb -t ; done | tee lsusb3.txt
while sleep 60 ; do date ; lsusb ; lsusb -t ; done | tee lsusb3.txt &
apt-get install mdadm
mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sd?1 | tee mdadm1.log
umount /media/*
mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sd?1 | tee mdadm2.log
mdadm --detail /dev/md0
mdadm --detail /dev/md0 > mdadm1.log
while sleep 5 ; do mdadm --detail /dev/md0 ; done | tee mdadm2.log
Commands from session 2b - this is where I shut the system down from an SSH session after triggering the bug:

Code: Select all

shutdown -h now
Other notes

I previously thought the bug caused the system to crash completely. Actually it just takes down USB like the bug in the other thread, but that includes the micro-USB hub that my keyboard was plugged into :oops:

As mentioned above, I normally partition my disks with one small VFAT partition followed by a Linux RAID partition (GPT type 29) with an ext4 filesystem. For this test, I made a single Linux RAID partition with no filesystem. References in syslog and kern.log to a VFAT partition on those drives probably refer to the old filesystem.

I made a few mistakes in the commands above. Here are some notes in case you or I repeat the steps based on the above:
  1. when capturinng logs, I forgot to add 2>&1, so STDERR isn't captured
  2. I accidentally overwrote the output of the successful mdadm --create command. It didn't say anything particularly interesting, but it's worth renaming the second mdadm2.log to mdadm3.log
  3. the rolling mdadm --detail might be more useful if it had a date like the other while commands
  4. several commands can be better automated (e.g. echo ... | fdisk /dev/sda), which would make replication easier
  5. as mentioned above, a precautionary umount /media/* may or may not be necessary
This almost certainly isn't related, but while reading syslog I noticed several messages like the following:

Code: Select all

Mar 28 17:18:11 odroid kernel: [    0.000000@0] **********************************************************
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **                                                      **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] ** trace_printk() being used. Allocating extra memory.  **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **                                                      **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] ** This means that this is a DEBUG kernel and it is     **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] ** unsafe for production use.                           **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **                                                      **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] ** If you see this message and you are not debugging    **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] ** the kernel, report this immediately to your vendor!  **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **                                                      **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **********************************************************
I'm happy to open another thread for this, but consider yourselves notified I guess :)

Please let me know by the end of the day if you want anything else. I'm not planning to do any more work on my Odroid today, but I'll reinstall the image overnight and continue configuring my actual setup tomorrow.
Attachments
syslog.log
Complete syslog of the device (renamed to 'syslog.log' so the forum software will allow me to upload it)
(1.19 MiB) Downloaded 10 times
mdadm2.log
Rolling output of mdadm --detail /dev/md0
(12.15 KiB) Downloaded 10 times
mdadm1.log
Output of failed mdadm --create
(897 Bytes) Downloaded 10 times
lsusb3.txt
Rolling output of lususb while replicating the bug
(7.63 KiB) Downloaded 12 times
kern.log
Complete kern.log of the device
(876.99 KiB) Downloaded 11 times

User avatar
odroid
Site Admin
Posts: 30976
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 21 times
Been thanked: 138 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Mon May 20, 2019 5:28 pm

I think you are still running an old kernel.

Code: Select all

Feb 18 06:26:57 odroid kernel: [    0.000000@0] Linux version 4.9.156-14 (root@builder_n2) (gcc version 7.3.0 (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) ) #1 SMP PREEMPT Sat Feb 16 02:15:44 -02 2019
This three lines of commands should be enough to have the latest kernel.
https://wiki.odroid.com/odroid-n2/os_im ... st-upgrade

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 5:43 pm

odroid wrote:
Mon May 20, 2019 5:28 pm
I think you are still running an old kernel.

Code: Select all

Feb 18 06:26:57 odroid kernel: [    0.000000@0] Linux version 4.9.156-14 (root@builder_n2) (gcc version 7.3.0 (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) ) #1 SMP PREEMPT Sat Feb 16 02:15:44 -02 2019
This three lines of commands should be enough to have the latest kernel.
https://wiki.odroid.com/odroid-n2/os_im ... st-upgrade
The attached log files start from the point I reflashed my SD card, so yes the first few boots use the old kernel. But line 11521 of the attached syslog.log says:

Code: Select all

May 20 06:55:22 odroid kernel: [    0.000000@0] Linux version 4.9.177-28 (root@builder_n2) (gcc version 7.4.0 (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04) ) #1 SMP PREEMPT Thu May 16 23:10:54 -03 2019
That's the start of the final boot sequence, so I'm pretty sure the dist-upgrade was successful.

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 5:53 pm

... but I just took another look through the logs, and the trace_printk notice is gone in the latest kernel, so at least that part was a false alarm :)

User avatar
odroid
Site Admin
Posts: 30976
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 21 times
Been thanked: 138 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Mon May 20, 2019 5:56 pm

I see. You are running the latest kernel and you still meet the crash issue with mdadm software driven RAID drivers.

According to your "lsusb3.txt" file, the disks and hub controller seemed to be disappeared in the last stages.
But I think the file looks corrupted or overwritten accidentally. Please check whether any USB devices disappeared or not.

Another question.
Are those two USB storage a self-powered device? or bus-powered 2.5" HDD?

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 6:16 pm

Yes, the disks and hub disappear when the bug occurs. They're bus-powered 2.5" hard disks, with no external power:
IMG_20190520_100539.jpg
IMG_20190520_100539.jpg (293.53 KiB) Viewed 1198 times
I've included the wifi device in the picture above for reference, although I most recently replicated the bug without it connected.

User avatar
odroid
Site Admin
Posts: 30976
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 21 times
Been thanked: 138 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Mon May 20, 2019 6:22 pm

Ahh.. you are using two bus-powered HDD devices.
Can you estimate the maximum power consumption of the storages?
Do you use our official 12V/2A PSU?
24Watt PSU might not be enough to run two external HDDs in parallel.

Do you have a DMM to measure the system 5V voltage rail on the 40pin GPIO header?

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 6:35 pm

Yes, I'm using the official 12V/2A PSU. If it makes any difference, I'm in the UK so I'm getting 240V input from the wall. I'm not much of a hardware guy I'm afraid, so I don't know how to estimate power consumption and don't have a DMM to help. But why would a loss of power to the USB 3.0 drives cause the USB 2.0 hub to fail?

I'm not sure if this is any help, but here's the text on the bottom of the drives (the blue drive in the picture below is the one with the blue cable in the picture above):
drives.jpg
drives.jpg (79.27 KiB) Viewed 1190 times

User avatar
odroid
Site Admin
Posts: 30976
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 21 times
Been thanked: 138 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Mon May 20, 2019 6:44 pm

Your input voltage should be okay since we are using 220V in Korea.

The rated power consumption of the storage seems to be 5Watt(5Vx1A). So total USB device power consumption could be 10Watt.
As far as I remember ODROID N2 could draw up to 15Watt when we ran a full CPU/GPU load test.
So the PSU output power might not be enough even I'm not sure at this moment.

Did you have no problem while testing a single disk?

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 6:57 pm

I've just run a test as follows on the same install as before:
  1. power the device on and watch USB for a minute or two - everything is fine
  2. attach one device, disable mdadm, attach the other device, wait a minute or two - still fine, bug not triggered while both devices are idle
  3. run dd if=/dev/sda of=/dev/sda to make the WD Gaming drive pull a lot of power, wait a few minutes - still fine
  4. run dd if=/dev/sdb of=/dev/sdb in a second terminal so both drives pull a lot of power, wait a few minutes - still fine
  5. kill both processes, then restart them but copying from/to each other instead of from/to themselves - triggers the bug
As a naive non-hardware guy, I would expect that to reading/writing to/from the same disk would pull a similar amount of power as reading/writing between disks, but I'll run some more tests and see what I can see.

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 7:26 pm

Taking some inspiration from a post in the other thread, I have now tried:
  • dd if=/dev/zero of=/dev/sd? status=progress for both disks at the same time in different terminals - wrote 1GB or so to each drive without triggering the bug, but writes were surprisingly slow (~4MB/s for sda, ~8MB/s for sdb)
  • dd if=/dev/sd? of=/dev/null status=progress for both disks at the same time in different terminals - read 12GB or so from each drive at ~125MB/s, then triggered the bug
  • dd if=/dev/sd? of=/dev/null status=progress for one disk at a time in a single terminal - read 30GB or so each at ~110MB/s without triggering the bug
So the bug is triggered when you read from two disks at once, but not when you write to two disks at once. But it's possible that there might be an unrelated issue with write speeds that's hiding the bug in that case.

It seems odd that read speeds would be higher when both disks are reading than just one, but I've only tested once so that could be random.

User avatar
odroid
Site Admin
Posts: 30976
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 21 times
Been thanked: 138 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Tue May 21, 2019 8:38 am

Single disk access seems not to cause any problem so far.
So I have a concern about the 24Watt PSU.
Do you have another PSU which can supply 30Watt?

Meanwhile we will try a multiple-threaded dd copying test with two USB storages to narrow down root causes.
Do you think the EXT4 file system is enough for this test?

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Tue May 21, 2019 6:24 pm

I expect ext4 will be fine, but I've been copying directly from the devices in order to rule out any FS-specific issues. In fact, the write tests will have zeroed out the superblock and filesystem, so I'm effectively testing a blank drive at the moment.

I'll have a look round for a 24W PSU, but to be honest I doubt I'll find one. I have had a couple of ideas though, which should rule out power as a cause:

I've got a USB 3.1 memory stick, and I can replicate the bug with that and one hard disk. Obviously this disk is slower, and interestingly it takes much longer to replicate the bug - the hard disk copied 76GB before USB died, whereas it usually dies after about 12-24 GB. Here's lsusb and lsusb -t for the memory stick:

Code: Select all

# lsusb:
Bus 002 Device 003: ID 0781:5583 SanDisk Corp. 
Bus 002 Device 002: ID 05e3:0620 Genesys Logic, Inc. 
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 04f2:0402 Chicony Electronics Co., Ltd Genius LuxeMate i200 Keyboard
Bus 001 Device 002: ID 05e3:0610 Genesys Logic, Inc. 4-port hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

# lsusb -t:
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M
        |__ Port 4: Dev 3, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/2p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 3: Dev 3, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
        |__ Port 3: Dev 3, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
Also, it turns out my electricity meter can tell me the current wattage used by the whole house, which can give readings accurate to about +-1 watt. I can post the details if you like, but in short the power usage seems to look like this:
  • Odroid plus two inactive disks uses about 5W
  • Odroid plus two disks triggering the bug uses about 10-11W
  • Odroid plus one inactive disk and one inactive memory stick uses about 4W
  • Odroid plus a disk and a memory stick triggering the bug uses about 7-8W
Obviously a household power meter isn't as precise as a professional tool, but at the very least it suggests the memory stick is triggering the same bug despite using less power.

This feels more like some kind of USB timing issue to me, as the bug is triggered randomly but correlates with the speed of data transfer. I'm going to try replicating with multiple `dd`s from a single disk, but probably won't finish before close of business Korean time.

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Tue May 21, 2019 8:26 pm

Last test results for the day:

I can replicate the bug with the memory stick and either of the HDDs attached, so it's unlikely to be problem in any one drive.

I haven't been able to replicate the bug with two processes reading from a single device.

Trying to replicate the bug after setting USB quirks seems to trigger a less serious version of the bug. I did approximately the following:
  1. echo 'options usb-storage quirks=0480:0900:g,1058:261e:g' > /etc/modprobe.d/quirks.conf # set modprobe quirks
  2. reboot # so the changes take effect
  3. echo -n 0480:0900:g,1058:261e:g > /sys/module/usb_storage/parameters/quirks # in case modprobe didn't work for some reason
  4. unplug and replug both hard disks
  5. run dd if=/dev/sda of=/dev/null in one terminal, and dd if=/dev/sdb of=/dev/null in another - this would normally trigger the bug in under a minute
  6. waited several minutes
Transfer speed was about 125MB/s for both disks (the same as normal), and I got error messages in syslog similar to what I normally get:

Code: Select all

May 20 07:11:11 odroid systemd[1]: Started Getty on tty4.
May 20 07:11:12 odroid kernel: [  174.848928@0] xhci-hcd xhci-hcd.0.auto: WARN Successful completion on short TX
May 20 07:11:12 odroid kernel: [  174.848991@0] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 5 ep 2 with no TDs queued?
May 20 07:11:12 odroid kernel: [  174.848995@0] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 5 ep 2 with no TDs queued?
May 20 07:11:12 odroid kernel: [  174.848999@0] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 5 ep 2 with no TDs queued?
May 20 07:11:12 odroid kernel: [  174.849003@0] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 5 ep 2 with no TDs queued?
May 20 07:11:12 odroid kernel: [  174.849059@0] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 3
May 20 07:11:12 odroid kernel: [  174.854109@0] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000cf7c1050 trb-start 00000000cf7c1030 trb-end 00000000cf7c1030 seg-start 00000000cf7c1000 seg-end 00000000cf7c1ff0
May 20 07:11:14 odroid systemd[1]: Started Session 5 of user root.
May 20 07:11:42 odroid kernel: [  205.164685@1] usb 2-1.2: reset SuperSpeed USB device number 4 using xhci-hcd
May 20 07:12:22 odroid kernel: [  244.534568@0] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
May 20 07:12:22 odroid kernel: [  244.539709@0] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000cf7c4360 trb-start 00000000cf7c41f0 trb-end 00000000cf7c4340 seg-start 00000000cf7c4000 seg-end 00000000cf7c4ff0
May 20 07:12:22 odroid kernel: [  244.539714@0] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1
May 20 07:12:22 odroid kernel: [  244.550363@0] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000cf7c4580 trb-start 00000000cf7c41f0 trb-end 00000000cf7c4340 seg-start 00000000cf7c4000 seg-end 00000000cf7c4ff0
May 20 07:12:52 odroid kernel: [  275.048709@1] usb 2-1.2: reset SuperSpeed USB device number 4 using xhci-hcd
May 20 07:13:29 odroid kernel: [  311.588731@0] usb 2-1.2: reset SuperSpeed USB device number 4 using xhci-hcd
May 20 07:16:07 odroid kernel: [  470.041615@0] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 5 ep 2 with no TDs queued?
May 20 07:16:07 odroid kernel: [  470.041624@0] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 5 ep 2 with no TDs queued?
May 20 07:16:38 odroid kernel: [  501.096766@1] usb 2-1.2: reset SuperSpeed USB device number 4 using xhci-hcd
May 20 07:17:01 odroid CRON[2428]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
May 20 07:18:43 odroid kernel: [  625.607748@0] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
May 20 07:18:43 odroid kernel: [  625.612885@0] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000cf7d0b80 trb-start 00000000cf7d0980 trb-end 00000000cf7d0b70 seg-start 00000000cf7d0000 seg-end 00000000cf7d0ff0
May 20 07:18:43 odroid kernel: [  625.612890@0] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1
May 20 07:18:43 odroid kernel: [  625.623539@0] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000cf7d0db0 trb-start 00000000cf7d0980 trb-end 00000000cf7d0b70 seg-start 00000000cf7d0000 seg-end 00000000cf7d0ff0
But instead of USB dying altogether, dd ... sdb would just pause after each error, then carry on again when the "reset" message came in. dd ... sda acted normally throughout, the USB keyboard was unaffected, and I didn't need to restart dd .. sdb. It was just like a download that dropped to zero bandwidth for a few seconds.

User avatar
odroid
Site Admin
Posts: 30976
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 21 times
Been thanked: 138 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Wed May 22, 2019 9:58 am

Thank you for the valuable and various test results.
It seems not to be related to the power supply.
We will try to make a similar test process.

User avatar
odroid
Site Admin
Posts: 30976
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 21 times
Been thanked: 138 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Wed May 22, 2019 4:27 pm

We could reproduce the issue with two storage devices finally.
Give us a few days to find any possible root causes.
These users thanked the author odroid for the post (total 4):
Andrew Sayers (Wed May 22, 2019 5:03 pm) • xabolcs (Thu May 23, 2019 12:54 am) • hobo (Sat May 25, 2019 8:16 am) • etcetera (Tue May 28, 2019 6:53 pm)

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Wed May 22, 2019 6:13 pm

Excellent! I've updated the first post with a quick summary, to make life easier for people who get here from a Google search or something. You're welcome to re-edit the summary as more information comes in.

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Sun May 26, 2019 1:12 am

I did a longer test while writing up some more notes in the first post. It seems like the workaround described above doesn't completely fix the bug, just makes it about 10 times less common. Presumably I didn't detect it before because I stopped my tests it before it triggered.

Soleil
Posts: 34
Joined: Tue Apr 30, 2019 9:20 am
languages_spoken: english
Has thanked: 1 time
Been thanked: 2 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Soleil » Mon May 27, 2019 10:40 pm

odroid wrote:
Wed May 22, 2019 4:27 pm
We could reproduce the issue with two storage devices finally.
Give us a few days to find any possible root causes.
Would you mind to share your findings? In which area is the problem? We're dying here being curious beasts.

Thanks for all help!

User avatar
odroid
Site Admin
Posts: 30976
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 21 times
Been thanked: 138 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Tue May 28, 2019 1:38 pm

We think the USB host driver and DMA driver seem to cause the issue.
But we still have no firm solution yet. We need more time to analyze the root causes precisely.
These users thanked the author odroid for the post (total 2):
etcetera (Tue May 28, 2019 6:52 pm) • xabolcs (Wed May 29, 2019 7:46 am)

User avatar
odroid
Site Admin
Posts: 30976
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 21 times
Been thanked: 138 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Fri May 31, 2019 5:39 pm

We tried many different approaches to find a solution.
But there was no firm solution yet. Sorry about that.

So we've reported the issue to Amlogic and their engineers could reproduce the same issue on their internal test lab too.
They started investigating the issue and I hope we can have a solution or workaroud soon.
These users thanked the author odroid for the post (total 5):
xabolcs (Fri May 31, 2019 6:37 pm) • etcetera (Fri May 31, 2019 6:39 pm) • XOR (Fri May 31, 2019 10:04 pm) • Andrew Sayers (Sat Jun 01, 2019 4:04 am) • Soleil (Sun Jun 02, 2019 6:49 pm)

Soleil
Posts: 34
Joined: Tue Apr 30, 2019 9:20 am
languages_spoken: english
Has thanked: 1 time
Been thanked: 2 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Soleil » Sun Jun 02, 2019 6:49 pm

Many thanks for updates! Communication is the key and your continuous help and engagement in the process to fix it is top notch.

This makes the whole difference why people should choose Odroid and why you guys should be successful. Despite difficult issues you've managed to reproduce it and engaged chip vendor whilst others could choose the path to diminish the issue and blame users.

Thanks again and we're all looking forward for a fix.

joshua.yang
Posts: 214
Joined: Fri Sep 22, 2017 5:54 pm
languages_spoken: Korean, English
ODROIDs: XU4, XU4Q + Cloudshell2, H2
Has thanked: 1 time
Been thanked: 17 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by joshua.yang » Wed Jun 12, 2019 11:36 am

Hi.

We just have found a temporary workaround for that.
For block devices, you can reduce queue max_sector_kb size to improve USB stability. But it also reduces the throughput of the devices about 20~30 percent.

Please refer to these commands.

Code: Select all

echo 32 > /sys/class/block/sda/queue/max_sectors_kb 
echo 32 > /sys/class/block/sdb/queue/max_sectors_kb 
while true; do dd if=/dev/sda1 of=/dev/null; done &
while true; do dd if=/dev/sdb1 of=/dev/null; done &
As far as I tested, when I edit that to 32, it runs stably without any warning/critical messages.
I think It is worth to try them until the firm solution finds.

The original thread about this: viewtopic.php?f=181&t=35129&p=259181#p259027

gahabana
Posts: 26
Joined: Wed Sep 07, 2016 1:47 am
languages_spoken: english
ODROIDs: odroid-c2
Has thanked: 2 times
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by gahabana » Thu Jun 13, 2019 5:05 pm

i tried all of the above suggestions includign reduction to 32k of max_sectors.
2 drives connected. one HDD (WD 2TB passport), the other JMicron USB to PCIe/NVME SSD (2TB PNY CS3030, tried with Intel 660P 2TB ) ...
3 both drives are running usb-storage not UAS
4 latest kernel (4.9.180)

copy from one drive to the other fails after less then a minute:
rsync -av /mnt/wd2tb/ /mnt/p/music/

log shows usual when this happens ... WD USB HDD is the one whrere it fails first ... couple of seocnds later SDD drive gets disconnected as well and both are unmounted afterwards :

NET is - unless this is somehow fixed - Odroid N2 can not be used reliably with USB 3.0 HDD drives at all. USB HDD 2.0 may be ok (havent tried) but with 20-30MB/second effective speed it is way too slow.

Code: Select all

[   12.732859] Key type cifs.idmap registered
[   17.869144] meson_uart ff803000.serial: ttyS0 use xtal(24M) 24000000 change 115200 to 115200
[   62.472505] fb: mem_free_work, free memory: addr:800000
[  611.313677] fb: osd[0] enable: 0 (kworker/0:3)
[ 1375.948481] xhci-hcd xhci-hcd.0.auto: xHCI host not responding to stop endpoint command.
[ 1375.948487] xhci-hcd xhci-hcd.0.auto: Assuming host is dying, halting host.
[ 1375.967214] xhci-hcd xhci-hcd.0.auto: Host not halted after 16000 microseconds.
[ 1375.967217] xhci-hcd xhci-hcd.0.auto: Non-responsive xHCI host is not halting.
[ 1375.967220] xhci-hcd xhci-hcd.0.auto: Completing active URBs anyway.
[ 1375.967329] xhci-hcd xhci-hcd.0.auto: HC died; cleaning up
[ 1375.967342] xhci-hcd xhci-hcd.0.auto: xHCI host not responding to stop endpoint command.
[ 1375.967344] xhci-hcd xhci-hcd.0.auto: Assuming host is dying, halting host.
[ 1375.972691] xhci-hcd xhci-hcd.0.auto: HC died; cleaning up
[ 1375.977188] usb 1-1: USB disconnect, device number 2
[ 1375.977757] usb 2-1: USB disconnect, device number 2
[ 1375.977765] usb 2-1.1: USB disconnect, device number 3
[ 1375.980487] blk_update_request: I/O error, dev sda, sector 127328512
[ 1375.980549] blk_update_request: I/O error, dev sda, sector 127328768
[ 1375.980568] blk_update_request: I/O error, dev sda, sector 127329024
[ 1375.980583] blk_update_request: I/O error, dev sda, sector 127329280
[ 1375.980595] blk_update_request: I/O error, dev sda, sector 127329536
[ 1375.980606] blk_update_request: I/O error, dev sda, sector 127329792
[ 1375.981177] blk_update_request: I/O error, dev sda, sector 127330048
[ 1375.981189] EXT4-fs warning (device sda1): ext4_end_bio:313: I/O error -5 writing to inode 48892006 (offset 16777216 size 5246976 starting block 15916288)
[ 1375.981194] Buffer I/O error on device sda1, logical block 15915776
[ 1375.981204] Buffer I/O error on device sda1, logical block 15915777
[ 1375.981207] Buffer I/O error on device sda1, logical block 15915778
[ 1375.981210] Buffer I/O error on device sda1, logical block 15915779
[ 1375.981213] Buffer I/O error on device sda1, logical block 15915780
[ 1375.981216] Buffer I/O error on device sda1, logical block 15915781
[ 1375.981219] Buffer I/O error on device sda1, logical block 15915782
[ 1375.981222] Buffer I/O error on device sda1, logical block 15915783
[ 1375.981226] Buffer I/O error on device sda1, logical block 15915784
[ 1375.981229] Buffer I/O error on device sda1, logical block 15915785
[ 1375.981634] blk_update_request: I/O error, dev sda, sector 127330304
[ 1375.981666] blk_update_request: I/O error, dev sda, sector 127330560
[ 1375.981691] blk_update_request: I/O error, dev sda, sector 127330816
[ 1375.982216] EXT4-fs warning (device sda1): ext4_end_bio:313: I/O error -5 writing to inode 48892006 (offset 16777216 size 6295552 starting block 15916544)
[ 1375.983182] EXT4-fs warning (device sda1): ext4_end_bio:313: I/O error -5 writing to inode 48892006 (offset 16777216 size 7344128 starting block 15916800)
[ 1375.984184] EXT4-fs warning (device sda1): ext4_end_bio:313: I/O error -5 writing to inode 48892006 (offset 16777216 size 8388608 starting block 15917056)
[ 1375.986452] EXT4-fs warning (device sda1): ext4_end_bio:313: I/O error -5 writing to inode 48892006 (offset 16777216 size 8388608 starting block 15917312)
[ 1375.987821] EXT4-fs warning (device sda1): ext4_end_bio:313: I/O error -5 writing to inode 48892006 (offset 25165824 size 2101248 starting block 15917568)
[ 1375.989193] EXT4-fs warning (device sda1): ext4_end_bio:313: I/O error -5 writing to inode 48892006 (offset 25165824 size 3149824 starting block 15917824)
[ 1375.990745] EXT4-fs warning (device sda1): ext4_end_bio:313: I/O error -5 writing to inode 48892006 (offset 25165824 size 4198400 starting block 15918080)
[ 1375.992268] EXT4-fs warning (device sda1): ext4_end_bio:313: I/O error -5 writing to inode 48892006 (offset 25165824 size 5246976 starting block 15918336)
[ 1375.993559] EXT4-fs warning (device sda1): ext4_end_bio:313: I/O error -5 writing to inode 48892006 (offset 25165824 size 6295552 starting block 15918592)
[ 1376.015889] JBD2: Detected IO errors while flushing file data on sda1-8
[ 1376.015910] Aborting journal on device sda1-8.
[ 1376.015951] Buffer I/O error on dev sda1, logical block 243826688, lost sync page write
[ 1376.015960] JBD2: Error -5 detected when updating journal superblock for sda1-8.
[ 1376.015996] JBD2: Detected IO errors while flushing file data on sda1-8
[ 1376.016506] Buffer I/O error on dev sda1, logical block 0, lost sync page write
[ 1376.016514] EXT4-fs error (device sda1): ext4_journal_check_start:56: Detected aborted journal
[ 1376.021516] EXT4-fs (sda1): Remounting filesystem read-only
[ 1376.027241] EXT4-fs (sda1): previous I/O error to superblock detected
[ 1376.027254] Buffer I/O error on dev sda1, logical block 0, lost sync page write
[ 1376.027261] EXT4-fs (sda1): ext4_writepages: jbd2_start: 9223372036854775807 pages, ino 48892007; err -30
[ 1376.072493] sd 1:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
[ 1376.072499] sd 1:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 1e 28 4a 08 00 01 00 00
[ 1376.156477] sd 1:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
[ 1376.156483] sd 1:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 1e 28 4b 08 00 01 00 00
[ 1376.244476] sd 1:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
[ 1376.244481] sd 1:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 1e 28 4a 08 00 00 08 00
[ 1376.244485] Buffer I/O error on dev sdb1, logical block 63244353, async page read

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Thu Jun 13, 2019 11:03 pm

@joshua.yang and @mad_ady - that's great! The fix worked for me in a 20-minute test, although it cut read performance in half. I've updated the first post to recommend the new workaround.

@gahabana - that definitely sounds like the same thing, so I've tried to make it clear in the first post that the fix doesn't work for everyone. Since the old workaround didn't work either, I've removed it from the first post. I can add it back in if anyone found the old fix worked but the new fix didn't?

@odroid - any updates from Amlogic about the permanent fix?

User avatar
odroid
Site Admin
Posts: 30976
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 21 times
Been thanked: 138 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Fri Jun 14, 2019 12:17 pm

Not yet. They told me they've been working on it.
We keep trying to find a firm solution too.
These users thanked the author odroid for the post (total 2):
Andrew Sayers (Fri Jun 14, 2019 4:18 pm) • xabolcs (Sat Jun 15, 2019 1:51 am)

rondi7
Posts: 1
Joined: Mon May 13, 2019 1:04 am
languages_spoken: english
ODROIDs: n2
Has thanked: 0
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by rondi7 » Sat Jun 15, 2019 1:18 am

I'm concerned about buying an N2 if there is a potential hardware problem with USB 3 portS running full speed. I do understand--Until Hardkernel can definitely find and fix the problem--there is no way to know for sure.
IF there is a hardware problem--how does it get fixed? Return to the seller for an immediate replacement, return it to Hardkernel, or hopefully--fix it following instructions?
tia, Ron

FaeRhan
Posts: 1
Joined: Sun Jun 16, 2019 2:47 am
languages_spoken: english
ODROIDs: C2, N2
Has thanked: 0
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by FaeRhan » Sun Jun 16, 2019 3:09 am

I tried using two of these with my Odroid N2 https://www.amazon.de/Elements-Desktop- ... 077RV4ZLY/ in a mdadm RAID 1 but had exactly the same issue with lots of dmesg messages and all USB devices disappearing randomly after some time until next reboot. My "testcase" was my new RAID resyncing which needs around a day. I'm running the ubuntu-18.04.2-4.9-minimal-odroid-n2-20190329.img.xz with all updates installed.

I added the reduce queue max_sector_kb size workaround to my crontab but for me it looks like in some cases the USB devices disappeared before the cron could even execute so I had no devices on ssh connection immediately after boot. My RAID resync worked mostly fine within the 24h (if boot with USB succeded) until it got to 90%, but after that the USB devices disappeared 3 times. The workaround makes the bug more rare, but does not fix it completely. Do we have any news when this will be fixed? Sadly USB is currently unusable for me, even with the workaround.

User avatar
mad_ady
Posts: 5976
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4, C1+, C2, N1, H2, N2
Location: Bucharest, Romania
Has thanked: 102 times
Been thanked: 55 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by mad_ady » Sun Jun 16, 2019 4:20 am

You can add the workaround in a udev rule - so when the disk is detected (at boot or after a disconnect) the correct buffer size is immediately set. Not a fix, though - a better workaround.

Andrew Sayers
Posts: 21
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 4 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Sun Jun 16, 2019 5:00 am

@FaeRhan - in further tests, I've found I need to add the workaround for the RAID device (/dev/md0) as well as the raw devices (/dev/sd{a,b}). I've updated the first post to recommend .../block/?d?/... instead of .../block/sd?/..., which should fix the problem so long as your RAID device is called something like /dev/md0.

@mad_ady - I don't suppose you could share an example rule? Always happy to update the workaround with something better :)

User avatar
mad_ady
Posts: 5976
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4, C1+, C2, N1, H2, N2
Location: Bucharest, Romania
Has thanked: 102 times
Been thanked: 55 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by mad_ady » Sun Jun 16, 2019 6:01 am

Sorry, I don't have anything handy. It's just an idea...

User avatar
mad_ady
Posts: 5976
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4, C1+, C2, N1, H2, N2
Location: Bucharest, Romania
Has thanked: 102 times
Been thanked: 55 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by mad_ady » Sun Jun 16, 2019 3:58 pm

Here's an udev example (not tested, but should work after you change the command used):

Code: Select all

$ cat /etc/udev/rules.d/90-disk.rules
ACTION=="add", ENV{DEVNAME}=="/dev/sd?", SUBSYSTEM=="block", ENV{ID_SERIAL_SHORT}=="NA4TEVC6", RUN+="/sbin/hdparm -S 60 $env{DEVNAME}"
ACTION=="add", ENV{DEVNAME}=="/dev/sd?", SUBSYSTEM=="block", ATTR{size}=="5860533168", RUN+="/usr/local/sbin/hd-idle -a $env{DEVNAME} -i 630 -l /var/log/hd-idle.log"
I use it to put my disks to sleep.
The first entry runs a command for a disk with a specific serial number, while the second one for a disk of a specific size. You can ommit serial and size and run it for any disk. There's a way to test it by running udev in debug mode to see how it goes.

Post Reply

Return to “Issues”

Who is online

Users browsing this forum: No registered users and 0 guests