Syncing two UAS devices causes Ubuntu to hang

Post Reply
Andrew Sayers
Posts: 14
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 1 time
Been thanked: 0
Contact:

Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Fri May 17, 2019 6:18 pm

Edit: summary of this thread

As of the time of this edit, Hardkernel are working on this issue. It turns out the problem has something to do with reading data from multiple USB devices at once, and it happens more quickly the faster you read data from them. The issue causes Ubuntu to stop recognising the N2's USB ports (including the micro-USB port) until you power the device off and on again, but ethernet still works so it's still possible to log in over SSH. I've been able to work around it temporarily by setting USB quirks for my devices, but obviously I can't guarantee it will fix anyone else's problems :)

The rest of this post (and most of this thread) just attempts to diagnose and replicate the problem. If you came here looking for a solution, you can safely skip to the point where Hardkernel replicated the problem.

Original post

I'm running Ubuntu on an Odroid N2, which crashes in under a minute when syncing a RAID array containing two USB drives. I think I'm supposed to report to you guys to rule out errors in Odroid, Ubuntu or my own reasoning. But I suspect this is a hard drive quirk that should be patched into the kernel.

Steps to reproduce the error
  1. get an Odroid N2
  2. install the official Ubuntu 20190329 image on a micro SD card
  3. attach a Western Digital Gaming drive (USB ID 1058:261e)
  4. attach a Toshiba External USB 3.0 drive (USB ID 0480:0900)
  5. create an MD RAID1 array with both devices
  6. wait a few moments while the devices resync
  7. expected: the device slowly resyncs
    observed: the system hangs and prints the attached messages to kern.log
Note 1: I can't replicate the error using dd with either device on its own. Doing so would allow me to rule out one device, so I'd appreciate any suggestions.

Note 2: I can provide exact instructions for creating a RAID array, but the instructions will be fairly long and boring so I'd rather confirm there's no easier way to replicate the problem first.

Workaround and suggested next steps

I've been able to work around the issue like so:

Code: Select all

echo 'options usb-storage quirks=0480:0900:g' > /etc/modprobe/quirk.conf # either of these will
echo 'options usb-storage quirks=1058:261e:g' > /etc/modprobe/quirk.conf # work around the issue
According to the kernel parameters file, this disables transferring more than 240 sectors at a time on UAS devices. Because I can only replicate the issue by doing a RAID sync between the two devices, I guess that disabling large transfers on either device effectively disables it for both?

I'm still left with the following possibilities:
  • could the Odroid itself be the problem? Has anyone else successfully configured a pair of UAS devices in a RAID array?
  • is there a way to trigger this issue on a single device, so we can confirm which one has the issue?
  • if we can confirm which device has the issue, would I be right that we're supposed to submit a patch to unusual_uas.h?
Attachments
kern.log
(67.33 KiB) Downloaded 6 times
Last edited by Andrew Sayers on Wed May 22, 2019 6:07 pm, edited 1 time in total.

User avatar
odroid
Site Admin
Posts: 30667
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 13 times
Been thanked: 98 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Fri May 17, 2019 6:25 pm

We will try to reproduce the UAS driver stability issue by end of the next week after running another USB storage stability test.
viewtopic.php?f=181&t=34849

Andrew Sayers
Posts: 14
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 1 time
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Fri May 17, 2019 6:51 pm

Thanks for the quick reply. I've now subscribed to the other thread in case this is related, but a few notes:
  • the logs in that thread don't look much like mine - the only similar line I can see is tag#16 CDB: Write(16) there kinda resembles tag#0 CDB: Write same(16) for me
  • most people in that thread seem to be describing errors that happen randomly, sometimes days apart, whereas I can replicate this issue reliably in under a minute (which might be useful for debugging!)
  • that thread seems to describe USB disconnecting but the system carrying on, which is the opposite of my problem - the kernel crashes, but the lights on the Odroid and the devices keep blinking (for at least 8 hours, after which I pulled the power out)
  • both of my devices are native USB, not connected through an SATA bridge board
Anyway, good luck and let me know if I can do anything to help.

User avatar
odroid
Site Admin
Posts: 30667
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 13 times
Been thanked: 98 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Mon May 20, 2019 9:18 am

I believe your HDD enclosures should have a SATA bridge inside.
Can you show me "lsusb -t" and "lsusb" outputs?

Also let us know your current kernel version.
The latest one should be 4.9.177

xabolcs
Posts: 28
Joined: Fri Jun 22, 2018 6:37 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 25 times
Been thanked: 1 time
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by xabolcs » Mon May 20, 2019 3:10 pm

From the logs:

Code: Select all

May 17 06:17:39 andrew kernel: [  280.335362@0] CPU: 0 PID: 2070 Comm: kworker/0:3 Not tainted 4.9.162-22 #1
May 17 06:17:39 andrew kernel: [  280.335363@0] Hardware name: Hardkernel ODROID-N2 (DT)
Andrew Sayers, you should update your Ubuntu and try to reproduce the issue with the updated kernel again!

Andrew Sayers
Posts: 14
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 1 time
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 5:18 pm

odroid wrote:
Mon May 20, 2019 9:18 am
I believe your HDD enclosures should have a SATA bridge inside.
Can you show me "lsusb -t" and "lsusb" outputs?

Also let us know your current kernel version.
The latest one should be 4.9.177
Sorry about the kernel version - I've been wiping and reinstalling Ubuntu fairly regularly, and must have forgotten to upgrade that time.

Requested debug info

Output of lsusb:

Code: Select all

Bus 002 Device 004: ID 1058:261e Western Digital Technologies, Inc. 
Bus 002 Device 003: ID 0480:0900 Toshiba America Inc 
Bus 002 Device 002: ID 05e3:0620 Genesys Logic, Inc. 
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 04f2:0402 Chicony Electronics Co., Ltd Genius LuxeMate i200 Keyboard
Bus 001 Device 002: ID 05e3:0610 Genesys Logic, Inc. 4-port hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Output of lsusb -t:

Code: Select all

/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M
        |__ Port 2: Dev 3, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
        |__ Port 4: Dev 4, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/2p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
    |__ Port 2: Dev 3, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
    |__ Port 2: Dev 3, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
Replicating on the latest kernel

I've now reinstalled again and reproduced the issue with the latest kernel. Here's the complete bash history (every command since reinstalling from the latest Ubuntu image):

Commands from session 1 - this includes upgrading the kernel after waiting for ntp to fix the date:

Code: Select all

apt-get update
date
apt-get update
apt-get dist-upgrade
lsusb > lsusb1.txt
lsusb -t > lsusb2.txt
fdisk /dev/sda # will create a GPT partition table with a single Linux RAID partition spanning the whole disk
fdisk /dev/sdb # will create a GPT partition table with a single Linux RAID partition spanning the whole disk
reboot
Commands from session 2a - this is where I trigger the bug. The system found an old partition and mounted it, and I've left the umount command in for completeness. You probably wouldn't need to run it while replicating the bug:

Code: Select all

dmesg -w &
while sleep 60 ; do date ; lsusb ; lsusb -t ; done | tee lsusb3.txt
while sleep 60 ; do date ; lsusb ; lsusb -t ; done | tee lsusb3.txt &
apt-get install mdadm
mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sd?1 | tee mdadm1.log
umount /media/*
mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sd?1 | tee mdadm2.log
mdadm --detail /dev/md0
mdadm --detail /dev/md0 > mdadm1.log
while sleep 5 ; do mdadm --detail /dev/md0 ; done | tee mdadm2.log
Commands from session 2b - this is where I shut the system down from an SSH session after triggering the bug:

Code: Select all

shutdown -h now
Other notes

I previously thought the bug caused the system to crash completely. Actually it just takes down USB like the bug in the other thread, but that includes the micro-USB hub that my keyboard was plugged into :oops:

As mentioned above, I normally partition my disks with one small VFAT partition followed by a Linux RAID partition (GPT type 29) with an ext4 filesystem. For this test, I made a single Linux RAID partition with no filesystem. References in syslog and kern.log to a VFAT partition on those drives probably refer to the old filesystem.

I made a few mistakes in the commands above. Here are some notes in case you or I repeat the steps based on the above:
  1. when capturinng logs, I forgot to add 2>&1, so STDERR isn't captured
  2. I accidentally overwrote the output of the successful mdadm --create command. It didn't say anything particularly interesting, but it's worth renaming the second mdadm2.log to mdadm3.log
  3. the rolling mdadm --detail might be more useful if it had a date like the other while commands
  4. several commands can be better automated (e.g. echo ... | fdisk /dev/sda), which would make replication easier
  5. as mentioned above, a precautionary umount /media/* may or may not be necessary
This almost certainly isn't related, but while reading syslog I noticed several messages like the following:

Code: Select all

Mar 28 17:18:11 odroid kernel: [    0.000000@0] **********************************************************
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **                                                      **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] ** trace_printk() being used. Allocating extra memory.  **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **                                                      **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] ** This means that this is a DEBUG kernel and it is     **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] ** unsafe for production use.                           **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **                                                      **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] ** If you see this message and you are not debugging    **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] ** the kernel, report this immediately to your vendor!  **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **                                                      **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
Mar 28 17:18:11 odroid kernel: [    0.000000@0] **********************************************************
I'm happy to open another thread for this, but consider yourselves notified I guess :)

Please let me know by the end of the day if you want anything else. I'm not planning to do any more work on my Odroid today, but I'll reinstall the image overnight and continue configuring my actual setup tomorrow.
Attachments
syslog.log
Complete syslog of the device (renamed to 'syslog.log' so the forum software will allow me to upload it)
(1.19 MiB) Downloaded 2 times
mdadm2.log
Rolling output of mdadm --detail /dev/md0
(12.15 KiB) Downloaded 2 times
mdadm1.log
Output of failed mdadm --create
(897 Bytes) Downloaded 2 times
lsusb3.txt
Rolling output of lususb while replicating the bug
(7.63 KiB) Downloaded 4 times
kern.log
Complete kern.log of the device
(876.99 KiB) Downloaded 3 times

User avatar
odroid
Site Admin
Posts: 30667
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 13 times
Been thanked: 98 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Mon May 20, 2019 5:28 pm

I think you are still running an old kernel.

Code: Select all

Feb 18 06:26:57 odroid kernel: [    0.000000@0] Linux version 4.9.156-14 (root@builder_n2) (gcc version 7.3.0 (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) ) #1 SMP PREEMPT Sat Feb 16 02:15:44 -02 2019
This three lines of commands should be enough to have the latest kernel.
https://wiki.odroid.com/odroid-n2/os_im ... st-upgrade

Andrew Sayers
Posts: 14
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 1 time
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 5:43 pm

odroid wrote:
Mon May 20, 2019 5:28 pm
I think you are still running an old kernel.

Code: Select all

Feb 18 06:26:57 odroid kernel: [    0.000000@0] Linux version 4.9.156-14 (root@builder_n2) (gcc version 7.3.0 (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) ) #1 SMP PREEMPT Sat Feb 16 02:15:44 -02 2019
This three lines of commands should be enough to have the latest kernel.
https://wiki.odroid.com/odroid-n2/os_im ... st-upgrade
The attached log files start from the point I reflashed my SD card, so yes the first few boots use the old kernel. But line 11521 of the attached syslog.log says:

Code: Select all

May 20 06:55:22 odroid kernel: [    0.000000@0] Linux version 4.9.177-28 (root@builder_n2) (gcc version 7.4.0 (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04) ) #1 SMP PREEMPT Thu May 16 23:10:54 -03 2019
That's the start of the final boot sequence, so I'm pretty sure the dist-upgrade was successful.

Andrew Sayers
Posts: 14
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 1 time
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 5:53 pm

... but I just took another look through the logs, and the trace_printk notice is gone in the latest kernel, so at least that part was a false alarm :)

User avatar
odroid
Site Admin
Posts: 30667
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 13 times
Been thanked: 98 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Mon May 20, 2019 5:56 pm

I see. You are running the latest kernel and you still meet the crash issue with mdadm software driven RAID drivers.

According to your "lsusb3.txt" file, the disks and hub controller seemed to be disappeared in the last stages.
But I think the file looks corrupted or overwritten accidentally. Please check whether any USB devices disappeared or not.

Another question.
Are those two USB storage a self-powered device? or bus-powered 2.5" HDD?

Andrew Sayers
Posts: 14
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 1 time
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 6:16 pm

Yes, the disks and hub disappear when the bug occurs. They're bus-powered 2.5" hard disks, with no external power:
IMG_20190520_100539.jpg
IMG_20190520_100539.jpg (293.53 KiB) Viewed 338 times
I've included the wifi device in the picture above for reference, although I most recently replicated the bug without it connected.

User avatar
odroid
Site Admin
Posts: 30667
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 13 times
Been thanked: 98 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Mon May 20, 2019 6:22 pm

Ahh.. you are using two bus-powered HDD devices.
Can you estimate the maximum power consumption of the storages?
Do you use our official 12V/2A PSU?
24Watt PSU might not be enough to run two external HDDs in parallel.

Do you have a DMM to measure the system 5V voltage rail on the 40pin GPIO header?

Andrew Sayers
Posts: 14
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 1 time
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 6:35 pm

Yes, I'm using the official 12V/2A PSU. If it makes any difference, I'm in the UK so I'm getting 240V input from the wall. I'm not much of a hardware guy I'm afraid, so I don't know how to estimate power consumption and don't have a DMM to help. But why would a loss of power to the USB 3.0 drives cause the USB 2.0 hub to fail?

I'm not sure if this is any help, but here's the text on the bottom of the drives (the blue drive in the picture below is the one with the blue cable in the picture above):
drives.jpg
drives.jpg (79.27 KiB) Viewed 330 times

User avatar
odroid
Site Admin
Posts: 30667
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 13 times
Been thanked: 98 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Mon May 20, 2019 6:44 pm

Your input voltage should be okay since we are using 220V in Korea.

The rated power consumption of the storage seems to be 5Watt(5Vx1A). So total USB device power consumption could be 10Watt.
As far as I remember ODROID N2 could draw up to 15Watt when we ran a full CPU/GPU load test.
So the PSU output power might not be enough even I'm not sure at this moment.

Did you have no problem while testing a single disk?

Andrew Sayers
Posts: 14
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 1 time
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 6:57 pm

I've just run a test as follows on the same install as before:
  1. power the device on and watch USB for a minute or two - everything is fine
  2. attach one device, disable mdadm, attach the other device, wait a minute or two - still fine, bug not triggered while both devices are idle
  3. run dd if=/dev/sda of=/dev/sda to make the WD Gaming drive pull a lot of power, wait a few minutes - still fine
  4. run dd if=/dev/sdb of=/dev/sdb in a second terminal so both drives pull a lot of power, wait a few minutes - still fine
  5. kill both processes, then restart them but copying from/to each other instead of from/to themselves - triggers the bug
As a naive non-hardware guy, I would expect that to reading/writing to/from the same disk would pull a similar amount of power as reading/writing between disks, but I'll run some more tests and see what I can see.

Andrew Sayers
Posts: 14
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 1 time
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Mon May 20, 2019 7:26 pm

Taking some inspiration from a post in the other thread, I have now tried:
  • dd if=/dev/zero of=/dev/sd? status=progress for both disks at the same time in different terminals - wrote 1GB or so to each drive without triggering the bug, but writes were surprisingly slow (~4MB/s for sda, ~8MB/s for sdb)
  • dd if=/dev/sd? of=/dev/null status=progress for both disks at the same time in different terminals - read 12GB or so from each drive at ~125MB/s, then triggered the bug
  • dd if=/dev/sd? of=/dev/null status=progress for one disk at a time in a single terminal - read 30GB or so each at ~110MB/s without triggering the bug
So the bug is triggered when you read from two disks at once, but not when you write to two disks at once. But it's possible that there might be an unrelated issue with write speeds that's hiding the bug in that case.

It seems odd that read speeds would be higher when both disks are reading than just one, but I've only tested once so that could be random.

User avatar
odroid
Site Admin
Posts: 30667
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 13 times
Been thanked: 98 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Tue May 21, 2019 8:38 am

Single disk access seems not to cause any problem so far.
So I have a concern about the 24Watt PSU.
Do you have another PSU which can supply 30Watt?

Meanwhile we will try a multiple-threaded dd copying test with two USB storages to narrow down root causes.
Do you think the EXT4 file system is enough for this test?

Andrew Sayers
Posts: 14
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 1 time
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Tue May 21, 2019 6:24 pm

I expect ext4 will be fine, but I've been copying directly from the devices in order to rule out any FS-specific issues. In fact, the write tests will have zeroed out the superblock and filesystem, so I'm effectively testing a blank drive at the moment.

I'll have a look round for a 24W PSU, but to be honest I doubt I'll find one. I have had a couple of ideas though, which should rule out power as a cause:

I've got a USB 3.1 memory stick, and I can replicate the bug with that and one hard disk. Obviously this disk is slower, and interestingly it takes much longer to replicate the bug - the hard disk copied 76GB before USB died, whereas it usually dies after about 12-24 GB. Here's lsusb and lsusb -t for the memory stick:

Code: Select all

# lsusb:
Bus 002 Device 003: ID 0781:5583 SanDisk Corp. 
Bus 002 Device 002: ID 05e3:0620 Genesys Logic, Inc. 
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 04f2:0402 Chicony Electronics Co., Ltd Genius LuxeMate i200 Keyboard
Bus 001 Device 002: ID 05e3:0610 Genesys Logic, Inc. 4-port hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

# lsusb -t:
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M
        |__ Port 4: Dev 3, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/2p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 3: Dev 3, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
        |__ Port 3: Dev 3, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
Also, it turns out my electricity meter can tell me the current wattage used by the whole house, which can give readings accurate to about +-1 watt. I can post the details if you like, but in short the power usage seems to look like this:
  • Odroid plus two inactive disks uses about 5W
  • Odroid plus two disks triggering the bug uses about 10-11W
  • Odroid plus one inactive disk and one inactive memory stick uses about 4W
  • Odroid plus a disk and a memory stick triggering the bug uses about 7-8W
Obviously a household power meter isn't as precise as a professional tool, but at the very least it suggests the memory stick is triggering the same bug despite using less power.

This feels more like some kind of USB timing issue to me, as the bug is triggered randomly but correlates with the speed of data transfer. I'm going to try replicating with multiple `dd`s from a single disk, but probably won't finish before close of business Korean time.

Andrew Sayers
Posts: 14
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 1 time
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Tue May 21, 2019 8:26 pm

Last test results for the day:

I can replicate the bug with the memory stick and either of the HDDs attached, so it's unlikely to be problem in any one drive.

I haven't been able to replicate the bug with two processes reading from a single device.

Trying to replicate the bug after setting USB quirks seems to trigger a less serious version of the bug. I did approximately the following:
  1. echo 'options usb-storage quirks=0480:0900:g,1058:261e:g' > /etc/modprobe.d/quirks.conf # set modprobe quirks
  2. reboot # so the changes take effect
  3. echo -n 0480:0900:g,1058:261e:g > /sys/module/usb_storage/parameters/quirks # in case modprobe didn't work for some reason
  4. unplug and replug both hard disks
  5. run dd if=/dev/sda of=/dev/null in one terminal, and dd if=/dev/sdb of=/dev/null in another - this would normally trigger the bug in under a minute
  6. waited several minutes
Transfer speed was about 125MB/s for both disks (the same as normal), and I got error messages in syslog similar to what I normally get:

Code: Select all

May 20 07:11:11 odroid systemd[1]: Started Getty on tty4.
May 20 07:11:12 odroid kernel: [  174.848928@0] xhci-hcd xhci-hcd.0.auto: WARN Successful completion on short TX
May 20 07:11:12 odroid kernel: [  174.848991@0] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 5 ep 2 with no TDs queued?
May 20 07:11:12 odroid kernel: [  174.848995@0] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 5 ep 2 with no TDs queued?
May 20 07:11:12 odroid kernel: [  174.848999@0] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 5 ep 2 with no TDs queued?
May 20 07:11:12 odroid kernel: [  174.849003@0] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 5 ep 2 with no TDs queued?
May 20 07:11:12 odroid kernel: [  174.849059@0] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 3
May 20 07:11:12 odroid kernel: [  174.854109@0] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000cf7c1050 trb-start 00000000cf7c1030 trb-end 00000000cf7c1030 seg-start 00000000cf7c1000 seg-end 00000000cf7c1ff0
May 20 07:11:14 odroid systemd[1]: Started Session 5 of user root.
May 20 07:11:42 odroid kernel: [  205.164685@1] usb 2-1.2: reset SuperSpeed USB device number 4 using xhci-hcd
May 20 07:12:22 odroid kernel: [  244.534568@0] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
May 20 07:12:22 odroid kernel: [  244.539709@0] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000cf7c4360 trb-start 00000000cf7c41f0 trb-end 00000000cf7c4340 seg-start 00000000cf7c4000 seg-end 00000000cf7c4ff0
May 20 07:12:22 odroid kernel: [  244.539714@0] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1
May 20 07:12:22 odroid kernel: [  244.550363@0] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000cf7c4580 trb-start 00000000cf7c41f0 trb-end 00000000cf7c4340 seg-start 00000000cf7c4000 seg-end 00000000cf7c4ff0
May 20 07:12:52 odroid kernel: [  275.048709@1] usb 2-1.2: reset SuperSpeed USB device number 4 using xhci-hcd
May 20 07:13:29 odroid kernel: [  311.588731@0] usb 2-1.2: reset SuperSpeed USB device number 4 using xhci-hcd
May 20 07:16:07 odroid kernel: [  470.041615@0] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 5 ep 2 with no TDs queued?
May 20 07:16:07 odroid kernel: [  470.041624@0] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 5 ep 2 with no TDs queued?
May 20 07:16:38 odroid kernel: [  501.096766@1] usb 2-1.2: reset SuperSpeed USB device number 4 using xhci-hcd
May 20 07:17:01 odroid CRON[2428]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
May 20 07:18:43 odroid kernel: [  625.607748@0] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
May 20 07:18:43 odroid kernel: [  625.612885@0] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000cf7d0b80 trb-start 00000000cf7d0980 trb-end 00000000cf7d0b70 seg-start 00000000cf7d0000 seg-end 00000000cf7d0ff0
May 20 07:18:43 odroid kernel: [  625.612890@0] xhci-hcd xhci-hcd.0.auto: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1
May 20 07:18:43 odroid kernel: [  625.623539@0] xhci-hcd xhci-hcd.0.auto: Looking for event-dma 00000000cf7d0db0 trb-start 00000000cf7d0980 trb-end 00000000cf7d0b70 seg-start 00000000cf7d0000 seg-end 00000000cf7d0ff0
But instead of USB dying altogether, dd ... sdb would just pause after each error, then carry on again when the "reset" message came in. dd ... sda acted normally throughout, the USB keyboard was unaffected, and I didn't need to restart dd .. sdb. It was just like a download that dropped to zero bandwidth for a few seconds.

User avatar
odroid
Site Admin
Posts: 30667
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 13 times
Been thanked: 98 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Wed May 22, 2019 9:58 am

Thank you for the valuable and various test results.
It seems not to be related to the power supply.
We will try to make a similar test process.

User avatar
odroid
Site Admin
Posts: 30667
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 13 times
Been thanked: 98 times
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by odroid » Wed May 22, 2019 4:27 pm

We could reproduce the issue with two storage devices finally.
Give us a few days to find any possible root causes.
These users thanked the author odroid for the post (total 2):
Andrew Sayers (Wed May 22, 2019 5:03 pm) • xabolcs (Thu May 23, 2019 12:54 am)

Andrew Sayers
Posts: 14
Joined: Fri May 17, 2019 5:06 pm
languages_spoken: english
ODROIDs: N2
Has thanked: 1 time
Been thanked: 0
Contact:

Re: Syncing two UAS devices causes Ubuntu to hang

Unread post by Andrew Sayers » Wed May 22, 2019 6:13 pm

Excellent! I've updated the first post with a quick summary, to make life easier for people who get here from a Google search or something. You're welcome to re-edit the summary as more information comes in.

Post Reply

Return to “Issues”

Who is online

Users browsing this forum: No registered users and 1 guest