Page 3 of 7

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Fri Jul 26, 2019 4:06 pm
by tobetter
Nighti wrote:
Fri Jul 26, 2019 1:38 am
@elatllat: I would prefer to see it fixed for 18.04 LTS. Jumping from one unstable thing to another doesn't really help.

@tobetter:
Mac: "00:1e:06:42:23:cd"

I don't have such a device yet. Do you have more information what is required to capture those logs?

Here are my hard drives currently attached: All are external powered
root@odroid[~]$ lsusb -t
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 4, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 2: Dev 6, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 3: Dev 8, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 4: Dev 3, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 5, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 3: Dev 7, If 0, Class=Hub, Driver=hub/4p, 5000M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/2p, 480M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 4: Dev 6, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 1: Dev 7, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 3: Dev 8, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 4: Dev 10, If 0, Class=Mass Storage, Driver=usb-storage, 480M
|__ Port 4: Dev 9, If 0, Class=Mass Storage, Driver=usb-storage, 480M
root@odroid[~]$ lsusb
Bus 002 Device 007: ID 2109:0812 VIA Labs, Inc. VL812 Hub
Bus 002 Device 005: ID 2109:0812 VIA Labs, Inc. VL812 Hub
Bus 002 Device 003: ID 2109:0812 VIA Labs, Inc. VL812 Hub
Bus 002 Device 008: ID 0bc2:3322 Seagate RSS LLC
Bus 002 Device 006: ID 1058:25ee Western Digital Technologies, Inc.
Bus 002 Device 004: ID 1058:25ee Western Digital Technologies, Inc.
Bus 002 Device 002: ID 05e3:0620 Genesys Logic, Inc.
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 009: ID 1058:1021 Western Digital Technologies, Inc. Elements Desktop (WDBAAU)
Bus 001 Device 010: ID 04fc:0c25 Sunplus Technology Co., Ltd SATALink SPIF225A
Bus 001 Device 008: ID 2109:2812 VIA Labs, Inc. VL812 Hub
Bus 001 Device 007: ID 2109:2812 VIA Labs, Inc. VL812 Hub
Bus 001 Device 006: ID 2109:2812 VIA Labs, Inc. VL812 Hub
Bus 001 Device 002: ID 05e3:0610 Genesys Logic, Inc. 4-port hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Can you run these commands and let me have their output?

Code: Select all

$ cat /sys/kernel/debug/aml_clkmsr/clkmsr | grep clk81
$ dmesg | grep sg_tablesize
$ cat /proc/cmdline

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Fri Jul 26, 2019 5:05 pm
by Nighti
root@odroid[~]$ cat /sys/kernel/debug/aml_clkmsr/clkmsr | grep clk81
[ 7][ 222000000]clk81
root@odroid[~]$ dmesg | grep sg_tablesize
[ 1.953651] usb: xhci: determined sg_tablesize: 2
[ 1.964456] usb: xhci: determined sg_tablesize: 2
root@odroid[~]$ cat /proc/cmdline
root=UUID=e139ce78-9841-40fe-8823-96a304a09859 rootwait rw console=ttyS1,115200n8 no_console_suspend fsck.repair=yes net.ifnames=0 elevator=noop hdmimode=1080p60hz cvbsmode=576cvbs max_freq_a53=1896 max_freq_a73=1800 maxcpus=6 voutmode=hdmi disablehpd=false cvbscable=0 overscan=100 monitor_onoff=false usb-storage.quirks=0x0bc2:0x3322:u usb-xhci.tablesize=2

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Fri Jul 26, 2019 7:17 pm
by fonix232
elatllat wrote:
Fri Jul 26, 2019 8:33 am
fonix232 wrote:
Fri Jul 26, 2019 8:17 am
... I did use your 5.3 kernel image, and unfortunately it does not boot :( I don't have an UART adapter at hand either, so I'm going to re-image the eMMC tomorrow.
Only thing I can think of is I'm using ssh so maybe you judged the "no boot" as a HDMI issue.
Not at all. I'm using my N2 as a headless server. It does not show up on the network, the network interface itself does not show any activity. and so on. HDDs spin up for a few minutes then spin down.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Sun Jul 28, 2019 8:41 am
by nick793
fonix232 wrote:
Fri Jul 26, 2019 7:17 pm
elatllat wrote:
Fri Jul 26, 2019 8:33 am
fonix232 wrote:
Fri Jul 26, 2019 8:17 am
... I did use your 5.3 kernel image, and unfortunately it does not boot :( I don't have an UART adapter at hand either, so I'm going to re-image the eMMC tomorrow.
Only thing I can think of is I'm using ssh so maybe you judged the "no boot" as a HDMI issue.
Not at all. I'm using my N2 as a headless server. It does not show up on the network, the network interface itself does not show any activity. and so on. HDDs spin up for a few minutes then spin down.
I tried the 5.3-rc1 image as well and it didn't boot for me either. Trying to ssh in and getting "no route to host" errors. Are there any other steps that we need to do ahead of time (like update petitboot or something)?

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Sun Jul 28, 2019 10:19 am
by elatllat
No.... without uart I can't guess, there should be next to no ways to make it not work.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Sun Jul 28, 2019 7:09 pm
by fonix232
I've tried the gl12 unified Armbian image currently in development, and after a good two days' testing, I can say that the USB issue is gone.

However that build is nowhere near stable at the moment, reboot takes a really long time, NetworkManager configs don't seem to be saved, and the overall feeling is of a quite beta software - but it's the right direction.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Sun Jul 28, 2019 8:57 pm
by nick793
Ok so I found why I couldn't boot. The image you had posted is for eMMC but I'm booting from SD. Just had to change mmcblk0p2 to mmcblk1p2 and boots like a charm now :)

Still haven't tested USB because mdadm seems to not like something with the new kernel but I can't figure out what. Once I can get that running I'll do whatever I can to break it again.

I did move my 3x4TB enclosure from my N2 to my new Pi 4 as a test though. The enclosure didn't support UAS, but was only getting USB 2.0 speeds on my N2. Plugged into the Pi 4 and it was chugging along at USB3 without a problem.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Sun Jul 28, 2019 11:26 pm
by elatllat
nick793 wrote:
Sun Jul 28, 2019 8:57 pm
...mdadm...
I'll have to update my build to include mdadm support.
zcat /proc/config.gz | grep MD_RAID

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Mon Jul 29, 2019 4:07 am
by fonix232
Sad report: unfortunately even the 5.2 kernel of the Armbian guys has the XHCI bug present - after about a day's worth of downloads, I decided to stress test it, added a bunch of torrents, and bam, server died within 5 minutes with the usual device unavailable errors.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Mon Jul 29, 2019 5:48 am
by elatllat
fonix232 wrote:
Mon Jul 29, 2019 4:07 am
Sad...
But how far is the Armbian 5.2 kernel from mainline?

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Mon Jul 29, 2019 9:50 am
by nick793
What confuses me is the inconsistency of it. The raid array that kills my N2 every time is the HDD enclosure with 2 spinners in it. It's self-powered, but when that goes through heavy I/O activity (200MB/s+) the N2 dies.

Meanwhile, the SSD array (2 separate enclosures, not self-powered) just completed a check with mdadm (read the entire 240GB array twice looking for mismatches). I would have thought that the mdadm check would have killed it (several minutes of 200MB/s reads on multiple ports) but the N2 is still chugging along happily.

This was all on the 4.9 kernel since I need to power the whole cluster back on tonight into a semi-stable mode.

Code: Select all

nick@odroid:~ $ lsusb -t
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M
        |__ Port 2: Dev 3, If 0, Class=Mass Storage, Driver=uas, 5000M
        |__ Port 3: Dev 4, If 0, Class=Mass Storage, Driver=uas, 5000M
        |__ Port 4: Dev 5, If 0, Class=Mass Storage, Driver=uas, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/2p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M

nick@odroid:~ $ lsusb
Bus 002 Device 005: ID 174c:55aa ASMedia Technology Inc. ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge
Bus 002 Device 004: ID 174c:55aa ASMedia Technology Inc. ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge
Bus 002 Device 003: ID 174c:55aa ASMedia Technology Inc. ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge
Bus 002 Device 002: ID 05e3:0620 Genesys Logic, Inc.
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 002: ID 05e3:0610 Genesys Logic, Inc. 4-port hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Mon Jul 29, 2019 10:18 am
by elatllat
nick793 wrote:
Mon Jul 29, 2019 9:50 am
What confuses me is the inconsistency of it. ....
To me that sounds perfectly consistent with an oom; the software will need to keep request in memory longer on spinning disks until the latency of the answer which will cause problems with poorly designed software. I'm just speculating, I have no idea what the real issue is.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Mon Jul 29, 2019 11:24 am
by elatllat
elatllat wrote:
Sun Jul 28, 2019 11:26 pm
nick793 wrote:
Sun Jul 28, 2019 8:57 pm
...mdadm...
I'll have to update my build to include mdadm support.
zcat /proc/config.gz | grep MD_RAID
updated.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Tue Jul 30, 2019 3:39 pm
by odroid
We've started a more intensive USB 3.0 storage test today.
We connected four SATA SSD via our USB-to-SATA bridge board https://www.hardkernel.com/shop/usb3-0- ... oard-plus/.
12V/2A PSU was not sufficient for four SSDs and we had to use a 15V/4A PSU.
We also replaced the USB cables with very thick one to minimize the voltage drops.
The DMM measured voltage on the SSD power input is near 4.95 volt now. It was around 4.68Volt before replacing.

We've run four "dd" commands in parallel and there has been no error so far for a couple of hours.
We will run this test for several more days and keep updating the status everyday.
We used the latest kernel and bootini update package to have usb-xhci.tablesize=2 Kernel parameter by default.
s_20190730_144137.jpg
s_20190730_144137.jpg (157.54 KiB) Viewed 3303 times

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Tue Jul 30, 2019 3:47 pm
by Nighti
What is the kernel version number you're using for this test?

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Tue Jul 30, 2019 3:51 pm
by odroid
The latest kernel package is 4.9.185-43.
https://github.com/hardkernel/linux/releases

The latest bootini package for setting the sg_table size properly and assigning the USB IRQ handlers to a big A73 core.
https://github.com/mdrjr/n2_bootini/blo ... t.ini#L103

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Tue Jul 30, 2019 4:04 pm
by Nighti
Thanks! Did the upgrade, will run the tests with 5 external HDD's tomorrow and report back.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Tue Jul 30, 2019 4:14 pm
by odroid
Really appreciate your test!
Which external HDDs and external USB hub do you use?
All of them is self-powered?

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 12:15 am
by nick793
Thanks for doing this test! Are you also doing high speed reads as well to test data flow the other way? Most of my failures have occurred when reading devices. However I can't find a rhyme or reason.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 7:00 am
by marmoset
I'm testing as well, viewtopic.php?f=181&t=35129 was fine, but I added another drive and it was failing pretty quickly under load, even with all the drives blacklisted with usb-storage.quirks Right now I've updated boot.ini and removed the blacklists, and it's been going for a while without a problem. I'll update the kernel if it does die.

I do still have max_sectors_kb set to 32, if it *doesn't* fail, I'll remove that as well and try again.

FWIW what seems to trigger it pretty easily for me is doing rsync -c with a lot of content (-c does checksums instead of just timestamp/size, I need to use it to fix some unrelated corrupted files). It makes sense that rsync -c would be more likely to trigger it since there's a lot of both reading and writing.

It's not high transfer rate causing it, the rsync is going over the internet and I'm only getting around 70Mbit (so only ~7MB/s) across 4 usb drives.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 7:10 am
by Nighti
I did my first tests and they are promising!
Here is my setup which is working fine with an Intel machine and failed with N2 in the first place:
root@odroid[~]$ lsusb
Bus 002 Device 008: ID 1058:25ee Western Digital Technologies, Inc.
Bus 002 Device 007: ID 1058:25ee Western Digital Technologies, Inc.
Bus 002 Device 006: ID 0bc2:3322 Seagate RSS LLC
Bus 002 Device 005: ID 2109:0812 VIA Labs, Inc. VL812 Hub
Bus 002 Device 004: ID 2109:0812 VIA Labs, Inc. VL812 Hub
Bus 002 Device 003: ID 2109:0812 VIA Labs, Inc. VL812 Hub
Bus 002 Device 002: ID 05e3:0620 Genesys Logic, Inc.
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 007: ID 1058:1021 Western Digital Technologies, Inc. Elements Desktop (WDBAAU)
Bus 001 Device 006: ID 04fc:0c25 Sunplus Technology Co., Ltd SATALink SPIF225A
Bus 001 Device 005: ID 2109:2812 VIA Labs, Inc. VL812 Hub
Bus 001 Device 004: ID 2109:2812 VIA Labs, Inc. VL812 Hub
Bus 001 Device 003: ID 2109:2812 VIA Labs, Inc. VL812 Hub
Bus 001 Device 002: ID 05e3:0610 Genesys Logic, Inc. 4-port hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
root@odroid[~]$ lsusb -t
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 2: Dev 3, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 4, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 3: Dev 5, If 0, Class=Hub, Driver=hub/4p, 5000M
|__ Port 1: Dev 6, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 2: Dev 7, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
|__ Port 3: Dev 8, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/2p, 480M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 2: Dev 3, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 1: Dev 4, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 3: Dev 5, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 4: Dev 6, If 0, Class=Mass Storage, Driver=usb-storage, 480M
|__ Port 4: Dev 7, If 0, Class=Mass Storage, Driver=usb-storage, 480M
Five external hard drives with a pretty nice mix. Everything is self powered to avoid any power source issues. Only one USB2.0 is without any power. 3x USB3.0 devices 2x USB2.0 hanging on one self powered USB3.0 hub which is connected to N2.

Here are the results for some copy operations:
root@odroid[/mnt/extMrBig]$ dd if=./largefile of=/mnt/extMsBig/largefile bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 21.8804 s, 49.1 MB/s
root@odroid[/mnt/intHDD]$ dd if=/dev/zero of=./largefile bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 18.8013 s, 57.1 MB/s
root@odroid[/mnt/intHDD]$ dd if=./largefile of=/mnt/extHDD/largefile bs=1M count=1024
^C640+0 records in
640+0 records out
671088640 bytes (671 MB, 640 MiB) copied, 237.036 s, 2.8 MB/s
root@odroid[/mnt/extMsBig]$ dd if=./largefile of=/mnt/intHDD/largefile bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 46.2939 s, 23.2 MB/s
One older USB2.0 drive is causing USB reset most of the time but I've seen same thing on the Intel platform for this drive. And I also got this dmesg while started the first copy. Maybe it's relevant at some point:
[42955.003999] IRQRatio___ERR.irq:22 ratio:35
[42955.004029] t_isr:354 t_total:1000, cnt:16493
Will do more test and let you know the results.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 10:29 am
by Nighti
Ran couple more real world tests like rsync all kind of different data files from USB3 to USB2, concurrent USB3/USB2 read/write on the same drive... Transferred couple hundred gigabytes back and forth.

Results:
USB2.0 devices are getting pretty often resets. Especially if the USB bus is under heavy load.
[54380.482360] usb 1-1.2.3.4: reset high-speed USB device number 6 using xhci-hcd
All devices are in stable condition and none so far died. That's a huge step forward! Thanks for that!

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 10:58 am
by odroid
Glad to hear your four USB 3.0 HDDs are still alive.
We also have no issue with four SSDs so far after starting the test about 17 hours ago. We will keep running the test a few more days.
The UAS mode was enabled by default. We didn't change "max_sectors_kb" either.
Just updated the Kernel and boot packages with apt upgrade.

Let's focus on USB 3.0 storage devices for a week.
If there is no critical issue, let's try to find what's wrong with the USB 2.0 devices.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 11:03 am
by Nighti
odroid wrote:
Wed Jul 31, 2019 10:58 am
We didn't change "max_sectors_kb" either.
Just updated the Kernel and boot packages with apt upgrade.
Correct. That's exactly what I've done as-well. All different drives have still their own different values.
odroid wrote:
Wed Jul 31, 2019 10:58 am
Let's focus on USB 3.0 storage devices for a week.
If there is no critical issue, let's try to find what's wrong with the USB 2.0 devices.
Agree!

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 12:06 pm
by elatllat
tobetter wrote:
Thu Jul 04, 2019 11:01 pm
...sd_fusing.sh...
That script could use a little TLC so I sent you a PR, but I notice there are open PRs from 4 years ago ... please clean those up.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 12:32 pm
by elatllat
On an (externally powered) ancient laptop spinning hard drive;

Code: Select all

apt install mdadm iozone3
parted /dev/sda mktable msdos
parted /dev/sda mkpart primary 0% 50%
parted /dev/sda mkpart primary 50% 100%
mdadm --create /dev/md/test /dev/sda1 /dev/sda2 --level=0 --raid-devices=2
mkfs.ext4 /dev/md/test
mount /dev/md/test /media/test
iozone -e -I -a -s 100M -r 4k -r 16k -r 512k -r 1024k -r 16384k -i 0 -i 1 -i 2 /media/test/iozone >fail.log 2>&1 &

uname -r
4.9.185-43

grep clk81 /sys/kernel/debug/aml_clkmsr/clkmsr
[ 7][ 167000000]clk81                    

dmesg | grep sg_tablesize
[    4.849820] usb: xhci: determined sg_tablesize: 4294967295
[    4.921921] usb: xhci: determined sg_tablesize: 4294967295

cat /proc/cmdline
root=UUID=e139ce78-9841-40fe-8823-96a304a09859 rootwait rw console=ttyS0,115200n8  no_console_suspend fsck.repair=yes net.ifnames=0 elevator=noop hdmimode=1080p60hz cvbsmode=576cvbs max_freq_a53=1896 max_freq_a73=1800 maxcpus=6 voutmode=hdmi  disablehpd=false cvbscable= overscan=100  monitor_onoff=false

dmesg | grep "USB disconnect" -A 999
[ 1079.734781] usb 1-1.1: USB disconnect, device number 3
[ 1079.736335] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 1079.741204] blk_update_request: I/O error, dev sda, sector 0
[ 1079.741495] blk_update_request: I/O error, dev sda, sector 0
[ 1079.747307] blk_update_request: I/O error, dev sda, sector 27429120
[ 1079.756585] blk_update_request: I/O error, dev sda, sector 27429360
[ 1079.763184] blk_update_request: I/O error, dev sda, sector 77861872
[ 1079.766515] blk_update_request: I/O error, dev sda, sector 77861872
[ 1079.772962] Aborting journal on device md127-8.
[ 1079.777248] sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[ 1079.777784] scsi 0:0:0:0: rejecting I/O to dead device
[ 1079.777792] blk_update_request: I/O error, dev sda, sector 27429600
[ 1079.789345] Buffer I/O error on dev md127, logical block 19431424, lost sync page write
[ 1079.797794] JBD2: Error -5 detected when updating journal superblock for md127-8.
[ 1079.813164] usb 2-1.1: new SuperSpeed USB device number 3 using xhci-hcd
[ 1079.834041] usb 2-1.1: New USB device found, idVendor=174c, idProduct=55aa
[ 1079.834049] usb 2-1.1: New USB device strings: Mfr=2, Product=3, SerialNumber=1
[ 1079.834055] usb 2-1.1: Product: 1053-3G Ext. HDD
[ 1079.834059] usb 2-1.1: Manufacturer: Asmedia Technology Inc.
[ 1079.834064] usb 2-1.1: SerialNumber: 000000000033
[ 1079.834703] usb-storage 2-1.1:1.0: USB Mass Storage device detected
[ 1079.836407] usb-storage 2-1.1:1.0: Quirks match for vid 174c pid 55aa: 400000
[ 1079.836520] scsi host1: usb-storage 2-1.1:1.0
[ 1080.842041] scsi 1:0:0:0: Direct-Access     Asmedia  1053-3G Ext. HDD 0    PQ: 0 ANSI: 6
[ 1080.849715] sd 1:0:0:0: Attached scsi generic sg0 type 0
[ 1080.854324] sd 1:0:0:0: [sdb] 312581808 512-byte logical blocks: (160 GB/149 GiB)
[ 1080.854937] sd 1:0:0:0: [sdb] Write Protect is off
[ 1080.854952] sd 1:0:0:0: [sdb] Mode Sense: 43 00 00 00
[ 1080.933197] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1080.969461] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1081.049219] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1081.561156] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1081.729216] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1081.893236] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1082.061163] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1082.229163] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1082.393132] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1082.414761] sd 1:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
[ 1082.414782] sd 1:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 00 00 00 00 00 00 08 00
[ 1082.414791] blk_update_request: I/O error, dev sdb, sector 0
[ 1082.414996] Buffer I/O error on dev sdb, logical block 0, async page read
[ 1082.561128] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1082.729111] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1082.893107] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1083.061098] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1083.229089] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1083.393096] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1083.414629] sd 1:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
[ 1083.414650] sd 1:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 00 00 00 00 00 00 08 00
[ 1083.414659] blk_update_request: I/O error, dev sdb, sector 0
[ 1083.414870] Buffer I/O error on dev sdb, logical block 0, async page read
[ 1083.421849]  sdb: unable to read partition table
[ 1083.442309] sd 1:0:0:0: [sdb] Attached SCSI disk
[ 1083.524986] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1083.701081] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1083.913066] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1084.113041] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1084.313029] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1084.512975] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1084.712951] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1084.733456] sd 1:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
[ 1084.733470] sd 1:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 12 a1 9e 00 00 00 08 00
[ 1084.733477] blk_update_request: I/O error, dev sdb, sector 312581632
[ 1084.901239] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1115.966748] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1116.086830] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1146.896787] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1147.016844] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1147.560820] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1147.760814] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1147.960803] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1147.981535] sd 1:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
[ 1147.981555] sd 1:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 12 a1 9e 00 00 00 08 00
[ 1147.981565] blk_update_request: I/O error, dev sdb, sector 312581632
[ 1147.982464] Buffer I/O error on dev sdb, logical block 39072704, async page read
[ 1148.236827] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1148.436826] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1148.636852] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1148.836717] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1149.036716] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1149.236710] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1149.257446] sd 1:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
[ 1149.257467] sd 1:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 12 a1 9e 00 00 00 08 00
[ 1149.257476] blk_update_request: I/O error, dev sdb, sector 312581632
[ 1149.428710] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1149.636700] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1149.836701] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1150.036656] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1150.236622] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1150.436603] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[ 1150.457282] sd 1:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
[ 1150.457303] sd 1:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 12 a1 9e 00 00 00 08 00
[ 1150.457312] blk_update_request: I/O error, dev sdb, sector 312581632
[ 1150.458210] Buffer I/O error on dev sdb, logical block 39072704, async page read
[ 1171.863233] usb 2-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
However using upstream kernel, Ubuntu (, and some old uboot) there are no problems;

Code: Select all

uname -r
5.3.0-rc2-next-20190730

cat /proc/cmdline
root=/dev/mmcblk0p2 rootwait rw clk_ignore_unused console=ttyAML0,115200

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 12:39 pm
by nick793
marmoset wrote:
Wed Jul 31, 2019 7:00 am
...FWIW what seems to trigger it pretty easily for me is doing rsync -c with a lot of content (-c does checksums instead of just timestamp/size, I need to use it to fix some unrelated corrupted files). It makes sense that rsync -c would be more likely to trigger it since there's a lot of both reading and writing.

It's not high transfer rate causing it, the rsync is going over the internet and I'm only getting around 70Mbit (so only ~7MB/s) across 4 usb drives.
Just a thought...seems like computing checksums is a common failure mechanism.

So far as I'm aware, transmission's verify local data function computes a ton of checksums on each piece of the file. I was also using glusterfs on this system which does snapshots/checksums all of the time. That explains the bulk of my failures.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 1:54 pm
by odroid
@elatllat,
sg_tablesize is must 2 and IRQ handle also must be assigned to a big core.
https://github.com/mdrjr/n2_bootini/blo ... t.ini#L103
I think you did not update the bootini package.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 2:27 pm
by elatllat
odroid wrote:
Wed Jul 31, 2019 1:54 pm
@elatllat,
sg_tablesize is must 2 and IRQ handle also must be assigned to a big core.
https://github.com/mdrjr/n2_bootini/blo ... t.ini#L103
I think you did not update the bootini package.
Looks updated to me:

Code: Select all

apt show bootini | grep -v @

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

Package: bootini
Version: 20190729-13
Priority: extra
Section: multimedia
Installed-Size: 81.9 kB
Provides: bootini
Depends: fbset
Download-Size: 5568 B
APT-Manual-Installed: yes
APT-Sources: http://deb.odroid.in/n2 bionic/main arm64 Packages
Description: boot.ini and System Tweaks for ODROID-N2

apt install bootini
Reading package lists... Done
Building dependency tree       
Reading state information... Done
bootini is already the newest version (20190729-13).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
But I see the cmdline differs, is that change still in testing (should I manually update it from git?)
(also did a shutdown -h before running the test, and double checked the history to make sure I remembered correctly)

Though oddly that error is not repeating maybe it required 2 power cycles?

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 2:35 pm
by odroid
@elatllat,
I have no idea what was wrong.
If the bootini package updated properly, you could find usb-xhci.tablesize=2 in the Kernel parameter in your boot.ini file.
And the Ethernet IRQ handler runs on CPU#5 and the USB 3.0 XHCI IRQ handler runs on CPU#4 something like this.

Code: Select all

cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
 21:          8          0          0          0          0       7504     GIC-0  40 Edge      eth0
 22:        115          0          0          0      19460          0     GIC-0  62 Level     xhci-hcd:usb1

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 2:41 pm
by elatllat
Seems like apt did not update boot.ini properly (I even remember seeing that blue screen about moving the old one to boot.ini.old) maybe a side issue.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 2:50 pm
by elatllat
nick793 wrote:
Wed Jul 31, 2019 12:39 pm
...checksums...
I was just able to break 4.9.185-43 like this;

Code: Select all

apt install mktorrent
dd if=/dev/urandom of=/media/test/random.dat bs=1M count=1000
echo 3 > /proc/sys/vm/drop_caches
sync
mktorrent -a 127.0.0.1 random.dat
but as Odroid noted there was something wrong with my boot.ini
and manually adding usb-xhci.tablesize=2 to the boot.ini seems to correct the issue.

Code: Select all

cat /proc/interrupts  | grep -E "eth0|xhci-hcd|CPU5"
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       
 21:         10          0          0          0          0       5462     GIC-0  40 Edge      eth0
 22:         60          0          0          0      65527          0     GIC-0  62 Level     xhci-hcd:usb1
 
The interrupts look OK.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 3:08 pm
by odroid
@elatllat,
Thank you for the good news. I didn't know the mktorrent command could generate very heavy traffics.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 3:21 pm
by elatllat
I ran the mktorrent test on 5.3.0-rc2 a few times and it had no problems, so whatever tablesize=2 is doing is not required upstream.
Also interrupts differ on 5.3;

Code: Select all

cat /proc/interrupts  | grep -E "eth0|xhci-hcd|CPU5"
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       
  6:       1206          0          0          0          0          0     GICv2  40 Level     eth0
 22:      12765          0          0          0          0          0     GICv2  62 Level     xhci-hcd:usb1
(keep in mind it's mktorrent on mdadm=0,n=2, on a single slow spinning disk)
So maybe the priorities should be
- make sure apt is adding tablesize=2 properly
- compare generic-xhci from 5.3 to HK and fix whatever tablesize=2 is mitigating (as I read it's not 100% and there is a speed impact)

But as I'm using 5.3 this bug is not bothering me at the moment.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 3:39 pm
by odroid
@elatllat,
Thank you for the clarification.
But I want to talk about the upstream kernel separately.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Wed Jul 31, 2019 3:47 pm
by elatllat
I only bring up 5.3 here because I know comparing working vs non-working versions of code can sometimes help (or sometimes the diff can be so big as to be useless); trying to help confirm the fix and narrow down the issue by providing more data points, to help others and verify it's currently not effecting me. I think it's good to stick with the LTS for most people.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Thu Aug 01, 2019 3:05 am
by fonix232
elatllat wrote:
Wed Jul 31, 2019 3:21 pm
I ran the mktorrent test on 5.3.0-rc2 a few times and it had no problems, so whatever tablesize=2 is doing is not required upstream.
Also interrupts differ on 5.3;

Code: Select all

cat /proc/interrupts  | grep -E "eth0|xhci-hcd|CPU5"
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       
  6:       1206          0          0          0          0          0     GICv2  40 Level     eth0
 22:      12765          0          0          0          0          0     GICv2  62 Level     xhci-hcd:usb1
(keep in mind it's mktorrent on mdadm=0,n=2, on a single slow spinning disk)
So maybe the priorities should be
- make sure apt is adding tablesize=2 properly
- compare generic-xhci from 5.3 to HK and fix whatever tablesize=2 is mitigating (as I read it's not 100% and there is a speed impact)

But as I'm using 5.3 this bug is not bothering me at the moment.

It's quite interesting that 5.3 is not presenting the issue, while 5.2 does. Maybe even worth checking into the XHCI module differences between 5.2 and 5.3?

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Thu Aug 01, 2019 4:11 am
by elatllat
fonix232 wrote:
Thu Aug 01, 2019 3:05 am
...differences between 5.2 and 5.3?
I assume Armbian is quite diffrent from mainline (Armbian is likely including some hardkernel and amlogic customizations).
git diff --shortstat, etc would let you know.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Thu Aug 01, 2019 8:10 am
by marmoset
marmoset wrote:
Wed Jul 31, 2019 7:00 am
I do still have max_sectors_kb set to 32, if it *doesn't* fail, I'll remove that as well and try again.
I did undo that and it's back to the default 512, and still chugging along with no problems.

So for me, it's the boot.ini changes that fixed it, it could be either the sg_table change or the USB IRQ being moved to a different core, or a combo. The performance isn't great currently (about 25MB/s doing a dd with 1M blocksize), but it's stable and that's more important to me.

I'm running debian stretch,Linux odroid2 4.9.185+ #1 SMP PREEMPT Mon Jul 22 10:33:16 CEST 2019 aarch64 GNU/Linux

If it's useful, I could test with just the sg_table change or just the USB IRQ change to see if it's just one of them fixing (or hiding) the problem.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Thu Aug 01, 2019 9:08 am
by odroid
@marmoset,
Thank you for sharing the test result.
Can you remember the transfer rate before the changes?
When we connected a single USB 3.0 SSD, the transfer rate was around 280MB/s.
It is around 240MB/s now after changing. So we can say there must be 10~20% of the performance drop. But the stability seems to be improved a lot.

We will try to check the USB transfer speed in Kernel 5.3 a couple of weeks later since we are very busy to prepare new OS images these days.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Thu Aug 01, 2019 10:13 am
by marmoset
@odroid after this initial sync finishes I'll test speeds more thoroughly with both the boot.ini changes and without, might be a couple days though.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Thu Aug 01, 2019 1:16 pm
by odroid
@marmoset, Okay. Thanks,

We've been running the four SSD heavy UAS accessing test over 45 hours there is no problem so far.
I couldn't find any relevant XHCI message in the dmesg output yet.
We keep running the test bench by early next week and share the result.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Thu Aug 01, 2019 5:07 pm
by fonix232
odroid wrote:
Thu Aug 01, 2019 1:16 pm
@marmoset, Okay. Thanks,

We've been running the four SSD heavy UAS accessing test over 45 hours there is no problem so far.
I couldn't find any relevant XHCI message in the dmesg output yet.
We keep running the test bench by early next week and share the result.
Could you replicate an environment that's known to fail? Mine is a RAID0 array using 2 disks, and continuous multi-stream download (just grab a bunch of big torrent files, e.g. Linux distro collections, and download those as fast as your network allows). Both Deluge and Transmission are known to fail, so it might be worth to use a Docker image of these two.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Thu Aug 01, 2019 6:03 pm
by odroid
We will have a discussion on the forum how to change the test method once we complete the current test early next week.

Meanwhile, please update the kernel & bootini packages via apt upgrade command and start a RAID0 disks test if you can help us.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Thu Aug 01, 2019 9:22 pm
by fonix232
odroid wrote:
Thu Aug 01, 2019 6:03 pm
We will have a discussion on the forum how to change the test method once we complete the current test early next week.

Meanwhile, please update the kernel & bootini packages via apt upgrade command and start a RAID0 disks test if you can help us.
I will do that today, just need some time to back up my setup first - I'm running Armbian at the moment as a test.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Fri Aug 02, 2019 10:15 am
by marmoset
odroid wrote:
Thu Aug 01, 2019 9:08 am
@marmoset,
Thank you for sharing the test result.
Can you remember the transfer rate before the changes?
When we connected a single USB 3.0 SSD, the transfer rate was around 280MB/s.
It is around 240MB/s now after changing. So we can say there must be 10~20% of the performance drop. But the stability seems to be improved a lot.

We will try to check the USB transfer speed in Kernel 5.3 a couple of weeks later since we are very busy to prepare new OS images these days.
@odroid

doing:

dd if=/dev/zero of=testfile bs=1M count=4000

after a fresh boot:

setenv bootargs "root=UUID=e139ce78-9841-40fe-8823-96a304a09859 rootwait rw ${condev} ${amlogic} no_console_suspend fsck.repair=yes net.ifnames=0 elevator=noop hdmimode=${hdmimode} cvbsmode=576cvbs max_freq_a53=${max_freq_a53} max_freq_a73=${max_freq_a73} maxcpus=${maxcpus} voutmode=${voutmode} disablehpd=${disablehpd} ${hid_quirks} ${cmode} overscan=${overscan} cvbscable=${cvbscable}"

gives:

4194304000 bytes (4.2 GB, 3.9 GiB) copied, 26.4916 s, 158 MB/s

and:

setenv bootargs "root=UUID=e139ce78-9841-40fe-8823-96a304a09859 rootwait rw ${condev} ${amlogic} no_console_suspend fsck.repair=yes net.ifnames=0 elevator=noop hdmimode=${hdmimode} cvbsmode=576cvbs max_freq_a53=${max_freq_a53} max_freq_a73=${max_freq_a73} maxcpus=${maxcpus} voutmode=${voutmode} ${cmode} disablehpd=${disablehpd} cvbscable=${cvbscable} overscan=${overscan} ${hid_quirks} monitor_onoff=${monitor_onoff} usb-xhci.tablesize=2 net.ifnames=0"

4194304000 bytes (4.2 GB, 3.9 GiB) copied, 96.8921 s, 43.3 MB/s

raid 5 of 4 spinning drives.

LMK if you want me to try anything else.

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Fri Aug 02, 2019 10:55 am
by odroid
@marmoset,
Too much Writing performance drop in the soft RAID mode with 4 HDDs.
Can you please try the following commands to see write & read performance without cache side effects.
Write command
dd if=/dev/zero of=test oflag=direct bs=8M count=128
Read command
dd if=test of=/dev/null iflag=direct bs=8M

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Fri Aug 02, 2019 12:25 pm
by marmoset
@odroid still worse, although not as bad:

1073741824 bytes (1.1 GB, 1.0 GiB) copied, 31.5123 s, 34.1 MB/s

vs

1073741824 bytes (1.1 GB, 1.0 GiB) copied, 17.1061 s, 62.8 MB/s

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Fri Aug 02, 2019 1:32 pm
by odroid
Too much slower than my estimation.
We will try the RAID5 with four SSDs as well as four HDDs to check the issue early next week.
We may ask you how to configure the system to use the soft-RAID once we are ready to test since we have no experience of using it.

BTW, can you show me the source clock frequency for the internal peripherals? It must be 222Mhz if the boot blob updated well.

Code: Select all

odroid@odroid:~$ sudo cat /sys/kernel/debug/clk/clk81/clk_rate                  
222222219

Re: Syncing two UAS devices causes Ubuntu to hang

Posted: Fri Aug 02, 2019 5:52 pm
by elatllat
odroid wrote:
Fri Aug 02, 2019 1:32 pm
...how to configure the system to use the soft-RAID...
I'd guess something like this was used;

Code: Select all

mdadm --create /dev/md/test /dev/sda /dev/sdb /dev/sdc /dev/sdd --level=5 --raid-devices=4
mkfs.ext4 /dev/md/test
mount /dev/md/test /media/test