SMART Crashes USB

Moderators: mdrjr, odroid

SMART Crashes USB

Unread postby Kosmatik » Sun Jun 04, 2017 5:34 am

Hello,

I've narrowed down my issue a lot.

Here's what happens:

Code: Select all
root@odroid:~# smartctl -l devstat /dev/sda -d sat
smartctl 6.5 2016-05-07 r4318 [armv7l-linux-4.9.30-41] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

Device Statistics (SMART Log 0x04)
Page  Offset Size        Value Flags Description
ATA_SMART_READ_LOG failed: Input/output error
Read Device Statistics pages 0x00-0x07 failed


Code: Select all
[  108.648559] sd 0:0:0:0: [sda] tag#0 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN
[  108.648568] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x85 85 08 0e 00 d5 00 08 00 04 00 4f 00 c2 00 b0 00
[  108.648651] scsi host0: uas_eh_bus_reset_handler start
[  108.728674] usb 4-1.2: reset SuperSpeed USB device number 3 using xhci-hcd
[  108.750796] scsi host0: uas_eh_bus_reset_handler success


This crashes USB in both Debian 8.8 and Ubuntu 16.04.2 LTS.

This only happens when /dev/sda is probed, not /dev/sdb, even when the hard drives have been switched.

viewtopic.php?f=146&t=26016&start=200#p191843 More details here.

Does this mean I now have a faulty cloudshell2? Is there a firmware upgrade for JMS561?
Kosmatik
 
Posts: 24
Joined: Tue May 23, 2017 12:06 pm
languages_spoken: English, Russian
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby phaseshifter » Sun Jun 04, 2017 11:06 am

are you using raid 0 or raid 1 the smart does not see the second drive thus way..it only sees the drives as "one" hence why u may have this error..you need to be in pri/slv to see the stats smart of both drives..also i think the prerequisite..is to have them mounted as in the wiki`s
no upgrade as far as i know..not yet anyway..if at all...
4.9.xx.xx XU3-LITE and U-2
phaseshifter
 
Posts: 2167
Joined: Fri May 08, 2015 9:12 am
languages_spoken: english
ODROIDs: opp sys.. ubuntu .kernel 4.9.xx.xx.c1+ ..c-2..xu3 lite,xu4...vu7,vu8,c-shell -II..c-shel-II,uart,hi-fi ,,hi-fi2,,show,w-board,6x16GB emmc`s 3.5 inch touch...other odroid acc`s as well

Re: SMART Crashes USB

Unread postby odroid » Sun Jun 04, 2017 12:40 pm

I think Kosmatik is using the PM(JBOD) mode since he can access /dev/sda and /dev/sdb both.
We will try to reproduce the issue early next week.
It seems to be worth to try a smart monitoring tool with Windows to confirm whether the issues is caused by firmware or smartmontool.
We will check it too.
User avatar
odroid
Site Admin
 
Posts: 23972
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: SMART Crashes USB

Unread postby Kosmatik » Tue Jun 06, 2017 12:40 pm

Yes, I am using JBOD.

I think the issue is due to a faulty cloudshell2, here for example ryecoaaron from the OMV forums is not having any resets.

http://forum.openmediavault.org/index.p ... post144749
Kosmatik
 
Posts: 24
Joined: Tue May 23, 2017 12:06 pm
languages_spoken: English, Russian
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby odroid » Wed Jun 07, 2017 5:00 pm

We could reproduce the issue. :(
Once we accessed the SMART on /dev/sda, the problem occurred while /dev/sdb was fine.
We've reported this issue to Jmicron to check any firmware update.

We will check what could be different from ryecoaaron's test result too.
User avatar
odroid
Site Admin
 
Posts: 23972
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: SMART Crashes USB

Unread postby odroid » Thu Jun 08, 2017 8:32 am

We also tried the smartctl on a Windows PC too.
But it shows the same result. :(
cs2_smartctl.jpg
cs2_smartctl.jpg (200.25 KiB) Viewed 1838 times


Is there any other Windows native software which can check the issue?
Because Jmicron doesn't support Linux officially, we have to find a way to reproduce the issue on Windows natively since the smartmontools could have a compatibility issue with JM561 bridge.
User avatar
odroid
Site Admin
 
Posts: 23972
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: SMART Crashes USB

Unread postby Kosmatik » Thu Jun 08, 2017 11:11 pm

Do they support Mac? Because smartmontools is available for Mac.

For windows there's Passmark DiskCheckup, CrystalDiskInfo, HDDScan, Hard Disk Sentinel (not free), HD Tune (not free). These were not tested by me, I have used a few in here throughout the years.

There's a built in disk check in windows, but I don't think it uses SMART data.

Code: Select all
wmic diskdrive


or to have it actually look like something decent - in powershell:

Code: Select all
gwmi Win32_DiskDrive | select *
Kosmatik
 
Posts: 24
Joined: Tue May 23, 2017 12:06 pm
languages_spoken: English, Russian
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby odroid » Fri Jun 09, 2017 9:58 am

Thank you for the reply.
We will try Passmark DiskCheckup, CrystalDiskInfo, HDDScan today.
User avatar
odroid
Site Admin
 
Posts: 23972
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: SMART Crashes USB

Unread postby lsc1117 » Fri Jun 09, 2017 5:13 pm

I tried to run the programs(Passmark DiskCheckup, CrystalDiskInfo, HDDScan).

They seem to work well except for the HDDScan.

The HDDScan has the same problem as smartctl.

so I think that the smartctl have some bugs....

hddscan_sdb.JPG
hddscan_sdb.JPG (236.09 KiB) Viewed 1789 times

hddscan_sda.JPG
hddscan_sda.JPG (48.1 KiB) Viewed 1789 times

diskcheckup_sda.JPG
diskcheckup_sda.JPG (105.99 KiB) Viewed 1789 times

crystaldiskinfo_sda.JPG
crystaldiskinfo_sda.JPG (95.16 KiB) Viewed 1789 times

crystaldiskinfo_sda.JPG
crystaldiskinfo_sda.JPG (95.16 KiB) Viewed 1789 times
Attachments
crystaldiskinfo_sdb.JPG
crystaldiskinfo_sdb.JPG (95.56 KiB) Viewed 1789 times
lsc1117
 
Posts: 43
Joined: Thu Aug 22, 2013 12:46 am
Location: South Korea
languages_spoken: english

Re: SMART Crashes USB

Unread postby Kosmatik » Sat Jun 10, 2017 12:39 am

If smartctl is having issues and HDDScan is having issues with the JMS561, but not with other chipsets, then it's more logical to think the problem lies within JMS561 and not HDDScan/smartctl, especially when one hard drive works and the other fails.

Also with CrystalDiskInfo, you can see the transfer mode, standard and features are different even though the hard drives are the same. It seems that the JMS561 is reporting incorrect SMART information, which could be directly related to the issue of USB crashing in Linux.

It's worth pursuing a firmware update with jmicron.
Kosmatik
 
Posts: 24
Joined: Tue May 23, 2017 12:06 pm
languages_spoken: English, Russian
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby odroid » Sat Jun 10, 2017 10:04 am

I agree. We will keep pushing Jmicron people.
User avatar
odroid
Site Admin
 
Posts: 23972
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: SMART Crashes USB

Unread postby Kosmatik » Thu Jun 15, 2017 5:58 am

WD My Book Duo uses the same JMS561 bridge and I cannot find anyone with issues, so there's a working firmware out there. It's probably WD branded though.
Kosmatik
 
Posts: 24
Joined: Tue May 23, 2017 12:06 pm
languages_spoken: English, Russian
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby odroid » Mon Jun 19, 2017 5:53 pm

Jmicron confirmed they could reproduce the issue.
They told us they will release a firmware update within a week.
User avatar
odroid
Site Admin
 
Posts: 23972
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: SMART Crashes USB

Unread postby crashoverride » Mon Jun 19, 2017 6:57 pm

odroid wrote:They told us they will release a firmware update within a week.

It would be nice if they could also officially release a tool that we could use to flash it! :lol:
crashoverride
 
Posts: 3018
Joined: Tue Dec 30, 2014 8:42 pm
languages_spoken: english
ODROIDs: C1

Re: SMART Crashes USB

Unread postby odroid » Wed Jun 21, 2017 9:04 pm

We've got the firmware update package (binary + official tool). But it runs on Windows only.
https://dn.odroid.com/cs2/JMMassProd2_v1_16_14_34.zip
There is a readme file for your reference.

Please see this link again to know the firmware update process.
viewtopic.php?f=146&t=26016&start=150#p190507

After updating the firmware, the SMART reading on both disks doesn't crash the USB connection.
Please try it and let us know the result.
User avatar
odroid
Site Admin
 
Posts: 23972
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: SMART Crashes USB

Unread postby Kosmatik » Thu Jun 22, 2017 12:50 pm

So, the firmware flash went successful and I see some progress.

running this I still get a usb reset, however it no longer hangs completely.
Code: Select all
root@openmediavault:~# smartctl -l devstat /dev/sda
smartctl 6.4 2014-10-07 r4002 [armv7l-linux-4.9.30-odroidxu4] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

Device Statistics page 0 is invalid (page=1, nentries=8)



Code: Select all
[   68.551434] usb 4-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[   68.701471] usb 4-1.1: reset SuperSpeed USB device number 3 using xhci-hcd


I am able to turn on monitoring in OMV without a crash, will see what dmesg shows up once OMV does its cron jobs.
Last edited by Kosmatik on Thu Jun 22, 2017 1:11 pm, edited 1 time in total.
Kosmatik
 
Posts: 24
Joined: Tue May 23, 2017 12:06 pm
languages_spoken: English, Russian
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby odroid » Thu Jun 22, 2017 12:53 pm

We will try to reproduce the reset problem early next week since we have some other urgent issues.
User avatar
odroid
Site Admin
 
Posts: 23972
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: SMART Crashes USB

Unread postby lsc1117 » Mon Jun 26, 2017 11:51 am

Kosmatik wrote:So, the firmware flash went successful and I see some progress.

running this I still get a usb reset, however it no longer hangs completely.
Code: Select all
root@openmediavault:~# smartctl -l devstat /dev/sda
smartctl 6.4 2014-10-07 r4002 [armv7l-linux-4.9.30-odroidxu4] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

Device Statistics page 0 is invalid (page=1, nentries=8)



Code: Select all
[   68.551434] usb 4-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[   68.701471] usb 4-1.1: reset SuperSpeed USB device number 3 using xhci-hcd


I am able to turn on monitoring in OMV without a crash, will see what dmesg shows up once OMV does its cron jobs.


We asked Jmicron that updated the F/W about the issue.
We will upload the F/W as soon as possible.
lsc1117
 
Posts: 43
Joined: Thu Aug 22, 2013 12:46 am
Location: South Korea
languages_spoken: english

Re: SMART Crashes USB

Unread postby joaofl » Fri Jun 30, 2017 1:31 am

I actually have a similar problem with my External USB3 WD MyBook 8TB running on my XU4 with the odroid 4.9.29+ kernel. The smartctl tools works fines, but whenever there is a high simultaneous load on ethernet+usb3 (like when streaming a video high-res from another machine), I keep on getting numerous errors like below (every second more or less):
Code: Select all
[Thu Jun 29 02:51:36 2017] usb 4-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[Thu Jun 29 02:51:36 2017] usb 4-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[Thu Jun 29 02:51:37 2017] usb 4-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[Thu Jun 29 02:51:38 2017] usb 4-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[Thu Jun 29 02:51:38 2017] usb 4-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[Thu Jun 29 02:51:40 2017] usb 4-1.1: reset SuperSpeed USB device number 3 using xhci-hcd

and sometimes this (which concerns me about data corruprion):
Code: Select all
[Thu Jun 29 02:22:39 2017] usb 4-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[Thu Jun 29 02:22:39 2017] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
[Thu Jun 29 02:22:39 2017] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x88 88 00 00 00 00 02 a6 40 0a 10 00 00 00 f0 00 00
[Thu Jun 29 02:22:39 2017] blk_update_request: I/O error, dev sda, sector 11379149328

Follows the output of my smarctl. This number called my attention:
Code: Select all
4  0x010  4            13605  Resets Between Cmd Acceptance and Completion

Which seems to be due to the before mentioned error.
Code: Select all
smartctl -l devstat /dev/sda -d sat
smartctl 6.4 2014-10-07 r4002 [armv7l-linux-4.9.29+] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, http://www.smartmontools.org

Device Statistics (GP Log 0x04)
Page Offset Size         Value  Description
  1  =====  =                =  == General Statistics (rev 2) ==
  1  0x008  4              146  Lifetime Power-On Resets
  1  0x018  6       3927009809  Logical Sectors Written
  1  0x020  6         15009303  Number of Write Commands
  1  0x028  6       6022623334  Logical Sectors Read
  1  0x030  6         51119170  Number of Read Commands
  1  0x038  6       2015207150  Date and Time TimeStamp
  3  =====  =                =  == Rotating Media Statistics (rev 1) ==
  3  0x008  4              558  Spindle Motor Power-on Hours
  3  0x010  4              558  Head Flying Hours
  3  0x018  4              156  Head Load Events
  3  0x020  4                0  Number of Reallocated Logical Sectors
  3  0x028  4           631804  Read Recovery Attempts
  3  0x030  4                0  Number of Mechanical Start Failures
  4  =====  =                =  == General Errors Statistics (rev 1) ==
  4  0x008  4                0  Number of Reported Uncorrectable Errors
  4  0x010  4            13605  Resets Between Cmd Acceptance and Completion
  5  =====  =                =  == Temperature Statistics (rev 1) ==
  5  0x008  1               55  Current Temperature
  5  0x010  1               53~ Average Short Term Temperature
  5  0x018  1               53~ Average Long Term Temperature
  5  0x020  1               61  Highest Temperature
  5  0x028  1               22  Lowest Temperature
  5  0x030  1               59~ Highest Average Short Term Temperature
  5  0x038  1               25~ Lowest Average Short Term Temperature
  5  0x040  1               53~ Highest Average Long Term Temperature
  5  0x048  1               25~ Lowest Average Long Term Temperature
  5  0x050  4              501  Time in Over-Temperature
  5  0x058  1               60  Specified Maximum Operating Temperature
  5  0x060  4                0  Time in Under-Temperature
  5  0x068  1                0  Specified Minimum Operating Temperature
  6  =====  =                =  == Transport Statistics (rev 1) ==
  6  0x008  4                0  Number of Hardware Resets
  6  0x010  4                0  Number of ASR Events
  6  0x018  4                0  Number of Interface CRC Errors
                              |_ ~ normalized value
joaofl
 
Posts: 26
Joined: Sat Feb 27, 2016 2:29 am
Location: Portugal, Brasil
languages_spoken: english, portuguese, spanish
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby Kosmatik » Fri Jun 30, 2017 5:53 am

Check with a different cable, try a different USB port, and try cleaning the USB port. Make sure the cable sits tight.
Kosmatik
 
Posts: 24
Joined: Tue May 23, 2017 12:06 pm
languages_spoken: English, Russian
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby odroid » Fri Jun 30, 2017 11:39 am

Also try to update the Kernel to 4.9.34-45 with following guide.
http://odroid.com/dokuwiki/doku.php?id= ... st-upgrade
User avatar
odroid
Site Admin
 
Posts: 23972
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: SMART Crashes USB

Unread postby joaofl » Mon Jul 03, 2017 11:40 pm

Thnaks. I'm gonna give it try and let you know.
joaofl
 
Posts: 26
Joined: Sat Feb 27, 2016 2:29 am
Location: Portugal, Brasil
languages_spoken: english, portuguese, spanish
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby joaofl » Thu Jul 06, 2017 11:37 pm

Unfortunately no progress on fixing the external USB3 HD issue. I think this issue is related to this hardkernel odroid kernel 4.9. I dont remember having this problem on previous kernel 3, and with the same external HD I dont get the same problem neither on my PC (usb3), nor on my RPi (usb2, arm, kernel 4).

Have tried replacing cables and other things more and nothing helped. I actually believe this is related to some USB3 drivers issue.

My test is simple: I simultaneously copy and transfer (r/w) a file from/to the board's external HD (on usb3), through SFTP from another Gigabit client on the network. This puts the ethernet and USB3 demands very high. Noticing that the gigabit ethernet on the odroid board is actually behind another USB3 port.

It seems to me that one crashes the other one, since if I perform simultaneous r/w locally, everything runs fine. The Ethernet on its own also seems to work fine.
I wonder if this is only happening to me.

Thank you all for the support.
joaofl
 
Posts: 26
Joined: Sat Feb 27, 2016 2:29 am
Location: Portugal, Brasil
languages_spoken: english, portuguese, spanish
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby Kosmatik » Sat Jul 08, 2017 5:35 am

3.x kernel doesn't have support for UAS so you're not seeing those issues there doesn't mean they're not present. Have you tried with a different cable?
Kosmatik
 
Posts: 24
Joined: Tue May 23, 2017 12:06 pm
languages_spoken: English, Russian
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby joaofl » Tue Jul 11, 2017 12:20 am

Yes, tested with 2 different cables, that work on other equipments, as well on a different USB port, and on a different XU4 board (since I have 2 of them). In all cases the same error happens.
joaofl
 
Posts: 26
Joined: Sat Feb 27, 2016 2:29 am
Location: Portugal, Brasil
languages_spoken: english, portuguese, spanish
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby odroid » Tue Jul 11, 2017 10:06 am

We could see a similar issue when an HDD requires high in-rush current.
When the CPU load is high, the HDD & USB power connection could cause the intermittent bus reset eventreset SuperSpeed USB device number 3 using xhci-hcd due to the voltage drop.
Try to lower the CPU max clock to 1.2Ghz from 2Ghz to confirm this phenomenon.
Code: Select all
echo 1200000 | sudo tee /sys/devices/system/cpu/cpu4/cpufreq/scaling_max_freq


If the issue is gone by lowering the CPU power consumption, you have to consider changing the power supply or USB cables.
We will do further investigation.
User avatar
odroid
Site Admin
 
Posts: 23972
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: SMART Crashes USB

Unread postby bronco » Tue Jul 11, 2017 3:33 pm

joaofl wrote:Yes, tested with 2 different cables, that work on other equipments, as well on a different USB port, and on a different XU4 board (since I have 2 of them). In all cases the same error happens.


Did you also try different power bricks to power the XU4 when testing? And took care that cables were properly inserted?

https://forum.openmediavault.org/index. ... post148662 (see also the next post there, same error message but different cause)
bronco
 
Posts: 26
Joined: Tue Jul 11, 2017 2:58 pm
languages_spoken: english

Re: SMART Crashes USB

Unread postby crazyquark » Tue Jul 11, 2017 4:17 pm

odroid wrote:We could see a similar issue when an HDD requires high in-rush current.
When the CPU load is high, the HDD & USB power connection could cause the intermittent bus reset eventreset SuperSpeed USB device number 3 using xhci-hcd due to the voltage drop.
Try to lower the CPU max clock to 1.2Ghz from 2Ghz to confirm this phenomenon.
Code: Select all
echo 1200000 | sudo tee /sys/devices/system/cpu/cpu4/cpufreq/scaling_max_freq


If the issue is gone by lowering the CPU power consumption, you have to consider changing the power supply or USB cables.
We will do further investigation.


Given that this also happens on a Cloudshell 1 w/ a 6A PSU the issue might be in the cloudshell board itself, maybe powering the HDDs separately would help? It's worth checking with an equivalently powered PSU externally to check if the cloudshell board itself is at fault.
crazyquark
 
Posts: 187
Joined: Thu Jan 15, 2015 4:22 pm
languages_spoken: english, french, romanian
ODROIDs: C1,C1+,C2,XU4

Re: SMART Crashes USB

Unread postby bronco » Thu Jul 13, 2017 10:22 pm

joaofl wrote:I dont remember having this problem on previous kernel 3


Well, out of curiousity I googled for 'usb 4-1.2: reset SuperSpeed USB device number 3 using xhci-hcd 2016' (to search for occurences with kernel 3.x) and this was one of the first hits: https://github.com/Fourdee/DietPi/issues/487 (and the solution was to increase voltage at the PSU and someone else said using another PSU with more amperage helped. Most probably only since PSUs with higher amperage ratings are more stable wrt voltage drops under load?)
bronco
 
Posts: 26
Joined: Tue Jul 11, 2017 2:58 pm
languages_spoken: english

Re: SMART Crashes USB

Unread postby bronco » Thu Jul 13, 2017 10:24 pm

crazyquark wrote:maybe powering the HDDs separately would help?


viewtopic.php?f=99&t=25813#p180433
bronco
 
Posts: 26
Joined: Tue Jul 11, 2017 2:58 pm
languages_spoken: english

Re: SMART Crashes USB

Unread postby odroid » Fri Jul 14, 2017 4:19 pm

Official Ubuntu 16.04 & Kernel 4.9 on eMMC with official 5V/4A PSU.
2TB HDD is connected to the old CloudShell.
Running "stress" to use all 8 cores.
Keep copying 10GB file from/to Windows PC in parallel (two Samba instances)
Keep copying a big file from/to eMMC to/from HDD in parallel.
We've run above test for 3hrs 44min. There is no USB reset issue yet.
cloudshell1_ubuntu16.04_smbtest.png
cloudshell1_ubuntu16.04_smbtest.png (623.26 KiB) Viewed 994 times

The DMM measured voltage on the DC jack is 5.1Volt and average load is 2.54Amp.
HDD SATA power pin shows 4.76~ 4.89 Volt.
We will keep running this test for 24 hours more.

We will perform the same test with the OMV image soon.

I think we can share the test result on Monday or Tuesday because it is already Friday PM 5:00 in Korea.
User avatar
odroid
Site Admin
 
Posts: 23972
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: SMART Crashes USB

Unread postby bronco » Sat Jul 15, 2017 5:14 pm

odroid wrote:The DMM measured voltage on the DC jack is 5.1Volt and average load is 2.54Amp.
HDD SATA power pin shows 4.76~ 4.89 Volt.


Thanks for the numbers. They're interesting. :)

So anyone concerned about these issues should now take the specifications of the disk used on his Cloudshell and look up allowed voltage. Usually it will be 5V +/-5% then minimum voltage allowed is 4.75V. Given that you measured already 4.76V it's obvious what will happen when a PSU is providing only 5.0V --> 4.66V available at the disk. But this is only true when also the same cable between XU4 and PSU is used. In case a cable with higher resistance (especially when being long) will be used voltage drops even lower (we're still talking about Ohm's law all the time).

Same when increasing overall consumption. An increase to eg. 3A and we're below 4.75V at the disk for sure.

BTW: We're in the wrong thread anyway (this one was about firmware issues with Cloudshell 2 -- here underpowering shouldn't be an issue). Maybe all this mix-up adds to the confusion? Talking about Cloudshell disk powering in Cloudshell 2 firmware threads? Ohm's law being valid or not discussed in kernel 4.9 forum? All the time 'Disable UAS' mentioned for no reason?
bronco
 
Posts: 26
Joined: Tue Jul 11, 2017 2:58 pm
languages_spoken: english

Re: SMART Crashes USB

Unread postby bronco » Sat Jul 15, 2017 6:36 pm

BTW: I fail to understand the testing methodology a bit. What are you testing for? Things happening wondrously after some amount of time? Or a relation of USB resets with under-voltage?

The workload you're testing is pretty light (2.54A) given that you recommend a 6A PSU for your Cloudshell users. What about testing loads that get closer to the PSU ratings (since everyone with some electronics experience knows that those small power bricks themselves already decrease their voltage with increased load. The 5V/4A PSU that came with my UP board provides 5.1V at idle and 4.84V with a full 4A load)

You have users reporting they run in trouble with transmission. Transmission's IO pattern is totally different than what you're testing (transmission is pure random IO keeping HDD heads busy all the time, your sequential Samba tests with huge files are lightweight compared to that). I used the iozone call from OMV forum and measured myself: iozone -e -I -a -s 1000M -r 4k -i 0 -i 1 -i 2 (as soon as the random IO tests start consumption increases drastically).

Anyway: if it's about to test for a relationship between under-voltage and these USB resets then I would start to test for exactly that: reducing voltage available to disk/Cloudshell? Either by decreasing input voltage or increasing consumption/load. It shouldn't really matter since if it's about finding the root cause and testing one possibility (under-voltage) then it's really just letting under-voltage happen. If the symptoms are reproducable each time voltage available to disk drops below 4.6V for example with heavy disk access patterns (iozone) then we have the root cause.

Then it's trying to explain Ohm's law and to demonstrate how different types of load generate different voltage drops here and there. And since a user in another thread said he's using the same disk that seems to fail connected to Cloudshell without problems in another external enclosure still being powered by XU4 checking for specific voltage drop sources on Cloudshell or in between (the cable you ship) would make some sense.
bronco
 
Posts: 26
Joined: Tue Jul 11, 2017 2:58 pm
languages_spoken: english

Re: SMART Crashes USB

Unread postby odroid » Sun Jul 16, 2017 11:06 am

Ah.. right.
I should post the test result on this thread. (I just posted)
viewtopic.php?f=146&t=27548

We will do some iozone tests with various input voltage(in 50mv steps) early next week.
Thank you for the valuable inputs.
User avatar
odroid
Site Admin
 
Posts: 23972
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: SMART Crashes USB

Unread postby Kosmatik » Tue Jul 25, 2017 1:51 am

Anything from jmicron on this? I still can't enable smart monitoring in OMV without USB crashes.
Kosmatik
 
Posts: 24
Joined: Tue May 23, 2017 12:06 pm
languages_spoken: English, Russian
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby Kosmatik » Tue Jul 25, 2017 6:07 am

I don't believe this is an issue with the firmware now. I've applied the same patch noted here: https://forum.openmediavault.org/index. ... post149700

and I now get no crashes running

Code: Select all
smartctl -l devstat /dev/sda -d sat
Kosmatik
 
Posts: 24
Joined: Tue May 23, 2017 12:06 pm
languages_spoken: English, Russian
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby odroid » Tue Jul 25, 2017 10:12 am

This tkaiser's patch? We already tried it on our kernel but it didn't help. We are trying to find other solutions.
http://sprunge.us/MOAA
User avatar
odroid
Site Admin
 
Posts: 23972
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: SMART Crashes USB

Unread postby Kosmatik » Tue Jul 25, 2017 10:50 am

Nevermind, it still happens
Kosmatik
 
Posts: 24
Joined: Tue May 23, 2017 12:06 pm
languages_spoken: English, Russian
ODROIDs: XU4

Re: SMART Crashes USB

Unread postby neal » Mon Aug 07, 2017 6:02 pm

Kosmatik wrote:Anything from jmicron on this? I still can't enable smart monitoring in OMV without USB crashes.


This firmware version is 003 from the JMicron that solves the problem crash the system when accessed the smartctl on /dev/sda alone at SATA connector 2.
but, we still have a problem can't getting the SMART information when using both of drivers. and we are trying to get the new firmware from JMicron could fix it.

Please see this link again to know the firmware update process.
viewtopic.php?f=146&t=26016&start=150#p190507
Attachments
jms561_Hardkernel_v158.001.000.003.zip
(151.31 KiB) Downloaded 6 times
neal
 
Posts: 15
Joined: Fri Apr 14, 2017 10:02 am
languages_spoken: english

Re: SMART Crashes USB

Unread postby Kosmatik » Fri Aug 18, 2017 9:29 am

Sorry for the late reply, I was unable to get to my NAS until today.

I just flashed this firmware and I see no resets while pulling devstat. However i do see resets when I pull device information under the SMART setting in OMV on both sda and sdb, but not when pulling extended information.

Code: Select all
root@openmediavault:/srv/dev-disk-by-label-Orly# smartctl -l devstat /dev/sda -d sat
smartctl 6.4 2014-10-07 r4002 [armv7l-linux-4.9.37-odroidxu4] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

bDevice Statistics (GP Log 0x04)
Page Offset Size         Value  Description
  1  =====  =                =  == General Statistics (rev 2) ==
  1  0x008  4              111  Lifetime Power-On Resets
  1  0x010  4             2089  Power-on Hours
  1  0x018  6       2283803713  Logical Sectors Written
  1  0x020  6          5949184  Number of Write Commands
  1  0x028  6       2842576195  Logical Sectors Read
  1  0x030  6          5942522  Number of Read Commands
  2  =====  =                =  == Free-Fall Statistics (rev 1) ==
  2  0x010  4                2  Overlimit Shock Events
  3  =====  =                =  == Rotating Media Statistics (rev 1) ==
  3  0x008  4             2010  Spindle Motor Power-on Hours
  3  0x010  4             1978  Head Flying Hours
  3  0x018  4              411  Head Load Events
  3  0x020  4                0  Number of Reallocated Logical Sectors
  3  0x028  4                0  Read Recovery Attempts
  3  0x030  4                0  Number of Mechanical Start Failures
  4  =====  =                =  == General Errors Statistics (rev 1) ==
  4  0x008  4                0  Number of Reported Uncorrectable Errors
  4  0x010  4               90  Resets Between Cmd Acceptance and Completion
  5  =====  =                =  == Temperature Statistics (rev 1) ==
  5  0x008  1               38  Current Temperature
  5  0x010  1               36~ Average Short Term Temperature
  5  0x018  1               36~ Average Long Term Temperature
  5  0x020  1               58  Highest Temperature
  5  0x028  1               22  Lowest Temperature
  5  0x030  1               49~ Highest Average Short Term Temperature
  5  0x038  1               29~ Lowest Average Short Term Temperature
  5  0x040  1               36~ Highest Average Long Term Temperature
  5  0x048  1               34~ Lowest Average Long Term Temperature
  5  0x050  4             1482  Time in Over-Temperature
  5  0x058  1               55  Specified Maximum Operating Temperature
  5  0x060  4                0  Time in Under-Temperature
  5  0x068  1                5  Specified Minimum Operating Temperature
  6  =====  =                =  == Transport Statistics (rev 1) ==
  6  0x008  4             1789  Number of Hardware Resets
  6  0x018  4                0  Number of Interface CRC Errors
  7  =====  =                =  == Solid State Device Statistics (rev 1) ==
                              |_ ~ normalized value

root@openmediavault:/srv/dev-disk-by-label-Orly# smartctl -l devstat /dev/sdb -d sat
smartctl 6.4 2014-10-07 r4002 [armv7l-linux-4.9.37-odroidxu4] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

Device Statistics (GP Log 0x04)
Page Offset Size         Value  Description
  1  =====  =                =  == General Statistics (rev 2) ==
  1  0x008  4              111  Lifetime Power-On Resets
  1  0x010  4             2089  Power-on Hours
  1  0x018  6       2283803713  Logical Sectors Written
  1  0x020  6          5949184  Number of Write Commands
  1  0x028  6       2842576195  Logical Sectors Read
  1  0x030  6          5942522  Number of Read Commands
  2  =====  =                =  == Free-Fall Statistics (rev 1) ==
  2  0x010  4                2  Overlimit Shock Events
  3  =====  =                =  == Rotating Media Statistics (rev 1) ==
  3  0x008  4             2010  Spindle Motor Power-on Hours
  3  0x010  4             1978  Head Flying Hours
  3  0x018  4              411  Head Load Events
  3  0x020  4                0  Number of Reallocated Logical Sectors
  3  0x028  4                0  Read Recovery Attempts
  3  0x030  4                0  Number of Mechanical Start Failures
  4  =====  =                =  == General Errors Statistics (rev 1) ==
  4  0x008  4                0  Number of Reported Uncorrectable Errors
  4  0x010  4               90  Resets Between Cmd Acceptance and Completion
  5  =====  =                =  == Temperature Statistics (rev 1) ==
  5  0x008  1               38  Current Temperature
  5  0x010  1               36~ Average Short Term Temperature
  5  0x018  1               36~ Average Long Term Temperature
  5  0x020  1               58  Highest Temperature
  5  0x028  1               22  Lowest Temperature
  5  0x030  1               49~ Highest Average Short Term Temperature
  5  0x038  1               29~ Lowest Average Short Term Temperature
  5  0x040  1               36~ Highest Average Long Term Temperature
  5  0x048  1               34~ Lowest Average Long Term Temperature
  5  0x050  4             1482  Time in Over-Temperature
  5  0x058  1               55  Specified Maximum Operating Temperature
  5  0x060  4                0  Time in Under-Temperature
  5  0x068  1                5  Specified Minimum Operating Temperature
  6  =====  =                =  == Transport Statistics (rev 1) ==
  6  0x008  4             1789  Number of Hardware Resets
  6  0x018  4                0  Number of Interface CRC Errors
  7  =====  =                =  == Solid State Device Statistics (rev 1) ==
                              |_ ~ normalized value


Code: Select all
[  873.497816] usb 4-1.1: reset SuperSpeed USB device number 3 using xhci-hcd
[  878.597813] usb 4-1.1: reset SuperSpeed USB device number 3 using xhci-hcd


I've enabled monitoring and SMART check, will leave it running and see if I get more resets.
Kosmatik
 
Posts: 24
Joined: Tue May 23, 2017 12:06 pm
languages_spoken: English, Russian
ODROIDs: XU4


Return to CloudShell

Who is online

Users browsing this forum: No registered users and 1 guest