odroid hc1 - node_exporter docker how to get temperature readings

Post Reply
thamvmk
Posts: 25
Joined: Fri Dec 27, 2013 11:07 am
languages_spoken: english
ODROIDs: odroid u3, eMMC 8GB-android, eMMC 16GB, archlinux
odroid c2, archlinux
odroid hc1, ubuntu 18.04
odroid c1, android 4.4
Has thanked: 2 times
Been thanked: 0
Contact:

odroid hc1 - node_exporter docker how to get temperature readings

Unread post by thamvmk » Thu May 16, 2019 12:10 am

hi, I'm having an unstable hc1 with ubuntu 18.04. The hc1 has a SSD and installed with nextcloud as a cloud storage. However, I've noticed that it will go unresponsive after a few days, longest uptime being 7 days. So, I've actually installed a node_exporter and scrapping its matrices to prometheus and grafana. I've searched through many odroid forum topic with regards to stability and I've done 1 changes (reduced the big core CPU default ondemand frequency from 2.0Ghz to 1.5Ghz) and observing the stability issue. There are a couple of leads, therefore I'm trying to find out what exactly is the root cause.

Any help would be much appreciated.

Change #1:
reduced the big core CPU default ondemand frequency from 2.0Ghz to 1.5Ghz

Code: Select all

root@hc1:/home/vincent# cpufreq-info -o
          minimum CPU frequency  -  maximum CPU frequency  -  governor
CPU  0       200000 kHz ( 13 %)  -    1500000 kHz (100 %)  -  ondemand
CPU  1       200000 kHz ( 13 %)  -    1500000 kHz (100 %)  -  ondemand
CPU  2       200000 kHz ( 13 %)  -    1500000 kHz (100 %)  -  ondemand
CPU  3       200000 kHz ( 13 %)  -    1500000 kHz (100 %)  -  ondemand
CPU  4       200000 kHz ( 10 %)  -    1536000 kHz ( 76 %)  -  ondemand
CPU  5       200000 kHz ( 10 %)  -    1536000 kHz ( 76 %)  -  ondemand
CPU  6       200000 kHz ( 10 %)  -    1536000 kHz ( 76 %)  -  ondemand
CPU  7       200000 kHz ( 10 %)  -    1536000 kHz ( 76 %)  -  ondemand
Change #2:

Code: Select all

/etc/NetworkManager/conf.d/default-wifi-powersave-on.conf
[connection]
wifi.powersave = 3

I'm using a the following docker container:

Code: Select all

vincent@hc1:~$ docker ps 
CONTAINER ID        IMAGE                          COMMAND                  CREATED             STATUS              PORTS               NAMES
db38467b13e2        carlosedp/node_exporter        "/bin/node_exporter …"   2 weeks ago         Up 2 days                               hc1_node

Code: Select all

vincent@hc1:~$ uname -a
Linux hc1 4.14.107-157 #1 SMP PREEMPT Thu Mar 21 09:59:50 -03 2019 armv7l armv7l armv7l GNU/Linux
vincent@hc1:~$ cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.2 LTS"
Vincent

User avatar
odroid
Site Admin
Posts: 31852
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 89 times
Been thanked: 255 times
Contact:

Re: odroid hc1 - node_exporter docker how to get temperature readings

Unread post by odroid » Thu May 16, 2019 9:59 am

Which power supply do you use?
Do you have a DMM to measure the system voltage not the PSU voltage?

thamvmk
Posts: 25
Joined: Fri Dec 27, 2013 11:07 am
languages_spoken: english
ODROIDs: odroid u3, eMMC 8GB-android, eMMC 16GB, archlinux
odroid c2, archlinux
odroid hc1, ubuntu 18.04
odroid c1, android 4.4
Has thanked: 2 times
Been thanked: 0
Contact:

Re: odroid hc1 - node_exporter docker how to get temperature readings

Unread post by thamvmk » Thu May 16, 2019 9:01 pm

5V/4A Power Supply EU Plug
No, I don't have a DMM to measure the system voltage

Image

This is the power adapter from hardkernel.
Vincent

User avatar
odroid
Site Admin
Posts: 31852
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 89 times
Been thanked: 255 times
Contact:

Re: odroid hc1 - node_exporter docker how to get temperature readings

Unread post by odroid » Fri May 17, 2019 8:47 am

There should be no power stability issue if you use our official PSU.

Was the blue LED on the HC1 board still flashing like heartbeat when you couldn't access it?
How were the LEDs on the RJ-45 Ethernet jack?

I have no idea how to access the temperature sensors from Docker.
I hope other users can help you.

User avatar
mad_ady
Posts: 6401
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4, C1+, C2, N1, H2, N2
Location: Bucharest, Romania
Has thanked: 150 times
Been thanked: 109 times
Contact:

Re: odroid hc1 - node_exporter docker how to get temperature readings

Unread post by mad_ady » Fri May 17, 2019 1:06 pm

Regarding docker access - see if this helps: https://stackoverflow.com/questions/470 ... -host-proc
These users thanked the author mad_ady for the post:
thamvmk (Sat May 18, 2019 10:44 pm)

thamvmk
Posts: 25
Joined: Fri Dec 27, 2013 11:07 am
languages_spoken: english
ODROIDs: odroid u3, eMMC 8GB-android, eMMC 16GB, archlinux
odroid c2, archlinux
odroid hc1, ubuntu 18.04
odroid c1, android 4.4
Has thanked: 2 times
Been thanked: 0
Contact:

Re: odroid hc1 - node_exporter docker how to get temperature readings

Unread post by thamvmk » Fri May 17, 2019 9:56 pm

Was the blue LED on the HC1 board still flashing like heartbeat when you couldn't access it?
If I'm not mistaken the blue LED is not flashing, this indicated kernel is down, right? But I looked at the kern.log, there is nothing there. And my current kernel panic, restart setting is set to 0.

How were the LEDs on the RJ-45 Ethernet jack?
Ethernet jack is blinking, but can't ping, ssh.
Vincent

User avatar
odroid
Site Admin
Posts: 31852
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 89 times
Been thanked: 255 times
Contact:

Re: odroid hc1 - node_exporter docker how to get temperature readings

Unread post by odroid » Mon May 20, 2019 9:36 am

If the blue LED is solid-on, it stuck at the bootloader stage.
If it is off, OS doesn't run or crashed.

Can you trace the memory usage level?
I heard some people could meet uncertain reboot issues when the Out-of-Memory condition appeared.

thamvmk
Posts: 25
Joined: Fri Dec 27, 2013 11:07 am
languages_spoken: english
ODROIDs: odroid u3, eMMC 8GB-android, eMMC 16GB, archlinux
odroid c2, archlinux
odroid hc1, ubuntu 18.04
odroid c1, android 4.4
Has thanked: 2 times
Been thanked: 0
Contact:

Re: odroid hc1 - node_exporter docker how to get temperature readings

Unread post by thamvmk » Tue May 21, 2019 11:53 am

odroid wrote:
Mon May 20, 2019 9:36 am
If the blue LED is solid-on, it stuck at the bootloader stage.
If it is off, OS doesn't run or crashed.

Can you trace the memory usage level?
I heard some people could meet uncertain reboot issues when the Out-of-Memory condition appeared.
I had another crash after 6+ days, right before 7th day. This time, I observed the blue LED and it is blinking, so the kernel and os is running. I further troubleshoot the connectivity from my router and found the following observation:

1) I removed the CAT-6 cable from HC1, and re-attach it, then check from my router if it can see this client. And from router, it can't see this client. So, I've rebooted the router and check if the HC1 can be detected by the router. And NO, the router still don't register this HC1.

2) Then, I've added a IP address reservation for HC1 - to the fixed IP that I've configured on HC1, rebooted the router and whaaalllaaa HC1 is now back on the network.

I'm near to solving this issue, but yet the question would be why the ethernet interface drops after around 5-7 days of operations? I will keep observation to find the root cause.

On the memory usage, I've got it all in grafana and prometheus, so I could see history of CPU, memory and all other matrix captured. I will add it into this post later.
Vincent

User avatar
odroid
Site Admin
Posts: 31852
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Has thanked: 89 times
Been thanked: 255 times
Contact:

Re: odroid hc1 - node_exporter docker how to get temperature readings

Unread post by odroid » Tue May 21, 2019 12:04 pm

My personal home server XU4(CloudShell) also had a very similar issue when my kids used very heavy bit-torrent stuff last year.
The problem could be resolved after firmware updating of the home router accidentally. As far as I remember, my XU4 server uptime must be longer than 200 days now.
So it would be very worth to check your router firmware version too.

Anyway, keep checking the connection status with the fixed IP settings as well as memory usage.
These users thanked the author odroid for the post:
thamvmk (Tue May 21, 2019 12:12 pm)

thamvmk
Posts: 25
Joined: Fri Dec 27, 2013 11:07 am
languages_spoken: english
ODROIDs: odroid u3, eMMC 8GB-android, eMMC 16GB, archlinux
odroid c2, archlinux
odroid hc1, ubuntu 18.04
odroid c1, android 4.4
Has thanked: 2 times
Been thanked: 0
Contact:

Re: odroid hc1 - node_exporter docker how to get temperature readings

Unread post by thamvmk » Tue May 21, 2019 12:13 pm

Following are some memory usage data right before the network connection is out:

6-May 21:05, Free Memory=42MB
13-May 22:10, Free Memory=746MB
21-May 23:20, Free Memory=46MB

Only 2 out of 3 occurrences has low memory, but the system is not totally out. Just the network interface gone.

And currently, after the last remedy to do address reservation at the router and hc1 is back up without rebooting, it is still running with around 42MB free memory. hc1 didn't require a reboot as it is still running, only issue is why the network interface go off after about a week in operations. I hope the address reservation at the router fixes this problem. Also as the router will also go out after like 2 weeks. I've a schedule reboot of the router every week on Tue 2am, but this timing also doesn't match the outage of hc1 network interface.
Vincent

User avatar
mad_ady
Posts: 6401
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4, C1+, C2, N1, H2, N2
Location: Bucharest, Romania
Has thanked: 150 times
Been thanked: 109 times
Contact:

Re: odroid hc1 - node_exporter docker how to get temperature readings

Unread post by mad_ady » Tue May 21, 2019 12:25 pm

You should also consider caches as free memory, since they are dynamically reduced when under stress. Maybe the oom-killer kills the dhcp process and it doesn't renew your lease...

thamvmk
Posts: 25
Joined: Fri Dec 27, 2013 11:07 am
languages_spoken: english
ODROIDs: odroid u3, eMMC 8GB-android, eMMC 16GB, archlinux
odroid c2, archlinux
odroid hc1, ubuntu 18.04
odroid c1, android 4.4
Has thanked: 2 times
Been thanked: 0
Contact:

Re: odroid hc1 - node_exporter docker how to get temperature readings

Unread post by thamvmk » Tue May 21, 2019 1:24 pm

Here is the image for memory usage from grafana:

https://vincetham.duckdns.org:8080/inde ... gcJYCG8YWx

Image

The oom_reaper did killed some process before but nothing indicating it killed the dhcp client, also I've setup fixed IP from hc1 and not DHCP. So, it leads to possibility of router issue.

Code: Select all

root@hc1:/var/log/hc1# rg -i "oom" 
kernel.log
6781:2019-05-17T21:59:51.610624+08:00 hc1 kernel: [340432.885370] dockerd invoked oom-killer: gfp_mask=0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=(null),  order=0, oom_score_adj=-500
6788:2019-05-17T21:59:51.848199+08:00 hc1 kernel: [340432.885484] [<c0223bb0>] (dump_header) from [<c0222da0>] (oom_kill_process+0x308/0x570)
6789:2019-05-17T21:59:51.848207+08:00 hc1 kernel: [340432.885495] [<c0222da0>] (oom_kill_process) from [<c02239ac>] (out_of_memory+0x214/0x320)
6825:2019-05-17T21:59:51.848523+08:00 hc1 kernel: [340432.885889] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
6904:2019-05-17T21:59:51.849204+08:00 hc1 kernel: [340433.363899] oom_reaper: reaped process 7526 (apache2), now anon-rss:0kB, file-rss:0kB, shmem-rss:84kB
Vincent

Post Reply

Return to “Ubuntu”

Who is online

Users browsing this forum: secureexp and 1 guest