[SOLVED] HC2 performance regression on kernel 5.*
-
- Posts: 14
- Joined: Thu Aug 29, 2019 4:53 am
- languages_spoken: english
- ODROIDs: ODROID-HC2
- Has thanked: 1 time
- Been thanked: 2 times
- Contact:
[SOLVED] HC2 performance regression on kernel 5.*
I use my HC2 as a server for borg backups over SSH.
For several months it had OMV4 (Armbian (stretch) kernel 4.*), though I don't really use OMV features. Upon upgrading to Armbian (buster) kernel 5.4.y and OMV5, the backup process, specifically "borg check", takes over twice as long.
I don't think it's a memory issue since the max "used" memory is reported by rrdcached at 400MB of the total 2GB. Min "free" was around 40MB, but that's because max "page cache" went to 1.6GB. I don't think that should be a problem, right? I also don't think it's a network issue since this is on my local network which hasn't changed and the abnormally long time is spent on checking the backup archives, according to backup logs on the client machine being backed up.
I'm thinking the issue might be IO, or, more likely, CPU related. IO seems ok for the 90% empty 10TB HDD, based on "hdparm -t /dev/sda" resulting in ~140MB/s. CPU seems like it could be suspect due to borg being a single threaded application, but I don't notice anything wrong with performance benchmarks done with "armbian-config": http://ix.io/2k57
Any ideas on how to track down the cause of this performance regression?
For several months it had OMV4 (Armbian (stretch) kernel 4.*), though I don't really use OMV features. Upon upgrading to Armbian (buster) kernel 5.4.y and OMV5, the backup process, specifically "borg check", takes over twice as long.
I don't think it's a memory issue since the max "used" memory is reported by rrdcached at 400MB of the total 2GB. Min "free" was around 40MB, but that's because max "page cache" went to 1.6GB. I don't think that should be a problem, right? I also don't think it's a network issue since this is on my local network which hasn't changed and the abnormally long time is spent on checking the backup archives, according to backup logs on the client machine being backed up.
I'm thinking the issue might be IO, or, more likely, CPU related. IO seems ok for the 90% empty 10TB HDD, based on "hdparm -t /dev/sda" resulting in ~140MB/s. CPU seems like it could be suspect due to borg being a single threaded application, but I don't notice anything wrong with performance benchmarks done with "armbian-config": http://ix.io/2k57
Any ideas on how to track down the cause of this performance regression?
Last edited by zerodroid on Mon Jun 29, 2020 12:25 pm, edited 1 time in total.
- mad_ady
- Posts: 10596
- Joined: Wed Jul 15, 2015 5:00 pm
- languages_spoken: english
- ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
- Location: Bucharest, Romania
- Has thanked: 644 times
- Been thanked: 905 times
- Contact:
Re: HC2 performance regression
Check governor and maybe force the backup process to use the big cores.
-
- Posts: 14
- Joined: Thu Aug 29, 2019 4:53 am
- languages_spoken: english
- ODROIDs: ODROID-HC2
- Has thanked: 1 time
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression
The governor is ondemand, and according to htop the borg process is always on cores 4-7 (big cores).
-
- Posts: 14
- Joined: Thu Aug 29, 2019 4:53 am
- languages_spoken: english
- ODROIDs: ODROID-HC2
- Has thanked: 1 time
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression
Benchmark results from sbc-bench show a significant regression in kernel 5.* for memory bandwidth and latency.
My HC2 with kernel 5.4.28-odroidxu4: http://ix.io/2k57
An XU4 with kernel 4.14.55-146 (odroidxu4): http://ix.io/1iLy
I don't know if this would explain the drop in borg check performance since it runs on a big core, but it does show that something is really wrong on kernel 5.*. No doubt if there are issues with memory bandwidth and latency being multiple times slower, it could account for the performance regression I've noticed.
It would be great for comparison if others would benchmark their HC2/HC1/XU4 with sbc-bench and post results here, especially those on kernel 4.*. Several others on kernel 5.* have confirmed my results above, so this is not an anomaly.
My HC2 with kernel 5.4.28-odroidxu4: http://ix.io/2k57
Code: Select all
Memory bandwidth tests on a little core:
standard memcpy : 84.8 MB/s
standard memset : 278.5 MB/s (0.3%)
Memory latency test:
block size : single random read / dual random read
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 4.2 ns / 7.5 ns
131072 : 6.5 ns / 10.8 ns
262144 : 7.8 ns / 12.3 ns
524288 : 13.2 ns / 19.9 ns
1048576 : 262.9 ns / 419.0 ns
2097152 : 394.6 ns / 544.5 ns
4194304 : 461.8 ns / 586.3 ns
8388608 : 498.4 ns / 605.4 ns
16777216 : 524.2 ns / 626.7 ns
33554432 : 547.8 ns / 661.2 ns
67108864 : 588.1 ns / 737.3 ns
An XU4 with kernel 4.14.55-146 (odroidxu4): http://ix.io/1iLy
Code: Select all
Memory bandwidth tests on a little core:
standard memcpy : 391.7 MB/s
standard memset : 800.5 MB/s
Memory latency test:
block size : single random read / dual random read
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 3.9 ns / 7.0 ns
131072 : 6.0 ns / 10.1 ns
262144 : 7.1 ns / 11.5 ns
524288 : 9.7 ns / 15.2 ns
1048576 : 76.7 ns / 119.2 ns
2097152 : 116.0 ns / 156.9 ns
4194304 : 136.5 ns / 170.7 ns
8388608 : 147.9 ns / 178.5 ns
16777216 : 156.4 ns / 186.2 ns
33554432 : 164.9 ns / 197.5 ns
67108864 : 176.2 ns / 217.2 ns
It would be great for comparison if others would benchmark their HC2/HC1/XU4 with sbc-bench and post results here, especially those on kernel 4.*. Several others on kernel 5.* have confirmed my results above, so this is not an anomaly.
- odroid
- Site Admin
- Posts: 39117
- Joined: Fri Feb 22, 2013 11:14 pm
- languages_spoken: English, Korean
- ODROIDs: ODROID
- Has thanked: 2513 times
- Been thanked: 1382 times
- Contact:
Re: HC2 performance regression on kernel 5.*
Nice information.
I think the DRAM controller configuration in Kernel 5.x could have some issues.
I think the DRAM controller configuration in Kernel 5.x could have some issues.
- lanefu
- Posts: 5
- Joined: Tue Jun 30, 2020 9:35 pm
- languages_spoken: english
- ODROIDs: N2, MC1-solo
- Has thanked: 2 times
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
Issue seems to be related to https://lwn.net/Articles/787647/
I disabled kernel feature CONFIG_EXYNOS5422_DMC and performance was restored.
See https://armbian.atlassian.net/browse/AR-337
I disabled kernel feature CONFIG_EXYNOS5422_DMC and performance was restored.
See https://armbian.atlassian.net/browse/AR-337
- lanefu
- Posts: 5
- Joined: Tue Jun 30, 2020 9:35 pm
- languages_spoken: english
- ODROIDs: N2, MC1-solo
- Has thanked: 2 times
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
I'd consider disabling DMC a workaround... According to https://cateee.net/lkddb/web-lkddb/EXYNOS5422_DMC.html. timings are set based on memory information provided in device tree... which is a little over my head
- lanefu
- Posts: 5
- Joined: Tue Jun 30, 2020 9:35 pm
- languages_spoken: english
- ODROIDs: N2, MC1-solo
- Has thanked: 2 times
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
Opted to disable DMC for now
https://github.com/armbian/build/pull/2073
should be available in nightly kernel tomorrow-ish
https://github.com/armbian/build/pull/2073
should be available in nightly kernel tomorrow-ish
-
- Posts: 14
- Joined: Thu Aug 29, 2019 4:53 am
- languages_spoken: english
- ODROIDs: ODROID-HC2
- Has thanked: 1 time
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
I can confirm that using the nightly kernel 5.4.49-odroidxu4, "borg check" is back to kernel 4.x completion times. That's a 2-3X performance difference!
Many thanks to @lanefu for this workaround!
@odroid Any tips or efforts towards a proper fix to the DMC kernel feature?
Many thanks to @lanefu for this workaround!
@odroid Any tips or efforts towards a proper fix to the DMC kernel feature?
- odroid
- Site Admin
- Posts: 39117
- Joined: Fri Feb 22, 2013 11:14 pm
- languages_spoken: English, Korean
- ODROIDs: ODROID
- Has thanked: 2513 times
- Been thanked: 1382 times
- Contact:
Re: HC2 performance regression on kernel 5.*
We will look into that 4~5 weeks later when we start making Ubuntu 20.04 images for XU4/XU3/HC1/HC2.
We are too busy these days for C1/C2/N2 Ubuntu 20.04 building.
We are too busy these days for C1/C2/N2 Ubuntu 20.04 building.
- These users thanked the author odroid for the post (total 2):
- lanefu (Wed Jul 01, 2020 12:45 pm) • xdcc_master (Tue Jul 14, 2020 4:03 pm)
- lanefu
- Posts: 5
- Joined: Tue Jun 30, 2020 9:35 pm
- languages_spoken: english
- ODROIDs: N2, MC1-solo
- Has thanked: 2 times
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
alternate workaround is to change DMC governor (memory not cpu) to performance or userspace. default is simple_ondemand
https://github.com/armbian/build/pull/2 ... -653203633
https://github.com/armbian/build/pull/2 ... -653203633
-
- Posts: 1795
- Joined: Sun Jul 07, 2013 3:05 am
- languages_spoken: german, english
- ODROIDs: X2, U3, XU3, C2, HiFi Shield, XU4, XU4Q,
N1, Go, VU5A, Show2, CloudShell2,
H2, N2, VU7A, VuShell, Go2, C4 - Has thanked: 120 times
- Been thanked: 372 times
- Contact:
Re: HC2 performance regression on kernel 5.*
Yes both workarounds are ok, the simple_ondemand governor seems to not doing anything, at least not scaling on demand and it stays strictly on 165MHZ
RG
-
- Posts: 401
- Joined: Mon Aug 26, 2013 6:05 pm
- languages_spoken: english
- Has thanked: 44 times
- Been thanked: 62 times
- Contact:
Re: HC2 performance regression on kernel 5.*
Ahhh thank you guys!
I was already wondering why my Fedora image was so slow on 5.4 compared to the legacy kernel.
This really adds the cherry on top.
The XU4 is still a beast with such good kernel/driver support.
I was already wondering why my Fedora image was so slow on 5.4 compared to the legacy kernel.
This really adds the cherry on top.
The XU4 is still a beast with such good kernel/driver support.
-
- Posts: 1584
- Joined: Fri Oct 02, 2015 1:44 pm
- languages_spoken: english
- ODROIDs: .
- Has thanked: 179 times
- Been thanked: 210 times
- Contact:
Re: HC2 performance regression on kernel 5.*
I'm wondering how exynos5422 DMC driver is implemented in kernel 5.x, so, I've looked into the devfreq driver briefly and checked memory benchmark.
And it looks basic operation of the driver works normally based on the result.
I have some back data that I made to check DMC driver on kernel 3.10.y and with u-boot workaround on kernel 4.x (no dmc driver) as well,
so I compared mbw benchmark output and essential registers of DMC and BPLL component.
1. Test Environment
I followed @joshua.yang's guide and use XU4 ubuntu mate image (20190929) & memeka's github branch odroidxu4.5.4.y.
viewtopic.php?p=273697#p273697
https://github.com/mihailescu2m/linux/t ... dxu4-5.4.y
Here is a patch to enable exynos5422 dmc feature.
2. memory benchmark, 'mbw'
I use benchmark utility 'mbw' and governor 'performance' mode of cpu & dmc.
3. essential registers value
To change DDR timing and PLL output, the following registers should be adjusted for each cases.
Using 'devmem2', I checked the registers status.
BPLL_LOCK values are different but it can be adjusted because it's related to PLL lock time value.
4. With governor, simple_ondemand
And one more thing..
Default governor options after booting done is simple_ondemand.
In this case, I got similar mbw results to 825MHz case, regardless min_freq/cur_freq.
I need to check this result is correct one because related register values are changed when I change cur_freq, but same mbw result.
It's not very low value like, with case 165MHz.
But this result shows dmc devfreq with simple_ondemand doesn't work correctly.
As @AreaScout, there should be no issue with performance, powersave or userspace.
But, I'm confused for simple_ondemand case.
wondering if this slow performance issue is related to dmc driver itself issue.
@zerodroid, could you share your dtb file with me?
And it looks basic operation of the driver works normally based on the result.

I have some back data that I made to check DMC driver on kernel 3.10.y and with u-boot workaround on kernel 4.x (no dmc driver) as well,
so I compared mbw benchmark output and essential registers of DMC and BPLL component.
1. Test Environment
I followed @joshua.yang's guide and use XU4 ubuntu mate image (20190929) & memeka's github branch odroidxu4.5.4.y.
viewtopic.php?p=273697#p273697
https://github.com/mihailescu2m/linux/t ... dxu4-5.4.y
Here is a patch to enable exynos5422 dmc feature.
Code: Select all
diff --git a/arch/arm/configs/odroidxu4_defconfig b/arch/arm/configs/odroidxu4_defconfig
index 661a849..33aea5d 100644
--- a/arch/arm/configs/odroidxu4_defconfig
+++ b/arch/arm/configs/odroidxu4_defconfig
@@ -5413,10 +5413,11 @@ CONFIG_EXTCON=y
CONFIG_EXTCON_USB_GPIO=m
# CONFIG_EXTCON_USBC_CROS_EC is not set
CONFIG_MEMORY=y
+CONFIG_DDR=y
# CONFIG_ARM_PL172_MPMC is not set
CONFIG_PL353_SMC=y
CONFIG_SAMSUNG_MC=y
-# CONFIG_EXYNOS5422_DMC is not set
+CONFIG_EXYNOS5422_DMC=y
CONFIG_EXYNOS_SROM=y
CONFIG_IIO=y
CONFIG_IIO_BUFFER=y
@@ -6599,8 +6600,6 @@ CONFIG_KASAN_STACK=1
# end of Memory Debugging
CONFIG_ARCH_HAS_KCOV=y
-CONFIG_CC_HAS_SANCOV_TRACE_PC=y
-# CONFIG_KCOV is not set
#
# Debug Lockups and Hangs
Code: Select all
# uname -a
Linux odroid 5.4.3+ #2 SMP PREEMPT Fri Jul 31 22:11:19 KST 2020 armv7l armv7l armv7l GNU/Linux
2. memory benchmark, 'mbw'
I use benchmark utility 'mbw' and governor 'performance' mode of cpu & dmc.
Code: Select all
# apt install mbw
Code: Select all
# echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
# echo performance > /sys/devices/system/cpu/cpufreq/policy4/scaling_governor
# echo performance > /sys/class/devfreq/devfreq0/governor
# echo ${ddr_freq} > /sys/class/devfreq/devfreq0/max_freq
# cat /sys/class/devfreq/devfreq0/cur_freq
${ddr_freq}
Code: Select all
# mbw 100 | grep AVG
3. essential registers value
To change DDR timing and PLL output, the following registers should be adjusted for each cases.
Using 'devmem2', I checked the registers status.
Code: Select all
# devmem2 0x10C20034 word
/dev/mem opened.
Memory mapped at address 0xb6fc8000.
Value at address 0x10C20034 (0xb6fc8034): 0x365A9713
4. With governor, simple_ondemand
And one more thing..
Default governor options after booting done is simple_ondemand.
In this case, I got similar mbw results to 825MHz case, regardless min_freq/cur_freq.
I need to check this result is correct one because related register values are changed when I change cur_freq, but same mbw result.
It's not very low value like, with case 165MHz.
But this result shows dmc devfreq with simple_ondemand doesn't work correctly.
Code: Select all
root@odroid:~# cat /sys/class/devfreq/devfreq0/governor
simple_ondemand
root@odroid:~# cat /sys/class/devfreq/devfreq0/cur_freq
165000000
root@odroid:~# mbw 100 | grep AVG
AVG Method: MEMCPY Elapsed: 0.05463 MiB: 100.00000 Copy: 1830.389 MiB/s
AVG Method: DUMB Elapsed: 0.06001 MiB: 100.00000 Copy: 1666.444 MiB/s
AVG Method: MCBLOCK Elapsed: 0.02231 MiB: 100.00000 Copy: 4482.757 MiB/s
But, I'm confused for simple_ondemand case.
wondering if this slow performance issue is related to dmc driver itself issue.

@zerodroid, could you share your dtb file with me?
Last edited by joy on Sun Aug 02, 2020 1:25 pm, edited 1 time in total.
-
- Posts: 14
- Joined: Thu Aug 29, 2019 4:53 am
- languages_spoken: english
- ODROIDs: ODROID-HC2
- Has thanked: 1 time
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
@joy which file would you like to see?
Thanks for investigating.
Thanks for investigating.
-
- Posts: 1584
- Joined: Fri Oct 02, 2015 1:44 pm
- languages_spoken: english
- ODROIDs: .
- Has thanked: 179 times
- Been thanked: 210 times
- Contact:
Re: HC2 performance regression on kernel 5.*
@zerodroid,
Do you use exynos5422-odroidxu4.dtb that is generated from arch/arm/boot/dts/?
Or other DTB file?
I mean the file.
Do you use exynos5422-odroidxu4.dtb that is generated from arch/arm/boot/dts/?
Or other DTB file?
I mean the file.

-
- Posts: 14
- Joined: Thu Aug 29, 2019 4:53 am
- languages_spoken: english
- ODROIDs: ODROID-HC2
- Has thanked: 1 time
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
@joy Sorry, I'm not familiar with these details. Could you let me know how to check which .dtb file is being used?
A quick search shows there are many .dtb files in the "/boot/dtb-5.4.50-odroidxu4/" and "/usr/lib/linux-image-current-odroidxu4/" directories.
A quick search shows there are many .dtb files in the "/boot/dtb-5.4.50-odroidxu4/" and "/usr/lib/linux-image-current-odroidxu4/" directories.
-
- Posts: 1584
- Joined: Fri Oct 02, 2015 1:44 pm
- languages_spoken: english
- ODROIDs: .
- Has thanked: 179 times
- Been thanked: 210 times
- Contact:
Re: HC2 performance regression on kernel 5.*
Hi @zerodroid,
That's OK. No problem.
I found the exynos5422-odroidhc1.dtb and there is no difference of lpddr timing.
Sorry for bothering you. I'm also not familiar with Armbian configs.
And based on your updated test result here and @lanefu's comment , I'm sure it's related to default simple_ondemand condition.
On kernel 4.x, default memory clock is 825MHz and it depends on u-boot 'dmc' command to support various memory clocks on XU4. (no dmc devfreq driver on kernel 4.x)
That's OK. No problem.
I found the exynos5422-odroidhc1.dtb and there is no difference of lpddr timing.
Sorry for bothering you. I'm also not familiar with Armbian configs.
And based on your updated test result here and @lanefu's comment , I'm sure it's related to default simple_ondemand condition.
I'm going to check how simple_ondemand condition of 3.10.y & 5.4.x kernel works and test sbc-bench and other memory benchmarks more.
On kernel 4.x, default memory clock is 825MHz and it depends on u-boot 'dmc' command to support various memory clocks on XU4. (no dmc devfreq driver on kernel 4.x)
Last edited by joy on Sat Aug 01, 2020 10:05 am, edited 1 time in total.
-
- Posts: 14
- Joined: Thu Aug 29, 2019 4:53 am
- languages_spoken: english
- ODROIDs: ODROID-HC2
- Has thanked: 1 time
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
@joy Ok, good.
I don't know what the .dtb is for, but you may want to take a look at the one for exynos5422-odroidxu4 too because the HC1/HC2 are identified as an XU4.
The optimized board configurations in Armbian for HC1/HC2 currently don't work so it is using the XU4 profile. As for the significance of this, I don't know...
Not sure if this is relevant at all, but more information is better than less.
Just let me know if you'd like me to check anything.
I don't know what the .dtb is for, but you may want to take a look at the one for exynos5422-odroidxu4 too because the HC1/HC2 are identified as an XU4.
The optimized board configurations in Armbian for HC1/HC2 currently don't work so it is using the XU4 profile. As for the significance of this, I don't know...
Not sure if this is relevant at all, but more information is better than less.
Just let me know if you'd like me to check anything.
-
- Posts: 1584
- Joined: Fri Oct 02, 2015 1:44 pm
- languages_spoken: english
- ODROIDs: .
- Has thanked: 179 times
- Been thanked: 210 times
- Contact:
Re: HC2 performance regression on kernel 5.*
@zerodroid,
Thanks for the information. OK I will.
It looks this is a known issue and there are some discussions.
https://lore.kernel.org/linux-pm/202006 ... ini.local/
Thanks for the information. OK I will.
It looks this is a known issue and there are some discussions.
https://lore.kernel.org/linux-pm/202006 ... ini.local/
-
- Posts: 1584
- Joined: Fri Oct 02, 2015 1:44 pm
- languages_spoken: english
- ODROIDs: .
- Has thanked: 179 times
- Been thanked: 210 times
- Contact:
Re: HC2 performance regression on kernel 5.*
I have an update for this issue.
I reproduced the low memory performance on XU4 board and found patches to fix this issue.
1. Test summary using sbc-bench and tinymembench
As I tested on XU4 using 5.4.x kernel in case of default simple_ondemand, there is no low memory performance issue.
Kernel 5.4.28 + XU4 DTB
http://ix.io/2sVv
During some test, I found there is a difference between (1) Armbian XU4 DTB and (2) Armbian HC1 DTB,
and I could reproduce low memory performance pattern with LITTLE cores.
Kernel 5.4.28 + HC1 DTB
http://ix.io/2sVP
And got same result on XU4 DTB, HK Ubuntu, kernel 5.4.3 with (1) HDMI connected and (2) HDMI disconnected.
For this test, I use 'tinymembench' and XU4 board+XU4 DTB.
https://github.com/ssvb/tinymembench
Kernel 5.4.3 + XU4 DTB + HDMI connected
Kernel 5.4.3 + XU4 DTB + HDMI removed
Using 'mbw' with simple_ondemand governor, I got the expected output and confirmed there is no basic devfreq logic issue of exynos5422 DMC again.
Please note that there is a different condition between 'mbw' test after frequency set-up and 'tinymembench' after booting done.
And devfreq sysfs node is also different from linux-stable.
/sys/class/devfreq/10c20000.memory-controller/
It looks this issue depends on booting workload conditions, I think,
and found some discussions and patches.
2. Related threads and Patches
Please refer to these links.
(updated)
https://lore.kernel.org/linux-pm/202006 ... ini.local/
https://lore.kernel.org/linux-pm/82080e ... arm.com/T/
https://lore.kernel.org/linux-pm/202007 ... a@arm.com/
https://kernel.googlesource.com/pub/scm ... ding-edge/
And here is my test results.
I got the similar result regardless of any conditions.
This patch is generated kernel 5.4.3 just for my test
and I think it must be similar (not exactly same) with kernel 5.4.28 tag of linux-stable for Armbian target.
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
I didn't check it on Armbian & linux-stable and HC1 board yet.
So, I hope somebody can confirm it on Armbian and with a conditions that EXYNOS5422_DMC config is activated.
I reproduced the low memory performance on XU4 board and found patches to fix this issue.
1. Test summary using sbc-bench and tinymembench
As I tested on XU4 using 5.4.x kernel in case of default simple_ondemand, there is no low memory performance issue.
Kernel 5.4.28 + XU4 DTB
http://ix.io/2sVv
Code: Select all
Memory bandwidth tests on little core
standard memcpy : 339.0 MB/s
standard memset : 796.3 MB/s (24.3%)
Memory bandwidth tests on big core
standard memcpy : 2235.5 MB/s
standard memset : 4904.9 MB/s (24.5%)
and I could reproduce low memory performance pattern with LITTLE cores.
Kernel 5.4.28 + HC1 DTB
http://ix.io/2sVP
Code: Select all
Memory bandwidth tests on little core
standard memcpy : 85.2 MB/s
standard memset : 278.7 MB/s (0.1%)
Memory bandwidth tests on big core
standard memcpy : 2426.4 MB/s
standard memset : 4892.2 MB/s (0.9%)
For this test, I use 'tinymembench' and XU4 board+XU4 DTB.
https://github.com/ssvb/tinymembench
Kernel 5.4.3 + XU4 DTB + HDMI connected
Code: Select all
Memory bandwidth tests on little core
standard memcpy : 322.6 MB/s
standard memset : 793.9 MB/s (25.2%)
Memory bandwidth tests on big core
standard memcpy : 2241.8 MB/s
standard memset : 4897.8 MB/s (24.7%)
Code: Select all
# taskset -c 0 /home/odroid/tinymembench/tinymembench
standard memcpy : 81.1 MB/s
standard memset : 275.5 MB/s
# taskset -c 4 /home/odroid/tinymembench/tinymembench
standard memcpy : 2394.7 MB/s
standard memset : 4845.6 MB/s (0.5%)
Please note that there is a different condition between 'mbw' test after frequency set-up and 'tinymembench' after booting done.
And devfreq sysfs node is also different from linux-stable.
/sys/class/devfreq/10c20000.memory-controller/
Code: Select all
# cat /sys/class/devfreq/devfreq0/governor
simple_ondemand
root@odroid:~# echo 825000000 > /sys/class/devfreq/devfreq0/max_freq
root@odroid:~# echo 825000000 > /sys/class/devfreq/devfreq0/min_freq
root@odroid:~# cat /sys/class/devfreq/devfreq0/cur_freq
825000000
root@odroid:~# mbw 100 | grep AVG
AVG Method: MEMCPY Elapsed: 0.05163 MiB: 100.00000 Copy: 1936.952 MiB/s
AVG Method: DUMB Elapsed: 0.05595 MiB: 100.00000 Copy: 1787.406 MiB/s
AVG Method: MCBLOCK Elapsed: 0.02208 MiB: 100.00000 Copy: 4529.765 MiB/s
root@odroid:~# echo 165000000 > /sys/class/devfreq/devfreq0/min_freq
root@odroid:~# echo 165000000 > /sys/class/devfreq/devfreq0/max_freq
root@odroid:~# cat /sys/class/devfreq/devfreq0/cur_freq
165000000
root@odroid:~# mbw 100 | grep AVG
AVG Method: MEMCPY Elapsed: 0.35779 MiB: 100.00000 Copy: 279.493 MiB/s
AVG Method: DUMB Elapsed: 0.35230 MiB: 100.00000 Copy: 283.851 MiB/s
AVG Method: MCBLOCK Elapsed: 0.20465 MiB: 100.00000 Copy: 488.636 MiB/s
and found some discussions and patches.
2. Related threads and Patches
Please refer to these links.
(updated)
Code: Select all
memory: samsung: exynos5422-dmc: Adjust polling interval and uptreshold
memory: samsung: exynos5422-dmc: Add module param to control IRQ mode
https://lore.kernel.org/linux-pm/82080e ... arm.com/T/
https://lore.kernel.org/linux-pm/202007 ... a@arm.com/
https://kernel.googlesource.com/pub/scm ... ding-edge/
And here is my test results.
I got the similar result regardless of any conditions.
Code: Select all
# taskset -c 0 /home/odroid/tinymembench/tinymembench
---
standard memcpy : 326.5 MB/s
standard memset : 793.3 MB/s
---
# taskset -c 4 /home/odroid/tinymembench/tinymembench
---
standard memcpy : 2309.7 MB/s
standard memset : 4843.3 MB/s (0.5%)
---
and I think it must be similar (not exactly same) with kernel 5.4.28 tag of linux-stable for Armbian target.
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Code: Select all
diff --git a/arch/arm/configs/odroidxu4_defconfig b/arch/arm/configs/odroidxu4_defconfig
index 661a849..33aea5d 100644
--- a/arch/arm/configs/odroidxu4_defconfig
+++ b/arch/arm/configs/odroidxu4_defconfig
@@ -5413,10 +5413,11 @@ CONFIG_EXTCON=y
CONFIG_EXTCON_USB_GPIO=m
# CONFIG_EXTCON_USBC_CROS_EC is not set
CONFIG_MEMORY=y
+CONFIG_DDR=y
# CONFIG_ARM_PL172_MPMC is not set
CONFIG_PL353_SMC=y
CONFIG_SAMSUNG_MC=y
-# CONFIG_EXYNOS5422_DMC is not set
+CONFIG_EXYNOS5422_DMC=y
CONFIG_EXYNOS_SROM=y
CONFIG_IIO=y
CONFIG_IIO_BUFFER=y
@@ -6599,8 +6600,6 @@ CONFIG_KASAN_STACK=1
# end of Memory Debugging
CONFIG_ARCH_HAS_KCOV=y
-CONFIG_CC_HAS_SANCOV_TRACE_PC=y
-# CONFIG_KCOV is not set
#
# Debug Lockups and Hangs
diff --git a/drivers/memory/samsung/exynos5422-dmc.c b/drivers/memory/samsung/exynos5422-dmc.c
index bdb264b..9fc3134 100644
--- a/drivers/memory/samsung/exynos5422-dmc.c
+++ b/drivers/memory/samsung/exynos5422-dmc.c
@@ -12,6 +12,7 @@
#include <linux/io.h>
#include <linux/mfd/syscon.h>
#include <linux/module.h>
+#include <linux/moduleparam.h>
#include <linux/of_device.h>
#include <linux/pm_opp.h>
#include <linux/platform_device.h>
@@ -21,6 +22,10 @@
#include "../jedec_ddr.h"
#include "../of_memory.h"
+static int irqmode;
+module_param(irqmode, int, 0644);
+MODULE_PARM_DESC(irqmode, "Enable IRQ mode (0=off [default], 1=on)");
+
#define EXYNOS5_DREXI_TIMINGAREF (0x0030)
#define EXYNOS5_DREXI_TIMINGROW0 (0x0034)
#define EXYNOS5_DREXI_TIMINGDATA0 (0x0038)
@@ -945,6 +950,7 @@ static int exynos5_dmc_get_cur_freq(struct device *dev, unsigned long *freq)
* It provides to the devfreq framework needed functions and polling period.
*/
static struct devfreq_dev_profile exynos5_dmc_df_profile = {
+ .timer = DEVFREQ_TIMER_DELAYED,
.target = exynos5_dmc_target,
.get_dev_status = exynos5_dmc_get_status,
.get_cur_freq = exynos5_dmc_get_cur_freq,
@@ -1432,7 +1438,7 @@ static int exynos5_dmc_probe(struct platform_device *pdev)
/* There is two modes in which the driver works: polling or IRQ */
irq[0] = platform_get_irq_byname(pdev, "drex_0");
irq[1] = platform_get_irq_byname(pdev, "drex_1");
- if (irq[0] > 0 && irq[1] > 0) {
+ if (irq[0] > 0 && irq[1] > 0 && irqmode) {
ret = devm_request_threaded_irq(dev, irq[0], NULL,
dmc_irq_thread, IRQF_ONESHOT,
dev_name(dev), dmc);
@@ -1470,10 +1476,10 @@ static int exynos5_dmc_probe(struct platform_device *pdev)
* Setup default thresholds for the devfreq governor.
* The values are chosen based on experiments.
*/
- dmc->gov_data.upthreshold = 30;
+ dmc->gov_data.upthreshold = 10;
dmc->gov_data.downdifferential = 5;
- exynos5_dmc_df_profile.polling_ms = 500;
+ exynos5_dmc_df_profile.polling_ms = 100;
}
@@ -1489,7 +1495,7 @@ static int exynos5_dmc_probe(struct platform_device *pdev)
if (dmc->in_irq_mode)
exynos5_dmc_start_perf_events(dmc, PERF_COUNTER_START_VALUE);
- dev_info(dev, "DMC initialized\n");
+ dev_info(dev, "DMC initialized, in irq mode: %d\n", dmc->in_irq_mode);
return 0;
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index 2bae9ed..faf4148 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -30,6 +30,13 @@
#define DEVFREQ_PRECHANGE (0)
#define DEVFREQ_POSTCHANGE (1)
+/* DEVFREQ work timers */
+enum devfreq_timer {
+ DEVFREQ_TIMER_DEFERRABLE = 0,
+ DEVFREQ_TIMER_DELAYED,
+ DEVFREQ_TIMER_NUM,
+};
+
struct devfreq;
struct devfreq_governor;
@@ -69,6 +76,7 @@ struct devfreq_dev_status {
* @initial_freq: The operating frequency when devfreq_add_device() is
* called.
* @polling_ms: The polling interval in ms. 0 disables polling.
+ * @timer: Timer type is either deferrable or delayed timer.
* @target: The device should set its operating frequency at
* freq or lowest-upper-than-freq value. If freq is
* higher than any operable frequency, set maximum.
@@ -95,6 +103,7 @@ struct devfreq_dev_status {
struct devfreq_dev_profile {
unsigned long initial_freq;
unsigned int polling_ms;
+ enum devfreq_timer timer;
int (*target)(struct device *dev, unsigned long *freq, u32 flags);
int (*get_dev_status)(struct device *dev,
So, I hope somebody can confirm it on Armbian and with a conditions that EXYNOS5422_DMC config is activated.
Last edited by joy on Mon Aug 10, 2020 6:34 pm, edited 1 time in total.
- lanefu
- Posts: 5
- Joined: Tue Jun 30, 2020 9:35 pm
- languages_spoken: english
- ODROIDs: N2, MC1-solo
- Has thanked: 2 times
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
Hello thanks for identifying and testing those patches.
I applied the following patches on the Armbian-dev kernel 5.7.13 on a mc1-solo
https://patchwork.kernel.org/patch/11657269/
https://patchwork.kernel.org/patch/11657275/
I'll merge this fix in for dev immediately, but going to wait on the -current (5.4) until after armbian v20.08 release.

I applied the following patches on the Armbian-dev kernel 5.7.13 on a mc1-solo
https://patchwork.kernel.org/patch/11657269/
https://patchwork.kernel.org/patch/11657275/
Code: Select all
taskset -c 0 /usr/local/src/tinymembench/tinymembench
---
standard memcpy : 360.4 MB/s
standard memset : 591.6 MB/s
---
taskset -c 4 /usr/local/src/tinymembench/tinymembench
---
standard memcpy : 2312.2 MB/s
standard memset : 4921.0 MB/s (1.0%)
---
-
- Posts: 1584
- Joined: Fri Oct 02, 2015 1:44 pm
- languages_spoken: english
- ODROIDs: .
- Has thanked: 179 times
- Been thanked: 210 times
- Contact:
Re: HC2 performance regression on kernel 5.*
Hi @lanefu,
Thank you for confirming it on Armbian and sharing the status here!
Yes. The two patches you picked are correct ones.
Thank you for confirming it on Armbian and sharing the status here!

Yes. The two patches you picked are correct ones.
lanefu wrote: ↑Thu Aug 06, 2020 7:58 amI applied the following patches on the Armbian-dev kernel 5.7.13 on a mc1-solo
https://patchwork.kernel.org/patch/11657269/
https://patchwork.kernel.org/patch/11657275/
-
- Posts: 37
- Joined: Sun Aug 18, 2019 7:04 pm
- languages_spoken: english
- ODROIDs: C2, XU4, N2, N2+
- Has thanked: 3 times
- Been thanked: 3 times
- Contact:
Re: HC2 performance regression on kernel 5.*
Can any of you guys share the contents of the boot.ini file for a 5.x kernel?
- odroid
- Site Admin
- Posts: 39117
- Joined: Fri Feb 22, 2013 11:14 pm
- languages_spoken: English, Korean
- ODROIDs: ODROID
- Has thanked: 2513 times
- Been thanked: 1382 times
- Contact:
Re: HC2 performance regression on kernel 5.*
boot.ini
and config.ini
files in our Ubuntu 20.04 (kernel 5.4) image.https://github.com/mdrjr/5422_bootini/tree/5.4
-
- Posts: 14
- Joined: Thu Aug 29, 2019 4:53 am
- languages_spoken: english
- ODROIDs: ODROID-HC2
- Has thanked: 1 time
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
Unfortunately it appears this regression is back in 5.4.142-odroidxu4 (ODROID-HC2) http://ix.io/3zgD
- odroid
- Site Admin
- Posts: 39117
- Joined: Fri Feb 22, 2013 11:14 pm
- languages_spoken: English, Korean
- ODROIDs: ODROID
- Has thanked: 2513 times
- Been thanked: 1382 times
- Contact:
Re: HC2 performance regression on kernel 5.*
The latest kernel update package 5.4.149-231 might fix the issue.zerodroid wrote: ↑Sun Sep 19, 2021 4:16 amUnfortunately it appears this regression is back in 5.4.142-odroidxu4 (ODROID-HC2) http://ix.io/3zgD
Please try it and let us know results.
-
- Posts: 14
- Joined: Thu Aug 29, 2019 4:53 am
- languages_spoken: english
- ODROIDs: ODROID-HC2
- Has thanked: 1 time
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
Thanks. I tried the nightly kernel 5.4.149-odroidxu4 but it didn't help: http://ix.io/3Agiodroid wrote: ↑Tue Sep 28, 2021 2:25 pmThe latest kernel update package 5.4.149-231 might fix the issue.zerodroid wrote: ↑Sun Sep 19, 2021 4:16 amUnfortunately it appears this regression is back in 5.4.142-odroidxu4 (ODROID-HC2) http://ix.io/3zgD
Please try it and let us know results.
- odroid
- Site Admin
- Posts: 39117
- Joined: Fri Feb 22, 2013 11:14 pm
- languages_spoken: English, Korean
- ODROIDs: ODROID
- Has thanked: 2513 times
- Been thanked: 1382 times
- Contact:
Re: HC2 performance regression on kernel 5.*
Try 5.4.150-233.
It should fix the issue probably.
It should fix the issue probably.
standard memcpy : 2283.2 MB/s
standard memset : 4844.5 MB/s (1.0%)
-
- Posts: 14
- Joined: Thu Aug 29, 2019 4:53 am
- languages_spoken: english
- ODROIDs: ODROID-HC2
- Has thanked: 1 time
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
Could you please show the results for the the little cores? That is where the problem is.
Under normal circumstances the little cores memcpy and memset should be around 340MB/s and 800MB/s, respectively.
With the nightly kernel 5.4.149-odroidxu4 my test showed: http://ix.io/3Agi
For the big cores it is similar to what you found:
But the little cores are degraded:standard memcpy : 2299.0 MB/s (10.9%)
standard memset : 4213.0 MB/s (11.0%)
standard memcpy : 84.6 MB/s
standard memset : 278.5 MB/s (0.2%)
-
- Site Admin
- Posts: 11806
- Joined: Fri Feb 22, 2013 11:34 pm
- languages_spoken: english, portuguese
- ODROIDs: -
- Location: Brazil
- Has thanked: 1 time
- Been thanked: 48 times
- Contact:
Re: HC2 performance regression on kernel 5.*
With 5.4.150
Double check your kernel from where you get to be updated from our github
On the little cores.. so looks ok to me now.---
standard memcpy : 328.6 MB/s
standard memset : 795.5 MB/s
---
Double check your kernel from where you get to be updated from our github
-
- Posts: 14
- Joined: Thu Aug 29, 2019 4:53 am
- languages_spoken: english
- ODROIDs: ODROID-HC2
- Has thanked: 1 time
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
That's great news. Currently the latest kernel available on armbian for the xu4/hc2 is the nightly 5.4.149. I'll try 5.4.150 once it becomes available and post my results.
-
- Posts: 14
- Joined: Thu Aug 29, 2019 4:53 am
- languages_spoken: english
- ODROIDs: ODROID-HC2
- Has thanked: 1 time
- Been thanked: 2 times
- Contact:
Re: HC2 performance regression on kernel 5.*
I can confirm kernel 5.4.150 fixes the issue. http://ix.io/3B3R
Small cores:
Small cores:
Big cores:standard memcpy : 329.1 MB/s
standard memset : 799.0 MB/s
Thank you!standard memcpy : 2378.0 MB/s (2.0%)
standard memset : 4866.0 MB/s (1.1%)
Who is online
Users browsing this forum: No registered users and 1 guest