No performance difference between 1.5, 1.75 & 2GHz

Kwiboo
Posts: 15
Joined: Mon Aug 08, 2016 10:27 am
languages_spoken: english, swedish
ODROIDs: C1+, C2
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by Kwiboo » Sun Sep 04, 2016 9:03 pm

cyrozap wrote:My plan is to finish analyzing BL30, then move on to writing a FOSS replacement for aml_encrypt_gxb (for the sake of OUR ESSENTIAL SOFTWARE FREEDOM... and so non-x86-64 users can sign binaries without qemu)
I created a small tool that at least can "sign" BL20 and I think the same principal is used to sign rest of the BL blobs (they are not encrypted for the C2 and only have a checksum as far as I know), see the aml_chksum tool I added in https://github.com/Kwiboo/u-boot/commit ... f0024b718f
Suggest you also check https://github.com/Kwiboo/u-boot/commit ... 1-20160818 that have full amlogic git history up to 2016-07-01 and a import of the 2016-08-18 source release diff. At one point (September 2015) it had the complete BL20 source before it was closed-source.

tkaiser
Posts: 671
Joined: Mon Nov 09, 2015 12:30 am
languages_spoken: english
ODROIDs: C1+, C2, XU4, HC1
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by tkaiser » Sun Sep 04, 2016 11:47 pm

odroid wrote:We are pushing Amlogic to release a new bl30.bin to try 1.75Ghz and higher clock.
To be honest, me personally would go the other direction (reliability first, lowering consumption while retaining performance). But my use case is different, we (Armbian) support a wide variety of SBC and try to optimize settings for server/headless use. ODROID-C2 is simply awesome even when combined with the slowest of your eMMC modules (database use case).

I've a setup here where I can pretty precisely measure consumption (between PSU and board using another SBC, its PMIC and average values) and wanted to try out the effects of completely disabling GPU/display and lowering DRAM frequency. Can you please point me in the right direction where to adjust what?

Apart from that I'm somewhat puzzled. Most of the people and even HK seem to trust in sysbench numbers. On C2 sysbench with 20.000 settings runs very short. Too short if you look at 'execution time (avg/stddev)' -- if standard deviation is not 0.00 I would already drop the entire test result. With a headless Armbian image I can drop 3 out of 10 sysbench runs on average so I don't want to imagine what happens when an OS image with desktop is tested. With such an insanely quick test execution I would always run at least 10 tests and look how results differ (in this case slight background activity for a few seconds leading already to a result variation of 6.8053 vs 5.9884 seconds -- that's HUGE)

Code: Select all

tk@odroidc2:~$ for i in {1..10}; do sysbench --test=cpu --cpu-max-prime=20000 run --num-threads=$(grep -c '^processor' /proc/cpuinfo) | awk -F" " '/execution time/ {print $4}' | cut -f1 -d/; done | datamash max 1 min 1 mean 1 median 1
6.8053	5.9884	6.21036	6.04305
Next problem with sysbench's cpu test is that it absolutely not depends on memory bandwidth (calculating prime numbers and accessing CPU caches solely -- which other or real-world workload is remotely comparable?). So when looking at sysbench numbers and increasing clockspeeds from 1536 MHz to 1680 MHz these numbers look 9 percent better while any real-world application that depends on memory bandwidth will show only a performance gain of less than 5 percent (both laughable in my opinion). I would recommend to test (also) with eg cpuminer https://github.com/pooler/cpuminer with '-mfpu=neon'. It's more heavy than sysbench (but not that much as cpuburn-a53), depends a lot on memory bandwidth and provides a benchmark mode which is great to tune performance settings in interactive mode.

I quoted above a statement regarding bl30.bin and asked where DRAM clockspeed is configured for a reason. Since with Allwinner SoCs where we have full control over all these settings it's pretty easy to fake benchmark results when a device is known to overheat and when the target audience blindly believes in sysbench numbers: Simply lower DRAM clockspeed, this will reduce consumption and temperature (also real-world performance of course!) but improves sysbench numbers since throttling starts later and cpufreq can remain higher. By configuring the device to run slower benchmark numbers look higher. I wouldn't call sysbench crap but at least one should exactly know what these numbers tell.

indium
Posts: 89
Joined: Thu May 28, 2015 2:27 pm
languages_spoken: english, ukrainian
Location: Ukraine
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by indium » Mon Sep 05, 2016 12:11 am

Kwiboo wrote: I created a small tool that at least can "sign" BL20 ...
And it gets loaded by the rom code and can run in Secure state? Can handle Monitor exceptions?
If so, then Amlogic screwed up even more than with fake 2GHz.

Kwiboo
Posts: 15
Joined: Mon Aug 08, 2016 10:27 am
languages_spoken: english, swedish
ODROIDs: C1+, C2
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by Kwiboo » Mon Sep 05, 2016 1:29 am

indium wrote:And it gets loaded by the rom code and can run in Secure state? Can handle Monitor exceptions?
If so, then Amlogic screwed up even more than with fake 2GHz.
The rom code loads BL2 that in turn loads all other blobs, u-boot and finally starts Linux (tested with LibreELEC using a Amlogic 20160701 kernel), unsure if that means it runs in Secure state and can handle Monitor exceptions.
This was with newer Amlogic blX.bin blobs from the 20160504 source release, but I suspect it works the same for the older blobs used by HK.

For the C2 the rom code only seems to check the SHA256 checksum before BL2 is loaded and run-
See https://github.com/Kwiboo/u-boot/blob/b ... cureboot.c for the code that helped me figure out how the checksum was calculated.
That code contains bugs (using wrong size/length variables) and do not work for SHA256 only signature checks. I suspect the C2 BL1 was compiled without the CONFIG_AML_SECURE_UBOOT flag but using a working closed-sourced version.

bl2.package is never actually used for anything on the C2 as it is overwritten by bl1.bin.hardkernel (BL2 mislabeled).
bl1.bin.hardkernel is a modified BL2 that has been padded by acs_tool.pyc and has custom HK code at 1024-1315 that moves the BL2 in memory by 512 bytes to make it possible to boot with both SD and eMMC.
Last edited by Kwiboo on Mon Sep 05, 2016 3:00 am, edited 1 time in total.

mlinuxguy
Posts: 793
Joined: Thu Feb 28, 2013 10:28 am
languages_spoken: english
ODROIDs: X, X2, XU, XU3, XU4, C1, C1+, C2, N1, USB-IO
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by mlinuxguy » Mon Sep 05, 2016 2:50 am

tkaiser wrote:I would recommend to test (also) with eg cpuminer https://github.com/pooler/cpuminer with '-mfpu=neon'. It's more heavy than sysbench (but not that much as cpuburn-a53), depends a lot on memory bandwidth and provides a benchmark mode which is great to tune performance settings in interactive mode.
I pulled cpuminer down and ran the benchmark at 1.75ghz

Code: Select all

[2016-09-04 12:45:05] thread 0: 6646 hashes, 1.33 khash/s
[2016-09-04 12:45:10] thread 1: 6840 hashes, 1.37 khash/s
[2016-09-04 12:45:10] thread 2: 6810 hashes, 1.36 khash/s
[2016-09-04 12:45:10] thread 3: 6801 hashes, 1.36 khash/s
[2016-09-04 12:45:10] Total: 5.42 khash/s
[2016-09-04 12:45:10] thread 0: 6639 hashes, 1.33 khash/s
[2016-09-04 12:45:15] thread 2: 6818 hashes, 1.36 khash/s
[2016-09-04 12:45:15] thread 3: 6819 hashes, 1.36 khash/s
[2016-09-04 12:45:15] Total: 5.42 khash/s
To get it to compile on my system I needed to:
# apt-get install libcurlpp-dev libcurl4-openssl-dev
add this line into the top of Makefile.am: ACLOCAL_AMFLAGS = -I m4
./autogen.sh
./configure CFLAGS="-O3"
make
./minerd --benchmark

minerd with 2 threads at 1.896ghz
:~/cpuminer# ./minerd --benchmark -t 2

Code: Select all

[2016-09-04 12:54:18] thread 1: 4096 hashes, 1.49 khash/s
[2016-09-04 12:54:18] thread 0: 4096 hashes, 1.45 khash/s
[2016-09-04 12:54:20] thread 1: 2976 hashes, 1.49 khash/s
[2016-09-04 12:54:20] Total: 2.94 khash/s
[2016-09-04 12:54:20] thread 0: 2901 hashes, 1.46 khash/s
[2016-09-04 12:54:25] thread 1: 7474 hashes, 1.49 khash/s
[2016-09-04 12:54:25] Total: 2.95 khash/s
[2016-09-04 12:54:25] thread 0: 7306 hashes, 1.46 khash/s
[2016-09-04 12:54:30] thread 1: 7464 hashes, 1.49 khash/s
[2016-09-04 12:54:30] Total: 2.95 khash/s
[2016-09-04 12:54:30] thread 0: 7291 hashes, 1.46 khash/s
minerd with 1 thread at 1.92ghz

Code: Select all

root@odroid64-1:~/cpuminer# ./minerd --benchmark -t 1
[2016-09-04 13:01:14] 1 miner threads started, using 'scrypt' algorithm.
[2016-09-04 13:01:16] thread 0: 4096 hashes, 1.48 khash/s
[2016-09-04 13:01:16] Total: 1.48 khash/s
[2016-09-04 13:01:19] thread 0: 4443 hashes, 1.48 khash/s
[2016-09-04 13:01:19] Total: 1.48 khash/s
[2016-09-04 13:01:24] thread 0: 7403 hashes, 1.48 khash/s
[2016-09-04 13:01:24] Total: 1.48 khash/s
Note: If I show 1 thread or 2 threads it's because that is the number of CPU's I can get it to stably run at for that frequency

For reference here is 1 thread of sysbench at 1.92ghz
19.47 / 4 = 4.87 (if we had 3 other cpu's)

Code: Select all

# sysbench --test=cpu --cpu-max-prime=20000 --num-threads=1 run
sysbench 0.4.12:  multi-threaded system evaluation benchmark
Number of threads: 1
Test execution summary:
    total time:                          19.4762s
    total number of events:              10000
    total time taken by event execution: 19.4735
    per-request statistics:
         min:                                  1.94ms
         avg:                                  1.95ms
         max:                                  6.73ms
         approx.  95 percentile:               1.95ms

Threads fairness:
    events (avg/stddev):           10000.0000/0.00
    execution time (avg/stddev):   19.4735/0.00

tkaiser
Posts: 671
Joined: Mon Nov 09, 2015 12:30 am
languages_spoken: english
ODROIDs: C1+, C2, XU4, HC1
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by tkaiser » Mon Sep 05, 2016 3:56 am

mlinuxguy wrote: I pulled cpuminer down and ran the benchmark at 1.75ghz

Code: Select all

[2016-09-04 12:45:10] Total: 5.42 khash/s
Thanks for the numbers (especially those with less cores, maybe we can come up with sane dvfs settings allowing highest single-thread performance at up to 2GHz?). What do you get with 4 cores running at 1536 MHz? I got 4.76 khash/s which would scale linearly with cpufreq (and no relationship to DRAM clockspeed unless this is adjusted dynamically with S905?). Can you point where to look into to adjust DRAM clockspeed and disable GPU entirely?

BTW: Another great tool in such a situation like now (testing through dvfs settings to come up with best performance without affecting reliability through undervoltage) is Linpack/OpenBlas with NEON optimizations (not the distro's standard hpcc package but this one). This tool led to undervolted RPi 3 dvfs settings being detected (not temperature: https://www.raspberrypi.org/forums/view ... &p=927615"), we used it in March to develop sane dvfs settings for Pine64/A64 and and it showed me recently that I need a new fan and PSU for NanoPi M3 (octa-core Samsung/Nexell Cortex-A53 that gets both pretty hot and power hungry when doing CPU intensive stuff). With Linpack I was not able to exceed 900 MHz since otherwise the board deadlocked and with cpuminer I had to revert back to 1300 MHz cpufreq since my fan was too weak and with 1400 MHz throttling started in an inefficient way which led to lower numbers than compared with 1300 MHz (9.96 khash/s): http://forum.armbian.com/index.php/topi ... 5/?p=14697.

Anyway: how to lower DRAM clockspeed and disable GPU?

mlinuxguy
Posts: 793
Joined: Thu Feb 28, 2013 10:28 am
languages_spoken: english
ODROIDs: X, X2, XU, XU3, XU4, C1, C1+, C2, N1, USB-IO
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by mlinuxguy » Mon Sep 05, 2016 4:17 am

tkaiser wrote:Can you point where to look into to adjust DRAM clockspeed and disable GPU entirely?
My previous post on the page back from this I show I cannot adjust DDR3 timings, Amlogic has them setup in the bl30.bin blob
ref: previous entry: --> http://forum.odroid.com/viewtopic.php?f ... 50#p158397
HK sent me an email that they are investigating DDR3 timings

GPU I turned off in boot.ini (though it appears to only save ram if you have no HDMI cable hooked up)
Not sure it really shuts down power to the GPU which is what we really want
Another question for HK: HOWTO disable GPU power for headless

1.536ghz

Code: Select all

root@odroid64-1:~# ./dumpfreq.sh
scaling_cur_freq: 100000
Scaling_available_frequencies: 100000 250000 500000 1000000 1296000 1536000
Scaling_max_freq: 1536000
Scaling_governor: ondemand
setting performance governor
CPUMINER benchmark at 1.536ghz

Code: Select all

root@odroid64-1:~/cpuminer# ./minerd --benchmark
[2016-09-04 14:16:44] 4 miner threads started, using 'scrypt' algorithm.
[2016-09-04 14:16:44] Binding thread 0 to cpu 0
[2016-09-04 14:16:44] Binding thread 1 to cpu 1
[2016-09-04 14:16:44] Binding thread 2 to cpu 2
[2016-09-04 14:16:44] Binding thread 3 to cpu 3
[2016-09-04 14:16:48] thread 1: 4096 hashes, 1.19 khash/s
[2016-09-04 14:16:48] thread 2: 4096 hashes, 1.19 khash/s
[2016-09-04 14:16:48] thread 3: 4096 hashes, 1.18 khash/s
[2016-09-04 14:16:48] thread 0: 4096 hashes, 1.15 khash/s
[2016-09-04 14:16:49] thread 1: 1186 hashes, 1.19 khash/s
[2016-09-04 14:16:49] thread 2: 1186 hashes, 1.19 khash/s
[2016-09-04 14:16:49] thread 3: 1182 hashes, 1.19 khash/s
[2016-09-04 14:16:49] Total: 4.71 khash/s
[2016-09-04 14:16:49] thread 0: 1154 hashes, 1.16 khash/s
[2016-09-04 14:16:54] thread 1: 5939 hashes, 1.19 khash/s
[2016-09-04 14:16:54] thread 2: 5929 hashes, 1.19 khash/s
[2016-09-04 14:16:54] thread 3: 5937 hashes, 1.19 khash/s
[2016-09-04 14:16:54] Total: 4.72 khash/s

crashoverride
Posts: 4223
Joined: Tue Dec 30, 2014 8:42 pm
languages_spoken: english
ODROIDs: C1
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by crashoverride » Mon Sep 05, 2016 4:42 am

mlinuxguy wrote: HOWTO disable GPU power for headless
If you are not using it then it should be in a low power state. How low of a power state (off) depends on the power domain. The datasheet indicates what power domain Mali is connected to. Unlike PCs, ARM SoC blocks are designed to be turned on and off as needed. To be certain, just remove the Mali entry from the device tree in boot.ini. "nographics" currently does not do this.
tkaiser wrote:Anyway: how to lower DRAM clockspeed and disable GPU?
I think you guys are chasing a red herring on DRAM clock speed. The RPi post mentioned earlier is about less aggressive over clocking of DRAM. On RPi the DRAM controller is part of the VC4. Its entirely different on S905. Unfortunately, I am sure the only way this will be "let go" is for someone to change the frequency and still observe the same failures. That data can arrive from DRAM perfectly fine and still be corrupted in the ARM registers due to processor speed.

cyrozap
Posts: 2
Joined: Sat Sep 03, 2016 2:14 pm
languages_spoken: english
ODROIDs: C2
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by cyrozap » Mon Sep 05, 2016 5:43 am

Kwiboo wrote:Suggest you also check https://github.com/Kwiboo/u-boot/commit ... 1-20160818 that have full amlogic git history up to 2016-07-01 and a import of the 2016-08-18 source release diff. At one point (September 2015) it had the complete BL20 source before it was closed-source.
Kwiboo wrote:The rom code loads BL2 that in turn loads all other blobs, u-boot and finally starts Linux (tested with LibreELEC using a Amlogic 20160701 kernel), unsure if that means it runs in Secure state and can handle Monitor exceptions.
Kwiboo wrote:For the C2 the rom code only seems to check the SHA256 checksum before BL2 is loaded and run-
See https://github.com/Kwiboo/u-boot/blob/b ... cureboot.c for the code that helped me figure out how the checksum was calculated.
Awesome, so that means we can potentially replace BL2 as well, leaving the true BL1 in ROM as the only absolutely non-replaceable component. Is that ROM available in memory for easy dumping? Even if we can't replace it, it would still be nice to know what it does.
Kwiboo wrote:bl2.package is never actually used for anything on the C2 as it is overwritten by bl1.bin.hardkernel (BL2 mislabeled).
That explains why a bunch of the strings I found in the bl1.bin.hardkernel binary mentioned BL2, and why bl2.package looks like random garbage data...
Kwiboo wrote:bl1.bin.hardkernel is a modified BL2 that has been padded by acs_tool.pyc and has custom HK code at 1024-1315 that moves the BL2 in memory by 512 bytes to make it possible to boot with both SD and eMMC.
I'm not sure if you're aware, but you can very easily decompile Python byte-code with a tool like uncompyle6, so acs_tool.pyc can be replaced with a Free version, too.

Thanks for your help!

Kwiboo
Posts: 15
Joined: Mon Aug 08, 2016 10:27 am
languages_spoken: english, swedish
ODROIDs: C1+, C2
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by Kwiboo » Mon Sep 05, 2016 6:47 am

cyrozap wrote:Is that ROM available in memory for easy dumping? Even if we can't replace it, it would still be nice to know what it does.
This is way beyond my knowledge so do not expect an answer from me.

My interest was mostly to get IR/CEC wake-up from power-off mode to work, but the HK bl301.bin source code was not available at the time so I used the firmware code from the Amlogic source release instead.
I only researched how the BL2 was signed with the single goal of making it possible to use the same u-boot.bin for both SD and eMMC boot (the same way that can be done with bl1.bin.hardkernel).
Since I managed to modify BL2 and write working header/checksum I wanted to let you know my findings as it correlated with your research.
cyrozap wrote:I'm not sure if you're aware, but you can very easily decompile Python byte-code with a tool like uncompyle6, so acs_tool.pyc can be replaced with a Free version, too.
Cool, then I guess it should be rather easy to write a tool to replace/combine fip_create, acs_tool.pyc and aml_encrypt_gxb when creating the final u-boot.bin blob.

brad
Posts: 775
Joined: Tue Mar 29, 2016 1:22 pm
languages_spoken: english
ODROIDs: C2 N1
Location: Australia
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by brad » Mon Sep 05, 2016 11:33 am

I have done some testing and experimenting and here are my notes, take them with a grain of salt I may be on the wrong track

1.752 x 4 cores at times crashing, usually when thermal scales back or if I start a 5th process with the 4 cpuburner threads running. This only occurs with cpuburn running although I suspect the other stress testing does not load the cpu this much. The crashes are generally "invalid instruction", an occur in random processes, top, sensors, cpuburn-a53. From others I suspected a cache / ddr issue or maybe a scheduler that cannot keep up.

Attempted to disable multi core features, and simplify memory and scheduler support. I found many of the amlogic drivers and ports require specific drivers. I disabled HDMI port, Mali, VPU, display driver, audio codec, meson_timer, video codec and a few more features in the kernel requiring the hardkernel specific memory setup and managed to get it to almost compile after a few typos I found and fixed in the kernel source using the generic arm memory controller. Got all the way through the compile until it failed at the end linking the memory page controller and odroidc2 battery controller (was complaining about no references to codecs). Plan to attempt to try overcome the last 2 hurdles when I have time.

I found crashes where more likely to occur during frequency changes when thermal throttling (Maybe cannot handle the CPU delay when switching or exaggerates the problem). I have played with the cpu voltage tables to give a more uniform and smoother transition of voltage which appears to have slightly increased the stability but still seeing crashes at times with cpuburn. I decreased the board thermal limit so it runs no hotter than approx 77C. The main changes were to set the high frequencies to core voltage of 1.140v (up from 1.100v), and I raised the lower frequencies to change no more than 0.01v per step. No real benefits gained. I Also attempted to reduce the core voltage down to 1v at higher frequencies but failed as soon at it run.

Theories:

- Reached a hardware limit somewhere, possibly thermal (yes captain obvious)
- L1 or L2 cache cannot keep up
- DDR cannot keep up
- A power regulator cannot supply the current or voltage required
- Scheduler cannot keep up, ie the M3 - Can we measure the load on the scheduler somehow?
- Bug in software settings or config.

Questions:

1) Can the cache voltages be modified? Arm specify that they should always be above the core voltages and I would love to know what they are, is there any adjustment in the Trusted Firmware to modify clock also?

2) Can the ddr voltage or frequency be modified? I see the ddr voltage is regulated at 1.5v by the MP2161GJ-C499 so cannot change voltage but can the clock be changed?

3) Can the M3 configu be modified in the Trusted firmware? Can clock speed or voltage be modified or can we tell the cpu to wait for it when it needs new work somehow instead of using cache / ddr when its not ready? Also how about the GIC, could an interrupt be taking too long and causing a bug in the code somewhere?

4) Smoothing out the voltage rails, Have access to the 5v and 3.3v via the GPIO pins which I can attempt to graph underload when I have the chance, also the 1.8v rail can be measured to some degree on the GPIO pins as there is an ADC output from the cpu. Not sure about measuring DDR 1.5v rail but might be able to find it somewhere on the board to access. I suspect I can place a capacitor on the GPIO pins for 5v & 3.3v rails but I doubt this would be of help.

5) Strongly recommend setting "Panic on Oops" in the kernel config when testing high frequencies. If kernel process Oops's under heavy load and freezes and the 4 cores are still loaded (with cpuburn) they continue to run (or spin) and the kernel looses control of thermal management. Board gets way to hot to the touch - this is probably a good way to melt / destroy your board.

Overall at the moment with current state of affairs C2 is limited to 1.68GHz with 4cpu's from my testing in cpuburn conditions, anything higher is prone to invalid instruction errors or kernel freezes even before chip temps get hot, so no amount of external cooling is going to help in my opinion for stability.

mlinuxguy
Posts: 793
Joined: Thu Feb 28, 2013 10:28 am
languages_spoken: english
ODROIDs: X, X2, XU, XU3, XU4, C1, C1+, C2, N1, USB-IO
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by mlinuxguy » Mon Sep 05, 2016 11:47 am

brad wrote:Overall at the moment with current state of affairs C2 is limited to 1.68GHz with 4cpu's from my testing in cpuburn conditions, anything higher is prone to invalid instruction errors or kernel freezes even before chip temps get hot, so no amount of external cooling is going to help in my opinion for stability.
How long do you run cpuburn before seeing temperature transition points hit? With the cut-down northbridge cooler I never even come close to the thermal limit settings (1.75ghz) after an hour of cpuburn.

You are on the stock cooler correct?

brad
Posts: 775
Joined: Tue Mar 29, 2016 1:22 pm
languages_spoken: english
ODROIDs: C2 N1
Location: Australia
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by brad » Mon Sep 05, 2016 11:51 am

Kwiboo wrote: I only researched how the BL2 was signed with the single goal of making it possible to use the same u-boot.bin for both SD and eMMC boot (the same way that can be done with bl1.bin.hardkernel).
Do you know what the BL301 is the amlogic arm trusted firmware, is this the actual boot code which would normally find its home in a ROM somewhere on the board? The foundation sources does not list this so I assume it only exists in the licenced version of the ATF? ATF doco for foundation module suggests (https://github.com/ARM-software/arm-tru ... -design.md)....
The cold boot path in this implementation of the ARM Trusted Firmware is divided into five steps (in order of execution):

Boot Loader stage 1 (BL1) AP Trusted ROM
Boot Loader stage 2 (BL2) Trusted Boot Firmware
Boot Loader stage 3-1 (BL31) EL3 Runtime Firmware
Boot Loader stage 3-2 (BL32) Secure-EL1 Payload (optional)
Boot Loader stage 3-3 (BL33) Non-trusted Firmware
We dont use BL32 which is the secure optional component.

brad
Posts: 775
Joined: Tue Mar 29, 2016 1:22 pm
languages_spoken: english
ODROIDs: C2 N1
Location: Australia
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by brad » Mon Sep 05, 2016 11:56 am

mlinuxguy wrote:
brad wrote:Overall at the moment with current state of affairs C2 is limited to 1.68GHz with 4cpu's from my testing in cpuburn conditions, anything higher is prone to invalid instruction errors or kernel freezes even before chip temps get hot, so no amount of external cooling is going to help in my opinion for stability.
How long do you run cpuburn before seeing temperature transition points hit? With the cut-down northbridge cooler I never even come close to the thermal limit settings (1.75ghz) after an hour of cpuburn.

You are on the stock cooler correct?
Yes stock board and cooler.

- Approx 3.5 mins to get from 35C up to 85C with thermal limits from hardkernel
- Approx 1.5 mins to get from 35C to around 75C with new thermal limits I added increasing core voltage to 1.14

mlinuxguy
Posts: 793
Joined: Thu Feb 28, 2013 10:28 am
languages_spoken: english
ODROIDs: X, X2, XU, XU3, XU4, C1, C1+, C2, N1, USB-IO
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by mlinuxguy » Mon Sep 05, 2016 1:28 pm

brad wrote:- Approx 3.5 mins to get from 35C up to 85C with thermal limits from hardkernel
- Approx 1.5 mins to get from 35C to around 75C with new thermal limits I added increasing core voltage to 1.14
My 4-cores at 1.75ghz (northbridge cooler)

Code: Select all

seconds, temp, freq
1775,68,1752000
1780,68,1752000
What are you modifying to get 1.14v ?
This file? /u-boot-hack/include/configs/odroidc2.h

Code: Select all

/* Platform power init config */
#define CONFIG_PLATFORM_POWER_INIT
#define CONFIG_VCCK_INIT_VOLTAGE    1100
#define CONFIG_VDDEE_INIT_VOLTAGE   1070    // voltage for power up
#define CONFIG_VDDEE_SLEEP_VOLTAGE  850 // voltage for suspend
My DDR3 changes in that file had no effect, its handled in BL30.bin.
So not sure if you changed voltage there it would work, if so very interesting...

Or this file?
/u-boot-hack/board/hardkernel/odroidc2/firmware/scp_task/dvfs_board.c
Last edited by mlinuxguy on Mon Sep 05, 2016 1:39 pm, edited 1 time in total.

mlinuxguy
Posts: 793
Joined: Thu Feb 28, 2013 10:28 am
languages_spoken: english
ODROIDs: X, X2, XU, XU3, XU4, C1, C1+, C2, N1, USB-IO
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by mlinuxguy » Mon Sep 05, 2016 1:36 pm

brad wrote:Attempted to disable multi core features, and simplify memory and scheduler support. I found many of the amlogic drivers and ports require specific drivers. I disabled HDMI port, Mali, VPU, display driver, audio codec, meson_timer, video codec and a few more features in the kernel requiring the hardkernel specific memory setup and managed to get it to almost compile after a few typos I found and fixed in the kernel source using the generic arm memory controller. Got all the way through the compile until it failed at the end linking the memory page controller and odroidc2 battery controller (was complaining about no references to codecs). Plan to attempt to try overcome the last 2 hurdles when I have time.
I successfully stripped down the DTS file to its minimum, compiled and built kernel + DTB
However, it hangs at "Starting kernel"
Not sure HK setup of the board expects GPU to be completely missing from the DTS tree (I also stripped out the codecs).

I've been trying to get the 4.8-rc4 kernel to use rootfs off NFS (eMMC doesn't work for me no matter how many patches I throw at it)
However with the 1gb nic freezing I doubt I'll be able to get NFS to work, will wait for that fix
The reasoning behind that is not only can I test the L2 cache with all the L2 fixes starting at 3.18 for arm64, but its also a bare minimum DTS tree with no GPU

brad
Posts: 775
Joined: Tue Mar 29, 2016 1:22 pm
languages_spoken: english
ODROIDs: C2 N1
Location: Australia
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by brad » Mon Sep 05, 2016 1:53 pm

mlinuxguy wrote:
brad wrote:- Approx 3.5 mins to get from 35C up to 85C with thermal limits from hardkernel
- Approx 1.5 mins to get from 35C to around 75C with new thermal limits I added increasing core voltage to 1.14
My 4-cores at 1.75ghz (northbridge cooler)

Code: Select all

seconds, temp, freq
1775,68,1752000
1780,68,1752000
What are you modifying to get 1.14v ?
This file? /u-boot-hack/include/configs/odroidc2.h

Code: Select all

/* Platform power init config */
#define CONFIG_PLATFORM_POWER_INIT
#define CONFIG_VCCK_INIT_VOLTAGE    1100
#define CONFIG_VDDEE_INIT_VOLTAGE   1070    // voltage for power up
#define CONFIG_VDDEE_SLEEP_VOLTAGE  850 // voltage for suspend
My DDR3 changes in that file had no effect, its handled in BL30.bin.
So not sure if you changed voltage there it would work, if so very interesting...
CPU scaling core frequency and voltage tables are found in U-boot under board/hardkernel/odroidc2/firmware/scp_task/dvfs_board.c

DDR Frequency should be in the include/configs/odroidc2.h .....

Code: Select all

/* Clock range : 384~1200MHz, should be multiple of 24 */
#define CONFIG_DDR_CLK			912
#define CONFIG_DDR_TYPE			CONFIG_DDR_TYPE_DDR3
I dont think the DDR voltage of 1.5v can be controlled as it is run from a fixed regulator on the board. I will to to modify DDR frequency to see if there is an improvement.

brad
Posts: 775
Joined: Tue Mar 29, 2016 1:22 pm
languages_spoken: english
ODROIDs: C2 N1
Location: Australia
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by brad » Mon Sep 05, 2016 2:09 pm

mlinuxguy wrote:
brad wrote:Attempted to disable multi core features, and simplify memory and scheduler support. I found many of the amlogic drivers and ports require specific drivers. I disabled HDMI port, Mali, VPU, display driver, audio codec, meson_timer, video codec and a few more features in the kernel requiring the hardkernel specific memory setup and managed to get it to almost compile after a few typos I found and fixed in the kernel source using the generic arm memory controller. Got all the way through the compile until it failed at the end linking the memory page controller and odroidc2 battery controller (was complaining about no references to codecs). Plan to attempt to try overcome the last 2 hurdles when I have time.
I successfully stripped down the DTS file to its minimum, compiled and built kernel + DTB
However, it hangs at "Starting kernel"
Not sure HK setup of the board expects GPU to be completely missing from the DTS tree (I also stripped out the codecs).

I've been trying to get the 4.8-rc4 kernel to use rootfs off NFS (eMMC doesn't work for me no matter how many patches I throw at it)
However with the 1gb nic freezing I doubt I'll be able to get NFS to work, will wait for that fix
The reasoning behind that is not only can I test the L2 cache with all the L2 fixes starting at 3.18 for arm64, but its also a bare minimum DTS tree with no GPU
Did you strip out the AML TTY serial console driver, believe this would be still required to get kernel output via the UART at boot. Is the blue light flashing? (you may have stripped out the led driver as well)

In regards to NFS on 4.x kernels it works fine at 100Mbit if you have the NFS async parameter set, if you set it to sync its run but slowly works fine as well. Gigabit still has issues. EMMC driver is slow (but working for me) in 4.8 at the moment but NFS 100Mbit performs much better. 4.8 on boot is locked to the frequency set in the hardkernel u-boot include/configs/odroidc2.h which is currently set to 1536 I believe. I need to find where it is set in the new mainline u-boot to be able to test different frequencies on 4.8 as there is no cpufreq support to modify inside the 4.8 kernel yet. Patches are currently in development for this so hopefully not too far away for a working setup where frequency can be scaled in the kernel. https://patchwork.kernel.org/project/li ... ogic/list/

mlinuxguy
Posts: 793
Joined: Thu Feb 28, 2013 10:28 am
languages_spoken: english
ODROIDs: X, X2, XU, XU3, XU4, C1, C1+, C2, N1, USB-IO
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by mlinuxguy » Mon Sep 05, 2016 2:15 pm

brad wrote:I will to to modify DDR frequency to see if there is an improvement.
That will not work, refer to my previous post here: http://forum.odroid.com/viewtopic.php?f ... 50#p158397

I emailed HK, Joy replied they are looking into it

tkaiser
Posts: 671
Joined: Mon Nov 09, 2015 12:30 am
languages_spoken: english
ODROIDs: C1+, C2, XU4, HC1
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by tkaiser » Mon Sep 05, 2016 4:58 pm

mlinuxguy wrote:HK sent me an email that they are investigating DDR3 timings ... HOWTO disable GPU power for headless
Just to be clear: Currently I know nothing about how things work with S905 and I also don't care about RPi stuff (RPI 3 is a toy lacking IO and network bandwidth and pretty much uninteresting). I've some experiences with Allwinner SoCs where everything can be configured (though you need to know what you're doing but you're free to destroy the device or choose settings that are dumb and lead to both high consumption and low performance).

Recent research showed that on Allwinner SoCs disabling both HDMI/GPU reduces consumption/temperatures a little and also increases memory bandwidth a lot. This is what I'm interested in since more memory bandwidth helps also with the use cases in question (databases, the ODROID's eMMC is simply awesome!). Your reported 4.72 khash/s at 1536 MHz are less than my 4.76 khash/s (so given these numbers are correct I benefit from higher memory bandwidth since cpuminer is sensitive here -- my Armbian install uses settings that are a few weeks old, no new and shiny bl30.bin and so on).

Seems when dealing with new versions of bl30.bin it could be an idea to check quickly https://github.com/ssvb/tinymembench results?

tkaiser
Posts: 671
Joined: Mon Nov 09, 2015 12:30 am
languages_spoken: english
ODROIDs: C1+, C2, XU4, HC1
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by tkaiser » Mon Sep 05, 2016 5:16 pm

crashoverride wrote: That data can arrive from DRAM perfectly fine and still be corrupted in the ARM registers due to processor speed.
Sure, one has to test the whole process. With Allwinner SoCs we used lima-memtester in the past: https://github.com/ssvb/lima-memtester

Idea behind: since CPU and GPU cores share DRAM access let the GPU cores run something heavy and memory dependant while running memtester on the CPU cores (test has to be done by many users). Collect results and use 24 MHz less than what has been reported as reliable by the weakest board (BTW: it's interesting how much heat even the dog slow and old Mali400MP2 is able to generate). As a result we use just 624 MHz in Armbian for all H3 boards since we can not trust in the vendor's 672 MHz any more :)

The reason why I'm talking about this DRAM stuff at all here is since I'm interested in disabling display/GPU fully (in the hope I get then less consumption, less temperatures but more memory bandwidth -- to be confirmed if this works with Amlogic at all) and since I wanted to point out that the memory controller also might be responsible for generating heat and increasing consumption so that lowering memory clockspeeds might help with an overall cooling/consumption strategy (and since you guys have to deal with blobs and were already fooled by CPU clockspeeds I would have a closer look what these blobs do here)

Kwiboo
Posts: 15
Joined: Mon Aug 08, 2016 10:27 am
languages_spoken: english, swedish
ODROIDs: C1+, C2
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by Kwiboo » Mon Sep 05, 2016 8:06 pm

brad wrote:Do you know what the BL301 is the amlogic arm trusted firmware, is this the actual boot code which would normally find its home in a ROM somewhere on the board? The foundation sources does not list this so I assume it only exists in the licenced version of the ATF? ATF doco for foundation module suggests (https://github.com/ARM-software/arm-tru ... -design.md)....
BL301 is a small part of BL30/SCP_BL2 running on the Cortex-M3 that isn't closed-source, it mainly have a secure, high and low task process loop for dvfs and suspend handling.
mlinuxguy wrote:My DDR3 changes in that file had no effect, its handled in BL30.bin.
So not sure if you changed voltage there it would work, if so very interesting...
power_init in power.c is called from BL2, this part of BL2 and has been moved to BL21 in newer Amlogic u-boot code.
ddr settings and timings in timing.c is built into acs.bin but is never merged into BL2 for the C2 since bl1.bin.hardkernel is used as BL2.

crashoverride
Posts: 4223
Joined: Tue Dec 30, 2014 8:42 pm
languages_spoken: english
ODROIDs: C1
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by crashoverride » Mon Sep 05, 2016 10:29 pm

tkaiser wrote:Recent research showed that on Allwinner SoCs disabling both HDMI/GPU reduces consumption/temperatures a little and also increases memory bandwidth a lot.
The GPU (Mali) should not be doing anything unless told to do so by GLES. The display controller, however, will constantly access DRAM to refresh the display. This does take bandwidth with a correlation to display size and depth. This is also the reason I do not think that DRAM speed/volt/etc has any relation to this. If the DRAM controller could not keep keep up or overheated, the display controller would also be receiving erroneous data. This would present as randomly changing graphical "garbage" across the display. If the DRAM controller works at its current speed with the CPU @ 1.5GHZ, then it should work regardless of whether the CPU is @ 1Hz or 2Ghz. A faster/slower CPU does not pull data any faster than the DRAM controller allows running at a given clock speed. A faster CPU simply has more wait cycles before data is available on the bus.

The point of this post is not to say changing the DRAM settings shouldn't be tested. Rather, the point is that people expecting a DRAM setting to solve the "2GHZ problem" will likely be disappointed by the test results.

[edit]
Another way to put it is:
Lowering the DRAM clock is effectively lowering the CPU clock because the CPU takes more wait cycles than it previously did. This can erroneously lead people to conclude that a lower DRAM clock allows the CPU to work at faster speeds. CPU stress testing needs to take place at the uboot level, not the linux userland level, and fit in L1 cache to eliminate DRAM from the equation.

tkaiser
Posts: 671
Joined: Mon Nov 09, 2015 12:30 am
languages_spoken: english
ODROIDs: C1+, C2, XU4, HC1
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by tkaiser » Mon Sep 05, 2016 11:31 pm

crashoverride wrote:The point of this post is not to say changing the DRAM settings shouldn't be tested. Rather, the point is that people expecting a DRAM setting to solve the "2GHZ problem" will likely be disappointed by the test results.
It seems I hijacked the wrong thread ;)

For me there exists no "2GHZ problem" at all. ODROID-C2 is performing very well in all relevant areas (awesome high sequential and random IO performance of the eMMC, nice USB 2.0 performance especially given that an USB hub is in the way, CPU fast enough for any interesting use cases -- GPU/video me not interesting). And GbE maxes out in both directions after applying the usual tweaks [1].

It's only that relationships between DRAM clockspeed and temperature/consumption of other SoCs are known (and I would assume that applies to Amlogic as well) so in case Amlogic provides now new blobs tweaking settings one should have a look how memory bandwidth behaves (since if all tests here are only relying on sysbench then Amlogic might be able to provide settings with higher CPU clockspeeds while having to decrease DRAM clock -- then sysbench numbers look better while real performance decreased). In my eyes a reasonable attempt to tweak cpufreq/dvfs settings further would be to allow single-threaded tasks reach up to 2.0GHz while limiting multi-threaded loads on all CPU cores to the 1.5GHz we already have (since efficiency is an issue and throttling as well without an annoying fan)

Anyway, for my use case more memory bandwidth is way more important than further increasing CPU clock so disabling display controller is the way to go. Thanks!

Edit: Setting setenv nographics "1" in /boot/boot.ini works in the opposite direction. 4.76 khash/s before, 4.72 khash/s after. Funny :shock:

Edit 2: Quick check with tinymembench confirms that memory throughput benefits from disabling display (as expected): http://pastebin.com/MpnWiNcy So surprisingly cpuminer shows different numbers for different reasons and can not be used to identify memory bandwidth effects.

[1] 'Shot in the dark' try below increased the ~800 Mbits/sec in RX direction also to ~935 Mbits/sec like with TX already:

Code: Select all

echo 32768 > /proc/sys/net/core/rps_sock_flow_entries
echo 32768 > /sys/class/net/eth0/queues/rx-0/rps_flow_cnt
sysctl -w net.core.rmem_max=26214400
sysctl -w net.core.wmem_max=26214400
sysctl -w net.core.rmem_default=514400
sysctl -w net.core.wmem_default=514400
sysctl -w net.ipv4.tcp_rmem='10240 87380 26214400'
sysctl -w net.ipv4.tcp_wmem='10240 87380 26214400'
sysctl -w net.ipv4.udp_rmem_min=131072
sysctl -w net.ipv4.udp_wmem_min=131072
sysctl -w net.ipv4.tcp_timestamps=1
sysctl -w net.ipv4.tcp_window_scaling=1
sysctl -w net.ipv4.tcp_sack=1
sysctl -w net.core.optmem_max=65535
sysctl -w net.core.netdev_max_backlog=5000

mlinuxguy
Posts: 793
Joined: Thu Feb 28, 2013 10:28 am
languages_spoken: english
ODROIDs: X, X2, XU, XU3, XU4, C1, C1+, C2, N1, USB-IO
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by mlinuxguy » Tue Sep 06, 2016 3:25 am

crashoverride wrote:The point of this post is not to say changing the DRAM settings shouldn't be tested. Rather, the point is that people expecting a DRAM setting to solve the "2GHZ problem" will likely be disappointed by the test results.
DRAM clocking and power req's are at least worth poking a stick at, when exploring a black box all variables are holes sticks can go in.... :evil:

Right now I see the following area's we could investigate:
(1) disable all un-needed power draws (GPU, i2C, etc...) check if that changes any results at higher CPU clocks
(2) If clock skew ( ref: https://en.wikipedia.org/wiki/Clock_skew ) is root cause one option is to disable all clocks besides CPU (similar to #1) OR adjust what clocks
we have knobs for (including DDR3) to see if we can shift the timing into sync
(3) Investigate varying the power to the various PWM driven subsystems that we have control over (note: this is separate from the DVFS cpu clocks)
(4) Determine if upstream kernel with pared down DTS tree and working L2 cache is stable at any higher clocks (could indicate if the above steps are worth deep investigation)

Note: If the upstream kernel with L2 cache patches improves performance over our current L2 cache, then it "could" offset lowered DDR3 ram timings if we find DDR3 changes help.

Working 1.75ghz (or greater) on a CPU with 28nm geometry is going to generate more heat. Using a cut-down Northbridge heatsink I don't see heat issues, instead I hit memory access errors. Clock skew, insufficient power, poor memory controller design, etc.. I have no idea where the clock train derails..... but its a fun black box to
poke at.... and determining root cause by exploring a system's limits tends to illuminate all kinds of dark corners in your system (look at what this thread has uncovered just regarding BL30.bin).

On the cache topic the L2 cache flush patches I was looking at earlier this year starting at 3.18 kernel fixed issues with the kernel improperly flushing the entire cache way too often. Are the extra cache flushes effecting our current performance? No idea, but a use case of headless database could really use a properly working L2 cache. If we end up running at higher clocks an L2 cache that properly works will boost overall performance, especially if we adjust DDR3 ram timings.

User avatar
Snk
Posts: 275
Joined: Sun Jul 31, 2016 6:43 am
languages_spoken: Portuguese
ODROIDs: XU4 + eMMC 32GB + UART
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by Snk » Thu Sep 08, 2016 9:08 pm

Nothing new from the Amlogic or HK?

User avatar
mad_ady
Posts: 5227
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4, C1+, C2, N1
Location: Bucharest, Romania
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by mad_ady » Thu Sep 08, 2016 9:12 pm

They've just released a kernel + uboot update to enable the extra frequencies. 1.75 is stable for me on 4 cores, but going above (even on one core) fails to boot properly. Odroid said there will be more changes in the future with regard to DRAM clock speed: http://forum.odroid.com/viewtopic.php?f ... 19#p158919

User avatar
Snk
Posts: 275
Joined: Sun Jul 31, 2016 6:43 am
languages_spoken: Portuguese
ODROIDs: XU4 + eMMC 32GB + UART
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by Snk » Thu Sep 08, 2016 10:01 pm

mad_ady wrote:They've just released a kernel + uboot update to enable the extra frequencies. 1.75 is stable for me on 4 cores, but going above (even on one core) fails to boot properly. Odroid said there will be more changes in the future with regard to DRAM clock speed: http://forum.odroid.com/viewtopic.php?f ... 19#p158919
Does this update, it corrected the false frequency 2Ghz?
If set at 1.65Ghz stable with this increase in frequency memories, certainly C2 will be even better!

@odroid

Bl4ckD0g
Posts: 48
Joined: Sat Apr 09, 2016 3:18 am
languages_spoken: english
ODROIDs: C2
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by Bl4ckD0g » Thu Sep 08, 2016 10:34 pm

mad_ady wrote:They've just released a kernel + uboot update to enable the extra frequencies. 1.75 is stable for me on 4 cores, but going above (even on one core) fails to boot properly. Odroid said there will be more changes in the future with regard to DRAM clock speed: http://forum.odroid.com/viewtopic.php?f ... 19#p158919
I have upgraded mine, but the system is hung after I rebooted it :roll:
Only when I arrive there I'll be able to power cycle it.

User avatar
Snk
Posts: 275
Joined: Sun Jul 31, 2016 6:43 am
languages_spoken: Portuguese
ODROIDs: XU4 + eMMC 32GB + UART
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by Snk » Fri Sep 09, 2016 2:48 am

Unfortunately HK to set the frequency only at 1.54Ghz.
I thought they had tried to bring at least 1.65Ghz, but apparently was not possible. :(

User avatar
mad_ady
Posts: 5227
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4, C1+, C2, N1
Location: Bucharest, Romania
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by mad_ady » Fri Sep 09, 2016 4:02 am

Well, keeping it at 1.5GHz keeps the performance and heat at the previous levels. You're free to try new settings based on your needs, but you should treat the new frequencies as overclocking - there might be stability issues.

majorowe
Posts: 33
Joined: Fri Jul 29, 2016 12:51 am
languages_spoken: english français deutsch espanol
ODROIDs: C2
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by majorowe » Fri Sep 09, 2016 5:48 am

Don't get met wrong, I love my C2 so far, but it is nevertheless slightly disappointing to learn the 4x2Ghz board I bought is actually only (stable at) 4x1.5GHz.

Hardkernel have updated their product descriptions, as has pollin.de, but you still see

Amlogic S905 (ARM® Cortex®-A53(ARMv8) 2Ghz quad core CPU)

on eu.diigiit.com and other sites.

Personally I'm happy to rest zen over the issue, but HK would want to be more careful in future.

User avatar
odroid
Site Admin
Posts: 29651
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by odroid » Fri Sep 09, 2016 8:48 am

If you update kernel with sudo apt update && sudo apt upgrade && sudo apt dist-upgrade, you will have "3.14.77-80" or higher.
And there will be new items in the new boot.ini to control the CPU clock options.
https://github.com/mdrjr/c2_bootini/blo ... t.ini#L101

Refer this WiKi page too.
http://odroid.com/dokuwiki/doku.php?id= ... t_cpu_freq

User avatar
Snk
Posts: 275
Joined: Sun Jul 31, 2016 6:43 am
languages_spoken: Portuguese
ODROIDs: XU4 + eMMC 32GB + UART
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by Snk » Fri Sep 09, 2016 9:43 am

odroid wrote:If you update kernel with sudo apt update && sudo apt upgrade && sudo apt dist-upgrade, you will have "3.14.77-80" or higher.
And there will be new items in the new boot.ini to control the CPU clock options.
https://github.com/mdrjr/c2_bootini/blo ... t.ini#L101

Refer this WiKi page too.
http://odroid.com/dokuwiki/doku.php?id= ... t_cpu_freq
I will try to take my maximum frequency in 1656 Mhz, as yet but is quite stable.
Thank you, friend!
Another thing ... I'm very noob.
How can I do to change the boot.ini?

User avatar
odroid
Site Admin
Posts: 29651
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by odroid » Fri Sep 09, 2016 9:51 am

Connect your eMMC/SD card your PC and use a text editor.

Or use vi or nano editor in command line on the board. The full path is /media/boot/boot.ini

User avatar
Snk
Posts: 275
Joined: Sun Jul 31, 2016 6:43 am
languages_spoken: Portuguese
ODROIDs: XU4 + eMMC 32GB + UART
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by Snk » Fri Sep 09, 2016 10:06 am

odroid wrote:Connect your eMMC/SD card your PC and use a text editor.

Or use vi or nano editor in command line on the board. The full path is /media/boot/boot.ini

Got I ... Now I know why I tried to use LibreOffice and then the boot just got the lit red and blue light. haha

User avatar
odroid
Site Admin
Posts: 29651
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by odroid » Fri Sep 09, 2016 3:05 pm

Please refer this post for the DDR clock boosting.
http://forum.odroid.com/viewtopic.php?f=135&t=23540

@mlinuxguy,
lowering the DDR clock doesn't improve the CPU overclock.

User avatar
odroid
Site Admin
Posts: 29651
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by odroid » Fri Sep 09, 2016 3:46 pm

@Snk,
Connect your SD card to your host PC and edit the boot.ini file to lower the clock frequency.

mlinuxguy
Posts: 793
Joined: Thu Feb 28, 2013 10:28 am
languages_spoken: english
ODROIDs: X, X2, XU, XU3, XU4, C1, C1+, C2, N1, USB-IO
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by mlinuxguy » Fri Sep 09, 2016 4:11 pm

@odroid yes I ran tests and posted results to this thread: http://forum.odroid.com/viewtopic.php?f ... 40#p159046
We lack voltage control over DDR3, but otherwise it does not help stability at higher clocks
Not sure voltage control would gain us much anyway, there is either a clock skew issue or power issue with multiple cores active at higher frequencies.

So far 1.75ghz, 4-cores and DDR3=1104 is stable for me... nice little boost in performance
I've built multiple kernels on it, ran benchmarks, and had no issues.

Next up for me will be the upstream kernel with a working L2 cache
That should be a big jump forward in performance...
ref: thread on the upstream kernel --> http://forum.odroid.com/viewtopic.php?f ... 17#p158716

User avatar
Snk
Posts: 275
Joined: Sun Jul 31, 2016 6:43 am
languages_spoken: Portuguese
ODROIDs: XU4 + eMMC 32GB + UART
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by Snk » Fri Sep 09, 2016 8:16 pm

odroid wrote:@Snk,
Connect your SD card to your host PC and edit the boot.ini file to lower the clock frequency.
By changing the frequency to 1656Mhz using Wordpad , my C2 is lit with two LEDs and does not start.
Edit the last line and leave like this:
# MAX Frequency
# Setenv max_freq " 2016 " # 2.016GHz
# Setenv max_freq " 1944 " # 1.944GHz
# Setenv max_freq " 1944 " # 1.944GHz
# Setenv max_freq " 1920 " # 1.920GHz
# Setenv max_freq " 1896 " # 1.896GHz
# Setenv max_freq " 1752 " # 1.752GHz
# Setenv max_freq " 1680 " # 1.680GHz
# Setenv max_freq " 1656 " # 1.656GHz
setenv max_freq " 1656 " # 1.656GHz

Except using Wordpad but the call continues with the blue LED lit all the time.
Returning to the original boot.ini , back to work.

Edit: I did it! :)
Using vi is much better to edit.

DarkBahamut
Posts: 322
Joined: Tue Jan 19, 2016 10:19 am
languages_spoken: english
ODROIDs: XU4, N1
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by DarkBahamut » Sat Sep 10, 2016 4:26 am

majorowe wrote:Don't get met wrong, I love my C2 so far, but it is nevertheless slightly disappointing to learn the 4x2Ghz board I bought is actually only (stable at) 4x1.5GHz.

Hardkernel have updated their product descriptions, as has pollin.de, but you still see

Amlogic S905 (ARM® Cortex®-A53(ARMv8) 2Ghz quad core CPU)

on eu.diigiit.com and other sites.

Personally I'm happy to rest zen over the issue, but HK would want to be more careful in future.
I'm generally in the camp that if the performance is what you expected then I don't think the clock speed matters too much. Hardkernels provided benchmarks would have been at 1.5GHz after all. That said having the correct advertised information is important I agree.

I feel it's worth noting this isn't the first time the information has been (or is) wrong though. The XU4's product and wiki pages list incorrect DDR3 clockspeeds and have done for years. That will definitely impact performance in bandwidth intensive applications or when using the GPU. It's even been brought up before and the pages still remain incorrect to this day.

brad
Posts: 775
Joined: Tue Mar 29, 2016 1:22 pm
languages_spoken: english
ODROIDs: C2 N1
Location: Australia
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by brad » Sat Sep 10, 2016 8:30 am

Snk wrote:Edit: I did it!
Using vi is much better to edit.
Hi Snk, if you would like to edit the config files in a Windows gui program you can try Notepad++ (https://notepad-plus-plus.org/)

Normal word processing programs such as Wordpad or Libreoffice will change the format of the file and cause errors.

LordadmiralDrake
Posts: 82
Joined: Wed Mar 30, 2016 6:24 pm
languages_spoken: english
ODROIDs: Odroid C2
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by LordadmiralDrake » Sun Sep 11, 2016 7:15 am

In general, use plain text editors (like Notepad or ConText) to modify such config files. Don't use Rich Text editors like WordPad, Word or Writer, since those will store additional data in the file that you dont want there

joy
Posts: 629
Joined: Fri Oct 02, 2015 1:44 pm
languages_spoken: english
ODROIDs: ODROID-C1+, XU4, X
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by joy » Sun Sep 11, 2016 10:13 am

One more thing we can try is adjusting vcck voltage.
We found it's available to increase vcck voltage a little so we're checking how to affect it and putting some tests.
It can affect booting with 2 or more cores in case of 1.896GHz and higher cpu frequency.
Also it can aggravate the system stability with high temperature because the overall current will be increased.
But it's worthy to try it.
If it's done, I will share it here.

brad
Posts: 775
Joined: Tue Mar 29, 2016 1:22 pm
languages_spoken: english
ODROIDs: C2 N1
Location: Australia
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by brad » Sun Sep 11, 2016 11:19 am

joy wrote:One more thing we can try is adjusting vcck voltage.
We found it's available to increase vcck voltage a little so we're checking how to affect it and putting some tests.
If I understand correctly the vcck is the supply directly to the a53 cores as opposed to vdd-ee for the everything else domain? Would be interesting to see the test results when you are finished.

brad
Posts: 775
Joined: Tue Mar 29, 2016 1:22 pm
languages_spoken: english
ODROIDs: C2 N1
Location: Australia
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by brad » Sun Sep 11, 2016 3:07 pm

I managed to get the cpufreq cpu driver working in the 4.8 next kernel with some new patches out and are seeing similar crashes at higher speeds. I am able to change to 1.944GHz x 4 and run a sysbench but any bigger load test breaks and I havent built cpuburn-a53 as yet. I will do some more experimenting to see if I can single cpu at 2Gh or more.

Here is a test...

Code: Select all

root@odroid64:/sys/devices/system/cpu/cpu0/cpufreq# echo 1944000 > scaling_max_freq
root@odroid64:/sys/devices/system/cpu/cpu0/cpufreq# sysbench --test=cpu --cpu-max-prime=20000 --num-threads=4 run
sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 20000


Test execution summary:
    total time:                          4.8597s
    total number of events:              10000
    total time taken by event execution: 19.4206
    per-request statistics:
         min:                                  1.88ms
         avg:                                  1.94ms
         max:                                 17.44ms
         approx.  95 percentile:               1.89ms

Threads fairness:
    events (avg/stddev):           2500.0000/4.36
    execution time (avg/stddev):   4.8552/0.00

mlinuxguy
Posts: 793
Joined: Thu Feb 28, 2013 10:28 am
languages_spoken: english
ODROIDs: X, X2, XU, XU3, XU4, C1, C1+, C2, N1, USB-IO
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by mlinuxguy » Mon Sep 12, 2016 12:42 am

@brad: can you test if adding these L2 cache entries to the DTSI file improves performance on the 4.8 kernel?
http://forum.odroid.com/viewtopic.php?f ... 17#p158716

You could run 2 benchmarks with L2 cache entries added and not added to the dtsi file
The sysbench test
and
# 7zr b ref: http://forum.odroid.com/viewtopic.php?f ... 50#p158238
from: apt install p7zip

brad
Posts: 775
Joined: Tue Mar 29, 2016 1:22 pm
languages_spoken: english
ODROIDs: C2 N1
Location: Australia
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by brad » Mon Sep 12, 2016 1:41 pm

@mlinuxguy There does not appear to be any difference with the cache entry in the device tree, both results are the same at 1.752GHz. Im not sure if it is enabled correctly with just the device tree entries.

Code: Select all

odroid@odroid64:~$ 7zr b

7-Zip (A) 9.20  Copyright (c) 1999-2010 Igor Pavlov  2010-11-18
p7zip Version 9.20 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,4 CPUs)

RAM size:    1988 MB,  # CPU hardware threads:   4
RAM usage:    850 MB,  # Benchmark threads:      4

Dict        Compressing          |        Decompressing
      Speed Usage    R/U Rating  |    Speed Usage    R/U Rating
       KB/s     %   MIPS   MIPS  |     KB/s     %   MIPS   MIPS

22:    2248   303    720   2187  |    63426   399   1432   5722
23:    2248   306    747   2291  |    62293   400   1426   5700
24:    2238   309    778   2407  |    60513   396   1417   5614
25:    2221   309    820   2536  |    60386   399   1422   5678
----------------------------------------------------------------
Avr:          307    767   2355               399   1424   5679
Tot:          353   1095   4017
odroid@odroid64:~$ sysbench --test=cpu --cpu-max-prime=20000 --num-threads=4 run
sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 20000


Test execution summary:
    total time:                          5.2246s
    total number of events:              10000
    total time taken by event execution: 20.8845
    per-request statistics:
         min:                                  2.09ms
         avg:                                  2.09ms
         max:                                  2.35ms
         approx.  95 percentile:               2.09ms

Threads fairness:
    events (avg/stddev):           2500.0000/0.71
    execution time (avg/stddev):   5.2211/0.00
cpu entries changed to and also cache entry added...

Code: Select all

                cpu0: cpu@0 {
                        device_type = "cpu";
                        compatible = "arm,cortex-a53", "arm,armv8";
                        reg = <0x0 0x0>;
                        enable-method = "psci";
                        next-level-cache = <&A53_L2>;
                        clocks = <&scpi_dvfs 0>;
-----
                A53_L2: l2-cache1 {
                        compatible = "cache";
                };

stmicro
Posts: 236
Joined: Tue Apr 28, 2015 4:23 pm
languages_spoken: english, chinese
ODROIDs: Many Odroids and Rpis.
Location: shenzhen china
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by stmicro » Mon Sep 12, 2016 7:39 pm

Great job. Thank HK and other advanced members. My C2 is flying with 1.75Ghz clocking on Kernel 3.14.77-81 update.

Jojo
Posts: 524
Joined: Mon May 18, 2015 12:13 am
languages_spoken: english, german
ODROIDs: C1, C1+, C2, HC1, HC2, VU8C
Location: Germany
Contact:

Re: No performance difference between 1.5, 1.75 & 2GHz

Unread post by Jojo » Wed Sep 14, 2016 7:04 pm

Hi,

first of all: thank you guys! Great job!

Short test result:
C2 with eMMC, Kernel 3.14.77-81
1,75 GHz on all core:
- Boots up fine
- Crunching SETI-WUs on two cores seems to be stable (temps at 72°C)
- Crunching SETI-WUs on 3+ cores seems NOT to be stable, C2 freezes almost immediately

1,68 GHz on all core:

- Boots up fine
- Crunching SETI-WUs on three cores seems to be stable (temps at >90°C, thermal clock throtteling)
How to ask questions the smart way:
http://www.catb.org/esr/faqs/smart-questions.html

Post Reply

Return to “Issues”

Who is online

Users browsing this forum: No registered users and 1 guest