Suggested gcc/g++ compiler options for C2
-
- Posts: 221
- Joined: Wed Jun 19, 2013 9:39 am
- languages_spoken: english
- Has thanked: 0
- Been thanked: 0
- Contact:
Suggested gcc/g++ compiler options for C2
I know I have seen some bits and pieces of this in some of the C2 threads, but would be nice to know if there are some suggestions on what compiler/linker options are suggested for compiling C/C++ code on the C2.
Compile for 64 bits versus 32 bits?
Thanks
Compile for 64 bits versus 32 bits?
Thanks
-
- Posts: 842
- Joined: Thu Feb 28, 2013 10:28 am
- languages_spoken: english
- ODROIDs: X, X2, XU, XU3, XU4, C1, C1+, C2, N1, USB-IO
- Has thanked: 0
- Been thanked: 0
- Contact:
Re: Suggested gcc/g++ compiler options for C2
I currently have this set in my env
# echo $CFLAGS
-march=armv8.1-a -mtune=cortex-a53 -fexpensive-optimizations -fprefetch-loop-arrays
I suspect its not the "ultimate" settings but seems to boost execution speed over default repo code
# echo $CFLAGS
-march=armv8.1-a -mtune=cortex-a53 -fexpensive-optimizations -fprefetch-loop-arrays
I suspect its not the "ultimate" settings but seems to boost execution speed over default repo code
-
- Posts: 5271
- Joined: Tue Dec 30, 2014 8:42 pm
- languages_spoken: english
- ODROIDs: C1
- Has thanked: 0
- Been thanked: 417 times
- Contact:
Re: Suggested gcc/g++ compiler options for C2
Are we armv8.1? or just armv8?mlinuxguy wrote:-march=armv8.1-a
I think the difference is entirely related to hypervisor support, so it may not matter for most things. Is that the default Ubuntu setting? Because that may explain the issues with "Illegal Instruction" for libvirt-bin/qemu.
-
- Posts: 842
- Joined: Thu Feb 28, 2013 10:28 am
- languages_spoken: english
- ODROIDs: X, X2, XU, XU3, XU4, C1, C1+, C2, N1, USB-IO
- Has thanked: 0
- Been thanked: 0
- Contact:
Re: Suggested gcc/g++ compiler options for C2
Ok I modified my CFLAGS and now just use armv8
From arm docs on 8.1:
When I tried to build the linux crash utility:
extract from config.log:
Per GCC docs
Change it to: -march=armv8-a
And compile/build works..
From arm docs on 8.1:
Code: Select all
The enhancements introduced with ARMv8.1 fall into two categories:
Changes to the instruction set.
Changes to the exception model and memory translation.
Instruction set enhancements
ARMv8.1 includes the following additions to the A64 instruction set:
A set of AArch64 atomic read-write instructions
Additions to the Advanced SIMD instruction set for both AArch32 and AArch64 to enable opportunities for some library optimizations:
Signed Saturating Rounding Doubling Multiply Accumulate, Returning High Half
Signed Saturating Rounding Doubling Multiply Subtract, Returning High Half
The instructions are added in vector and scalar forms.
A set of AArch64 load and store instructions that can provide memory access order that is limited to configurable address regions.
As well as the additions, the optional CRC instructions in v8.0 become a requirement in ARMv8.1.
extract from config.log:
Code: Select all
configure:4254: gcc -march=armv8 -mtune=cortex-a53 -fprefetch-loop-arrays conftest.c >&5
conftest.c:1:0: error: unknown value 'armv8' for -march
Code: Select all
-march=name
Specify the name of the target architecture and, optionally, one or more feature modifiers. This option has the form -march=arch{+[no]feature}*.
The permissible values for arch are ‘armv8-a’, ‘armv8.1-a’ or native.
And compile/build works..
- meveric
- Posts: 11441
- Joined: Mon Feb 25, 2013 2:41 pm
- languages_spoken: german, english
- ODROIDs: X2, U2, U3, XU-Lite, XU3, XU3-Lite, C1, XU4, C2, C1+, XU4Q, HC1, N1, Go, H2 (N4100), N2, H2 (J4105), GoA, C4, GoA v1.1, H2+, HC4, GoS
- Has thanked: 63 times
- Been thanked: 459 times
- Contact:
Re: Suggested gcc/g++ compiler options for C2
Code: Select all
root@odroid-jessie64:~# cat /proc/cpuinfo | grep Features
Features : fp asimd crc32
-march=name
This specifies the name of the target ARM architecture. GCC uses this name to determine what kind of instructions it can emit when generating assembly code. This option can be used in conjunction with or instead of the -mcpu= option. Permissible names are: ‘armv2’, ‘armv2a’, ‘armv3’, ‘armv3m’, ‘armv4’, ‘armv4t’, ‘armv5’, ‘armv5t’, ‘armv5e’, ‘armv5te’, ‘armv6’, ‘armv6j’, ‘armv6t2’, ‘armv6z’, ‘armv6kz’, ‘armv6-m’, ‘armv7’, ‘armv7-a’, ‘armv7-r’, ‘armv7-m’, ‘armv7e-m’, ‘armv7ve’, ‘armv8-a’, ‘armv8-a+crc’, ‘armv8.1-a’, ‘armv8.1-a+crc’, ‘iwmmxt’, ‘iwmmxt2’, ‘ep9312’.
-march=armv7ve is the armv7-a architecture with virtualization extensions.
-march=armv8-a+crc enables code generation for the ARMv8-A architecture together with the optional CRC32 extensions.
-march=native causes the compiler to auto-detect the architecture of the build computer. At present, this feature is only supported on GNU/Linux, and not all architectures are recognized. If the auto-detect is unsuccessful the option has no effect.
Donate to support my work on the ODROID GameStation Turbo Image for U2/U3 XU3/XU4 X2 X C1 as well as many other releases.
Check out the Games and Emulators section to find some of my work or check the files in my repository to find the software i build for ODROIDs.
If you want to add my repository to your image read my HOWTO integrate my repo into your image.
Check out the Games and Emulators section to find some of my work or check the files in my repository to find the software i build for ODROIDs.
If you want to add my repository to your image read my HOWTO integrate my repo into your image.
-
- Site Admin
- Posts: 11782
- Joined: Fri Feb 22, 2013 11:34 pm
- languages_spoken: english, portuguese
- ODROIDs: -
- Location: Brazil
- Has thanked: 1 time
- Been thanked: 35 times
- Contact:
Re: Suggested gcc/g++ compiler options for C2
-mnative is currently broken on gcc 5.3.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70133
It looks like a linaro issue but its being looked.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70133
It looks like a linaro issue but its being looked.
-
- Posts: 925
- Joined: Sun Aug 30, 2015 11:21 pm
- languages_spoken: English
- ODROIDs: C1, C1+, C2 & XU4
- Has thanked: 0
- Been thanked: 0
- Contact:
Re: Suggested gcc/g++ compiler options for C2
According to these references:
Interesting flags related to a53:
- http://infocenter.arm.com/help/topic/co ... 13854.html
- https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
- https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html
Code: Select all
-march=armv8-a+crc -mtune=cortex-a53 -mfpu=neon-fp-armv8
Code: Select all
-mfix-cortex-a53-835769
Enable or disable the workaround for the ARM Cortex-A53 erratum number 835769. This involves inserting a NOP instruction between memory instructions and 64-bit integer multiply-accumulate instructions.
-mfix-cortex-a53-843419
Enable or disable the workaround for the ARM Cortex-A53 erratum number 843419. This erratum workaround is made at link time and this will only pass the corresponding flag to the linker.
-
- Posts: 150
- Joined: Tue Jun 02, 2015 1:43 am
- languages_spoken: english
- ODROIDs: C1, XU4, C2
- Has thanked: 0
- Been thanked: 3 times
- Contact:
Re: Suggested gcc/g++ compiler options for C2
GCC on the Hardkernel supplied Ubuntu SD card doesn't know about the -mfpu option... in fact there appears to be no 'mfpu' string present in the executable.
gcc: error: unrecognized command line option ‘-mfpu=neon’
gcc version 5.3.1 20160225 (Ubuntu/Linaro 5.3.1-10ubuntu2)
Particularly interested in giving the FPU a workout as I think my application could let GCC use it. Short of compiling GCC myself, any ideas on how I can find a version that will support it?
gcc: error: unrecognized command line option ‘-mfpu=neon’
gcc version 5.3.1 20160225 (Ubuntu/Linaro 5.3.1-10ubuntu2)
Particularly interested in giving the FPU a workout as I think my application could let GCC use it. Short of compiling GCC myself, any ideas on how I can find a version that will support it?
- memeka
- Posts: 4420
- Joined: Mon May 20, 2013 10:22 am
- languages_spoken: english
- ODROIDs: XU rev2 + eMMC + UART
U3 + eMMC + IO Shield + UART - Has thanked: 2 times
- Been thanked: 60 times
- Contact:
Re: Suggested gcc/g++ compiler options for C2
is neon the best on ARMv8, or is there anything new?
Images: U2/U3 Trusty Dev Center | XU Trusty Dev Center | XU4 Hipster Stretchy Pants
Information: U2/U3 Dashboard | XU Dashboard
Say thank you with a beer
Information: U2/U3 Dashboard | XU Dashboard
Say thank you with a beer
-
- Posts: 150
- Joined: Tue Jun 02, 2015 1:43 am
- languages_spoken: english
- ODROIDs: C1, XU4, C2
- Has thanked: 0
- Been thanked: 3 times
- Contact:
Re: Suggested gcc/g++ compiler options for C2
How are you guys using -mfpu with GCC? Which dist are you using? As per above, the Ubuntu card supplied with my C2 does not support it.
-
- Posts: 5271
- Joined: Tue Dec 30, 2014 8:42 pm
- languages_spoken: english
- ODROIDs: C1
- Has thanked: 0
- Been thanked: 417 times
- Contact:
Re: Suggested gcc/g++ compiler options for C2
They are using Debian Jessie with GCC 4.x. Ubuntu 16.04 is using GCC 5.x.
I am not sure what the goal is here, but there is no need to specify additional compiler flags. The default for aarch64 enables both SIMD (NEON) and FPU.
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html
I am not sure what the goal is here, but there is no need to specify additional compiler flags. The default for aarch64 enables both SIMD (NEON) and FPU.
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html
‘fp’
Enable floating-point instructions. This is on by default for all possible values for options -march and -mcpu.
‘simd’
Enable Advanced SIMD instructions. This also enables floating-point instructions. This is on by default for all possible values for options -march and -mcpu.
-
- Posts: 5271
- Joined: Tue Dec 30, 2014 8:42 pm
- languages_spoken: english
- ODROIDs: C1
- Has thanked: 0
- Been thanked: 417 times
- Contact:
Re: Suggested gcc/g++ compiler options for C2
For AArch64 and AArch32 its Advanced SIMD (NEON) which adds new instructions.memeka wrote:is neon the best on ARMv8, or is there anything new?
-
- Posts: 5271
- Joined: Tue Dec 30, 2014 8:42 pm
- languages_spoken: english
- ODROIDs: C1
- Has thanked: 0
- Been thanked: 417 times
- Contact:
Re: Suggested gcc/g++ compiler options for C2
Code: Select all
int main()
{
double x = 123;
double y = x * x + x;
return (int)x;
}
Code: Select all
$ g++ main.cpp -o main
$ gdb ./main
GNU gdb (Ubuntu 7.11-0ubuntu1) 7.11
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./main...(no debugging symbols found)...done.
(gdb) b main
Breakpoint 1 at 0x4005b4
(gdb) r
Starting program: /srv/disk0/c2/compiletest/main
Breakpoint 1, 0x00000000004005b4 in main ()
(gdb) disassemble
Dump of assembler code for function main:
0x00000000004005a8 <+0>: sub sp, sp, #0x10
0x00000000004005ac <+4>: adrp x0, 0x400000
0x00000000004005b0 <+8>: add x0, x0, #0x5e8
=> 0x00000000004005b4 <+12>: ldr x0, [x0]
0x00000000004005b8 <+16>: str x0, [sp]
0x00000000004005bc <+20>: ldr d1, [sp]
0x00000000004005c0 <+24>: ldr d0, [sp]
0x00000000004005c4 <+28>: fmul d1, d1, d0
0x00000000004005c8 <+32>: ldr d0, [sp]
0x00000000004005cc <+36>: fadd d0, d1, d0
0x00000000004005d0 <+40>: str d0, [sp,#8]
0x00000000004005d4 <+44>: ldr d0, [sp]
0x00000000004005d8 <+48>: fcvtzs w0, d0
0x00000000004005dc <+52>: add sp, sp, #0x10
0x00000000004005e0 <+56>: ret
End of assembler dump.
(gdb)
-
- Posts: 150
- Joined: Tue Jun 02, 2015 1:43 am
- languages_spoken: english
- ODROIDs: C1, XU4, C2
- Has thanked: 0
- Been thanked: 3 times
- Contact:
Re: Suggested gcc/g++ compiler options for C2
Hmm, interesting. I've played with optimisation flags on the C1, and they have made a difference, so I just assumed that they were necessary for 64 bit ARMv8 too.crashoverride wrote:They are using Debian Jessie with GCC 4.x. Ubuntu 16.04 is using GCC 5.x.
I am not sure what the goal is here, but there is no need to specify additional compiler flags. The default for aarch64 enables both SIMD (NEON) and FPU.
GCC can be a little mysterious, so it's hard to figure out what's going on under the hood sometimes. As an example, I read an article about how GCC compiles switch statements, and when I inspected the assembly output of my application, was surprised to find that a large switch statement (about 35 cases) was actually being compiled to a sequence of comparisons, rather than a jump table, and the order of comparison seemed to be quite random (certainly not the same order it was in the source). Redoing some of the defines and shifting some blocks of code around has improved things - around 15% faster - but it's still doing a strange hybrid of comparisons and a jump table, even though the range of cases is limited between 0 to 255. For some reason it thinks that a jump table of 1024/2048 bytes is not the right thing to do.
I guess this is a long winded way of saying that sometimes it would be nice to force GCC to do something a specific way, and be sure it is happening, rather than let its mysterious "heuristics" decide.
-
- Posts: 150
- Joined: Tue Jun 02, 2015 1:43 am
- languages_spoken: english
- ODROIDs: C1, XU4, C2
- Has thanked: 0
- Been thanked: 3 times
- Contact:
Re: Suggested gcc/g++ compiler options for C2
Here are some generic GCC flags that I've found increase speed (in my case) on both the C1 and C2. If you have an application which loops hard, they may help compile more noticeably efficient code.
-finline-limit=5000 (5000 is an arbitrary number I chose)
-fipa-pta
-fvariable-expansion-in-unroller
-fwhole-program
-fomit-frame-pointer
Note that the last two options may cause problems if you're not compiling a standalone program.
Example:
gcc -O3 ai-benchmark.c
9768 bytes, 5.90 sec
gcc -O3 -finline-limit=5000 -fipa-pta -fvariable-expansion-in-unroller -fwhole-program -fomit-frame-pointer ai-benchmark.c
9764 bytes, 4.96 sec
-finline-limit=5000 (5000 is an arbitrary number I chose)
-fipa-pta
-fvariable-expansion-in-unroller
-fwhole-program
-fomit-frame-pointer
Note that the last two options may cause problems if you're not compiling a standalone program.
Example:
gcc -O3 ai-benchmark.c
9768 bytes, 5.90 sec
gcc -O3 -finline-limit=5000 -fipa-pta -fvariable-expansion-in-unroller -fwhole-program -fomit-frame-pointer ai-benchmark.c
9764 bytes, 4.96 sec
Who is online
Users browsing this forum: No registered users and 3 guests