Getting OpenCL and clinfo to Work

Post Reply
hominoid
Posts: 887
Joined: Tue Feb 28, 2017 3:55 am
languages_spoken: english
ODROIDs: C2, C4, XU4, MC1, N1, N2, N2L, N2+, HC4, M1, H2, H3+
Location: Lake Superior Basin, USA
Has thanked: 128 times
Been thanked: 410 times
Contact:

Getting OpenCL and clinfo to Work

Post by hominoid »

It has been nagging me that clinfo does not work on the XU4 so over the last month or two I've been digging into it trying to figure out why.

Code: Select all

root@c0n0:~# apt install clinfo
root@c0n0:~# clinfo
Number of platforms                               0
I had also noticed lots of posts asking about clinfo not working on other SBC's but found no solutions for any SBC. So I thought it might be important to first investigate this to make sure OpenCL was indeed setup correctly before going further with trying to fix OpenCL kernels for the sgminer project. The only time, on other platforms (x86_64), I had seen it not working is when there had been an issue with the GPU driver.

The good news is I did recently get clinfo to work correctly and it reports a bunch of information on the Mali GPU, it looks great. The added info will help understand and tune the GPU better. It appears that setting up the a vendor ICD file for ARM GPU was needed, in a specific location.

Code: Select all

root@c0n0:/etc# clinfo

Number of platforms                               1
  Platform Name                                   ARM Platform
  Platform Vendor                                 ARM
  Platform Version                                OpenCL 1.2 v1.r12p0-04rel0.03af15950392f3702b248717f4938b82
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_gl_sharing cl_khr_icd cl_khr_egl_event cl_khr_egl_image cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory
  Platform Extensions function suffix             ARM

  Platform Name                                   ARM Platform
Number of devices                                 2
  Device Name                                     Mali-T628
  Device Vendor                                   ARM
  Device Vendor ID                                0x6200010
  Device Version                                  OpenCL 1.2 v1.r12p0-04rel0.03af15950392f3702b248717f4938b82
  Driver Version                                  1.2
  Device OpenCL C Version                         OpenCL C 1.2 v1.r12p0-04rel0.03af15950392f3702b248717f4938b82
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Max compute units                               4
  Max clock frequency                             600MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              4
  Preferred / native vector sizes
    char                                                16 / 16
    short                                                8 / 8
    int                                                  4 / 4
    long                                                 2 / 2
    half                                                 8 / 8        (cl_khr_fp16)
    float                                                4 / 4
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              2090344448 (1.947GiB)
  Error Correction support                        No
  Max memory allocation                           522586112 (498.4MiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        <printDeviceInfo:89: get CL_DEVICE_GLOBAL_MEM_CACHE_SIZE : error -30>
  Global Memory cache line                        64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            65536 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             65536x65536 pixels
    Max 3D image size                             65536x65536x65536 pixels
    Max number of read image args                 128
    Max number of write image args                8
  Local memory type                               Global
  Local memory size                               32768 (32KiB)
  Max constant buffer size                        65536 (64KiB)
  Max number of constant args                     8
  Max size of kernel argument                     1024
  Queue properties
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_gl_sharing cl_khr_icd cl_khr_egl_event cl_khr_egl_image cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory

  Device Name                                     Mali-T628
  Device Vendor                                   ARM
  Device Vendor ID                                0x6200010
  Device Version                                  OpenCL 1.2 v1.r12p0-04rel0.03af15950392f3702b248717f4938b82
  Driver Version                                  1.2
  Device OpenCL C Version                         OpenCL C 1.2 v1.r12p0-04rel0.03af15950392f3702b248717f4938b82
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Max compute units                               2
  Max clock frequency                             600MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              4
  Preferred / native vector sizes
    char                                                16 / 16
    short                                                8 / 8
    int                                                  4 / 4
    long                                                 2 / 2
    half                                                 8 / 8        (cl_khr_fp16)
    float                                                4 / 4
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              2090344448 (1.947GiB)
  Error Correction support                        No
  Max memory allocation                           522586112 (498.4MiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        <printDeviceInfo:89: get CL_DEVICE_GLOBAL_MEM_CACHE_SIZE : error -30>
  Global Memory cache line                        64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            65536 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             65536x65536 pixels
    Max 3D image size                             65536x65536x65536 pixels
    Max number of read image args                 128
    Max number of write image args                8
  Local memory type                               Global
  Local memory size                               32768 (32KiB)
  Max constant buffer size                        65536 (64KiB)
  Max number of constant args                     8
  Max size of kernel argument                     1024
  Queue properties
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_gl_sharing cl_khr_icd cl_khr_egl_event cl_khr_egl_image cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  ARM Platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [ARM]
  clCreateContext(NULL, ...) [default]            Success [ARM]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (2)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-T628
    Device Name                                   Mali-T628
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (2)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-T628
    Device Name                                   Mali-T628

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.8
  ICD loader Profile                              OpenCL 1.2
        NOTE:   your OpenCL library declares to support OpenCL 1.2,
                but it seems to support up to OpenCL 2.1 too.
On the x86 platforms it appears that it is the GPU vendors during driver installation that setup the ICD vendor files and OpenCL libraries. This might be why I've not seen clinfo working anywhere on ARM. Should the ICD file be part of the setup done by HK as the vendor?

Install the frame buffer and clinfo if not already.

Code: Select all

apt-get install mali-fbdev clinfo
Setup the vendor ICD file.

Code: Select all

mkdir /etc/OpenCL
mkdir /etc/OpenCL/vendors
echo "/usr/lib/arm-linux-gnueabihf/mali-egl/libOpenCL.so" > /etc/OpenCL/vendors/armocl.icd
clinfo should now report correctly.

Even though the OpenCL libraries and include files are not needed for clinfo, there is no standard location for their installation. I have read many things, but this post seemed to have the best handle on things but is dated. Here's specifically how to use the consensus locations that AMD, NVIDIA and INTEL follow(ed) for libraries and include files. No explicit references are then need to link to the OpenCL libraries.

Download to ~/ latest or use existing ARM Computer Vision and Machine Learning library
https://github.com/ARM-software/ComputeLibrary

Code: Select all

cd /opt
tar -xvzf ~/arm_compute-v18.01-bin.tar.gz
cd ~/
rm arm_compute-v18.01-bin.tar.gz

Code: Select all

cp /opt/arm_compute-v18.01-bin/include/CL/* /usr/include/CL/
mkdir /usr/lib/OpenCL
mkdir /usr/lib/OpenCL/vendors
mkdir /usr/lib/OpenCL/vendors/arm
cp /opt/arm_compute-v18.01-bin/lib/linux-armv7a-cl/* /usr/lib/OpenCL/vendors/arm/
echo "/usr/lib/OpenCL/vendors/arm" > /etc/ld.so.conf.d/opencl-vendor-arm.conf
ldconfig
All help and comments are welcomed and appreciated.
EDITED: Corrected information
Last edited by hominoid on Sun Feb 18, 2018 6:50 am, edited 1 time in total.

User avatar
odroid
Site Admin
Posts: 41864
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English, Korean
ODROIDs: ODROID
Has thanked: 3433 times
Been thanked: 1920 times
Contact:

Re: Getting OpenCL and clinfo to work?

Post by odroid »

I just tried a simple convolution example and it worked.
viewtopic.php?f=95&t=5559#p210460

I will try to look into "ocl-icd-opencl-dev" source package early next week.

hominoid
Posts: 887
Joined: Tue Feb 28, 2017 3:55 am
languages_spoken: english
ODROIDs: C2, C4, XU4, MC1, N1, N2, N2L, N2+, HC4, M1, H2, H3+
Location: Lake Superior Basin, USA
Has thanked: 128 times
Been thanked: 410 times
Contact:

Re: Getting OpenCL and clinfo to work?

Post by hominoid »

odroid wrote:I just tried a simple convolution example and it worked.
viewtopic.php?f=95&t=5559#p210460

I will try to look into "ocl-icd-opencl-dev" source package early next week.
@odroid, I spent some more time on this and answered my own questions.
The only thing required to get clinfo working is the creation of the vendor icd file under /etc/OpenCL/vendors
mali-fbdev needs to be installed but no other libraries are needed. I have updated my first post to reflect the correct procedure. The only outstanding question is when, where and how to create /etc/OpenCL/vendors/armocl.icd

Torkel
Posts: 8
Joined: Fri Nov 14, 2014 6:22 pm
languages_spoken: english, deutsch
ODROIDs: u3, XU4
Has thanked: 0
Been thanked: 0
Contact:

Re: Getting OpenCL and clinfo to Work

Post by Torkel »

Hi I followed your instructions to get OpenCl to work on an Odroid XU4 with Ubuntu-Mate-18.04.
clinfo says:

Code: Select all

Number of platforms                               1
  Platform Name                                   ARM Platform
  Platform Vendor                                 ARM
  Platform Version                                OpenCL 1.2 v1.r12p0-04rel0.03af15950392f3702b248717f4938b82
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_gl_sharing cl_khr_icd cl_khr_egl_event cl_khr_egl_image cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory
  Platform Extensions function suffix             ARM

  Platform Name                                   ARM Platform
Number of devices                                 2
  Device Name                                     Mali-T628
  Device Vendor                                   ARM
  Device Vendor ID                                0x6200010
  Device Version                                  OpenCL 1.2 v1.r12p0-04rel0.03af15950392f3702b248717f4938b82
  Driver Version                                  1.2
  Device OpenCL C Version                         OpenCL C 1.2 v1.r12p0-04rel0.03af15950392f3702b248717f4938b82
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               4
  Max clock frequency                             600MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              4
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 8 / 8        (cl_khr_fp16)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              2090397696 (1.947GiB)
  Error Correction support                        No
  Max memory allocation                           522599424 (498.4MiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        131072 (128KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            65536 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             65536x65536 pixels
    Max 3D image size                             65536x65536x65536 pixels
    Max number of read image args                 128
    Max number of write image args                8
  Local memory type                               Global
  Local memory size                               32768 (32KiB)
  Max number of constant args                     8
  Max constant buffer size                        65536 (64KiB)
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_gl_sharing cl_khr_icd cl_khr_egl_event cl_khr_egl_image cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory

  Device Name                                     Mali-T628
  Device Vendor                                   ARM
  Device Vendor ID                                0x6200010
  Device Version                                  OpenCL 1.2 v1.r12p0-04rel0.03af15950392f3702b248717f4938b82
  Driver Version                                  1.2
  Device OpenCL C Version                         OpenCL C 1.2 v1.r12p0-04rel0.03af15950392f3702b248717f4938b82
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               2
  Max clock frequency                             600MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              4
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 8 / 8        (cl_khr_fp16)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              2090397696 (1.947GiB)
  Error Correction support                        No
  Max memory allocation                           522599424 (498.4MiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        131072 (128KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            65536 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             65536x65536 pixels
    Max 3D image size                             65536x65536x65536 pixels
    Max number of read image args                 128
    Max number of write image args                8
  Local memory type                               Global
  Local memory size                               32768 (32KiB)
  Max number of constant args                     8
  Max constant buffer size                        65536 (64KiB)
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_gl_sharing cl_khr_icd cl_khr_egl_event cl_khr_egl_image cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  ARM Platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [ARM]
  clCreateContext(NULL, ...) [default]            Success [ARM]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-T628
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (2)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-T628
    Device Name                                   Mali-T628
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (2)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-T628
    Device Name                                   Mali-T628

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.11
  ICD loader Profile                              OpenCL 2.1
But now Smplayer breaks with:

Code: Select all

Das ist SMPlayer Version 18.2.2 (Revision 8937), ausgeführt auf Linux
Speicherzugriffsfehler
And compiling my favorite fractal generator with opencl stops with following message:

Code: Select all

/usr/include/CL/cl.hpp:177:10: fatal error: GL/gl.h: Datei oder Verzeichnis nicht gefunden
 #include <GL/gl.h>
          ^~~~~~~~~
compilation terminated.

Makefile:2243: recipe for target 'algebra.o' failed
make: *** [algebra.o] Error 1

mhhh, what to do?
Last edited by Torkel on Wed Aug 08, 2018 4:29 pm, edited 1 time in total.

User avatar
meveric
Posts: 12126
Joined: Mon Feb 25, 2013 2:41 pm
languages_spoken: german, english
ODROIDs: X2, U2, U3, XU-Lite, XU3, XU3-Lite, C1, XU4, C2, C1+, XU4Q, HC1, N1, Go, H2 (N4100), N2, H2 (J4105), GoA, C4, GoA v1.1, H2+, HC4, GoS
Has thanked: 93 times
Been thanked: 675 times
Contact:

Re: Getting OpenCL and clinfo to Work

Post by meveric »

Torkel wrote:mhhh, what to do?
using

Code: Select all

 tags to prevent to scrolling of numerous pages of output when someone tries to follow this thread :P

Aside from that it complains about missing OpenGL headers.
Donate to support my work on the ODROID GameStation Turbo Image for U2/U3 XU3/XU4 X2 X C1 as well as many other releases.
Check out the Games and Emulators section to find some of my work or check the files in my repository to find the software i build for ODROIDs.
If you want to add my repository to your image read my HOWTO integrate my repo into your image.

Torkel
Posts: 8
Joined: Fri Nov 14, 2014 6:22 pm
languages_spoken: english, deutsch
ODROIDs: u3, XU4
Has thanked: 0
Been thanked: 0
Contact:

Re: Getting OpenCL and clinfo to Work

Post by Torkel »

Sorry :-D
I had used QuickReply and there is no Editor :-D
I've updated my post and you're right it's much shorter :-D

Which package I've to install, tried mesa-common-dev but no change!
And starting smplayer as root gives this output

Code: Select all

smplayer
QStandardPaths: wrong ownership on runtime directory /run/user/1000, 1000 instead of 0
QStandardPaths: wrong ownership on runtime directory /run/user/1000, 1000 instead of 0

** (smplayer:5439): WARNING **: 09:37:50.742: Unable to connect to dbus: Verbindung ist geschlossen
Qt: Session management error: None of the authentication protocols specified are supported

(smplayer:5439): GLib-GIO-CRITICAL **: 09:37:50.830: g_dbus_connection_register_object: assertion 'G_IS_DBUS_CONNECTION (connection)' failed

(smplayer:5439): GLib-GIO-CRITICAL **: 09:37:50.830: g_dbus_connection_register_object: assertion 'G_IS_DBUS_CONNECTION (connection)' failed

(smplayer:5439): GLib-GIO-CRITICAL **: 09:37:50.830: g_dbus_connection_get_unique_name: assertion 'G_IS_DBUS_CONNECTION (connection)' failed

(smplayer:5439): GLib-GIO-CRITICAL **: 09:37:50.909: g_dbus_connection_register_object: assertion 'G_IS_DBUS_CONNECTION (connection)' failed

(smplayer:5439): GLib-GIO-CRITICAL **: 09:37:50.910: g_dbus_connection_register_object: assertion 'G_IS_DBUS_CONNECTION (connection)' failed

(smplayer:5439): GLib-GIO-CRITICAL **: 09:37:50.910: g_dbus_connection_get_unique_name: assertion 'G_IS_DBUS_CONNECTION (connection)' failed
Das ist SMPlayer Version 18.2.2 (Revision 8937), ausgeführt auf Linux

(smplayer:5439): GLib-GIO-CRITICAL **: 09:37:51.259: g_dbus_connection_register_object: assertion 'G_IS_DBUS_CONNECTION (connection)' failed

(smplayer:5439): GLib-GIO-CRITICAL **: 09:37:51.259: g_dbus_connection_register_object: assertion 'G_IS_DBUS_CONNECTION (connection)' failed

(smplayer:5439): GLib-GIO-CRITICAL **: 09:37:51.259: g_dbus_connection_get_unique_name: assertion 'G_IS_DBUS_CONNECTION (connection)' failed
Speicherzugriffsfehler


Last edited by Torkel on Wed Aug 08, 2018 4:46 pm, edited 1 time in total.

User avatar
meveric
Posts: 12126
Joined: Mon Feb 25, 2013 2:41 pm
languages_spoken: german, english
ODROIDs: X2, U2, U3, XU-Lite, XU3, XU3-Lite, C1, XU4, C2, C1+, XU4Q, HC1, N1, Go, H2 (N4100), N2, H2 (J4105), GoA, C4, GoA v1.1, H2+, HC4, GoS
Has thanked: 93 times
Been thanked: 675 times
Contact:

Re: Getting OpenCL and clinfo to Work

Post by meveric »

libgl1-mesa-dev should be the package for the OpenGL headers.
Donate to support my work on the ODROID GameStation Turbo Image for U2/U3 XU3/XU4 X2 X C1 as well as many other releases.
Check out the Games and Emulators section to find some of my work or check the files in my repository to find the software i build for ODROIDs.
If you want to add my repository to your image read my HOWTO integrate my repo into your image.

Torkel
Posts: 8
Joined: Fri Nov 14, 2014 6:22 pm
languages_spoken: english, deutsch
ODROIDs: u3, XU4
Has thanked: 0
Been thanked: 0
Contact:

Re: Getting OpenCL and clinfo to Work

Post by Torkel »

Thanks.
But now compiling stops with:

Code: Select all

In file included from ../src/include_header_wrapper.hpp:62:0,
                 from ../src/algebra.hpp:51,
                 from ../src/algebra.cpp:40:
/usr/include/CL/cl.hpp: In function ‘void cl::detail::fence()’:
/usr/include/CL/cl.hpp:1041:27: error: ‘_mm_mfence’ was not declared in this scope
     inline void fence() { _mm_mfence(); }
                           ^~~~~~~~~~
/usr/include/CL/cl.hpp:1041:27: note: suggested alternative: ‘fence’
     inline void fence() { _mm_mfence(); }
                           ^~~~~~~~~~
                           fence
Makefile:2243: recipe for target 'algebra.o' failed
make: *** [algebra.o] Error 1


User avatar
meveric
Posts: 12126
Joined: Mon Feb 25, 2013 2:41 pm
languages_spoken: german, english
ODROIDs: X2, U2, U3, XU-Lite, XU3, XU3-Lite, C1, XU4, C2, C1+, XU4Q, HC1, N1, Go, H2 (N4100), N2, H2 (J4105), GoA, C4, GoA v1.1, H2+, HC4, GoS
Has thanked: 93 times
Been thanked: 675 times
Contact:

Re: Getting OpenCL and clinfo to Work

Post by meveric »

Torkel wrote:

Code: Select all

/usr/include/CL/cl.hpp:1041:27: error: ‘_mm_mfence’ was not declared in this scope
     inline void fence() { _mm_mfence(); }
                           ^~~~~~~~~~
Not quite sure why, it says that _mm_mfence was not declared, means in some of the headers there should be a declaration what _mm_mfence means, but it's missing, maybe an #include is missing, or your headers are generally not correct.
Something like this is really hard to figure out if you have no experience.
Donate to support my work on the ODROID GameStation Turbo Image for U2/U3 XU3/XU4 X2 X C1 as well as many other releases.
Check out the Games and Emulators section to find some of my work or check the files in my repository to find the software i build for ODROIDs.
If you want to add my repository to your image read my HOWTO integrate my repo into your image.

Torkel
Posts: 8
Joined: Fri Nov 14, 2014 6:22 pm
languages_spoken: english, deutsch
ODROIDs: u3, XU4
Has thanked: 0
Been thanked: 0
Contact:

Re: Getting OpenCL and clinfo to Work

Post by Torkel »

Before following the above instructions, i installed the opencl-headers package.
And tried compiling with and without opencl function. It works and the program started without problem, ok the compiled opencl version won't. but it was ok. After installing clinfo and the mali-fbdev package, even the non opencl version of the program nor smplayer started. it all works fine before, damn!

hominoid
Posts: 887
Joined: Tue Feb 28, 2017 3:55 am
languages_spoken: english
ODROIDs: C2, C4, XU4, MC1, N1, N2, N2L, N2+, HC4, M1, H2, H3+
Location: Lake Superior Basin, USA
Has thanked: 128 times
Been thanked: 410 times
Contact:

Re: Getting OpenCL and clinfo to Work

Post by hominoid »

Torkel wrote:Before following the above instructions, i installed the opencl-headers package.
And tried compiling with and without opencl function. It works and the program started without problem, ok the compiled opencl version won't. but it was ok. After installing clinfo and the mali-fbdev package, even the non opencl version of the program nor smplayer started. it all works fine before, damn!
@Torkel, sorry for the slow reply but this is my busy time of the year. I had a few minutes and has catching up on the latest posts and saw your questions. I had what might be a similar issue regarding OpenCL header versions that I never fully resolved but had a work around. I posted about it here on my March 16, 2018 post but this was on Ubuntu 16.04 using version 18.03 of the ARM Computer Vision and Machine Learning library. I have not had much time right now working with Ubuntu 18.04 and OpenCL. For the work around I just copied the OpenCL headers from the version of the ARM Computer Vision and Machine Learning library I was using over the ones in /usr/include/CL respectfully.

Code: Select all

cp ./arm_compute-v18.03-bin-linux/include/CL/* /usr/include/CL/
I have to work all weekend but will check back as soon as possible to see if I can help anymore or with any new ideas.

Torkel
Posts: 8
Joined: Fri Nov 14, 2014 6:22 pm
languages_spoken: english, deutsch
ODROIDs: u3, XU4
Has thanked: 0
Been thanked: 0
Contact:

Re: Getting OpenCL and clinfo to Work

Post by Torkel »

I got it work!
- I flashed a new offical Ubuntu Mate 18.04 image
- did a

Code: Select all

 sudo apt-get install mali-x11 --reinstall 
- install the needed packages to run and compile "mandelbulber2-2.14" with opencl support

Code: Select all

sudo apt-get install build-essential libqt5gui5 qt5-default libpng16-16 \
    libpng-dev qttools5-dev qttools5-dev-tools libgomp1 libgsl-dev \
    libsndfile1-dev qtmultimedia5-dev libqt5multimedia5-plugins liblzo2-2 \
    liblzo2-dev opencl-headers
Compiling worked without problems, and the program started.
I enabled opencl in the program settings, it shows both Mali GPUs (due to program limitaions it only can use one at the moment)
But
rendering an image with OpenCL tooks twice the time than without!!
clinfo shows 0 pplatforms
how can I speed it up?

Post Reply

Return to “Ubuntu”

Who is online

Users browsing this forum: No registered users and 1 guest