Mali-G31 OpenCL Comparison

Post Reply
hominoid
Posts: 487
Joined: Tue Feb 28, 2017 3:55 am
languages_spoken: english
ODROIDs: C2, C4, XU4, MC1, N1, N2, N2+, HC4
Location: Lake Superior Basin, USA
Has thanked: 48 times
Been thanked: 161 times
Contact:

Mali-G31 OpenCL Comparison

Post by hominoid »

The Odroid-C4’s Mali-G31 comparison with the Odroid-N2’s Mali-G52 uses the OpenCL crypto-currency miner sgminer-arm and the OpenCL password recovery tool hashcat for GPGPU performance evaluation. The test configuration was conducted with OEM passive cooling and the following OS and application versions.

Odroid-C4 Mali-G31
Linux odroid-c4 4.9.218-25 #1 SMP PREEMPT Mon Jun 8 13:54:52 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux
gcc version 9.3, llvm 9.0, pocl 1.5, arm compute v20.02.1

Code: Select all

hominoid@odroid-c4:~$ clinfo
Number of platforms                               2
  Platform Name                                   ARM Platform
  Platform Vendor                                 ARM
  Platform Version                                OpenCL 2.0 git.c8adbf9.ad00b04c1b60847de257177231dc1a53
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_icd cl_khr_egl_image cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_create_command_queue cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory cl_arm_shared_virtual_memory
  Platform Extensions function suffix             ARM

  Platform Name                                   Portable Computing Language
  Platform Vendor                                 The pocl project
  Platform Version                                OpenCL 1.2 pocl 1.5, Debug+Asserts, LLVM 9.0.1, RELOC, SLEEF, FP16, POCL_DEBUG
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             POCL

  Platform Name                                   ARM Platform
Number of devices                                 1
  Device Name                                     Mali-G31
  Device Vendor                                   ARM
  Device Vendor ID                                0x70930000
  Device Version                                  OpenCL 2.0 git.c8adbf9.ad00b04c1b60847de257177231dc1a53
  Driver Version                                  2.0
  Device OpenCL C Version                         OpenCL C 2.0 git.c8adbf9.ad00b04c1b60847de257177231dc1a53
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               1
  Max clock frequency                             750MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              4
  Preferred / native vector sizes                 
    char                                                16 / 4       
    short                                                8 / 2       
    int                                                  4 / 1       
    long                                                 2 / 1       
    half                                                 8 / 2        (cl_khr_fp16)
    float                                                4 / 1       
    double                                               0 / 0        (n/a)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (n/a)
  Address bits                                    64, Little-Endian
  Global memory size                              3886833664 (3.62GiB)
  Error Correction support                        No
  Max memory allocation                           971708416 (926.7MiB)
  Unified memory for Host and Device              Yes
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   No
    Fine-grained system sharing                   No
    Atomics                                       No
  Shared Virtual Memory (SVM) capabilities (ARM)  
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   No
    Fine-grained system sharing                   No
    Atomics                                       No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Preferred alignment for atomics                 
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Max size for global variable                    65536 (64KiB)
  Preferred total size of global vars             0
  Global Memory cache type                        Read/Write
  Global Memory cache size                        65536 (64KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            65536 pixels
    Max 1D or 2D image array size                 2048 images
    Base address alignment for 2D image buffers   32 bytes
    Pitch alignment for 2D image buffers          64 pixels
    Max 2D image size                             65536x65536 pixels
    Max 3D image size                             65536x65536x65536 pixels
    Max number of read image args                 128
    Max number of write image args                64
    Max number of read/write image args           64
  Max number of pipe args                         16
  Max active pipe reservations                    1
  Max pipe packet size                            1024
  Local memory type                               Global
  Local memory size                               32768 (32KiB)
  Max number of constant args                     8
  Max constant buffer size                        65536 (64KiB)
  Max size of kernel argument                     1024
  Queue properties (on host)                      
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Queue properties (on device)                    
    Out-of-order execution                        Yes
    Profiling                                     Yes
    Preferred size                                2097152 (2MiB)
    Max size                                      16777216 (16MiB)
  Max queues on device                            1
  Max events on device                            1024
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                (n/a)
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_icd cl_khr_egl_image cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_create_command_queue cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory cl_arm_shared_virtual_memory

  Platform Name                                   Portable Computing Language
Number of devices                                 1
  Device Name                                     pthread-cortex-a55
  Device Vendor                                   ARM
  Device Vendor ID                                0x13b5
  Device Version                                  OpenCL 1.2 pocl HSTR: pthread-aarch64-unknown-linux-gnu-cortex-a55
  Driver Version                                  1.5
  Device OpenCL C Version                         OpenCL C 1.2 pocl
  Device Type                                     CPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               4
  Max clock frequency                             1908MHz
  Device Partition                                (core)
    Max number of sub-devices                     4
    Supported partition types                     equally, by counts
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             4096x4096x4096
  Max work group size                             4096
  Preferred work group size multiple              8
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 8 / 8        (cl_khr_fp16)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     No
    Infinity and NANs                             No
    Round to nearest                              No
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              2918959104 (2.718GiB)
  Error Correction support                        No
  Max memory allocation                           1073741824 (1024MiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        None
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            67108864 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             8192x8192 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 128
    Max number of write image args                128
  Local memory type                               Global
  Local memory size                               16777216 (16MiB)
  Max number of constant args                     8
  Max constant buffer size                        16777216 (16MiB)
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      1ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            Yes
  printf() buffer size                            16777216 (16MiB)
  Built-in kernels                                (n/a)
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_fp16 cl_khr_fp64


NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  ARM Platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [ARM]
  clCreateContext(NULL, ...) [default]            Success [ARM]
  clCreateContext(NULL, ...) [other]              Success [POCL]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G31
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G31
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G31

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.11
  ICD loader Profile                              OpenCL 2.1
sgminer-arm 5.6-RC1
CFLAGS="-Ofast -Wall -march=armv8-a+crypto -mtune=cortex-a55 -fexpensive-optimizations -fprefetch-loop-arrays -std=gnu99 -I/opt/arm_compute-v20.02.1-bin-linux/include/CL" LDFLAGS="-L/opt/arm_compute-v20.02.1-bin-linux/lib/linux-arm64-v8a-neon-cl" ./configure --disable-git-version --disable-adl –disable-adl-checks
./sgminer -k Lyra2Rev2 -o stratum+tcp://mona.suprnova.cc:2995 -u user -p password -I 14 -w 32 -d 0

hashcat v6.0.0-4-g5628317d
CFLAGS="-march=armv8-a+crypto -mtune=cortex-a55 -fexpensive-optimizations -fprefetch-loop-arrays"

Code: Select all

hominoid@odroid-c4:~/hashcat$ ./hashcat -I
hashcat (v6.0.0-4-g5628317d+) starting...

OpenCL Info:
============

OpenCL Platform ID #1
  Vendor..: ARM
  Name....: ARM Platform
  Version.: OpenCL 2.0 git.c8adbf9.ad00b04c1b60847de257177231dc1a53

  Backend Device ID #1
    Type...........: GPU
    Vendor.ID......: 2147483648
    Vendor.........: ARM
    Name...........: Mali-G31
    Version........: OpenCL 2.0 git.c8adbf9.ad00b04c1b60847de257177231dc1a53
    Processor(s)...: 1
    Clock..........: 750
    Memory.Total...: 3706 MB (limited to 926 MB allocatable in one block)
    Memory.Free....: 3642 MB
    OpenCL.Version.: OpenCL C 2.0 git.c8adbf9.ad00b04c1b60847de257177231dc1a53
    Driver.Version.: 2.0

OpenCL Platform ID #2
  Vendor..: The pocl project
  Name....: Portable Computing Language
  Version.: OpenCL 1.2 pocl 1.5, Debug+Asserts, LLVM 9.0.1, RELOC, SLEEF, FP16, POCL_DEBUG

  Backend Device ID #2
    Type...........: CPU
    Vendor.ID......: 2147483648
    Vendor.........: ARM
    Name...........: pthread-cortex-a55
    Version........: OpenCL 1.2 pocl HSTR: pthread-aarch64-unknown-linux-gnu-cortex-a55
    Processor(s)...: 4
    Clock..........: 1908
    Memory.Total...: 2783 MB (limited to 1024 MB allocatable in one block)
    Memory.Free....: 2719 MB
    OpenCL.Version.: OpenCL C 1.2 pocl
    Driver.Version.: 1.5
Odroid-N2 Mali-G52
Linux odroid-n2 4.9.219+ #1 SMP PREEMPT Fri Apr 24 14:32:44 EDT 2020 aarch64 aarch64 aarch64 GNU/Linux
gcc version 8.3.0, llvm 9.0, pocl 1.5, arm compute v20.02.1

Code: Select all

hominoid@odroid-n2:~$ clinfo
Number of platforms                               2
  Platform Name                                   ARM Platform
  Platform Vendor                                 ARM
  Platform Version                                OpenCL 2.0 git.c8adbf9.122c9daed32dbba4b3056f41a2f23c58
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_icd cl_khr_egl_image cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_subgroups cl_khr_create_command_queue cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory cl_arm_shared_virtual_memory
  Platform Extensions function suffix             ARM

  Platform Name                                   Portable Computing Language
  Platform Vendor                                 The pocl project
  Platform Version                                OpenCL 1.2 pocl 1.5, Debug+Asserts, LLVM 9.0.0, RELOC, SLEEF, FP16, POCL_DEBUG
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             POCL

  Platform Name                                   ARM Platform
Number of devices                                 1
  Device Name                                     Mali-G52
  Device Vendor                                   ARM
  Device Vendor ID                                0x72120000
  Device Version                                  OpenCL 2.0 git.c8adbf9.122c9daed32dbba4b3056f41a2f23c58
  Driver Version                                  2.0
  Device OpenCL C Version                         OpenCL C 2.0 git.c8adbf9.122c9daed32dbba4b3056f41a2f23c58
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               2
  Max clock frequency                             750MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             384x384x384
  Max work group size                             384
  Preferred work group size multiple              8
  Preferred / native vector sizes                 
    char                                                16 / 4       
    short                                                8 / 2       
    int                                                  4 / 1       
    long                                                 2 / 1       
    half                                                 8 / 2        (cl_khr_fp16)
    float                                                4 / 1       
    double                                               0 / 0        (n/a)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (n/a)
  Address bits                                    64, Little-Endian
  Global memory size                              3887480832 (3.62GiB)
  Error Correction support                        No
  Max memory allocation                           971870208 (926.8MiB)
  Unified memory for Host and Device              Yes
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   No
    Fine-grained system sharing                   No
    Atomics                                       No
  Shared Virtual Memory (SVM) capabilities (ARM)  
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   No
    Fine-grained system sharing                   No
    Atomics                                       No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Preferred alignment for atomics                 
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Max size for global variable                    65536 (64KiB)
  Preferred total size of global vars             0
  Global Memory cache type                        Read/Write
  Global Memory cache size                        131072 (128KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            65536 pixels
    Max 1D or 2D image array size                 2048 images
    Base address alignment for 2D image buffers   32 bytes
    Pitch alignment for 2D image buffers          64 pixels
    Max 2D image size                             65536x65536 pixels
    Max 3D image size                             65536x65536x65536 pixels
    Max number of read image args                 128
    Max number of write image args                64
    Max number of read/write image args           64
  Max number of pipe args                         16
  Max active pipe reservations                    1
  Max pipe packet size                            1024
  Local memory type                               Global
  Local memory size                               32768 (32KiB)
  Max number of constant args                     8
  Max constant buffer size                        65536 (64KiB)
  Max size of kernel argument                     1024
  Queue properties (on host)                      
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Queue properties (on device)                    
    Out-of-order execution                        Yes
    Profiling                                     Yes
    Preferred size                                2097152 (2MiB)
    Max size                                      16777216 (16MiB)
  Max queues on device                            1
  Max events on device                            1024
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_icd cl_khr_egl_image cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_subgroups cl_khr_create_command_queue cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory cl_arm_shared_virtual_memory

  Platform Name                                   Portable Computing Language
Number of devices                                 1
  Device Name                                     pthread-cortex-a53
  Device Vendor                                   ARM
  Device Vendor ID                                0x13b5
  Device Version                                  OpenCL 1.2 pocl HSTR: pthread-aarch64-unknown-linux-gnu-cortex-a73
  Driver Version                                  1.5
  Device OpenCL C Version                         OpenCL C 1.2 pocl
  Device Type                                     CPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               6
  Max clock frequency                             1896MHz
  Device Partition                                (core)
    Max number of sub-devices                     6
    Supported partition types                     equally, by counts
  Max work item dimensions                        3
  Max work item sizes                             4096x4096x4096
  Max work group size                             4096
  Preferred work group size multiple              8
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 8 / 8        (cl_khr_fp16)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     No
    Infinity and NANs                             No
    Round to nearest                              No
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              2919346176 (2.719GiB)
  Error Correction support                        No
  Max memory allocation                           1073741824 (1024MiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        None
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            67108864 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             8192x8192 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 128
    Max number of write image args                128
  Local memory type                               Global
  Local memory size                               16777216 (16MiB)
  Max number of constant args                     8
  Max constant buffer size                        16777216 (16MiB)
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      1ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            Yes
  printf() buffer size                            16777216 (16MiB)
  Built-in kernels                                
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_fp16 cl_khr_fp64

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  ARM Platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [ARM]
  clCreateContext(NULL, ...) [default]            Success [ARM]
  clCreateContext(NULL, ...) [other]              Success [POCL]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G52
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G52
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G52

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.11
  ICD loader Profile                              OpenCL 2.1
sgminer-arm 5.6-RC1
CFLAGS="-Ofast -Wall -march=armv8-a+crypto -mtune=cortex-a73.cortex-a53 -fexpensive-optimizations -fprefetch-loop-arrays -std=gnu99 -I/opt/arm_compute-v20.02.1-bin-linux/include/CL" LDFLAGS="-L/opt/arm_compute-v20.02.1-bin-linux/lib/linux-arm64-v8a-neon-cl" ./configure --disable-git-version --disable-adl --disable-adl-checks
./sgminer -k Lyra2Rev2 -o stratum+tcp://mona.suprnova.cc:2995 -u user -p password -I 17 -w 64 -d 0

hashcat v6.0.0-4-g5628317d
CFLAGS="-march=armv8-a+crypto -mtune=cortex-a73.cortex-a53 -fexpensive-optimizations -fprefetch-loop-arrays"

Code: Select all

hominoid@odroid-n2:~/hashcat$ ./hashcat -I
hashcat (v6.0.0-4-g5628317d+) starting...

OpenCL Info:
============

OpenCL Platform ID #1
  Vendor..: ARM
  Name....: ARM Platform
  Version.: OpenCL 2.0 git.c8adbf9.122c9daed32dbba4b3056f41a2f23c58

  Backend Device ID #1
    Type...........: GPU
    Vendor.ID......: 2147483648
    Vendor.........: ARM
    Name...........: Mali-G52
    Version........: OpenCL 2.0 git.c8adbf9.122c9daed32dbba4b3056f41a2f23c58
    Processor(s)...: 2
    Clock..........: 750
    Memory.Total...: 3707 MB (limited to 926 MB allocatable in one block)
    Memory.Free....: 3643 MB
    OpenCL.Version.: OpenCL C 2.0 git.c8adbf9.122c9daed32dbba4b3056f41a2f23c58
    Driver.Version.: 2.0

OpenCL Platform ID #2
  Vendor..: The pocl project
  Name....: Portable Computing Language
  Version.: OpenCL 1.2 pocl 1.5, Debug+Asserts, LLVM 9.0.0, RELOC, SLEEF, FP16, POCL_DEBUG

  Backend Device ID #2
    Type...........: CPU
    Vendor.ID......: 2147483648
    Vendor.........: ARM
    Name...........: pthread-cortex-a53
    Version........: OpenCL 1.2 pocl HSTR: pthread-aarch64-unknown-linux-gnu-cortex-a73
    Processor(s)...: 6
    Clock..........: 1896
    Memory.Total...: 2784 MB (limited to 1024 MB allocatable in one block)
    Memory.Free....: 2720 MB
    OpenCL.Version.: OpenCL C 1.2 pocl
    Driver.Version.: 1.5
The Mali-G31 and Mali-G52 ran sgminer-arm LyraRev2 on the live Mona coin blockchain for approximately 1 hour. Both passive systems were able to handle the load without any obvious throttling.
sgminer pwr-therm.png
sgminer pwr-therm.png (195.83 KiB) Viewed 1381 times
Mali-G31 sgminer-arm OpenCL LyraRev2

Code: Select all

sgminer 5.5.6-ARM-RC1 - Started: [2020-06-21 12:01:45] - [0 days 00:58:11]
--------------------------------------------------------------------------------
(5s):38.64K (avg):38.19Kh/s | A:0  R:0  HW:1  WU:0.138/m
ST: 2  SS: 0  NB: 39  LW: 3653  GF: 5  RF: 0
Connected to mona.suprnova.cc (stratum) diff 256 as user hominoid.c0n0
Block: f0be4eb9...  Diff:2.06M  Started: [12:56:53]  Best share: 5.322
--------------------------------------------------------------------------------
[P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
GPU 0:                | 38.63K/38.19Kh/s | R:  0.0% HW:1 WU:0.138/m I:14
--------------------------------------------------------------------------------
[12:01:26] Started sgminer 5.5.6-ARM-RC1
[12:01:26] * using Jansson 2.7
[12:01:27] Probing for an alive pool
[12:01:27] mona.suprnova.cc difficulty changed to 256
[12:01:28] Startup GPU initialization... Using settings from pool mona.suprnova.cc.
[12:01:28] Startup Pool No = 0
[12:01:28] Building binary lyra2rev2Mali-G31gw32l8lgtc16384.bin
[12:01:34] Initialising kernel lyra2rev2.cl with nfactor 10, n 1024
[12:11:36] Stratum connection to mona.suprnova.cc interrupted
[12:11:37] mona.suprnova.cc not responding!
[12:11:38] mona.suprnova.cc difficulty changed to 256
[12:22:17] Stratum connection to mona.suprnova.cc interrupted
[12:22:17] mona.suprnova.cc difficulty changed to 256
[12:33:10] Stratum connection to mona.suprnova.cc interrupted
[12:33:16] mona.suprnova.cc difficulty changed to 256
[12:43:54] Stratum connection to mona.suprnova.cc interrupted
[12:43:55] mona.suprnova.cc difficulty changed to 256
[12:54:06] Stratum connection to mona.suprnova.cc interrupted
[12:54:06] mona.suprnova.cc difficulty changed to 256

hominoid@odroid-c4:~/sgminer-arm$ ./start-mona
[13:02:30] 
Summary of runtime statistics:
                    
[13:02:30] Started at [2020-06-21 12:01:45]                    
[13:02:30] Pool: stratum+tcp://mona.suprnova.cc:2995                    
[13:02:30] Runtime: 1 hrs : 0 mins : 45 secs                    
[13:02:30] Average hashrate: 38.2 Kilohash/s                    
[13:02:30] Solved blocks: 0                    
[13:02:30] Best share difficulty: 5.322                    
[13:02:30] Share submissions: 0                    
[13:02:30] Accepted shares: 0                    
[13:02:30] Rejected shares: 0                    
[13:02:30] Accepted difficulty shares: 0                    
[13:02:30] Rejected difficulty shares: 0                    
[13:02:30] Hardware errors: 1                    
[13:02:30] Utility (accepted shares / min): 0.00/min                    
[13:02:30] Work Utility (diff1 shares solved / min): 0.13/min
                    
[13:02:30] Stale submissions discarded due to new blocks: 0                    
[13:02:30] Unable to get work from server occasions: 5                    
[13:02:30] Work items generated locally: 3811                    
[13:02:30] Submitting work remotely delay occasions: 0                    
[13:02:30] New blocks detected on network: 39
                    
[13:02:30] Summary of per device statistics:
                    
[13:02:30] GPU0                | (5s):38.62K (avg):38.21Kh/s | A:0 R:0 HW:1 WU:0.132/m                    
[13:02:30]  
Mali-G52 sgminer-arm OpenCL LyraRev2

Code: Select all

sgminer 5.5.6-ARM-RC1 - Started: [2020-06-21 16:17:10] - [0 days 00:55:39]
--------------------------------------------------------------------------------
(5s):109.8K (avg):94.79Kh/s | A:0  R:0  HW:3  WU:0.359/m
ST: 2  SS: 0  NB: 42  LW: 3481  GF: 5  RF: 0
Connected to mona.suprnova.cc (stratum) diff 256 as user hominoid.c0n0
Block: 23365878...  Diff:1.86M  Started: [17:12:37]  Best share: 11.36
--------------------------------------------------------------------------------
[P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
GPU 0:                | 95.83K/94.79Kh/s | R:  0.0% HW:3 WU:0.359/m I:17
--------------------------------------------------------------------------------
[16:16:59] Started sgminer 5.5.6-ARM-RC1
[16:16:59] * using Jansson 2.7
[16:16:59] Probing for an alive pool
[16:16:59] mona.suprnova.cc difficulty changed to 256
[16:17:00] Startup GPU initialization... Using settings from pool mona.suprnova.cc.
[16:17:00] Startup Pool No = 0
[16:17:00] Building binary lyra2rev2Mali-G52gw64l8lgtc131072.bin
[16:17:03] Initialising kernel lyra2rev2.cl with nfactor 10, n 1024
[16:27:05] Stratum connection to mona.suprnova.cc interrupted
[16:27:05] mona.suprnova.cc difficulty changed to 256
[16:37:14] Stratum connection to mona.suprnova.cc interrupted
[16:37:14] mona.suprnova.cc difficulty changed to 256
[16:47:38] Stratum connection to mona.suprnova.cc interrupted
[16:47:38] mona.suprnova.cc difficulty changed to 256
[16:58:32] Stratum connection to mona.suprnova.cc interrupted
[16:58:32] mona.suprnova.cc difficulty changed to 256
[17:09:04] Stratum connection to mona.suprnova.cc interrupted
[17:09:05] mona.suprnova.cc difficulty changed to 256


hominoid@odroid-n2:~/sgminer-arm$ ./start-mona
[17:17:17] 
Summary of runtime statistics:
                    
[17:17:17] Started at [2020-06-21 16:17:10]                    
[17:17:17] Pool: stratum+tcp://mona.suprnova.cc:2995                    
[17:17:17] Runtime: 1 hrs : 0 mins : 6 secs                    
[17:17:17] Average hashrate: 94.8 Kilohash/s                    
[17:17:17] Solved blocks: 0                    
[17:17:17] Best share difficulty: 11.36                    
[17:17:17] Share submissions: 0                    
[17:17:17] Accepted shares: 0                    
[17:17:17] Rejected shares: 0                    
[17:17:17] Accepted difficulty shares: 0                    
[17:17:17] Rejected difficulty shares: 0                    
[17:17:17] Hardware errors: 3                    
[17:17:17] Utility (accepted shares / min): 0.00/min                    
[17:17:17] Work Utility (diff1 shares solved / min): 0.33/min
                    
[17:17:17] Stale submissions discarded due to new blocks: 0                    
[17:17:17] Unable to get work from server occasions: 5                    
[17:17:17] Work items generated locally: 3757                    
[17:17:17] Submitting work remotely delay occasions: 0                    
[17:17:17] New blocks detected on network: 44
                    
[17:17:17] Summary of per device statistics:
                    
[17:17:17] GPU0                | (5s):95.75K (avg):94.84Kh/s | A:0 R:0 HW:3 WU:0.333/m                    
[17:17:17]                      
Odroid-C4 Mali-G31 Only Hashcat OpenCL

Code: Select all

hominoid@odroid-c4:~/hashcat$ ./hashcat -D 2 -b 
hashcat (v6.0.0-4-g5628317d+) starting in benchmark mode...

OpenCL API (OpenCL 2.0 git.c8adbf9.ad00b04c1b60847de257177231dc1a53) - Platform #1 [ARM]
========================================================================================
* Device #1: Mali-G31, 3642/3706 MB (926 MB allocatable), 1MCU

OpenCL API (OpenCL 1.2 pocl 1.5, Debug+Asserts, LLVM 9.0.1, RELOC, SLEEF, FP16, POCL_DEBUG) - Platform #2 [The pocl project]
============================================================================================================================
* Device #2: pthread-cortex-a55, skipped

Benchmark relevant options:
===========================
* --opencl-device-types=2
* --optimized-kernel-enable

Hashmode: 0 - MD5

Speed.#1.........: 32035.8 kH/s (64.45ms) @ Accel:8 Loops:1024 Thr:256 Vec:1

Hashmode: 100 - SHA1

Speed.#1.........:  8076.5 kH/s (64.09ms) @ Accel:8 Loops:256 Thr:256 Vec:1

Hashmode: 1400 - SHA2-256

Speed.#1.........:  4672.5 kH/s (55.02ms) @ Accel:1 Loops:1024 Thr:256 Vec:1

Hashmode: 1700 - SHA2-512

Speed.#1.........:  1047.5 kH/s (61.69ms) @ Accel:1 Loops:256 Thr:256 Vec:1

Hashmode: 1000 - NTLM

Speed.#1.........: 42327.7 kH/s (48.51ms) @ Accel:8 Loops:1024 Thr:256 Vec:1

Hashmode: 3000 - LM

Speed.#1.........:  1260.6 kH/s (102.19ms) @ Accel:2 Loops:1024 Thr:64 Vec:1

Hashmode: 5500 - NetNTLMv1 / NetNTLMv1+ESS

Speed.#1.........: 27988.9 kH/s (74.01ms) @ Accel:8 Loops:1024 Thr:256 Vec:1

Hashmode: 5600 - NetNTLMv2

Speed.#1.........:  1899.1 kH/s (68.15ms) @ Accel:1 Loops:512 Thr:256 Vec:1

Hashmode: 1500 - descrypt, DES (Unix), Traditional DES

Speed.#1.........:   104.6 kH/s (623.41ms) @ Accel:1 Loops:1024 Thr:64 Vec:1

Hashmode: 500 - md5crypt, MD5 (Unix), Cisco-IOS $1$ (MD5) (Iterations: 1000)

Speed.#1.........:     8888 H/s (55.56ms) @ Accel:2 Loops:1000 Thr:256 Vec:1

Hashmode: 3200 - bcrypt $2*$, Blowfish (Unix) (Iterations: 32)

Speed.#1.........:       52 H/s (74.53ms) @ Accel:4 Loops:4 Thr:8 Vec:1

Hashmode: 1800 - sha512crypt $6$, SHA512 (Unix) (Iterations: 5000)

Speed.#1.........:       78 H/s (83.10ms) @ Accel:2 Loops:64 Thr:256 Vec:1

Hashmode: 7500 - Kerberos 5, etype 23, AS-REQ Pre-Auth

Speed.#1.........:   307.9 kH/s (52.37ms) @ Accel:2 Loops:128 Thr:64 Vec:1

Hashmode: 13100 - Kerberos 5, etype 23, TGS-REP

Speed.#1.........:   299.9 kH/s (53.71ms) @ Accel:2 Loops:128 Thr:64 Vec:1

Hashmode: 15300 - DPAPI masterkey file v1 (Iterations: 23999)

Speed.#1.........:       79 H/s (68.95ms) @ Accel:2 Loops:256 Thr:256 Vec:1

Hashmode: 15900 - DPAPI masterkey file v2 (Iterations: 12899)

Speed.#1.........:       39 H/s (64.38ms) @ Accel:2 Loops:64 Thr:256 Vec:1

Hashmode: 7100 - macOS v10.8+ (PBKDF2-SHA512) (Iterations: 1023)

Speed.#1.........:      415 H/s (66.78ms) @ Accel:16 Loops:7 Thr:256 Vec:1

Hashmode: 11600 - 7-Zip (Iterations: 16384)

Speed.#1.........:      442 H/s (142.87ms) @ Accel:1 Loops:4096 Thr:256 Vec:1

Hashmode: 12500 - RAR3-hp (Iterations: 262144)

Speed.#1.........:       64 H/s (248.72ms) @ Accel:1 Loops:16384 Thr:256 Vec:1

Hashmode: 13000 - RAR5 (Iterations: 32799)

Speed.#1.........:       41 H/s (95.48ms) @ Accel:2 Loops:256 Thr:256 Vec:1

Hashmode: 6211 - TrueCrypt RIPEMD160 + XTS 512 bit (Iterations: 1999)

Speed.#1.........:      377 H/s (84.11ms) @ Accel:2 Loops:128 Thr:256 Vec:1

Hashmode: 13400 - KeePass 1 (AES/Twofish) and KeePass 2 (AES) (Iterations: 24569)

Speed.#1.........:       61 H/s (87.00ms) @ Accel:2 Loops:256 Thr:256 Vec:1

Hashmode: 6800 - LastPass + LastPass sniffed (Iterations: 499)

Speed.#1.........:     2671 H/s (75.16ms) @ Accel:4 Loops:124 Thr:256 Vec:1

Hashmode: 11300 - Bitcoin/Litecoin wallet.dat (Iterations: 200459)
Odroid-N2 Mali-G52 Only Hashcat OpenCL

Code: Select all

hominoid@odroid-n2:~/hashcat$ ./hashcat -D 2 -b
hashcat (v6.0.0-4-g5628317d+) starting in benchmark mode...

OpenCL API (OpenCL 2.0 git.c8adbf9.122c9daed32dbba4b3056f41a2f23c58) - Platform #1 [ARM]
========================================================================================
* Device #1: Mali-G52, 3643/3707 MB (926 MB allocatable), 2MCU

OpenCL API (OpenCL 1.2 pocl 1.5, Debug+Asserts, LLVM 9.0.0, RELOC, SLEEF, FP16, POCL_DEBUG) - Platform #2 [The pocl project]
============================================================================================================================
* Device #2: pthread-cortex-a53, skipped

Benchmark relevant options:
===========================
* --opencl-device-types=2
* --optimized-kernel-enable

Hashmode: 0 - MD5

Speed.#1.........:   184.5 MH/s (67.27ms) @ Accel:32 Loops:512 Thr:384 Vec:1

Hashmode: 100 - SHA1

Speed.#1.........: 24878.8 kH/s (62.36ms) @ Accel:4 Loops:512 Thr:384 Vec:1

Hashmode: 1400 - SHA2-256

Speed.#1.........: 22409.8 kH/s (69.27ms) @ Accel:2 Loops:1024 Thr:384 Vec:1

Hashmode: 1700 - SHA2-512

Speed.#1.........:  4954.6 kH/s (78.45ms) @ Accel:2 Loops:256 Thr:384 Vec:1

Hashmode: 22000 - WPA-PBKDF2-PMKID+EAPOL (Iterations: 4095)

Speed.#1.........:     1125 H/s (84.48ms) @ Accel:2 Loops:256 Thr:384 Vec:1

Hashmode: 1000 - NTLM

Speed.#1.........:   236.7 MH/s (52.26ms) @ Accel:32 Loops:512 Thr:384 Vec:1

Hashmode: 3000 - LM

clEnqueueNDRangeKernel(): CL_INVALID_WORK_GROUP_SIZE

Speed.#1.........:        0 H/s (0.00ms) @ Accel:32 Loops:512 Thr:64 Vec:1

Hashmode: 5500 - NetNTLMv1 / NetNTLMv1+ESS

Speed.#1.........:   163.0 MH/s (76.22ms) @ Accel:16 Loops:1024 Thr:384 Vec:1

Hashmode: 5600 - NetNTLMv2

Speed.#1.........: 11067.6 kH/s (70.12ms) @ Accel:2 Loops:512 Thr:384 Vec:1

Hashmode: 1500 - descrypt, DES (Unix), Traditional DES

clEnqueueNDRangeKernel(): CL_INVALID_WORK_GROUP_SIZE

Speed.#1.........:        0 H/s (0.00ms) @ Accel:2 Loops:512 Thr:64 Vec:1

Hashmode: 500 - md5crypt, MD5 (Unix), Cisco-IOS $1$ (MD5) (Iterations: 1000)

Speed.#1.........:    61168 H/s (94.42ms) @ Accel:8 Loops:1000 Thr:384 Vec:1

Hashmode: 3200 - bcrypt $2*$, Blowfish (Unix) (Iterations: 32)

Speed.#1.........:       49 H/s (78.87ms) @ Accel:4 Loops:2 Thr:8 Vec:1

Hashmode: 1800 - sha512crypt $6$, SHA512 (Unix) (Iterations: 5000)

Speed.#1.........:      306 H/s (62.99ms) @ Accel:4 Loops:32 Thr:384 Vec:1

Hashmode: 7500 - Kerberos 5, etype 23, AS-REQ Pre-Auth

Speed.#1.........:   345.5 kH/s (46.56ms) @ Accel:1 Loops:128 Thr:64 Vec:1

Hashmode: 13100 - Kerberos 5, etype 23, TGS-REP

Speed.#1.........:   307.0 kH/s (52.49ms) @ Accel:1 Loops:128 Thr:64 Vec:1

Hashmode: 15300 - DPAPI masterkey file v1 (Iterations: 23999)

Speed.#1.........:      211 H/s (77.18ms) @ Accel:2 Loops:256 Thr:384 Vec:1

Hashmode: 15900 - DPAPI masterkey file v2 (Iterations: 12899)

Speed.#1.........:      136 H/s (55.15ms) @ Accel:1 Loops:128 Thr:384 Vec:1

Hashmode: 7100 - macOS v10.8+ (PBKDF2-SHA512) (Iterations: 1023)

Speed.#1.........:     1088 H/s (76.25ms) @ Accel:16 Loops:7 Thr:384 Vec:1

Hashmode: 11600 - 7-Zip (Iterations: 16384)

Speed.#1.........:     2671 H/s (68.87ms) @ Accel:1 Loops:4096 Thr:384 Vec:1

Hashmode: 12500 - RAR3-hp (Iterations: 262144)

Speed.#1.........:      164 H/s (291.95ms) @ Accel:1 Loops:16384 Thr:384 Vec:1

Hashmode: 13000 - RAR5 (Iterations: 32799)

Speed.#1.........:      205 H/s (57.95ms) @ Accel:2 Loops:256 Thr:384 Vec:1

Hashmode: 6211 - TrueCrypt RIPEMD160 + XTS 512 bit (Iterations: 1999)

Speed.#1.........:     2112 H/s (89.22ms) @ Accel:4 Loops:128 Thr:384 Vec:1

Hashmode: 13400 - KeePass 1 (AES/Twofish) and KeePass 2 (AES) (Iterations: 24569)

Speed.#1.........:      125 H/s (127.87ms) @ Accel:2 Loops:256 Thr:384 Vec:1

Hashmode: 6800 - LastPass + LastPass sniffed (Iterations: 499)

Speed.#1.........:    13500 H/s (43.63ms) @ Accel:4 Loops:124 Thr:384 Vec:1

Hashmode: 11300 - Bitcoin/Litecoin wallet.dat (Iterations: 200459)

Speed.#1.........:       13 H/s (74.62ms) @ Accel:1 Loops:256 Thr:384 Vec:1

Started: Sun Jun 21 15:09:01 2020
Stopped: Sun Jun 21 15:29:45 2020
Odroid-C4 Mali-G31 and Cortex A55 Hashcat OpenCL

Code: Select all

hominoid@odroid-c4:~/hashcat$ ./hashcat -D 1,2 -b
hashcat (v6.0.0-4-g5628317d+) starting in benchmark mode...

OpenCL API (OpenCL 2.0 git.c8adbf9.ad00b04c1b60847de257177231dc1a53) - Platform #1 [ARM]
========================================================================================
* Device #1: Mali-G31, 3642/3706 MB (926 MB allocatable), 1MCU

OpenCL API (OpenCL 1.2 pocl 1.5, Debug+Asserts, LLVM 9.0.1, RELOC, SLEEF, FP16, POCL_DEBUG) - Platform #2 [The pocl project]
============================================================================================================================
* Device #2: pthread-cortex-a55, 2719/2783 MB (1024 MB allocatable), 4MCU

Benchmark relevant options:
===========================
* --opencl-device-types=1,2
* --optimized-kernel-enable

Hashmode: 0 - MD5

Speed.#1.........: 32042.7 kH/s (64.45ms) @ Accel:8 Loops:1024 Thr:256 Vec:1
Speed.#2.........: 49385.7 kH/s (10.23ms) @ Accel:1024 Loops:128 Thr:1 Vec:4
Speed.#*.........: 81428.5 kH/s

Hashmode: 100 - SHA1

Speed.#1.........:  8084.3 kH/s (63.85ms) @ Accel:2 Loops:1024 Thr:256 Vec:1
Speed.#2.........: 23280.5 kH/s (22.09ms) @ Accel:1024 Loops:128 Thr:1 Vec:4
Speed.#*.........: 31364.7 kH/s

Hashmode: 1400 - SHA2-256

Speed.#1.........:  4706.1 kH/s (54.67ms) @ Accel:1 Loops:1024 Thr:256 Vec:1
Speed.#2.........: 10209.0 kH/s (50.89ms) @ Accel:1024 Loops:128 Thr:1 Vec:4
Speed.#*.........: 14915.1 kH/s

Hashmode: 1700 - SHA2-512

Speed.#1.........:  1042.2 kH/s (61.87ms) @ Accel:1 Loops:256 Thr:256 Vec:1
Speed.#2.........:  3180.5 kH/s (82.01ms) @ Accel:512 Loops:128 Thr:1 Vec:2
Speed.#*.........:  4222.8 kH/s

Hashmode: 1000 - NTLM

Speed.#1.........: 42407.7 kH/s (48.51ms) @ Accel:8 Loops:1024 Thr:256 Vec:1
Speed.#2.........: 98912.1 kH/s (41.80ms) @ Accel:1024 Loops:1024 Thr:1 Vec:4
Speed.#*.........:   141.3 MH/s

Hashmode: 3000 - LM

Speed.#1.........:  1273.3 kH/s (101.34ms) @ Accel:2 Loops:1024 Thr:64 Vec:1
Speed.#2.........: 27378.2 kH/s (74.61ms) @ Accel:512 Loops:1024 Thr:1 Vec:4
Speed.#*.........: 28651.5 kH/s

Hashmode: 5500 - NetNTLMv1 / NetNTLMv1+ESS

Speed.#1.........: 27917.9 kH/s (74.01ms) @ Accel:8 Loops:1024 Thr:256 Vec:1
Speed.#2.........: 55011.8 kH/s (4.43ms) @ Accel:1024 Loops:64 Thr:1 Vec:4
Speed.#*.........: 82929.7 kH/s

Hashmode: 5600 - NetNTLMv2

Speed.#1.........:  1897.2 kH/s (68.22ms) @ Accel:1 Loops:512 Thr:256 Vec:1
Speed.#2.........:  3205.0 kH/s (81.37ms) @ Accel:512 Loops:128 Thr:1 Vec:4
Speed.#*.........:  5102.2 kH/s

Hashmode: 1500 - descrypt, DES (Unix), Traditional DES

Speed.#1.........:   102.6 kH/s (636.87ms) @ Accel:1 Loops:1024 Thr:64 Vec:1
Speed.#2.........:  1120.5 kH/s (56.79ms) @ Accel:16 Loops:1024 Thr:1 Vec:4
Speed.#*.........:  1223.1 kH/s

Hashmode: 500 - md5crypt, MD5 (Unix), Cisco-IOS $1$ (MD5) (Iterations: 1000)

Speed.#1.........:     6430 H/s (54.10ms) @ Accel:2 Loops:1000 Thr:256 Vec:1
Speed.#2.........:    12246 H/s (18.97ms) @ Accel:1024 Loops:62 Thr:1 Vec:4
Speed.#*.........:    18676 H/s

Hashmode: 3200 - bcrypt $2*$, Blowfish (Unix) (Iterations: 32)

Speed.#1.........:       46 H/s (85.25ms) @ Accel:8 Loops:2 Thr:8 Vec:1
Speed.#2.........:     1301 H/s (92.69ms) @ Accel:32 Loops:32 Thr:1 Vec:4
Speed.#*.........:     1347 H/s

Hashmode: 1800 - sha512crypt $6$, SHA512 (Unix) (Iterations: 5000)

Speed.#1.........:       85 H/s (76.60ms) @ Accel:2 Loops:64 Thr:256 Vec:1
Speed.#2.........:      849 H/s (59.63ms) @ Accel:64 Loops:1024 Thr:1 Vec:2
Speed.#*.........:      933 H/s

Hashmode: 7500 - Kerberos 5, etype 23, AS-REQ Pre-Auth

Speed.#1.........:   331.9 kH/s (48.54ms) @ Accel:2 Loops:128 Thr:64 Vec:1
Speed.#2.........:  1129.2 kH/s (57.64ms) @ Accel:4 Loops:64 Thr:64 Vec:4
Speed.#*.........:  1461.1 kH/s

Hashmode: 13100 - Kerberos 5, etype 23, TGS-REP

Speed.#1.........:   305.0 kH/s (52.86ms) @ Accel:2 Loops:128 Thr:64 Vec:1
Speed.#2.........:  1125.5 kH/s (57.73ms) @ Accel:1 Loops:256 Thr:64 Vec:4
Speed.#*.........:  1430.5 kH/s

Hashmode: 15300 - DPAPI masterkey file v1 (Iterations: 23999)

Speed.#1.........:       79 H/s (68.47ms) @ Accel:2 Loops:256 Thr:256 Vec:1
Speed.#2.........:      205 H/s (53.15ms) @ Accel:512 Loops:128 Thr:1 Vec:4
Speed.#*.........:      284 H/s

Hashmode: 7100 - macOS v10.8+ (PBKDF2-SHA512) (Iterations: 1023)

Speed.#1.........:      409 H/s (67.79ms) @ Accel:16 Loops:7 Thr:256 Vec:1
Speed.#2.........:     1448 H/s (57.90ms) @ Accel:64 Loops:511 Thr:1 Vec:2
Speed.#*.........:     1857 H/s

Hashmode: 11600 - 7-Zip (Iterations: 16384)

Speed.#1.........:      432 H/s (142.73ms) @ Accel:1 Loops:4096 Thr:256 Vec:1
Speed.#2.........:     1093 H/s (57.19ms) @ Accel:64 Loops:4096 Thr:1 Vec:4
Speed.#*.........:     1526 H/s

Hashmode: 12500 - RAR3-hp (Iterations: 262144)

Speed.#1.........:       63 H/s (255.06ms) @ Accel:1 Loops:16384 Thr:256 Vec:1
Speed.#2.........:      152 H/s (52.08ms) @ Accel:32 Loops:16384 Thr:1 Vec:4
Speed.#*.........:      215 H/s

Hashmode: 13000 - RAR5 (Iterations: 32799)

Speed.#1.........:       42 H/s (94.99ms) @ Accel:2 Loops:256 Thr:256 Vec:1
Speed.#2.........:      121 H/s (63.92ms) @ Accel:64 Loops:1024 Thr:1 Vec:4
Speed.#*.........:      163 H/s

Hashmode: 6211 - TrueCrypt RIPEMD160 + XTS 512 bit (Iterations: 1999)

Speed.#1.........:      369 H/s (84.40ms) @ Accel:1 Loops:256 Thr:256 Vec:1
Speed.#2.........:      889 H/s (71.04ms) @ Accel:64 Loops:512 Thr:1 Vec:4
Speed.#*.........:     1258 H/s

Hashmode: 13400 - KeePass 1 (AES/Twofish) and KeePass 2 (AES) (Iterations: 24569)

Speed.#1.........:       60 H/s (88.92ms) @ Accel:2 Loops:256 Thr:256 Vec:1
Speed.#2.........:      321 H/s (66.09ms) @ Accel:128 Loops:1024 Thr:1 Vec:4
Speed.#*.........:      381 H/s

Hashmode: 6800 - LastPass + LastPass sniffed (Iterations: 499)

Speed.#1.........:     2525 H/s (61.85ms) @ Accel:2 Loops:249 Thr:256 Vec:1
Speed.#2.........:     7591 H/s (64.08ms) @ Accel:128 Loops:499 Thr:1 Vec:4
Speed.#*.........:    10116 H/s
Odroid-N2 Mali-G52 and Cortex A73 Hashcat OpenCL

Code: Select all

hominoid@odroid-n2:~/hashcat$ ./hashcat -D 1,2 -b
hashcat (v6.0.0-4-g5628317d+) starting in benchmark mode...

OpenCL API (OpenCL 2.0 git.c8adbf9.122c9daed32dbba4b3056f41a2f23c58) - Platform #1 [ARM]
========================================================================================
* Device #1: Mali-G52, 3643/3707 MB (926 MB allocatable), 2MCU

OpenCL API (OpenCL 1.2 pocl 1.5, Debug+Asserts, LLVM 9.0.0, RELOC, SLEEF, FP16, POCL_DEBUG) - Platform #2 [The pocl project]
============================================================================================================================
* Device #2: pthread-cortex-a53, 2720/2784 MB (1024 MB allocatable), 6MCU

Benchmark relevant options:
===========================
* --opencl-device-types=1,2
* --optimized-kernel-enable

Hashmode: 0 - MD5

Speed.#1.........:   185.6 MH/s (67.16ms) @ Accel:16 Loops:1024 Thr:384 Vec:1
Speed.#2.........: 58561.4 kH/s (53.31ms) @ Accel:1024 Loops:512 Thr:1 Vec:4
Speed.#*.........:   244.2 MH/s

Hashmode: 100 - SHA1

Speed.#1.........: 30482.1 kH/s (50.96ms) @ Accel:4 Loops:512 Thr:384 Vec:1
Speed.#2.........: 34149.1 kH/s (91.64ms) @ Accel:1024 Loops:512 Thr:1 Vec:4
Speed.#*.........: 64631.2 kH/s

Hashmode: 1400 - SHA2-256

Speed.#1.........: 22455.9 kH/s (69.28ms) @ Accel:2 Loops:1024 Thr:384 Vec:1
Speed.#2.........: 15672.1 kH/s (49.82ms) @ Accel:256 Loops:512 Thr:1 Vec:4
Speed.#*.........: 38128.0 kH/s

Hashmode: 1700 - SHA2-512

Speed.#1.........:  5322.6 kH/s (73.41ms) @ Accel:2 Loops:256 Thr:384 Vec:1
Speed.#2.........:  5201.3 kH/s (75.22ms) @ Accel:128 Loops:512 Thr:1 Vec:2
Speed.#*.........: 10523.9 kH/s

Hashmode: 22000 - WPA-PBKDF2-PMKID+EAPOL (Iterations: 4095)

Speed.#1.........:     1210 H/s (78.07ms) @ Accel:2 Loops:256 Thr:384 Vec:1
Speed.#2.........:     1474 H/s (64.82ms) @ Accel:256 Loops:256 Thr:1 Vec:4
Speed.#*.........:     2684 H/s

Hashmode: 1000 - NTLM

Speed.#1.........:   238.3 MH/s (52.26ms) @ Accel:32 Loops:512 Thr:384 Vec:1
Speed.#2.........:   102.6 MH/s (60.78ms) @ Accel:1024 Loops:1024 Thr:1 Vec:4
Speed.#*.........:   340.9 MH/s

Hashmode: 3000 - LM

clEnqueueNDRangeKernel(): CL_INVALID_WORK_GROUP_SIZE

Speed.#1.........:        0 H/s (0.00ms) @ Accel:32 Loops:512 Thr:64 Vec:1
Speed.#2.........: 46056.8 kH/s (67.27ms) @ Accel:512 Loops:1024 Thr:1 Vec:4
Speed.#*.........: 46056.8 kH/s

Hashmode: 5500 - NetNTLMv1 / NetNTLMv1+ESS

Speed.#1.........:   163.6 MH/s (76.23ms) @ Accel:16 Loops:1024 Thr:384 Vec:1
Speed.#2.........: 63393.2 kH/s (49.13ms) @ Accel:512 Loops:1024 Thr:1 Vec:4
Speed.#*.........:   227.0 MH/s

Hashmode: 5600 - NetNTLMv2

Speed.#1.........: 11016.3 kH/s (70.85ms) @ Accel:2 Loops:512 Thr:384 Vec:1
Speed.#2.........:  3433.5 kH/s (56.77ms) @ Accel:32 Loops:1024 Thr:1 Vec:4
Speed.#*.........: 14449.8 kH/s

Hashmode: 1500 - descrypt, DES (Unix), Traditional DES

clEnqueueNDRangeKernel(): CL_INVALID_WORK_GROUP_SIZE

Speed.#1.........:        0 H/s (0.00ms) @ Accel:2 Loops:512 Thr:64 Vec:1
Speed.#2.........:  1963.5 kH/s (49.21ms) @ Accel:16 Loops:1024 Thr:1 Vec:4
Speed.#*.........:  1963.5 kH/s

Hashmode: 500 - md5crypt, MD5 (Unix), Cisco-IOS $1$ (MD5) (Iterations: 1000)

Speed.#1.........:    62318 H/s (94.46ms) @ Accel:16 Loops:500 Thr:384 Vec:1
Speed.#2.........:    20587 H/s (72.66ms) @ Accel:512 Loops:500 Thr:1 Vec:4
Speed.#*.........:    82904 H/s

Hashmode: 3200 - bcrypt $2*$, Blowfish (Unix) (Iterations: 32)

Speed.#1.........:       49 H/s (77.83ms) @ Accel:4 Loops:2 Thr:8 Vec:1
Speed.#2.........:     1360 H/s (53.06ms) @ Accel:16 Loops:32 Thr:1 Vec:4
Speed.#*.........:     1409 H/s

Hashmode: 1800 - sha512crypt $6$, SHA512 (Unix) (Iterations: 5000)

Speed.#1.........:      308 H/s (62.74ms) @ Accel:4 Loops:32 Thr:384 Vec:1
Speed.#2.........:     1262 H/s (60.30ms) @ Accel:128 Loops:512 Thr:1 Vec:2
Speed.#*.........:     1569 H/s

Hashmode: 7500 - Kerberos 5, etype 23, AS-REQ Pre-Auth

Speed.#1.........:   249.6 kH/s (130.42ms) @ Accel:2 Loops:128 Thr:64 Vec:1
Speed.#2.........:  1564.9 kH/s (62.49ms) @ Accel:2 Loops:128 Thr:64 Vec:4
Speed.#*.........:  1814.5 kH/s

Hashmode: 13100 - Kerberos 5, etype 23, TGS-REP

Speed.#1.........:   298.0 kH/s (54.32ms) @ Accel:1 Loops:128 Thr:64 Vec:1
Speed.#2.........:  1567.0 kH/s (62.44ms) @ Accel:2 Loops:128 Thr:64 Vec:4
Speed.#*.........:  1865.0 kH/s

Hashmode: 15300 - DPAPI masterkey file v1 (Iterations: 23999)

Speed.#1.........:      215 H/s (75.47ms) @ Accel:2 Loops:256 Thr:384 Vec:1
Speed.#2.........:      248 H/s (65.71ms) @ Accel:128 Loops:512 Thr:1 Vec:4
Speed.#*.........:      463 H/s

Hashmode: 15900 - DPAPI masterkey file v2 (Iterations: 12899)

Speed.#1.........:      139 H/s (54.27ms) @ Accel:1 Loops:128 Thr:384 Vec:1
Speed.#2.........:      192 H/s (79.05ms) @ Accel:256 Loops:128 Thr:1 Vec:2
Speed.#*.........:      331 H/s

Hashmode: 7100 - macOS v10.8+ (PBKDF2-SHA512) (Iterations: 1023)

Speed.#1.........:     1019 H/s (81.51ms) @ Accel:16 Loops:7 Thr:384 Vec:1
Speed.#2.........:     2408 H/s (70.28ms) @ Accel:256 Loops:127 Thr:1 Vec:2
Speed.#*.........:     3426 H/s

Hashmode: 11600 - 7-Zip (Iterations: 16384)

Speed.#1.........:     2660 H/s (68.96ms) @ Accel:1 Loops:4096 Thr:384 Vec:1
Speed.#2.........:     1593 H/s (58.71ms) @ Accel:64 Loops:4096 Thr:1 Vec:4
Speed.#*.........:     4253 H/s

Hashmode: 12500 - RAR3-hp (Iterations: 262144)

Speed.#1.........:      174 H/s (274.38ms) @ Accel:1 Loops:16384 Thr:384 Vec:1
Speed.#2.........:      214 H/s (55.84ms) @ Accel:32 Loops:16384 Thr:1 Vec:4
Speed.#*.........:      389 H/s

Hashmode: 13000 - RAR5 (Iterations: 32799)

Speed.#1.........:      211 H/s (56.47ms) @ Accel:2 Loops:256 Thr:384 Vec:1
Speed.#2.........:      191 H/s (60.95ms) @ Accel:64 Loops:1024 Thr:1 Vec:4
Speed.#*.........:      401 H/s

Hashmode: 6211 - TrueCrypt RIPEMD160 + XTS 512 bit (Iterations: 1999)

Speed.#1.........:     2117 H/s (89.24ms) @ Accel:4 Loops:128 Thr:384 Vec:1
Speed.#2.........:     1204 H/s (79.16ms) @ Accel:256 Loops:128 Thr:1 Vec:4
Speed.#*.........:     3321 H/s

Hashmode: 13400 - KeePass 1 (AES/Twofish) and KeePass 2 (AES) (Iterations: 24569)

Speed.#1.........:      125 H/s (127.48ms) @ Accel:2 Loops:256 Thr:384 Vec:1
Speed.#2.........:      418 H/s (76.41ms) @ Accel:256 Loops:512 Thr:1 Vec:4
Speed.#*.........:      543 H/s

Hashmode: 6800 - LastPass + LastPass sniffed (Iterations: 499)

Speed.#1.........:    13019 H/s (44.93ms) @ Accel:4 Loops:124 Thr:384 Vec:1
Speed.#2.........:    11989 H/s (40.38ms) @ Accel:256 Loops:249 Thr:1 Vec:4
Speed.#*.........:    25007 H/s

Hashmode: 11300 - Bitcoin/Litecoin wallet.dat (Iterations: 200459)

Speed.#1.........:       14 H/s (68.00ms) @ Accel:1 Loops:256 Thr:384 Vec:1
Speed.#2.........:       25 H/s (78.84ms) @ Accel:512 Loops:128 Thr:1 Vec:2
Speed.#*.........:       39 H/s

Started: Sun Jun 21 15:33:28 2020
Stopped: Sun Jun 21 15:55:53 2020
Summary
The Mali-G31 is an entry level device and the smallest ARM GPU. It produced 38.21kh/s LyraRev2 hashes compared to 94.84kh/s for the Mali-G52. The Mali-G31’s efficiency was 14.74kh/watt compared to 27.44kh/watt for the Mali-G52, or in other words, the Mali-G52 produced 1.86 times the hashes of the Mali-G31 per watt for the sgminer-arm OpenCL workload. Other Mali GPU sgminer-arm LyraRev2 results are available in the Mali-G52 OpenCL Comparison thread.
These users thanked the author hominoid for the post (total 4):
odroid (Wed Jun 24, 2020 1:20 pm) • mad_ady (Wed Jun 24, 2020 11:31 pm) • meister ivar (Mon Jun 29, 2020 2:43 am) • alprakas (Wed Jul 15, 2020 7:18 pm)

kubak
Posts: 1
Joined: Tue Sep 08, 2020 11:51 pm
languages_spoken: english
ODROIDs: Odroid-C4
Has thanked: 0
Been thanked: 0
Contact:

Re: Mali-G31 OpenCL Comparison

Post by kubak »

Hi, can I ask where you got your OpenCL driver for the Odroid-C4? I flashed the latest Ubuntu image, but the OpenCL driver that is packaged with it seems unresponsive. No output from clinfo, and always returns error codes for any OpenCL calls.

Thanks!

hominoid
Posts: 487
Joined: Tue Feb 28, 2017 3:55 am
languages_spoken: english
ODROIDs: C2, C4, XU4, MC1, N1, N2, N2+, HC4
Location: Lake Superior Basin, USA
Has thanked: 48 times
Been thanked: 161 times
Contact:

Re: Mali-G31 OpenCL Comparison

Post by hominoid »

kubak wrote:
Tue Sep 08, 2020 11:55 pm
Hi, can I ask where you got your OpenCL driver for the Odroid-C4? I flashed the latest Ubuntu image, but the OpenCL driver that is packaged with it seems unresponsive. No output from clinfo, and always returns error codes for any OpenCL calls.

Thanks!
@kubak,
I can't try it myself right now but here are some things to check. Make sure your running the minimal image which supports the frame buffer and it's loaded as described in the release notes. In the past you needed to setup the icd file and a libOpenCL.so symbolic link as illustrated here.

hominoid
Posts: 487
Joined: Tue Feb 28, 2017 3:55 am
languages_spoken: english
ODROIDs: C2, C4, XU4, MC1, N1, N2, N2+, HC4
Location: Lake Superior Basin, USA
Has thanked: 48 times
Been thanked: 161 times
Contact:

Re: Mali-G31 OpenCL Comparison

Post by hominoid »

I had a chance to take a look and I do remember having to fix a dependency problem between the mali_fbdev and ocl-icd-libopencl1 for OpenCL to work. Here is how to fix it based on the release notes for the N2. It needs to be added to the C4 release notes.

Code: Select all

sudo apt purge mali-fbdev
sudo apt install clinfo ocl-icd-libopencl1
cd ~
apt download mali-fbdev
ar -xv mali-fbdev_*
tar -xvf data.tar.xz 
rm usr/lib/aarch64-linux-gnu/libOpenCL.so*
sudo cp -r usr/* /usr/
mkdir -p /etc/OpenCL/vendors/
echo libmali.so | sudo tee /etc/OpenCL/vendors/mali.icd
rm ~/control.tar.xz ~/data.tar.xz ~/debian-binary ~/mali-fbdev_0.1-2_arm64.deb
rm -r ~/etc ~/usr
It should report correctly now.

Code: Select all

hominoid@odroid-c4:~$ clinfo
Number of platforms                               1
  Platform Name                                   ARM Platform
  Platform Vendor                                 ARM
  Platform Version                                OpenCL 2.0 git.c8adbf9.ad00b04c1b60847de257177231dc1a53
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_icd cl_khr_egl_image cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_create_command_queue cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory cl_arm_shared_virtual_memory
  Platform Extensions function suffix             ARM

  Platform Name                                   ARM Platform
Number of devices                                 1
  Device Name                                     Mali-G31
  Device Vendor                                   ARM
  Device Vendor ID                                0x70930000
  Device Version                                  OpenCL 2.0 git.c8adbf9.ad00b04c1b60847de257177231dc1a53
  Driver Version                                  2.0
  Device OpenCL C Version                         OpenCL C 2.0 git.c8adbf9.ad00b04c1b60847de257177231dc1a53
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               1
  Max clock frequency                             750MHz
  Device Partition                                (core)
    Max number of sub-devices                     0
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              4
  Preferred / native vector sizes                 
    char                                                16 / 4       
    short                                                8 / 2       
    int                                                  4 / 1       
    long                                                 2 / 1       
    half                                                 8 / 2        (cl_khr_fp16)
    float                                                4 / 1       
    double                                               0 / 0        (n/a)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (n/a)
  Address bits                                    64, Little-Endian
  Global memory size                              3886776320 (3.62GiB)
  Error Correction support                        No
  Max memory allocation                           971694080 (926.7MiB)
  Unified memory for Host and Device              Yes
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   No
    Fine-grained system sharing                   No
    Atomics                                       No
  Shared Virtual Memory (SVM) capabilities (ARM)  
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   No
    Fine-grained system sharing                   No
    Atomics                                       No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Preferred alignment for atomics                 
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Max size for global variable                    65536 (64KiB)
  Preferred total size of global vars             0
  Global Memory cache type                        Read/Write
  Global Memory cache size                        65536 (64KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            65536 pixels
    Max 1D or 2D image array size                 2048 images
    Base address alignment for 2D image buffers   32 bytes
    Pitch alignment for 2D image buffers          64 pixels
    Max 2D image size                             65536x65536 pixels
    Max 3D image size                             65536x65536x65536 pixels
    Max number of read image args                 128
    Max number of write image args                64
    Max number of read/write image args           64
  Max number of pipe args                         16
  Max active pipe reservations                    1
  Max pipe packet size                            1024
  Local memory type                               Global
  Local memory size                               32768 (32KiB)
  Max number of constant args                     8
  Max constant buffer size                        65536 (64KiB)
  Max size of kernel argument                     1024
  Queue properties (on host)                      
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Queue properties (on device)                    
    Out-of-order execution                        Yes
    Profiling                                     Yes
    Preferred size                                2097152 (2MiB)
    Max size                                      16777216 (16MiB)
  Max queues on device                            1
  Max events on device                            1024
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                (n/a)
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_icd cl_khr_egl_image cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_create_command_queue cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory cl_arm_shared_virtual_memory

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  ARM Platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [ARM]
  clCreateContext(NULL, ...) [default]            Success [ARM]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G31
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G31
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 ARM Platform
    Device Name                                   Mali-G31

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.11
  ICD loader Profile                              OpenCL 2.1

User avatar
odroid
Site Admin
Posts: 35923
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English, Korean
ODROIDs: ODROID
Has thanked: 1325 times
Been thanked: 918 times
Contact:

Re: Mali-G31 OpenCL Comparison

Post by odroid »

@hominoid,
Thanks, We've updated the wiki page.
https://wiki.odroid.com/odroid-c4/os_im ... figuration

Post Reply

Return to “General Topics”

Who is online

Users browsing this forum: No registered users and 0 guests