🤠

ROCm Burn in '25

に公開

ROCm 再び

23 Version

https://zenn.dev/manyan3/articles/bcc35b169243cc

市販だと色々準備が大変なので自作でバーンイン(負荷)作る

https://github.com/mi-kaneyon/rocm_lpoweroading

2年経ってセットアップは容易になったか確認

https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/quick-start.html

  • 前よりはかなりインストールできる模様
  • ただし、いくつかコンポーネントが入らないようだ
Install log (last part)

Setting up libglx-dev:amd64 (1.4.0-1) ...
Setting up libgl-dev:amd64 (1.4.0-1) ...
Setting up mesa-common-dev:amd64 (23.2.1-1ubuntu3.1~22.04.3) ...
Setting up rocm-opencl-dev (2.0.0.60002-115~22.04) ...
Setting up rocm-opencl-sdk (6.0.2.60002-115~22.04) ...
Setting up rocm-clang-ocl (0.5.0.60002-115~22.04) ...
Setting up rocm-utils (6.0.2.60002-115~22.04) ...
Setting up rocm (6.0.2.60002-115~22.04) ...
Errors were encountered while processing:
 amdgpu-dkms
E: Sub-process /usr/bin/dpkg returned an error code (1)
Please reboot system for all settings to take effect.

rocminfo

rocminfo
ROCk module version 6.10.5 is loaded
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.14
Runtime Ext Version:     1.6
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                            
Mwaitx:                  DISABLED
DMAbuf Support:          YES

==========              
HSA Agents              
==========              
*******                  
Agent 1                  
*******                  
  Name:                    AMD Ryzen 5 5600G with Radeon Graphics
  Uuid:                    CPU-XX                            
  Marketing Name:          AMD Ryzen 5 5600G with Radeon Graphics
  Vendor Name:             CPU                                
  Feature:                 None specified                    
  Profile:                 FULL_PROFILE                      
  Float Round Mode:        NEAR                              
  Max Queue Number:        0(0x0)                            
  Queue Min Size:          0(0x0)                            
  Queue Max Size:          0(0x0)                            
  Queue Type:              MULTI                              
  Node:                    0                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                  
  Chip ID:                 0(0x0)                            
  ASIC Revision:           0(0x0)                            
  Cacheline Size:          64(0x40)                          
  Max Clock Freq. (MHz):   4464                              
  BDFID:                   0                                  
  Internal Node ID:        0                                  
  Compute Unit:            12                                
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Memory Properties:      
  Features:                None
  Pool Info:              
    Pool 1                  
      Segment:                 GLOBAL; FLAGS: FINE GRAINED        
      Size:                    78115916(0x4a7f44c) KB            
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                              
    Pool 2                  
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    78115916(0x4a7f44c) KB            
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                              
    Pool 3                  
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    78115916(0x4a7f44c) KB            
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                              
    Pool 4                  
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    78115916(0x4a7f44c) KB            
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                              
  ISA Info:                
*******                  
Agent 2                  
*******                  
  Name:                    gfx1031                            
  Uuid:                    GPU-XX                            
  Marketing Name:          AMD Radeon RX 6700 XT              
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                      
  Float Round Mode:        NEAR                              
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          64(0x40)                          
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    1                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
    L2:                      3072(0xc00) KB                    
    L3:                      98304(0x18000) KB                  
  Chip ID:                 29663(0x73df)                      
  ASIC Revision:           0(0x0)                            
  Cacheline Size:          128(0x80)                          
  Max Clock Freq. (MHz):   2855                              
  BDFID:                   4608                              
  Internal Node ID:        1                                  
  Compute Unit:            40                                
  SIMDs per CU:            2                                  
  Shader Engines:          2                                  
  Shader Arrs. per Eng.:   2                                  
  WatchPts on Addr. Ranges:4                                  
  Coherent Host Access:    FALSE                              
  Memory Properties:      
  Features:                KERNEL_DISPATCH
  Fast F16 Operation:      TRUE                              
  Wavefront Size:          32(0x20)                          
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        32(0x20)                          
  Max Work-item Per CU:    1024(0x400)                        
  Grid Max Size:           4294967295(0xffffffff)            
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)            
    y                        4294967295(0xffffffff)            
    z                        4294967295(0xffffffff)            
  Max fbarriers/Workgrp:   32                                
  Packet Processor uCode:: 120                                
  SDMA engine uCode::      80                                
  IOMMU Support::          None                              
  Pool Info:              
    Pool 1                  
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    12566528(0xbfc000) KB              
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                            
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                  
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    12566528(0xbfc000) KB              
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                            
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 3                  
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Recommended Granule:0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx1031        
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                  
      Default Rounding Mode:   NEAR                              
      Default Rounding Mode:   NEAR                              
      Fast f16:                TRUE                              
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)            
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)            
        y                        4294967295(0xffffffff)            
        z                        4294967295(0xffffffff)            
      FBarrier Max Size:       32                                
*******                  
Agent 3                  
*******                  
  Name:                    gfx90c                            
  Uuid:                    GPU-XX                            
  Marketing Name:          AMD Radeon Graphics                
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                      
  Float Round Mode:        NEAR                              
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          64(0x40)                          
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    2                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
    L2:                      1024(0x400) KB                    
  Chip ID:                 5688(0x1638)                      
  ASIC Revision:           0(0x0)                            
  Cacheline Size:          64(0x40)                          
  Max Clock Freq. (MHz):   1900                              
  BDFID:                   12288                              
  Internal Node ID:        2                                  
  Compute Unit:            7                                  
  SIMDs per CU:            4                                  
  Shader Engines:          1                                  
  Shader Arrs. per Eng.:   1                                  
  WatchPts on Addr. Ranges:4                                  
  Coherent Host Access:    FALSE                              
  Memory Properties:       APU
  Features:                KERNEL_DISPATCH
  Fast F16 Operation:      TRUE                              
  Wavefront Size:          64(0x40)                          
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        40(0x28)                          
  Max Work-item Per CU:    2560(0xa00)                        
  Grid Max Size:           4294967295(0xffffffff)            
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)            
    y                        4294967295(0xffffffff)            
    z                        4294967295(0xffffffff)            
  Max fbarriers/Workgrp:   32                                
  Packet Processor uCode:: 472                                
  SDMA engine uCode::      40                                
  IOMMU Support::          None                              
  Pool Info:              
    Pool 1                  
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    39057956(0x253fa24) KB            
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                            
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                  
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    39057956(0x253fa24) KB            
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                            
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 3                  
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Recommended Granule:0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx90c:xnack-  
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                  
      Default Rounding Mode:   NEAR                              
      Default Rounding Mode:   NEAR                              
      Fast f16:                TRUE                              
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)            
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)            
        y                        4294967295(0xffffffff)            
        z                        4294967295(0xffffffff)            
      FBarrier Max Size:       32                                
*** Done ***            



---

(mmlab) amd@amd-MS-7B86:~/Desktop$ clinfo
Number of platforms: 2
  Platform Profile: FULL_PROFILE
  Platform Version: OpenCL 2.1 AMD-APP (3635.0)
  Platform Name: AMD Accelerated Parallel Processing
  Platform Vendor: Advanced Micro Devices, Inc.
  Platform Extensions: cl_khr_icd cl_amd_event_callback
  Platform Profile: FULL_PROFILE
  Platform Version: OpenCL 2.1 AMD-APP (3635.0)
  Platform Name: AMD Accelerated Parallel Processing
  Platform Vendor: Advanced Micro Devices, Inc.
  Platform Extensions: cl_khr_icd cl_amd_event_callback


  Platform Name: AMD Accelerated Parallel Processing
Number of devices: 2
  Device Type: CL_DEVICE_TYPE_GPU
  Vendor ID: 1002h
  Board name: AMD Radeon RX 6700 XT
  Device Topology: PCI[ B#18, D#0, F#0 ]
  Max compute units: 20
  Max work items dimensions: 3
    Max work items[0]: 1024
    Max work items[1]: 1024
    Max work items[2]: 1024
  Max work group size: 256
  Preferred vector width char: 4
  Preferred vector width short: 2
  Preferred vector width int: 1
  Preferred vector width long: 1
  Preferred vector width float: 1
  Preferred vector width double: 1
  Native vector width char: 4
  Native vector width short: 2
  Native vector width int: 1
  Native vector width long: 1
  Native vector width float: 1
  Native vector width double: 1
  Max clock frequency: 2855Mhz
  Address bits: 64
  Max memory allocation: 10937905968
  Image support: Yes
  Max number of images read arguments: 128
  Max number of images write arguments: 8
  Max image 2D width: 16384
  Max image 2D height: 16384
  Max image 3D width: 16384
  Max image 3D height: 16384
  Max image 3D depth: 8192
  Max samplers within kernel: 16
  Max size of kernel argument: 1024
  Alignment (bits) of base address: 2048
  Minimum alignment (bytes) for any datatype: 128
  Single precision floating point capability
    Denorms: Yes
    Quiet NaNs: Yes
    Round to nearest even: Yes
    Round to zero: Yes
    Round to +ve and infinity: Yes
    IEEE754-2008 fused multiply-add: Yes
  Cache type: Read/Write
  Cache line size: 128
  Cache size: 16384
  Global memory size: 12868124672
  Constant buffer size: 10937905968
  Max number of constant args: 8
  Local memory type: Local
  Local memory size: 65536
  Max pipe arguments: 16
  Max pipe active reservations: 16
  Max pipe packet size: 2347971376
  Max global variable size: 10937905968
  Max global variable preferred total size: 12868124672
  Max read/write image args: 64
  Max on device events: 1024
  Queue on device max size: 8388608
  Max on device queues: 1
  Queue on device preferred size: 262144
  SVM capabilities:
    Coarse grain buffer: Yes
    Fine grain buffer: Yes
    Fine grain system: No
    Atomics: No
  Preferred platform atomic alignment: 0
  Preferred global atomic alignment: 0
  Preferred local atomic alignment: 0
  Kernel Preferred work group size multiple: 32
  Error correction support: 0
  Unified memory for Host and Device: 0
  Profiling timer resolution: 1
  Device endianess: Little
  Available: Yes
  Compiler available: Yes
  Execution capabilities:
    Execute OpenCL kernels: Yes
    Execute native function: No
  Queue on Host properties:
    Out-of-Order: No
    Profiling : Yes
  Queue on Device properties:
    Out-of-Order: Yes
    Profiling : Yes
  Platform ID: 0x7f29757f1010
  Name: gfx1031
  Vendor: Advanced Micro Devices, Inc.
  Device OpenCL C version: OpenCL C 2.0
  Driver version: 3635.0 (HSA1.1,LC)
  Profile: FULL_PROFILE
  Version: OpenCL 2.0
  Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program


  Device Type: CL_DEVICE_TYPE_GPU
  Vendor ID: 1002h
  Board name: AMD Radeon Graphics
  Device Topology: PCI[ B#48, D#0, F#0 ]
  Max compute units: 7
  Max work items dimensions: 3
    Max work items[0]: 1024
    Max work items[1]: 1024
    Max work items[2]: 1024
  Max work group size: 256
  Preferred vector width char: 4
  Preferred vector width short: 2
  Preferred vector width int: 1
  Preferred vector width long: 1
  Preferred vector width float: 1
  Preferred vector width double: 1
  Native vector width char: 4
  Native vector width short: 2
  Native vector width int: 1
  Native vector width long: 1
  Native vector width float: 1
  Native vector width double: 1
  Max clock frequency: 1900Mhz
  Address bits: 64
  Max memory allocation: 33996044896
  Image support: Yes
  Max number of images read arguments: 128
  Max number of images write arguments: 8
  Max image 2D width: 16384
  Max image 2D height: 16384
  Max image 3D width: 16384
  Max image 3D height: 16384
  Max image 3D depth: 8192
  Max samplers within kernel: 16
  Max size of kernel argument: 1024
  Alignment (bits) of base address: 2048
  Minimum alignment (bytes) for any datatype: 128
  Single precision floating point capability
    Denorms: Yes
    Quiet NaNs: Yes
    Round to nearest even: Yes
    Round to zero: Yes
    Round to +ve and infinity: Yes
    IEEE754-2008 fused multiply-add: Yes
  Cache type: Read/Write
  Cache line size: 64
  Cache size: 16384
  Global memory size: 39995346944
  Constant buffer size: 33996044896
  Max number of constant args: 8
  Local memory type: Local
  Local memory size: 65536
  Max pipe arguments: 16
  Max pipe active reservations: 16
  Max pipe packet size: 3931273824
  Max global variable size: 33996044896
  Max global variable preferred total size: 39995346944
  Max read/write image args: 64
  Max on device events: 1024
  Queue on device max size: 8388608
  Max on device queues: 1
  Queue on device preferred size: 262144
  SVM capabilities:
    Coarse grain buffer: Yes
    Fine grain buffer: Yes
    Fine grain system: No
    Atomics: No
  Preferred platform atomic alignment: 0
  Preferred global atomic alignment: 0
  Preferred local atomic alignment: 0
  Kernel Preferred work group size multiple: 64
  Error correction support: 0
  Unified memory for Host and Device: 1
  Profiling timer resolution: 1
  Device endianess: Little
  Available: Yes
  Compiler available: Yes
  Execution capabilities:
    Execute OpenCL kernels: Yes
    Execute native function: No
  Queue on Host properties:
    Out-of-Order: No
    Profiling : Yes
  Queue on Device properties:
    Out-of-Order: Yes
    Profiling : Yes
  Platform ID: 0x7f29757f1010
  Name: gfx90c:xnack-
  Vendor: Advanced Micro Devices, Inc.
  Device OpenCL C version: OpenCL C 2.0
  Driver version: 3635.0 (HSA1.1,LC)
  Profile: FULL_PROFILE
  Version: OpenCL 2.0
  Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program


  Platform Name: AMD Accelerated Parallel Processing
Number of devices: 2
  Device Type: CL_DEVICE_TYPE_GPU
  Vendor ID: 1002h
  Board name: AMD Radeon RX 6700 XT
  Device Topology: PCI[ B#18, D#0, F#0 ]
  Max compute units: 20
  Max work items dimensions: 3
    Max work items[0]: 1024
    Max work items[1]: 1024
    Max work items[2]: 1024
  Max work group size: 256
  Preferred vector width char: 4
  Preferred vector width short: 2
  Preferred vector width int: 1
  Preferred vector width long: 1
  Preferred vector width float: 1
  Preferred vector width double: 1
  Native vector width char: 4
  Native vector width short: 2
  Native vector width int: 1
  Native vector width long: 1
  Native vector width float: 1
  Native vector width double: 1
  Max clock frequency: 2855Mhz
  Address bits: 64
  Max memory allocation: 10937905968
  Image support: Yes
  Max number of images read arguments: 128
  Max number of images write arguments: 8
  Max image 2D width: 16384
  Max image 2D height: 16384
  Max image 3D width: 16384
  Max image 3D height: 16384
  Max image 3D depth: 8192
  Max samplers within kernel: 16
  Max size of kernel argument: 1024
  Alignment (bits) of base address: 2048
  Minimum alignment (bytes) for any datatype: 128
  Single precision floating point capability
    Denorms: Yes
    Quiet NaNs: Yes
    Round to nearest even: Yes
    Round to zero: Yes
    Round to +ve and infinity: Yes
    IEEE754-2008 fused multiply-add: Yes
  Cache type: Read/Write
  Cache line size: 128
  Cache size: 16384
  Global memory size: 12868124672
  Constant buffer size: 10937905968
  Max number of constant args: 8
  Local memory type: Local
  Local memory size: 65536
  Max pipe arguments: 16
  Max pipe active reservations: 16
  Max pipe packet size: 2347971376
  Max global variable size: 10937905968
  Max global variable preferred total size: 12868124672
  Max read/write image args: 64
  Max on device events: 1024
  Queue on device max size: 8388608
  Max on device queues: 1
  Queue on device preferred size: 262144
  SVM capabilities:
    Coarse grain buffer: Yes
    Fine grain buffer: Yes
    Fine grain system: No
    Atomics: No
  Preferred platform atomic alignment: 0
  Preferred global atomic alignment: 0
  Preferred local atomic alignment: 0
  Kernel Preferred work group size multiple: 32
  Error correction support: 0
  Unified memory for Host and Device: 0
  Profiling timer resolution: 1
  Device endianess: Little
  Available: Yes
  Compiler available: Yes
  Execution capabilities:
    Execute OpenCL kernels: Yes
    Execute native function: No
  Queue on Host properties:
    Out-of-Order: No
    Profiling : Yes
  Queue on Device properties:
    Out-of-Order: Yes
    Profiling : Yes
  Platform ID: 0x7f29757f1010
  Name: gfx1031
  Vendor: Advanced Micro Devices, Inc.
  Device OpenCL C version: OpenCL C 2.0
  Driver version: 3635.0 (HSA1.1,LC)
  Profile: FULL_PROFILE
  Version: OpenCL 2.0
  Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program


  Device Type: CL_DEVICE_TYPE_GPU
  Vendor ID: 1002h
  Board name: AMD Radeon Graphics
  Device Topology: PCI[ B#48, D#0, F#0 ]
  Max compute units: 7
  Max work items dimensions: 3
    Max work items[0]: 1024
    Max work items[1]: 1024
    Max work items[2]: 1024
  Max work group size: 256
  Preferred vector width char: 4
  Preferred vector width short: 2
  Preferred vector width int: 1
  Preferred vector width long: 1
  Preferred vector width float: 1
  Preferred vector width double: 1
  Native vector width char: 4
  Native vector width short: 2
  Native vector width int: 1
  Native vector width long: 1
  Native vector width float: 1
  Native vector width double: 1
  Max clock frequency: 1900Mhz
  Address bits: 64
  Max memory allocation: 33996044896
  Image support: Yes
  Max number of images read arguments: 128
  Max number of images write arguments: 8
  Max image 2D width: 16384
  Max image 2D height: 16384
  Max image 3D width: 16384
  Max image 3D height: 16384
  Max image 3D depth: 8192
  Max samplers within kernel: 16
  Max size of kernel argument: 1024
  Alignment (bits) of base address: 2048
  Minimum alignment (bytes) for any datatype: 128
  Single precision floating point capability
    Denorms: Yes
    Quiet NaNs: Yes
    Round to nearest even: Yes
    Round to zero: Yes
    Round to +ve and infinity: Yes
    IEEE754-2008 fused multiply-add: Yes
  Cache type: Read/Write
  Cache line size: 64
  Cache size: 16384
  Global memory size: 39995346944
  Constant buffer size: 33996044896
  Max number of constant args: 8
  Local memory type: Local
  Local memory size: 65536
  Max pipe arguments: 16
  Max pipe active reservations: 16
  Max pipe packet size: 3931273824
  Max global variable size: 33996044896
  Max global variable preferred total size: 39995346944
  Max read/write image args: 64
  Max on device events: 1024
  Queue on device max size: 8388608
  Max on device queues: 1
  Queue on device preferred size: 262144
  SVM capabilities:
    Coarse grain buffer: Yes
    Fine grain buffer: Yes
    Fine grain system: No
    Atomics: No
  Preferred platform atomic alignment: 0
  Preferred global atomic alignment: 0
  Preferred local atomic alignment: 0
  Kernel Preferred work group size multiple: 64
  Error correction support: 0
  Unified memory for Host and Device: 1
  Profiling timer resolution: 1
  Device endianess: Little
  Available: Yes
  Compiler available: Yes
  Execution capabilities:
    Execute OpenCL kernels: Yes
    Execute native function: No
  Queue on Host properties:
    Out-of-Order: No
    Profiling : Yes
  Queue on Device properties:
    Out-of-Order: Yes
    Profiling : Yes
  Platform ID: 0x7f29757f1010
  Name: gfx90c:xnack-
  Vendor: Advanced Micro Devices, Inc.
  Device OpenCL C version: OpenCL C 2.0
  Driver version: 3635.0 (HSA1.1,LC)
  Profile: FULL_PROFILE
  Version: OpenCL 2.0
  Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program

Install error回避(solution)

  • 公式より
  • AMD GPUがインストール失敗している模様
  • rocm-smi がレス返さない(abnormal response)

wget https://repo.radeon.com/amdgpu-install/6.3.3/ubuntu/jammy/amdgpu-install_6.3.60303-1_all.deb
sudo apt install ./amdgpu-install_6.3.60303-1_all.deb


sudo apt install ./amdgpu-install_6.3.60303-1_all.deb がダメだ


sudo dpkg -i amdgpu-install_6.3.60303-1_all.deb

  • これでインストールが成功し、上記のアコーディオンの表示が出るようなる

だいぶマシになったが、素直にはインストールさせてくれない

ストレスかけるソフトROCm対応(試作)

compliant with ROCm power loading suites

https://github.com/mi-kaneyon/rocmcuda-powerloader

Discussion