PAC 2022

Problem

Generalized Plasmon-pole模型第一性原理计算GW估计的一种通用模型,应用于量子计算、能量存储与转换、光伏、纳米电子学等领域。其计算核心算法如图所示,计算复数的矩阵向量相关性,每个单元是一个类似张量计算的累加计算,算法的实现是四重循环, 计算的复杂度高,且适合GPU计算

Frequently used commands in GPU platform

加载环境

source /opt/intel/oneapi/setvars.sh
-------->
:: initializing oneAPI environment ...
   -bash: BASH_VERSION = 5.0.17(1)-release
   args: Using "$@" for setvars.sh arguments: 
:: advisor -- latest
:: ccl -- latest
:: clck -- latest
:: compiler -- latest
:: dal -- latest
:: debugger -- latest
:: dev-utilities -- latest
:: dnnl -- latest
:: dpcpp-ct -- latest
:: dpl -- latest
:: inspector -- latest
:: intelpython -- latest
:: ipp -- latest
:: ippcp -- latest
:: ipp -- latest
:: itac -- latest
:: mkl -- latest
:: mpi -- latest
:: tbb -- latest
:: vpl -- latest
:: vtune -- latest
:: oneAPI environment initialized ::

查看硬件

clinfo -l
-------->
Platform #0: Intel(R) FPGA Emulation Platform for OpenCL(TM)
 `-- Device #0: Intel(R) FPGA Emulation Device
Platform #1: Intel(R) OpenCL
 `-- Device #0: Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz
Platform #2: Intel(R) OpenCL HD Graphics
 +-- Device #0: Intel(R) Graphics [0x020a]
 `-- Device #1: Intel(R) Graphics [0x020a]

sycl-ls
-------->
[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2022.13.3.0.16_160000]
[opencl:cpu:1] Intel(R) OpenCL, Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz 3.0 [2022.13.3.0.16_160000]
[opencl:gpu:2] Intel(R) OpenCL HD Graphics, Intel(R) Graphics [0x020a] 3.0 [22.18.023111]
[opencl:gpu:3] Intel(R) OpenCL HD Graphics, Intel(R) Graphics [0x020a] 3.0 [22.18.023111]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Graphics [0x020a] 1.3 [1.3.23111]
[ext_oneapi_level_zero:gpu:1] Intel(R) Level-Zero, Intel(R) Graphics [0x020a] 1.3 [1.3.23111]
[host:host:0] SYCL host platform, SYCL host device 1.2 [1.2]

GPU 利用率

sudo /usr/bin/intel_gpu_top -d pci:card=1
sudo /usr/bin/intel_gpu_top -d pci:card=2
-------->
intel-gpu-top -    0/   0 MHz;    0% RC6;        0 irqs/s

      IMC reads:   ------ (null)/s
     IMC writes:   ------ (null)/s

          ENGINE      BUSY                                                                                                                          MI_SEMA MI_WAIT
       Blitter/0    0.00% |                                                                                                                       |      0%      0%
       Blitter/1    0.00% |                                                                                                                       |      0%      0%
         Video/0    0.00% |                                                                                                                       |      0%      0%
         Video/1    0.00% |                                                                                                                       |      0%      0%
         Video/2    0.00% |                                                                                                                       |      0%      0%
         Video/3    0.00% |                                                                                                                       |      0%      0%
         Video/4    0.00% |                                                                                                                       |      0%      0%
         Video/5    0.00% |                                                                                                                       |      0%      0%
         Video/6    0.00% |                                                                                                                       |      0%      0%
         Video/7    0.00% |                                                                                                                       |      0%      0%
         Video/8    0.00% |                                                                                                                       |      0%      0%
         Video/9    0.00% |                                                                                                                       |      0%      0%
        Video/10    0.00% |                                                                                                                       |      0%      0%
        Video/11    0.00% |                                                                                                                       |      0%      0%
        Video/12    0.00% |                                                                                                                       |      0%      0%
        Video/13    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/0    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/1    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/2    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/3    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/4    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/5    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/6    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/7    0.00% |                                                                                                                       |      0%      0%
       Compute/0    0.00% |                                                                                                                       |      0%      0%
       Compute/1    0.00% |                                                                                                                       |      0%      0%
       Compute/2    0.00% |                                                                                                                       |      0%      0%
       Compute/3    0.00% |                                                                                                                       |      0%      0%
       Compute/4    0.00% |                                                                                                                       |      0%      0%
       Compute/5    0.00% |                                                                                                                       |      0%      0%
       Compute/6    0.00% |                                                                                                                       |      0%      0%
       Compute/7    0.00% |                                                                                                                       |      0%      0%

  PID            NAME                                                                                                                                          

其他查询指令

clinfo | grep -E 'units|frequency|memory size'
-------->
  Max compute units                               64
  Max clock frequency                             2900MHz
  Global memory size                              270139891712 (251.6GiB)
  Local memory size                               262144 (256KiB)
  Max compute units                               64
  Max clock frequency                             2900MHz
  Global memory size                              270139891712 (251.6GiB)
  Local memory size                               32768 (32KiB)
  Max compute units                               960
  Max clock frequency                             1400MHz
  Global memory size                              32482365440 (30.25GiB)
  Local memory size                               65536 (64KiB)
  Max compute units                               960
  Max clock frequency                             1400MHz
  Global memory size                              32482365440 (30.25GiB)
  Local memory size                               65536 (64KiB)

Last updated

Was this helpful?