🍪
cookielau
  • Introduction
  • Machine Learning
    • Distributed
      • Bookmarks
    • NLP
      • Transformers
    • MLC
      • Tensor Program Abstraction
      • End-to-End Module Execution
  • Framework
    • PyTorch
      • Bookmarks
      • Model
      • Shared
      • Miscellaneous
    • Tensorflow
      • Bookmarks
      • Model
      • Shared
      • Miscellaneous
    • CUDA
      • Bookmarks
    • DeepSpeed
    • Bagua
      • Model
      • Optimizer
    • Others
      • Bookmarks
  • About Me
    • 2022-04-28
  • Random Thoughts
  • Archives
    • CPP
      • Bookmarks
      • Container
      • Algorithm
      • FILE CONTROL
      • Virtual Table
      • Assembly
      • Key Words
      • Problems
      • Others
    • JAVA
      • String Container
      • Maps
    • PYTHON
      • Bookmarks
      • Python Tools
        • Batch Rename
        • Combine Excel
        • Excel Oprations
        • Read Write Excel
        • Rotate PDF
      • Library
        • Pandas Notes
        • Numpy Notes
        • Json Notes
      • Spider
        • Selenium Install
        • Selenium Locating
        • Selenium Errors
        • Selenium Basics
      • Django
        • Start Up
      • Others
    • LINUX
      • Installation
      • Cli Tools
      • WSL
      • Bugs
    • JUNIOR2
      • Economics
        • Chapter 0x01 经济管理概述
        • Chapter 0x02 微观市场机制分析
        • Chapter 0x03 生产决策与市场结构
        • Chapter 0x04 宏观经济市场分析
        • Chapter 0x05 管理的职能
        • Chapter 0x06 生产系统结构与战略
        • Chapter 0x0b 投资项目经济评价
        • Chapter 0x0f 投资项目经济评价
      • Computer Network
        • 概述
        • 分层模型
        • 物理层
        • 数据链路层
        • 网络层
        • 传输层
        • 应用层
        • HTTP(s)实验
        • [Practice]
      • Software Engineering
        • Introduction
        • Demand Analysis
        • Task Estimation
        • Presentation
      • Network Security
        • Chapter 0x01 概述
        • Chapter 0x02 密码学
        • Chapter 0x03 公钥体制
        • Chapter 0x04 消息认证
        • Chapter 0x05 密钥管理
        • Chapter 0x06 访问控制
        • Assignments
      • x86 Programming
        • Basic Knowledge
        • Program Design
        • System Interruption
        • Frequently used functions
    • MD&LaTex
      • Markdown
      • LaTex
    • NPM
      • NPM LINK
    • MyBlogs
      • 2020BUAA软工——“停下来,回头看”
      • 2020BUAA软工——“初窥构建之法”
      • 2020BUAA软工——“上手软件工程,PSP初体验!”
      • 2020BUAA软工——“深度评测官”
      • 2020BUAA软工——“并肩作战,平面交点Pro”
    • SC
      • PAC 2022
        • Lectures
      • OpenMP & MPI
        • MPI Overview
        • Message Passing Programming
        • OpenMP Overview
        • Work Sharing Directives
        • Annual Challenge
        • Future Topics in OpenMP
        • Tasks
        • OpenMP & MPI
    • Hardware
      • Nvidia GPU
        • Frequent Error
        • Memory Classification
        • CUDA_7_Streams_Simplify_Concurrency
        • Optimize_Data_Transfers_in_CUDA
        • Overlap_Data_Transfers_in_CUDA
        • Write_Flexible_Kernels_with_Grid-Stride_Loops
        • How_to_Access_Global_Memory_Efficiently
        • Using_Shared_Memory
      • Intel CPU
        • Construction
        • Optimization
        • Compilation
        • OpenMP
    • English
      • Vocab
      • Composition
    • Interview
      • Computer Network
Powered by GitBook
On this page
  • Problem
  • Frequently used commands in GPU platform
  • 加载环境
  • 查看硬件
  • GPU 利用率
  • 其他查询指令

Was this helpful?

  1. Archives
  2. SC

PAC 2022

PreviousSCNextLectures

Last updated 2 years ago

Was this helpful?

Problem

Generalized Plasmon-pole模型第一性原理计算GW估计的一种通用模型,应用于量子计算、能量存储与转换、光伏、纳米电子学等领域。其计算核心算法如图所示,计算复数的矩阵向量相关性,每个单元是一个类似张量计算的累加计算,算法的实现是四重循环, 计算的复杂度高,且适合GPU计算。

Frequently used commands in GPU platform

加载环境

source /opt/intel/oneapi/setvars.sh
-------->
:: initializing oneAPI environment ...
   -bash: BASH_VERSION = 5.0.17(1)-release
   args: Using "$@" for setvars.sh arguments: 
:: advisor -- latest
:: ccl -- latest
:: clck -- latest
:: compiler -- latest
:: dal -- latest
:: debugger -- latest
:: dev-utilities -- latest
:: dnnl -- latest
:: dpcpp-ct -- latest
:: dpl -- latest
:: inspector -- latest
:: intelpython -- latest
:: ipp -- latest
:: ippcp -- latest
:: ipp -- latest
:: itac -- latest
:: mkl -- latest
:: mpi -- latest
:: tbb -- latest
:: vpl -- latest
:: vtune -- latest
:: oneAPI environment initialized ::

查看硬件

clinfo -l
-------->
Platform #0: Intel(R) FPGA Emulation Platform for OpenCL(TM)
 `-- Device #0: Intel(R) FPGA Emulation Device
Platform #1: Intel(R) OpenCL
 `-- Device #0: Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz
Platform #2: Intel(R) OpenCL HD Graphics
 +-- Device #0: Intel(R) Graphics [0x020a]
 `-- Device #1: Intel(R) Graphics [0x020a]

或

sycl-ls
-------->
[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2022.13.3.0.16_160000]
[opencl:cpu:1] Intel(R) OpenCL, Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz 3.0 [2022.13.3.0.16_160000]
[opencl:gpu:2] Intel(R) OpenCL HD Graphics, Intel(R) Graphics [0x020a] 3.0 [22.18.023111]
[opencl:gpu:3] Intel(R) OpenCL HD Graphics, Intel(R) Graphics [0x020a] 3.0 [22.18.023111]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Graphics [0x020a] 1.3 [1.3.23111]
[ext_oneapi_level_zero:gpu:1] Intel(R) Level-Zero, Intel(R) Graphics [0x020a] 1.3 [1.3.23111]
[host:host:0] SYCL host platform, SYCL host device 1.2 [1.2]

GPU 利用率

sudo /usr/bin/intel_gpu_top -d pci:card=1
sudo /usr/bin/intel_gpu_top -d pci:card=2
-------->
intel-gpu-top -    0/   0 MHz;    0% RC6;        0 irqs/s

      IMC reads:   ------ (null)/s
     IMC writes:   ------ (null)/s

          ENGINE      BUSY                                                                                                                          MI_SEMA MI_WAIT
       Blitter/0    0.00% |                                                                                                                       |      0%      0%
       Blitter/1    0.00% |                                                                                                                       |      0%      0%
         Video/0    0.00% |                                                                                                                       |      0%      0%
         Video/1    0.00% |                                                                                                                       |      0%      0%
         Video/2    0.00% |                                                                                                                       |      0%      0%
         Video/3    0.00% |                                                                                                                       |      0%      0%
         Video/4    0.00% |                                                                                                                       |      0%      0%
         Video/5    0.00% |                                                                                                                       |      0%      0%
         Video/6    0.00% |                                                                                                                       |      0%      0%
         Video/7    0.00% |                                                                                                                       |      0%      0%
         Video/8    0.00% |                                                                                                                       |      0%      0%
         Video/9    0.00% |                                                                                                                       |      0%      0%
        Video/10    0.00% |                                                                                                                       |      0%      0%
        Video/11    0.00% |                                                                                                                       |      0%      0%
        Video/12    0.00% |                                                                                                                       |      0%      0%
        Video/13    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/0    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/1    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/2    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/3    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/4    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/5    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/6    0.00% |                                                                                                                       |      0%      0%
  VideoEnhance/7    0.00% |                                                                                                                       |      0%      0%
       Compute/0    0.00% |                                                                                                                       |      0%      0%
       Compute/1    0.00% |                                                                                                                       |      0%      0%
       Compute/2    0.00% |                                                                                                                       |      0%      0%
       Compute/3    0.00% |                                                                                                                       |      0%      0%
       Compute/4    0.00% |                                                                                                                       |      0%      0%
       Compute/5    0.00% |                                                                                                                       |      0%      0%
       Compute/6    0.00% |                                                                                                                       |      0%      0%
       Compute/7    0.00% |                                                                                                                       |      0%      0%

  PID            NAME                                                                                                                                          

其他查询指令

clinfo | grep -E 'units|frequency|memory size'
-------->
  Max compute units                               64
  Max clock frequency                             2900MHz
  Global memory size                              270139891712 (251.6GiB)
  Local memory size                               262144 (256KiB)
  Max compute units                               64
  Max clock frequency                             2900MHz
  Global memory size                              270139891712 (251.6GiB)
  Local memory size                               32768 (32KiB)
  Max compute units                               960
  Max clock frequency                             1400MHz
  Global memory size                              32482365440 (30.25GiB)
  Local memory size                               65536 (64KiB)
  Max compute units                               960
  Max clock frequency                             1400MHz
  Global memory size                              32482365440 (30.25GiB)
  Local memory size                               65536 (64KiB)