这是一个用于记录应用命令的页面
CONFIG DeePMD 编译运行过程 Preparations:
安装虚拟环境(可跳过):
1 2 python3 -m venv --system-site-packages ./testsource /home/cyq/test/bin/activate
安装tensorflow
、horovod
、mpi4py
:
1 2 pip install tensorflow==2.7.0 HOROVOD_WITHOUT_GLOO=1 HOROVOD_WITH_TENSORFLOW=1 HOROVOD_GPU_OPERATIONS=NCCL pip install --no-cache-dir horovod mpi4py
克隆仓库并安装:
1 2 3 4 5 6 cd test git clone --recursive -b asc-2022 git://github.com/deepmodeling/deepmd-kit.gitcd deepmd-kitexport DP_VARIANT=cuda pip install .
导入数据集:
1 2 3 4 unzip asc-deepmd-kit-before.zip unzip asc-water.zip unzip asc-copper.zip unzip asc-mgalcu.zip
导入环境变量(可修改):
1 2 3 4 export CUDA_VISIBLE_DEVICES=0,1,2,3export OMP_NUM_THREADS=8export TF_INTRA_OP_PARALLELISM_THREADS=8export TF_INTER_OP_PARALLELISM_THREADS=1
训练:
1 2 3 4 cd water dp train input.json --skip-neighbor-stat CUDA_VISIBLE_DEVICES=0,1,2,3 horovodrun -np 2 dp train --mpi-log=workers input.json --skip-neighbor-stat
查看显卡运行情况:
国内源 1 2 3 4 5 6 7 8 9 10 11 pip install ... -i url 清 华 : https:// pypi.tuna.tsinghua.edu.cn/simple 中 国 科 技 大 学 https:// pypi.mirrors.ustc.edu.cn/simple/ 华 中 理 工 大 学 : http:// pypi.hustunique.com/ 山 东 理 工 大 学 : http:// pypi.sdutlinux.org/ 豆 瓣 : http:// pypi.douban.com/simple/
讯飞机器上的版本(oneAPI) 安装虚拟环境:
1 2 3 4 5 source /opt/rh/devtoolset-9/enable source /home/zsc/intel/oneapi/setvars.sh source /home/cyq/miniconda3/bin/activate conda activate t1
安装tensorflow
、horovod
、mpi4py
:
1 2 3 4 5 6 7 8 9 pip install tensorflow==2.7.0 -i https://pypi.tuna.tsinghua.edu.cn/simpleexport CPLUS_INCLUDE_PATH=/home/zsc/intel/oneapi/compiler/2022.0.2/linux/include:$CPLUS_INCLUDE_PATH export CPLUS_INCLUDE_PATH=/home/zsc/intel/oneapi/compiler/2022.0.2/linux/include/sycl:$CPLUS_INCLUDE_PATH HOROVOD_WITHOUT_GLOO=1 HOROVOD_WITH_TENSORFLOW=1 HOROVOD_GPU_OPERATIONS=NCCL pip install --no-cache-dir horovod mpi4py -i https://pypi.tuna.tsinghua.edu.cn/simple
克隆仓库:
1 2 3 4 5 6 cd test git clone --recursive -b asc-2022 git://github.com/deepmodeling/deepmd-kit.gitcd deepmd-kitexport DP_VARIANT=cuda pip install . -i https://pypi.tuna.tsinghua.edu.cn/simple
导入环境变量(可修改):
1 2 3 4 5 export CUDA_VISIBLE_DEVICES=2,3export OMP_NUM_THREADS=8export TF_INTRA_OP_PARALLELISM_THREADS=8export TF_INTER_OP_PARALLELISM_THREADS=1
导入数据集后训练:
1 2 3 4 cd waterexport LD_LIBRARY_PATH=/home/cyq/miniconda3/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/home/cyq:$LD_LIBRARY_PATH dp train input.json --skip-neighbor-stat
python 计时 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 start_time = time.time()print ("###################### BEGIN ####################" ) end_time = time.time()print ("PROFILEINFO ######### NAME time = %f s #############" %(end_time-start_time))print ("PROFILEINFO ###################### END ####################" ) start_time = time.time() end_time = time.time() total_time_NAME = end_time-start_time + total_time_NAMEif cur_batch == stop_batch - 1 : print ("PROFILEINFO ######### NAME time = %f s #############" %(total_time_NAME)) total_time_NAME=0 print ("hello!" )