xiaoqiugood的个人博客分享 http://blog.sciencenet.cn/u/xiaoqiugood

博文

VASP报错:forrtl: error (78): process killed (SIGTERM)

已有 20756 次阅读 2020-5-22 00:51 |个人分类:结构优化|系统分类:科研笔记

关注:

VASP计算异常终止及解决办法




1、报错:

Image              PC                Routine            Line        Source

vasp.5.4.4         0000000004EA0CCA  Unknown               Unknown  Unknown

libpthread-2.12.s  00000036CB60F710  Unknown               Unknown  Unknown

libmpi.so.12.0     00002AE947CF2C18  Unknown               Unknown  Unknown

forrtl: error (78): process killed (SIGTERM)

Image              PC                Routine            Line        Source

vasp.5.4.4         0000000004EA0CCA  Unknown               Unknown  Unknown

libpthread-2.12.s  00000036CB60F710  Unknown               Unknown  Unknown

libmpi.so.12.0     00002B4B72541407  Unknown               Unknown  Unknown

libmpi.so.12       00002B4B722B9B65  PMPIDI_CH3I_Progr     Unknown  Unknown

libmpi.so.12.0     00002B4B7245C243  Unknown               Unknown  Unknown

libmpi.so.12.0     00002B4B722654B8  Unknown               Unknown  Unknown

libmpi.so.12       00002B4B722696E6  PMPI_Allreduce        Unknown  Unknown

libmpifort.so.12.  00002B4B71E17FF1  mpi_allreduce_        Unknown  Unknown

vasp.5.4.4         0000000000446975  Unknown               Unknown  Unknown

vasp.5.4.4         00000000007F4426  Unknown               Unknown  Unknown

vasp.5.4.4         0000000000D9D2BE  Unknown               Unknown  Unknown

vasp.5.4.4         00000000013917BD  Unknown               Unknown  Unknown

vasp.5.4.4         000000000136E8A1  Unknown               Unknown  Unknown

vasp.5.4.4         0000000000438D1E  Unknown               Unknown  Unknown

libc-2.12.so       00000036CB21ED5D  __libc_start_main     Unknown  Unknown

vasp.5.4.4         0000000000438C29  Unknown               Unknown  Unknown

forrtl: error (78): process killed (SIGTERM)

Image              PC                Routine            Line        Source

vasp.5.4.4         0000000004EA0CCA  Unknown               Unknown  Unknown

libpthread-2.12.s  00000036CB60F710  Unknown               Unknown  Unknown

libmpi.so.12.0     00002AE947CF2C18  Unknown               Unknown  Unknown

forrtl: error (78): process killed (SIGTERM)

Image              PC                Routine            Line        Source

vasp.5.4.4         0000000004EA0CCA  Unknown               Unknown  Unknown

libpthread-2.12.s  00000036CB60F710  Unknown               Unknown  Unknown

libmpi.so.12.0     00002B4B72541407  Unknown               Unknown  Unknown

libmpi.so.12       00002B4B722B9B65  PMPIDI_CH3I_Progr     Unknown  Unknown

libmpi.so.12.0     00002B4B7245C243  Unknown               Unknown  Unknown

libmpi.so.12.0     00002B4B722654B8  Unknown               Unknown  Unknown

libmpi.so.12       00002B4B722696E6  PMPI_Allreduce        Unknown  Unknown

libmpifort.so.12.  00002B4B71E17FF1  mpi_allreduce_        Unknown  Unknown

vasp.5.4.4         0000000000446975  Unknown               Unknown  Unknown

vasp.5.4.4         00000000007F4426  Unknown               Unknown  Unknown

vasp.5.4.4         0000000000D9D2BE  Unknown               Unknown  Unknown

vasp.5.4.4         00000000013917BD  Unknown               Unknown  Unknown

vasp.5.4.4         000000000136E8A1  Unknown               Unknown  Unknown

vasp.5.4.4         0000000000438D1E  Unknown               Unknown  Unknown

libc-2.12.so       00000036CB21ED5D  __libc_start_main     Unknown  Unknown

vasp.5.4.4         0000000000438C29  Unknown               Unknown  Unknown

forrtl: error (78): process killed (SIGTERM)


2、解决方案

http://muchong.com/html/201005/2084704.html

从上面的错误结果可以看出,有些共享库找不到。一般来说,并行程序(如vasp)的编译是在主节点上进行的,而运行是在各个计算节点进行的。有些人(尤其是某些公司)在安装机群时,将软件都装在主节点上,通过网络共享(nfs)的方式发布到各计算节点上。不过也有些是在各计算节点上全部装一遍。无论是哪一种情况,你需要去查找在各计算节点上是否能找到共享库。你用下面的命令查看一下目前已指定的共享位置都有哪些:
echo $LD_LIBRARY_PATH




然后看你那些需要共享的库文件所在的目录是否出现在上面命令的结果中。
例如看第一个错误的情况:locate libmpi.so.12.0


locate libmpi.so.12.0

/usr/mpi/gcc/openmpi-1.10.3rc4/lib64/libmpi.so.12.0.3


libmpi.so.12.0.3文件在/usr/mpi/gcc/openmpi-1.10.3rc4/lib64/中,可你的共享库路径中只有/mpi/lib,显然计算节点是无法找到这个共享文件的,所以你得手工加上。方法是,将下面语句加到你主目录下的.bashrc(或者.bash_profile)文件中去:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/mpi/gcc/openmpi-1.10.3rc4/lib64


export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/mpi/gcc/openmpi-1.10.3rc4/lib64:/lib:/lib/i686/nosegneg:/lib64
重复以上步骤,直到将所有出现错误的共享库文件都能正确地被计算机搜索到,



echo $LD_LIBRARY_PATH

/public/home/users/application/compiler/parallel_studio_xe_2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64:/public/home/users/application/compiler/parallel_studio_xe_2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin:/public/home/users/application/compiler/parallel_studio_xe_2017/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib:/public/home/users/application/compiler/parallel_studio_xe_2017/compilers_and_libraries_2017.4.196/linux/mpi/mic/lib:/public/home/users/application/compiler/parallel_studio_xe_2017/compilers_and_libraries_2017.4.196/linux/ipp/lib/intel64:/public/home/users/application/compiler/parallel_studio_xe_2017/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin:/public/home/users/application/compiler/parallel_studio_xe_2017/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin:/public/home/users/application/compiler/parallel_studio_xe_2017/compilers_and_libraries_2017.4.196/linux/tbb/lib/intel64/gcc4.4:/public/home/users/application/compiler/parallel_studio_xe_2017/debugger_2017/iga/lib:/public/home/users/application/compiler/parallel_studio_xe_2017/debugger_2017/libipt/intel64/lib:/public/home/users/application/compiler/parallel_studio_xe_2017/compilers_and_libraries_2017.4.196/linux/daal/lib/intel64_lin:/public/home/users/application/compiler/parallel_studio_xe_2017/compilers_and_libraries_2017.4.196/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/public/software/mpi/openmpi/1.6.5/intel/lib:/public/software/compiler/intel/composer_xe_2015.2.164/compiler/lib/intel64:/public/software/compiler/intel/composer_xe_2015.2.164/mkl/lib/intel64::/opt/gridview//pbs/dispatcher/lib:/usr/local/lib64:/usr/local/lib




https://blog.sciencenet.cn/blog-567091-1234328.html

上一篇:VASP优化:NCORE、NPAR、KPAR设置
下一篇:菲克定律:气体分子在材料中的固溶及随后扩散
收藏 IP: 182.137.41.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (1 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-11-21 20:32

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部