博文

vasp Problem & Solution+NLEMDL详解

已有 40249 次阅读 2012-9-17 10:59 |系统分类:科研笔记| version

Compilation of VASP on Opteron/Rocks cluster Ametisti with Pathscale

Compilation of VASP on Opteron/Rocks cluster Ametisti with Pathscale 2.2.1 Fortran compiler

These observations apply to the serial version of VASP. The parallel section dealing with VASP on Ametisti using the MPI interface will be included later.

I used the makefile from Janne Blomqvist at Helsinki University of Technology (Thanks Janne).

Few things were noticed after a successful compilation.

Serial version problems:

Problem 1: Error EDDDAV: Call to ZHEGV failed.

Problem 2: lib-4201 : UNRECOVERABLE library error

Problem 3: WARNING in EDDRMM: call to ZHEGV failed, returncode = 6 3 9

Parallel version problems:

Will be added at some later date

Problem 1)

The test calculation for copper system (benchmark.tar.gz, download from the VASP ftp-server) stops with an error message:

"Error EDDDAV: Call to ZHEGV failed. Returncode = 9 1 8".

The actual numbers change at the end of the error, but the message means that a LAPACK library call failed.

Solution:

The subroutine davidson.F must be handled with lower optimization setting

Add the following lines to the end of the VASP Makefile:

davidson.o : davidson.F

$(CPP)

(FC) (FC) (FFLAGS) -O1 -c ∗ * (SUFFIX)

(You remembered to use the TAB key instead of spaces with the second and the third line, right?)

Problem 2)

After lowering the davidson subroutine optimization level the calculation ends with another error:

"lib-4201 : UNRECOVERABLE library error: Unable to find error message (check NLSPATH, file lib.cat)

Encountered during a direct access unformatted READ from unit 21. Fortran unit 21 is connected to a direct unformatted unblocked file: "TMPCAR"

/opt/gridengine/default/spool//compute-0-4/job_scripts/10170: line 31: 16997 Aborted (core dumped) ./vasp_path_serial >vasp_path_serial.out"

Solution:

Change/add the IWAVPR=10 line to your INCAR file. This is from the VASP manual, FAQ section, page 149

"Question: I am running VASP on a SGI Origin, and the simple benchmark (benchmark.tar.gz) fails with lib-4201 :

UNRECOVERABLE library error READ operation tried to read past the end-of-record.

Encountered during a direct access unformatted READ from unit 21

Fortran unit 21 is connected to a direct unformatted unblocked file:

"TMPCAR" IOT Trap

Abort (core dumped)

Answer: VASP extrapolates the wave functions between molecular dynamics time steps. To store the wave functions of the previous time

steps either a temporary scratch file (TMPCAR) is used (IWAVPR=1-9) or large work arrays are allocated (IWAVPR=11-19).

On the SGI, the version that uses a temporary scratch file does not compile correctly, and hence the user has to set IWAVPR to 10."

Problem 3)

When running the Hg benchmark (bench.Hg.tar.gz), the OUTCAR file has numerous lines saying:

"WARNING in EDDRMM: call to ZHEGV failed, returncode = 6 3 9"

Solution:

This issue is addressed in the VASP support forum (http://cms.mpi.univie.ac.at/vasp-forum/forum_viewtopic.php?3.214)

Short summary is given here. Possible reasons (this may, once again, be connected to failures calling LAPACK routines):

1) The diagonalization algorithm is not stable for your system

-> Change ALGO = Normal or ALGO = Fast in your INCAR file

2) Your geometry is not reasonable. Maybe your initial structure or the algorithm handling ion relaxation is giving a bad structure

-> Switch to a different ion relaxation scheme (IBRION line in your INCAR)

-> Reduce the step size of the first step by reducing the POTIM value in your INCAR

3) Problem with your LAPACK installation

-> Use the LAPACK routines that came with the VASP library tar-file (vasp.4.lib)

4) If the error does not go away even if you already did the thing described in part 1 (ALGO change), then there may be a problem with LAPACK.

-> Disable the calls to ZHEGV subroutine. This can be done by commenting the line

# define USE_ZHEEVX

in subroutines davidson.F, subrot.F and wavpre_noio.F.

-> Recompile VASP after you changed the subroutines

ZZ http://blog.163.com/wangle_xq/blog/static/1302592200961033436254/

补充：

一、ERROR: charge density could not be read from file CHGCAR for ICHARG>10
And
ERROR: while reading WAVECAR, plane wave coefficients changed 23962, 11981

可能方案：sometimes an incorrect ISMEAR will cause this error；

please check whether the FFT meshes have changed；

the CHGCAR really is in the working directory at runtime；

the fft meshes of CHGCAR are compatible ；

More verbosely, grep the OUTCAR file from the SCF calculation and find the values of NGXF, NGYF, and NGZF. Set these parameters explicitly in the INCAR file that you use when calculating the band structure (the ICHARG=11 run).

二、WARNING: Sub-Space-Matrix is not hermitian in DAV 1, -18.497193968206293

Error reading item 'IMAGES' from file INCAR
可能解决： using IALGO=48 instead of IALGO=Default

三、"mpirun has exited due to to process rank 3 with PID 15887" on node linuxtest2 exiting without
calling "finalize". This may have caused other processes in the application to be terminated by
signals sent by mpirun(as reported here)."
相关答案：

(1) The solution depends on why one process abruptly stops. If the process stops due to a terminal error, there is not much you can do unless you have a mechanism to trap exceptions. If you can trap exceptions, you could insert at several places code that makes all processes check for
an exception. If any process finds one, all processes call MPI_FINALIZE and stop.
On the other hand, if the process stops abruptly due to a stop (Fortran) or exit (C, C++) instruction, then you should revise the code to place stop or exit instructions only in sections of code that
all processes execute.

(2) Related to your first question, as Dion says in the previous answer, the first thing to check is the follwing:be sure that when you run the command

mpirun -np N -machinefile hosts.txt mpipython <your ESyS-Particle script filename>

N is equal to NProc + 1, where NProc is the total number of processing units (tasks) that you want to use for your simulation, i.e., the total number of sub-domains into which your model is subdivided.
For example, if you do not subdivide your model into sub-domain, i.e., you run a simple sequential simulation, not a parallel one, NProc = 1, then N = 2. In this case, in hosts.txt, you'll have to write the name of the host (localhost in the case of a dual core desktop or laptop) on only one line.

Then, be sure that you put the hosts.txt file in the folder where your ESyS-Particle script is. In this case, it means hosts.txt must be in the same folder as the GravityTut.py file, where you execute the command

mpirun -np 2 -machinefile hosts.txt `which mpipython` GravityTut.py

Notice that, compared to the version of the command you used, I added the following sentence

-machinefile hosts.txt

It says to mpirun that it has to choose the hosts indicated in the line(s) of hosts.txt as the ones where the program mpipython ... should run.
localhost inside hosts.txt means simply that you are using the same host from where you run the mpirun command, that is your desktop/laptop. If you were using a compute cluster, you should have indicated as hosts the names of the single nodes where to run your (parallell) program.
On multi-core desktops/laptops, you just need to use the word localhost. Repeat it as many times, on different lines, as the total number of processes you want to use in case you want to run the program in parallel.

(3)I have solved this problem by using "ISYM=0"

四、 hit a member that was already found in another star
I also have solved this problem by using "ISYM=0"

other answer：

（1） This message usually only occurs if the unit cell has very strange shape. Please check the symmetry information part in OUTCAR.

（2） Choosing the lattice vectors and basis as suggested by VASP would solve the problem.

（3）usually, for reasonable lattices (Bravais matrices) , this warning does not show up, it indicates usually that the angle between 2 lattice vectors is close to 0 or 180 degrees. Please only continue if you are sure that your cell geometry is correct. Of course, vasp will continue to calculate with that lattice as usual.

五、The same to the question of "三" and add to more question

在优化后做静态计算把CONTCAR copy成POSCAR后，为什么出现这个错误，原先的POSCAR都可以
vasp.4.6.36 17Feb09 complex
POSCAR found : 2 types and 16 ions
LDA part: xc-table for Ceperly-Alder, standard interpolation
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
vasp 00000000004AE4C6 Unknown Unknown Unknown
vasp 00000000004AA464 Unknown Unknown Unknown
vasp 000000000041CBE8 Unknown Unknown Unknown
vasp 0000000000415CBC Unknown Unknown Unknown
libc.so.6 0000003A4DC1C3FB Unknown Unknown Unknown
vasp 0000000000415BEA Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
vasp 00000000004AE4C6 Unknown Unknown Unknown
vasp 00000000004AA464 Unknown Unknown Unknown
vasp 000000000041CBE8 Unknown Unknown Unknown
vasp 0000000000415CBC Unknown Unknown Unknown
libc.so.6 0000003A4DC1C3FB Unknown Unknown Unknown
vasp 0000000000415BEA Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
vasp 00000000004AE4C6 Unknown Unknown Unknown
vasp 00000000004AA464 Unknown Unknown Unknown
vasp 000000000041CBE8 Unknown Unknown Unknown
vasp 0000000000415CBC Unknown Unknown Unknown
libc.so.6 0000003A4DC1C3FB Unknown Unknown Unknown
vasp 0000000000415BEA Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
vasp 00000000004AE4C6 Unknown Unknown Unknown
vasp 00000000004AA464 Unknown Unknown Unknown
vasp 000000000041CBE8 Unknown Unknown Unknown
vasp 0000000000415CBC Unknown Unknown Unknown
libc.so.6 0000003A4DC1C3FB Unknown Unknown Unknown
vasp 0000000000415BEA Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
vasp 00000000004AE4C6 Unknown Unknown Unknown
vasp 00000000004AA464 Unknown Unknown Unknown
vasp 000000000041CBE8 Unknown Unknown Unknown
vasp 0000000000415CBC Unknown Unknown Unknown
libc.so.6 0000003A4DC1C3FB Unknown Unknown Unknown
vasp 0000000000415BEA Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
vasp 00000000004AE4C6 Unknown Unknown Unknown
vasp 00000000004AA464 Unknown Unknown Unknown
vasp 000000000041CBE8 Unknown Unknown Unknown
vasp 0000000000415CBC Unknown Unknown Unknown
libc.so.6 0000003A4DC1C3FB Unknown Unknown Unknown
vasp 0000000000415BEA Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
vasp 00000000004AE4C6 Unknown Unknown Unknown
vasp 00000000004AA464 Unknown Unknown Unknown
vasp 000000000041CBE8 Unknown Unknown Unknown
vasp 0000000000415CBC Unknown Unknown Unknown
libc.so.6 0000003A4DC1C3FB Unknown Unknown Unknown
vasp 0000000000415BEA Unknown Unknown Unknown
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 396 on
node node4 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------

可能原因：（1）I have solved this problem by using "ISYM=0"

（2）可以把程序重新编译一下，改一下优化级别，

或者看一下ulimited

六、 VERY BAD NEWS! internal error in subroutine IBZKPT:
Reciprocal lattice and k-lattice belong to different class of lattices. 48

VERY BAD NEWS! internal error in subroutine IBZKPT:
Reciprocal lattice and k-lattice belong to different class of lattices. 48
POSCAR, INCAR and KPOINTS ok, starting setup
FFT: planning ... 2
reading WAVECAR
entering main loop
N E dE d eps ncg rms rms(c)

解决：未解决

等待补充。。。。。。

七、ZBRENT: fatal internal in brackting system-shutdown; contact gK immediately

this message says that during geometry optimisation no reasonable next step could be found with Brent's algorithm (linear bisectioning)
please check the following
1) is the system converged already? (look at the forces in OUTCAR, especially if you use the total energy change as congvergence criterium for ionic optimisation)
2) if not: is each ionic step converged electronically ( this is needed to obtain reasonable forces)

Well, I encounter the same problem when I optimize the geometry. I set EDIFFG=-1E-2. And every SCF is converged. But in the last SCF non-local-force acting on ions is -.178E-14 0.000E+00 0.000E+00; it seems not small enough,right? I see the total drift is 0.000063 -0.000007 -0.000005.
I don't know whether the optimization is OK ended.
Or what parameter do I need to let the convergence continue?

head admin （解答者）： sometimes, if the calculation is already highly converged, vasp cannot interpolate the next step to within the numerical accuracy because it simply would be so small. (this behaviour is machine-dependent). If you have used IBRION=2, it may help to switch to IBRION=1 and set ADDGRID=.true. in addition

As said by the head admin. i have solved my NEB optimization, by switching from IBRION=2 [conjugate grad.]to IBRION=1[q-N]. thnx

ZZ http://cms.mpi.univie.ac.at/vasp-forum/forum_viewtopic.php?3.869

八、Information: wavefunction orthogonal band 40 0.3870

On an SGI Origin 3000 IRIX, we have a user with an issue running VASP 4.6.28 on a very small input. His output is below, but the error is “ERROR FEXCP: supplied Exchange-correletion table is too small, maximal index : 2147483647”. The user tells me that the code crashes when IWAVPR = 2 is used. Would it help if I supplied the user's input files?

Thank you,

Chris

> mpirun -np 8 $WORKDIR/vasp.4.6-mpi/vasp

MPI:cpuset name = 5063-0

running on 8 nodes

distr: one band on 1 nodes, 8 groups

vasp.4.6.28 25Jul05 complex

POSCAR found : 2 types and 14 ions

LDA part: xc-table for Ceperly-Alder, standard interpolation

POSCAR, INCAR and KPOINTS ok, starting setup

FFT: planning ... 1

reading WAVECAR

prediction of wavefunctions initialized - no I/O

entering main loop

N E dE d eps ncg rms rms(c)

RMM: 1 0.257110684240E+03 0.25711E+03 -0.66168E+03 512 0.416E+02

RMM: 2 0.542260330471E+02 -0.20288E+03 -0.22272E+03 512 0.117E+02

RMM: 3 -0.209237073829E+02 -0.75150E+02 -0.78091E+02 512 0.802E+01

RMM: 4 -0.512862710105E+02 -0.30363E+02 -0.24864E+02 512 0.534E+01

RMM: 5 -0.604342554867E+02 -0.91480E+01 -0.81863E+01 512 0.305E+01

RMM: 6 -0.639606371480E+02 -0.35264E+01 -0.30751E+01 512 0.191E+01

RMM: 7 -0.653312726206E+02 -0.13706E+01 -0.12794E+01 512 0.118E+01

RMM: 8 -0.659387291604E+02 -0.60746E+00 -0.56097E+00 512 0.784E+00

RMM: 9 -0.664261562862E+02 -0.48743E+00 -0.47733E+00 1194 0.503E+00

RMM: 10 -0.664457885097E+02 -0.19632E-01 -0.21726E-01 1302 0.127E+00

RMM: 11 -0.664470230231E+02 -0.12345E-02 -0.12460E-02 1195 0.231E-01

RMM: 12 -0.664470249490E+02 -0.19260E-05 -0.12383E-04 1208 0.553E-02 0.102E+01

RMM: 13 -0.635837645668E+02 0.28633E+01 -0.70831E+00 1026 0.764E+00 0.170E+00

RMM: 14 -0.634626945503E+02 0.12107E+00 -0.53221E-01 1053 0.227E+00 0.470E-01

RMM: 15 -0.634338682456E+02 0.28826E-01 -0.83545E-02 1067 0.817E-01 0.242E-01

RMM: 16 -0.634205023438E+02 0.13366E-01 -0.25201E-02 1058 0.422E-01 0.200E-01

RMM: 17 -0.634083709340E+02 0.12131E-01 -0.14890E-02 1033 0.311E-01 0.125E-01

RMM: 18 -0.634005700848E+02 0.78008E-02 -0.17008E-02 1035 0.352E-01 0.613E-02

RMM: 19 -0.634008881412E+02 -0.31806E-03 -0.49093E-03 1027 0.202E-01 0.313E-02

RMM: 20 -0.634032494445E+02 -0.23613E-02 -0.20939E-03 1024 0.187E-01

1 F= -.63403249E+02 E0= -.63432026E+02 d E =-.634032E+02

curvature: 0.00 expect dE= 0.000E+00 dE for cont linesearch 0.000E+00

trial: gam= 0.00000 g(F)= 0.271E+01 g(S)= 0.000E+00 ort = 0.000E+00 (trialstep = 0.100E+01)

search vector abs. value= 0.271E+01

bond charge predicted

N E dE d eps ncg rms rms(c)

RMM: 1 -0.622380488974E+02 -0.62238E+02 -0.80856E+01 1034 0.250E+01 0.257E+00

RMM: 2 -0.623627374801E+02 -0.12469E+00 -0.37779E+00 1092 0.575E+00 0.182E+00

RMM: 3 -0.620302994364E+02 0.33244E+00 -0.62853E-01 1090 0.197E+00 0.544E-01

RMM: 4 -0.620175914925E+02 0.12708E-01 -0.91897E-02 1109 0.776E-01 0.183E-01

RMM: 5 -0.620171460094E+02 0.44548E-03 -0.11059E-02 1156 0.288E-01 0.566E-02

RMM: 6 -0.620170080844E+02 0.13792E-03 -0.10960E-03 1121 0.987E-02 0.194E-02

RMM: 7 -0.620170386641E+02 -0.30580E-04 -0.20009E-04 1090 0.475E-02 0.106E-02

RMM: 8 -0.620171699581E+02 -0.13129E-03 -0.11087E-04 1043 0.360E-02 0.547E-03

RMM: 9 -0.620173525227E+02 -0.18256E-03 -0.10615E-04 1017 0.333E-02 0.302E-03

RMM: 10 -0.620175143216E+02 -0.16180E-03 -0.13935E-04 997 0.327E-02 0.126E-03

RMM: 11 -0.620175602815E+02 -0.45960E-04 -0.61231E-05 962 0.176E-02 0.775E-04

RMM: 12 -0.620175725727E+02 -0.12291E-04 -0.29721E-05 871 0.100E-02 0.408E-04

RMM: 13 -0.620175758385E+02 -0.32658E-05 -0.69854E-06 773 0.448E-03 0.284E-04

RMM: 14 -0.620175801449E+02 -0.43064E-05 -0.57089E-07 748 0.415E-03 0.136E-04

RMM: 15 -0.620175808736E+02 -0.72864E-06 0.12261E-06 654 0.218E-03

2 F= -.62017581E+02 E0= -.62066802E+02 d E =0.138567E+01

trial-energy change: 1.385669 1 .order 0.301372 -2.714160 3.316904

step: 0.2486(harm= 0.4500) dis= 0.08163 next Energy= -63.723907 (dE=-0.321E+00)

Information: wavefunction orthogonal band 28 0.8265

Information: wavefunction orthogonal band 31 0.8884

Information: wavefunction orthogonal band 32 0.7410

Information: wavefunction orthogonal band 32 0.8199

Information: wavefunction orthogonal band 32 0.8947

Information: wavefunction orthogonal band 32 0.8917

Information: wavefunction orthogonal band 31 0.8413

Information: wavefunction orthogonal band 32 0.8751

Information: wavefunction orthogonal band 32 0.7849

Information: wavefunction orthogonal band 28 0.8995

Information: wavefunction orthogonal band 31 0.8844

Information: wavefunction orthogonal band 32 0.8960

Information: wavefunction orthogonal band 29 0.7967

Information: wavefunction orthogonal band 32 0.8830

Information: wavefunction orthogonal band 32 0.7903

Information: wavefunction orthogonal band 28 0.8881

Information: wavefunction orthogonal band 32 0.8822

Information: wavefunction orthogonal band 31 0.8943

Information: wavefunction orthogonal band 27 0.8857

Information: wavefunction orthogonal band 30 0.8459

Information: wavefunction orthogonal band 31 0.8960

Information: wavefunction orthogonal band 32 0.8219

Information: wavefunction orthogonal band 30 0.8979

Information: wavefunction orthogonal band 31 0.8921

Information: wavefunction orthogonal band 32 0.8013

bond charge predicted

prediction of wavefunctions

N E dE d eps ncg rms rms(c)

ERROR FEXCP: supplied Exchange-correletion table

is too small, maximal index : 2147483647

solutions

it seems that the first ionic relaxation steps lead to an unreasonable geometry and hence electron density. Please check your XDATCAR file and decrease POTIM in INCAR if this is the case. It may also help to switch to a different ionic relaxation algorithm (IBRION)

NLEMDL详解

关键词“NELMDL”:

A）此关键词的用途：指定计算开始时电子非自洽迭代的步数（即NELMDL gives the number of non-selfconsistent steps at the beginning），目的是make calculations faster。“非自洽”指的是保持charge density不变，由于Charge density is used to set up the Hamiltonian, 所以“非自洽”也指保持初始的哈密顿量不变。

B）默认值（default value）:

NELMDL = -5 (当ISTART=0, INIWAV=1, and IALGO=8时)

NELMDL = -12 (当 ISTART=0, INIWAV=1, and IALGO=48时)

NELMDL = 0 (其他情况下)

NELMDL might be positive or negative.

A positive number means that a delay is applied after each ionic movement -- in general not a convenient option. （在每次核运动之后）

A negative value results in a delay only for the start-configuration. （只在第一步核运动之前）

C）关键词“NELMDL”为什么可以减少计算所需的时间？

Charge density is used to set up the Hamiltonian, then the wavefunctions are optimized iteratively so that they get closer to the exact wavefunctions of this Hamiltonian. From the optimized wavefunctions a new charge density is calculated, which is then mixed with the old input-charge density. A brief flowchart is given below.（参自Manual P105页）

一般情况下，the initial guessed wavefunctions是比较离谱的，在前NELMDL次非自洽迭代过程中保持charge density不变、保持初始的哈密顿量不变，只对wavefunctions进行优化，在得到一个与the exact wavefunctions of initial Hamiltonian较为接近的wavefunctions后，再开始同时优化charge density。这样一来，计算时间要比一开始就同时优化charge density 和wavefunctions短得多。

转载本文请联系原作者获取授权，同时请注明本文来自王达科学网博客。
链接地址：https://blog.sciencenet.cn/blog-671981-613568.html

上一篇：Phonopy 计算声子谱
下一篇：BADER CHARGE ANALYSIS

收藏 IP: 61.148.16.*| 热度|

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

数据加载中...

返回顶部

博文发布时间已经超过87600小时，评论已关闭。

王达

扫一扫，分享此博文

全部作者的其他最新博文

• LDA+U

dwd0826的个人博客分享 http://blog.sciencenet.cn/u/dwd0826

博文

vasp Problem & Solution+NLEMDL详解

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

王达

全部作者的其他最新博文

全部精选博文导读

相关博文

dwd0826的个人博客分享 http://blog.sciencenet.cn/u/dwd0826

博文

vasp Problem & Solution+NLEMDL详解

当前推荐数：0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

王达

全部作者的其他最新博文

全部精选博文导读

相关博文

该博文允许注册用户评论请点击登录评论 (0 个评论)