博文

HPC 命令

已有 4021 次阅读 2012-2-16 22:15 |个人分类:科学研究|系统分类:科研笔记

最近使用UoB的高性能计算，从一个朋友处学习了相关的命令，国内也有一些是使用这种方式来进行计算的，如大连理工大学的高性能计算。

两大工具：

putty

fileZilla

前者用于Linux系统下的命令操作，后者用于文件传输。

产生任务的命令：

qsub xxx.job

任务开始的命令：

qstat -u userid

显示自己的任务的命令：

showq -u userid

删除任务的命令：

qdel xxxid

其他请见：(以下为转载)

How the queueing system is set up

You must not attempt to run jobs on the login node. Run jobs by submitting them to the queueing system, which will distribute them among the nodes.

The essence of using a queueing system is to plan ahead, you cannot submit a job and expect it to run immediately. If you need to run a job which does a large amount of file reading/writing, you must arrange to do this on the scratch space (/local) on the nodes. The speed of the processors is such that a large amount of reading/writing to the user area jams up the network connections and creates serious problems on the fileservers.

The queueing system on bluecrystalp2 is torque (a development of OpenPBS), using the Moab scheduler (a commercially supported development of Maui). You prepare your job and submit it to the queueing system from the login node. It will then be run on the nodes as soon as resources are available. You may submit as many jobs as you want and the system will run them as soon as it can.

Users can each have up to a soft limit of 256 and a hard limit of 320 cores (processors) in use at any one time. This means that if the system is busy, your number of cores in use can go up to the soft limit. If the system has unused resources, with no jobs eligible to run, then more of your jobs up to the core hard limit can run.
You can submit more jobs than this and the extra jobs will stay in the queues until they are eligible to run.
However, you should note that a single job asking for more than the soft limit of cores will never run, even if there are spare resources on the system.
Also note that for running parallel jobs you should close pack onto nodes i.e. use the minimum number of nodes to provide the number of cores you need.
Always ask for complete nodes for a parallel job, otherwise, if there is another job running on the same node as your parallel job, when it starts/ends, the cleanup system for deleting dangling processes may delete your processes. This does not apply to serial jobs of course.
Similarly each user can have up to a soft limit of 64 and a hard limit of 128 jobs running at any one time (as the system is weighted to favour parallel jobs).
Each user may have a maximum of 10,000 jobs in each of the veryshort and short queues at one time, with an overall limit of 20,000 jobs for each of these queues.
Each user may have a maximum of 3,000 jobs in the each of the medium and long queues at one time, with an overall limit of 10,000 jobs for each of these queues.
These limits are to prevent the queuing system becoming overwhelmed.

If there are idle processors available and no queued jobs, a job will run at once. If the system is at capacity and there are jobs waiting to run, a fair shares system will determine the priority in which these jobs will run. There are contributions from several sources to calculate the priority.

User cpu-time. A user who has not used much cpu-time will be given a higher priority than one who has. The cpu usage is calculated over three windows, each of ten days length, the older windows having a lower weighting factor.
Department cpu-time. There will also be a contribution from how much cpu-time the user's department as a whole has used. This is calculated over the same three windows as the user cpu-time.
Requested walltime. A job asking for a small amount of time will receive a higher priority than a long job.
Time spent waiting in idle state before running. The job's priority will increase as time goes on, so eventually it must run. There is a limit per user on the number of jobs waiting to run and accruing priority i.e. in the idle area (to prevent a user with a large number of jobs in the queues hogging the system). Over this limit the user's other jobs will wait in the queues (in the blocked area) but not accruing priority until they are eligible move into the idle area.

Some useful commands:

The command qstat will show you what jobs are in the queue at the moment. qstat -an will give more detail.
showq gives a helpful display of the current state of the queueing system.
qstat -f JOBID will give a lot of detail about job JOBID and is useful for checking precisely what is happening to your job).
The command qsub is used to submit jobs to the queueing system. It may be used interactively, although this is not recommended, or by submitting a shell-script.
qdel JOBIDNO will delete a job from the queues. Get the job ID number from qstat or showq.
checkjob JOBIDNO will give you information about your job. Only works on running jobs. Use -v for more detail.
tracejob JOBIDNO will tell you what has happened to your job.
showbf show resources available for immediate use.
showstart JOBIDNO get an estimate of when the job will start and end. How accurate it is depends on how accurate people's estimates of the time they will need are. Usually people wildly overestimate the time their jobs need, and your job will almost certainly run sooner than the estimate given.
The command showstats will show some information about the use of the system, showstats -u will show more detailed information.
The command shownodes will show some information about each node, such as the no. of cores, which queues access that node etc. Use the command shownodes -h (or --help) to see the various options.

There are various queues on bluecrystalp2. For normal use, the best method is not to specify a queue. Your job is then sent by default to a routeing queue (named default), which will send it to the appropriate queue to run. The queues are based on the amount of walltime you ask for.

The standard execution queues are:

veryshort: maximum 12 hours walltime.
short: maximum 120 hours walltime (5 days).
medium: maximum 240 hours walltime (10 days).
long: maximum 360 hours walltime (15 days).

The shorter queues can access a few more nodes than the longer queues. This is so that a job which could run and finish very quickly is not unreasonably held up.

The special queues are:

abaqus: maximum run time 360 hours walltime. You should use this queue for running abaqus jobs as there are a limited number of licenses available.
himem: maximum run time 360 hours walltime. You should use this queue for accessing the large memory nodes.
These nodes have 24 cores and 256 GB RAM, i.e. 10.5 GB RAM per core. If you need more memory ask for more cores to reserve the amount of memory you need..
These nodes do NOT have infinipath cards.
gpg: maximum run time 360 hours walltime. You should use this queue for accessing the nodes with Nvidia GPU accelerator cards.
At the moment the queuing system is configured to only allow one job at a time to run on each of these nodes by setting the node core-count as one. Do not try to request more than one core on these nodes, as then your job will not run.
These nodes do NOT have infinipath cards.

To run jobs in these special queues, use the qsub flag -q queue_name, e.g -q himem.

By default, the queueing system will create two files, Your_jobname.oYour_jobIDno for standard out, and Your_jobname.eYour_jobIDno for standard error (although they may well only contain a couple of system announcements).
These files will be in the directory you were in when you submitted the job. You can use different filenames for the .o and .e files the queueing system creates with the options -o your_filename1 (for stdout) and -e your_filename2, or merge the two to stdout with the option -j oe (you can also use the -o option with this).
Note that while your job is running these files are stored in the / partition on the node. This is a fairly small partition. If your job is liable to create a large stdout or stderr file, in your job submission script you should redirect this output to a file in your user disk space. If the / partition on the node becomes full you will lose this output.

The two most useful options to qsub are --m[abe] and -l resource_list.

The -m option directs the system to send you email, a - if the job aborts, b - when it begins and e - when it ends. By default, this email will remain on bluecrystalp2, and you can check for messages by running pine when you login to bluecrystal.If you wish to send this email to your usual email address, you can either also use the qsub flag -M your_email_address , or create a file called .forward (there really is a dot at the beginning) in your home directory containing a single line, your usual email address. This file must only be user writeable, i.e. permissions -rw-r-----. The email will then get forwarded to where you normally collect your email. The mail clients mutt and pine are available on bluecrystalp2.
If you use the option for the system to send you an email when the job ends, it will also contain some useful information on how long the job took and how much memory it needed etc.
Even if you do not use the -m qsub flag, if your job is aborted, an email will be sent.

It is important to ask for appropriate resources when you submit your job. See the man page pbs_resources_linux for information. This is the -l option and is a comma separated list.

Important resources are:

walltime=hh:mm:ss - the amount of time required. You should request a little more than needed for your job to finish, otherwise it will halt when the requested time is exceeded.
If you do not specify a walltime you will get the default walltime for the shortest queue.
nodes=string - the number and type of nodes required etc. The string may simply be a number, or if there are several arguments they are colon separated. Possible arguments are n, the number of nodes, and ppn=n, the number of cores (processors) required on each node e.g.
-l walltime=20:00:00,nodes=2:ppn=8

For a parallel job, you should use the minimum number of nodes that will give you the number of cores you require i.e. for 32 cores use 4 nodes with 8 cores on each, do not use 8 nodes with 4 cores on each. By close packing jobs onto the minimum number of nodes, nodes are made free for other parallel jobs. If the job is loose packed across nodes, other jobs are held up, resources are wasted, and the job will have to wait longer before it starts running, as it needs more nodes.

You submit jobs to the queueing system using the command qsub. The normal way to use qsub is to write a shell-script. Essentially this is a file containing a command, or list of commands as you would type them on the command line to run the job interactively. Supposing this shell-script is called filename, you would submit it with the command:
qsub [options] filename

转载本文请联系原作者获取授权，同时请注明本文来自覃森科学网博客。
链接地址：https://blog.sciencenet.cn/blog-72645-538187.html

上一篇：nsfc2011不能载入dll问题
下一篇：高性能计算，居然5天没有结果

收藏 IP: 137.222.114.*| 热度|

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

数据加载中...

返回顶部

博文发布时间已经超过87600小时，评论已关闭。

覃森

扫一扫，分享此博文

全部作者的精选博文

• 又是一年植树节！

物含妙理总堪寻分享 http://blog.sciencenet.cn/u/hdu016

博文

HPC 命令

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

覃森

全部作者的精选博文

全部作者的其他最新博文

全部精选博文导读

相关博文

物含妙理总堪寻分享 http://blog.sciencenet.cn/u/hdu016

博文

HPC 命令

当前推荐数：0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

覃森

全部作者的精选博文

全部作者的其他最新博文

全部精选博文导读

相关博文

该博文允许注册用户评论请点击登录评论 (0 个评论)