||
You must not attempt to run jobs on the login node. Run jobs by submitting them to the queueing system, which will distribute them among the nodes.
The essence of using a queueing system is to plan ahead, you cannot submit a job and expect it to run immediately. If you need to run a job which does a large amount of file reading/writing, you must arrange to do this on the scratch space (/local) on the nodes. The speed of the processors is such that a large amount of reading/writing to the user area jams up the network connections and creates serious problems on the fileservers.
The queueing system on bluecrystalp2 is torque (a development of OpenPBS), using the Moab scheduler (a commercially supported development of Maui). You prepare your job and submit it to the queueing system from the login node. It will then be run on the nodes as soon as resources are available. You may submit as many jobs as you want and the system will run them as soon as it can.
Users can each have up to a soft limit of 256 and a hard limit of 320 cores (processors) in use at any one time. This means that if the system is busy, your number of cores in use can go up to the soft limit. If the system has unused resources, with no jobs eligible to run, then more of your jobs up to the core hard limit can run.
You can submit more jobs than this and the extra jobs will stay in the queues until they are eligible to run.
However, you should note that a single job asking for more than the soft limit of cores will never run, even if there are spare resources on the system.
Also note that for running parallel jobs you should close pack onto nodes i.e. use the minimum number of nodes to provide the number of cores you need.
Always ask for complete nodes for a parallel job, otherwise, if there is another job running on the same node as your parallel job, when it starts/ends, the cleanup system for deleting dangling processes may delete your processes. This does not apply to serial jobs of course.
Similarly each user can have up to a soft limit of 64 and a hard limit of 128 jobs running at any one time (as the system is weighted to favour parallel jobs).
Each user may have a maximum of 10,000 jobs in each of the veryshort and short queues at one time, with an overall limit of 20,000 jobs for each of these queues.
Each user may have a maximum of 3,000 jobs in the each of the medium and long queues at one time, with an overall limit of 10,000 jobs for each of these queues.
These limits are to prevent the queuing system becoming overwhelmed.
If there are idle processors available and no queued jobs, a job will run at once. If the system is at capacity and there are jobs waiting to run, a fair shares system will determine the priority in which these jobs will run. There are contributions from several sources to calculate the priority.
Some useful commands:
There are various queues on bluecrystalp2. For normal use, the best method is not to specify a queue. Your job is then sent by default to a routeing queue (named default), which will send it to the appropriate queue to run. The queues are based on the amount of walltime you ask for.
The standard execution queues are:
The shorter queues can access a few more nodes than the longer queues. This is so that a job which could run and finish very quickly is not unreasonably held up.
The special queues are:
To run jobs in these special queues, use the qsub flag -q queue_name, e.g -q himem.
By default, the queueing system will create two files, Your_jobname.oYour_jobIDno for standard out, and Your_jobname.eYour_jobIDno for standard error (although they may well only contain a couple of system announcements).
These files will be in the directory you were in when you submitted the job. You can use different filenames for the .o and .e files the queueing system creates with the options -o your_filename1 (for stdout) and -e your_filename2, or merge the two to stdout with the option -j oe (you can also use the -o option with this).
Note that while your job is running these files are stored in the / partition on the node. This is a fairly small partition. If your job is liable to create a large stdout or stderr file, in your job submission script you should redirect this output to a file in your user disk space. If the / partition on the node becomes full you will lose this output.
The two most useful options to qsub are --m[abe] and -l resource_list.
The -m option directs the system to send you email, a - if the job aborts, b - when it begins and e - when it ends. By default, this email will remain on bluecrystalp2, and you can check for messages by running pine when you login to bluecrystal.If you wish to send this email to your usual email address, you can either also use the qsub flag -M your_email_address , or create a file called .forward (there really is a dot at the beginning) in your home directory containing a single line, your usual email address. This file must only be user writeable, i.e. permissions -rw-r-----. The email will then get forwarded to where you normally collect your email. The mail clients mutt and pine are available on bluecrystalp2.
If you use the option for the system to send you an email when the job ends, it will also contain some useful information on how long the job took and how much memory it needed etc.
Even if you do not use the -m qsub flag, if your job is aborted, an email will be sent.
It is important to ask for appropriate resources when you submit your job. See the man page pbs_resources_linux for information. This is the -l option and is a comma separated list.
Important resources are:
For a parallel job, you should use the minimum number of nodes that will give you the number of cores you require i.e. for 32 cores use 4 nodes with 8 cores on each, do not use 8 nodes with 4 cores on each. By close packing jobs onto the minimum number of nodes, nodes are made free for other parallel jobs. If the job is loose packed across nodes, other jobs are held up, resources are wasted, and the job will have to wait longer before it starts running, as it needs more nodes.
You submit jobs to the queueing system using the command qsub. The normal way to use qsub is to write a shell-script. Essentially this is a file containing a command, or list of commands as you would type them on the command line to run the job interactively. Supposing this shell-script is called filename, you would submit it with the command:
qsub [options] filename
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-9-21 03:59
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社