PBS est un gestionnaire de queue de calculqui se configure par l'interface en ligne de commande qmgr :
Qmgr: p s # # Create queues and set their attributes. # # # Create and define queue workq # create queue workq set queue workq queue_type = Execution set queue workq enabled = True set queue workq started = True # # Set server attributes. # set server scheduling = True set server default_queue = workq set server log_events = 127 set server mail_from = adm set server resources_default.ncpus = 1 set server resources_available.mem = 950mb set server resources_available.ncpus = 1 set server resources_default.ncpus = 1 set server scheduler_iteration = 60 Qmgr: p n @localhost # # Create nodes and set their properties. # # # Create and define node taz # # create node taz # unsuppored operation set node taz state = free set node taz ntype = time-shared
Voici la config du cluster au taff :
Qmgr: p s # # Create queues and set their attributes. # # # Create and define queue workq # create queue workq set queue workq queue_type = Execution set queue workq Priority = 10 set queue workq enabled = True set queue workq started = True # # Create and define queue bigmem # create queue bigmem set queue bigmem queue_type = Execution set queue bigmem Priority = 15 set queue bigmem resources_max.mem = 4gb set queue bigmem resources_min.mem = 2gb set queue bigmem resources_default.mem = 2gb set queue bigmem resources_default.nodes = mpku50373 set queue bigmem enabled = True set queue bigmem started = True # # Create and define queue fast # create queue fast set queue fast queue_type = Execution set queue fast Priority = 150 set queue fast max_running = 1 set queue fast resources_max.walltime = 01:00:00 set queue fast enabled = True set queue fast started = True # # Set server attributes. # set server scheduling = True set server max_user_run = 10 set server default_queue = workq set server log_events = 127 set server mail_from = adm set server query_other_jobs = True set server resources_default.ncpus = 1 set server scheduler_iteration = 60 set server resv_enable = True set server node_fail_requeue = 310 set server max_array_size = 10000 Qmgr: p n @localhost # # Create nodes and set their properties. # # # Create and define node mpku50373 # create node mpku50373 set node mpku50373 ntype = time-shared set node mpku50373 state = free set node mpku50373 Priority = 10 set node mpku50373 resources_available.arch = linux set node mpku50373 resources_available.mem = 7954872kb set node mpku50373 resources_available.ncpus = 2 set node mpku50373 resv_enable = True # # Create and define node mpkl40162 # create node mpkl40162 set node mpkl40162 ntype = time-shared set node mpkl40162 state = free set node mpkl40162 Priority = 100 set node mpkl40162 resources_available.arch = linux set node mpkl40162 resources_available.mem = 1024864kb set node mpkl40162 resources_available.ncpus = 2 set node mpkl40162 resv_enable = True
Autre config au boulot :
[kiefferj@sstrnp9o048B tmp]$ qmgr Max open servers: 4 Qmgr: p s # # Create queues and set their attributes. # # # Create and define queue fast # create queue fast set queue fast queue_type = Execution set queue fast Priority = 100 set queue fast max_running = 3 set queue fast enabled = True set queue fast started = True # # Create and define queue workq # create queue workq set queue workq queue_type = Execution set queue workq Priority = 10 set queue workq enabled = True set queue workq started = True # # Set server attributes. # set server scheduling = True set server default_queue = workq set server log_events = 511 set server mail_from = adm set server query_other_jobs = True set server resources_available.ncpus = 4 set server resources_default.ncpus = 1 set server scheduler_iteration = 60 set server node_check_rate = 150 set server tcp_timeout = 6 set server pbs_version = 2.1.6 Qmgr: p n @localhost # # Create nodes and set their properties. # # # Create and define node sstrnp9o048b # create node sstrnp9o048b set node sstrnp9o048b state = busy set node sstrnp9o048b np = 4 set node sstrnp9o048b ntype = time-shared set node sstrnp9o048b status = opsys=linux set node sstrnp9o048b status += uname=Linux sstrnp9o048B.d6.f1.enterprise 2.6.20-1.2300.fc5 #1 SMP Sun Mar 11 19:29:01 EDT 2007 x86_64 set node sstrnp9o048b status += sessions=6075 6111 13399 23284 20761 23794 24300 25988 28154 29293 set node sstrnp9o048b status += nsessions=10 set node sstrnp9o048b status += nusers=1 set node sstrnp9o048b status += idletime=56 set node sstrnp9o048b status += totmem=41789632kb set node sstrnp9o048b status += availmem=32677724kb set node sstrnp9o048b status += physmem=9021640kb set node sstrnp9o048b status += ncpus=4 set node sstrnp9o048b status += loadave=4.01 set node sstrnp9o048b status += netload=3366063078 set node sstrnp9o048b status += state=busy set node sstrnp9o048b status += jobs=34052.sstrnp9o048b.d6.f1.enterprise 34055.sstrnp9o048b.d6.f1.enterprise 34056.sstrnp9o048b.d6.f1.enterprise 34057.sstrnp9o048b.d6.f1.enterprise set node sstrnp9o048b status += rectime=1175098629
Il est important d'éditer le fichier /var/spool/torque/mom_priv/config et d'y ajouter les serveurs autorisés à s'y connecter par exemple :
$clienthost patagonia $logevent 0x1ff $usescp
Un script qui lance un calcul avec les données inline, utile pour les clusters de machines ou le montage NFS est pas possible
#!/usr/bin/python #this program submit a MOPAC job through the PBS queue manager with the input data inlined # Jerome Kieffer 29/03/2007 import os,sys if len(sys.argv)!=2 or not os.path.isfile(sys.argv[1]): raise "Please give the name of the file to process" data=[i.strip() for i in open(sys.argv[1])] infile=sys.argv[1] path,filename=os.path.split(infile) basename=os.path.splitext(filename)[0] pbs=basename+".pbs" print "submitting file : %s as pbs job : %s"%(filename,pbs) cwd=os.getcwd() os.chdir(os.path.join(cwd,path)) f=open(pbs,"w") f.write("#!/usr/bin/python\n#PBS -l mem=100mb\n#PBS -l ncpus=1import sys,os,tempfile\nexe='MOPAC2007'\ndata=") f.write(str(data)) f.write(""" filename=tempfile.mkstemp()[1] f=open(filename+".dat","w") for i in data:f.write(i+"\\n") f.close() os.system("%s %s.dat "%(exe,filename)) if os.path.isfile(filename+".arc"): - arc") for i in arc: sys.stdout.write(i) - remove(filename+".arc") os.remove(filename+".dat") if os.path.isfile(filename+".out"): - out") for i in out: sys.stderr.write(i) - remove(filename+".out") """) f.close() os.system("qsub %s"%pbs) os.chdir(cwd)
Un script simpliste pour lancer Gaussian03 à travers la queue PBS
#!/usr/bin/python #this program submit a Gaussian'03 job through the PBS queue manger import os,sys if len(sys.argv)<2 : print "Enter the name of the file you want to submit to Gaussian03" - exit(1) infile=sys.argv[1] path,filename=os.path.split(infile) basename=os.path.splitext(filename)[0] pbs=basename+".pbs" cwd=os.getcwd() os.chdir(os.path.join(cwd,path)) f=open(pbs,"w") f.write("#!/bin/sh\n#PBS -l mem=2000mb\ncd %s\nexport g03root=/opt\nexport GAUSS_SCRDIR=/tmp\nsource $g03root/g03/bsd/g03.profile\nnice g03 %s \n"%(os.path.join(cwd,path),filename)) f.close() os.system("qsub %s"%pbs) os.chdir(cwd)
Config PBS de la machine Head
Max open servers: 4 Qmgr: p s # # Create queues and set their attributes. # # # Create and define queue workq # create queue workq set queue workq queue_type = Execution set queue workq Priority = 10 set queue workq enabled = True set queue workq started = True # # Create and define queue gaussian # create queue gaussian set queue gaussian queue_type = Execution set queue gaussian Priority = 20 set queue gaussian acl_host_enable = False set queue gaussian acl_hosts = node-B304-S022-Laptop set queue gaussian acl_hosts += gaussian set queue gaussian acl_hosts += sissapp089b set queue gaussian acl_hosts += sissapp018b set queue gaussian acl_hosts += sissapp093b set queue gaussian enabled = True set queue gaussian started = True # # Create and define queue mopac # create queue mopac set queue mopac queue_type = Execution set queue mopac Priority = 5 set queue mopac acl_host_enable = False set queue mopac acl_hosts = box set queue mopac acl_hosts += node-B304-S022-Laptop set queue mopac acl_hosts += node-B104-expl set queue mopac acl_hosts += head set queue mopac enabled = True set queue mopac started = True # # Set server attributes. # set server scheduling = True set server managers = admin@head set server managers += kiefferj@head set server operators = admin@head set server operators += kiefferj@head set server default_queue = workq set server log_events = 3 set server mail_from = adm set server query_other_jobs = True set server resources_default.ncpus = 1 set server scheduler_iteration = 20 set server node_ping_rate = 60 set server node_check_rate = 30 set server tcp_timeout = 6 set server log_level = 1 set server pbs_version = 2.1.6
petite remarque : set queue mopac acl_host_enable = False correspond en fait ce que les noeuds definis ait le droit de SOUSMETTRE un job dans cette queue, pas pour l'execution. On utilise les proprites en lieu et place qui fonctionnennt mieux.