Defining Job Classes as Templates for Jobs

Engineers improving Univa Grid Engine use an automated test environment during the development and during the test phase before a new version of Univa Grid Engine is released. This test suite automatically installs test clusters and runs several thousand test scenarios to see if functionality of Univa Grid Engine is broken.

For each bug that is fixed in the Univa Grid Engine code base a new test scenario is added to the test suite and also when new functionality is implemented then engineers have to create new tests to keep the test coverage high.

Today I have finished the 91st test scenario that tests job class functionality. Some of these new tests take care that the ownership of a job class is respected and some validate that the code is working correctly that should prevent users from deriving new jobs from job classes that do not allow this. Ownership and accessibility are two characteristics that have to be defined when new job classes are created in Univa Grid Engine 8.1.

Defining Job Classes

A job class is a new object type in Univa Grid Engine 8.1. Objects of this type can be defined by managers and also by users of a Univa Grid Engine Cluster to prepare templates for jobs. Those objects can later on be used to create jobs.

Like other configuration objects in Univa Grid Engine each job class is defined by a set of configuration attributes. This set of attributes can be divided into two categories. The first category contains attributes defining a job class itself and the second category all those which form the template which in turn eventually gets instantiated into new jobs.

Attributes describing a Job Class

Following attributes describe characteristics of a job class:

jcname

The jcname attribute defines a name that uniquely identifies a job class.

There is one particular job class with the special name template. It acts as template for all other job classes and the configuration of this job class template can only be adjusted by users having the manager role in Univa Grid Engine. This gives manager accounts control about default settings, some of which also can be set so that they must not be changed.

variant_list

Job classes may, for instance, represent an application type in a cluster. If the same application should be started with various different settings in one cluster or if the possible resource selection applied by Univa Grid Engine system should depend on the mode how the application should be executed then it is possible to define one job class with multiple variants. A job class variant can be seen as a copy of a job class that differs only in some aspects from the original job class.

The variant_list job class attribute defines the names of all existing job class variants. If the keyword NONE is used or when the list contains only the word default then the job class has only one variant. If multiple names are listed here, that are separated by commas, then the job class will have multiple variants. The default variant always has to exist. If the variant_list attribute does not contain the word default then it will be automatically added by the Univa Grid Engine system.

Other commands that require a reference of a job class can either use the jcname to refer to the default variant of a job class or they can reference a different variant by combining the jcname with the name of a specific variant. Both names have to be separated by a dot (.) character.

owner_list

The owner_list attribute denotes the ownership of a job class. As default the user that creates a job class will be the owner. Only this user and all managers are allowed to modify or delete the job class object. Managers and owners can also add additional user names to this list to give these users modify and delete permissions. If a manager creates a job class then the owner_list will be NONE to express that only managers are allowed to modify or delete the corresponding job class. Even if a job class is owned only by managers it can still be used to create new jobs. The right to derive new jobs from a job class can be restricted with the user_list and xuser_list attributes explained below.

user_list

The user_list job class parameter contains a comma separated list of Univa Grid Engine user access list names or user names. User names have to be prefixed with a percent character (%). Each user referenced in the user_list and each user in at least one of the enlisted access lists has the right to derive new jobs from this job class using the -jc switch of one of the submit commands. If the user_list parameter is set to NONE (the default) any user can use the job class to create new jobs if access is not explicitly excluded via the xuser_lists parameter described below. If a user is contained both in an access list enlisted in xuser_lists and user_lists the user is denied access to use the job class.

xuser_list

The xuser_list job class contains a comma separated list of Univa Grid Engine user access list names or user names. User names have to be prefixed with a percent character (%). Each user referenced in the xuser_list and each user in at least one of the enlisted access lists is not allowed to derive new jobs from this job class. If the xuser_list parameter is set to NONE (the default) any user has access. If a user is contained both in an access list enlisted in xuser_lists and user_lists the user is denied access to use the job class.

Example: Job Classes - Identity, Ownership, Access

Below you can find an example for the first part of a sleeper job class. It will be enhanced below to illustrate the use of job classes.

   jcname        sleeper
   variant_list  NONE
   owner_list    NONE 
   user_list     NONE
   xuser_list    NONE
   ...

sleeper is the unique name that identifies the job class (jcname sleeper). This job class defines only the default variant because no other variant names are specified (variant_list NONE). The job class does not specify an owner (owner_list NONE) as a result it can only be changed or deleted by users having the manager role. Managers and all other users are allowed to derive new jobs from this job class. Creating new jobs is not restricted (user_list NONE; user_list NONE).

Attributes to Form a Job Template

Additionally to the attributes mentioned previously each job class has a set of attributes that form a job template. In most cases the names of those additional attributes correspond to the names of command line switches of the qsub command. The value for all these additional attributes might either be the keyword UNSPECIFIED or it might be the same value that would be passed with the corresponding qsub command line switch.

All these additional job template attributes will be evaluated to form a virtual command line when a job class is used to instantiate a new job. All attributes for which the corresponding value contains the UNSPECIFIED keyword will be ignored whereas all others define the submit arguments for the new job that will be created.

All template attributes can be divided in two groups. There are template attributes that accept simple attribute values (like a character sequence, a number or the value yes or no) and there are template attributes that allow to specify a list of values or a list of key/value pairs (like the list of resource requests a job has or the list of queues where a job might get executed).

Instead of listing all template attributes here I will continue with an example. A description for all available template attributes can be found in the documentation of Univa Grid Engine 8.1. There also the default values are documented that will be used when the keyword UNSPECIFIED is used in the job class definition.

Example: Job Classes - With Job Template Parameters

Second version of the sleeper job class defining job template attributes for the default variant:

   > qsub -jc sleeper
   Your job 4097 ("Sleeper") has been submitted

   > qsub -S /bin/sh -N Sleeper -b y /bin/sleep
   Your job 4098 ("Sleeper") has been submitted

Job 4097 is derived from a job class whereas job 4098 is submitted conventionally. The parameters specified in the sleeper job class are identical to the command line arguments that are passed to qsub command to submit the jobs. As a result both jobs are identical. Both use the same shell and job command and therefore they will sleep for 60 seconds after start. The only difference between the two jobs is the submit time and the job id. Users that try to change both jobs after they have been submitted will also encounter an additional differences. It is not allowed to change the specification of job 4097. The reason for this will be explained soon....

blog comments powered by Disqus