Scheduler

Licenese Management with Univa License Orchestrator (2/2)

Following blog post gives an overview about Univa License Orchestrator architecture and how Univa Grid Engine clusters can be connected to the Univa License Orchestrator.

‘ULO
Read More...
Comments

Core and Memory Binding of Jobs in Univa Grid Engine 8.1

Execution nodes in Grid Engine clusters usually have multiple sockets and multiple cores with a hierarchy of different caches. This hardware architecture will provide performance benefit for jobs and therefore improve the overall throughput of a cluster if it is handled correctly.

Univa Grid Engine is not only aware of the underlaying hardware architecture of compute resources. It provides also the necessary semantics to give managers and users of a cluster full control where jobs should be executed and how they should be handled.

Especially Univa Grid Engine 8.1 is extremely powerful. Within this version of Univa Grid Engine the scheduler component is completely responsible for the socket and core selection. Due to that it is possible to guarantee core binding specific requests. This was different in UGE 8.0 and it is still in other available Grid Engine versions.

The scheduler is also aware of the memory allocation capabilities of the underlaying hardware. As result particular memory allocation strategies can be selected so that jobs and underlaying applications will have accelerated access to available memory. Also this feature is new in UGE 8.1.

Read More...
Comments

How to Schedule GPU Resources

Assume following scenario: You have a cluster of machines where some machines have a number of GPU cards attached. There are jobs that require one or multiple GPU cards for their calculation. Performance of each GPU is optimal when it is used by only one job at one point it time. How is it possible to setup a Grid Engine cluster so that each job is told which GPU(s) it should take?

Read More...
Comments

How to Limit the Number of Slots for Parallel Jobs Depending on the Selected Parallel Environment

Assume you have a 128 slots Grid Engine cluster and two different parallel environments (PE's) that are named mpi_small and mpi_large.

> qconf -sp mpi_small
pe_name mpi_small
slots 128
…

> qconf -sp mpi_large
pe_name mpi_small
slots 128
...

How is it possible to define a slot limit per job depending on the PE that is chosen when the overall number of slots for each PE type should not be limited. Let's say you want to limit mpi_small jobs to a maximum of 32 slots per job and mpi_large jobs to 64 slots per job. How can this be achieved?

Read More...

Comments

Integration of Job Classes into the Existing System

Extensive use of job classes will have a positive impact on the cluster throughput in Univa Grid Engine 8.1 clusters. Reason for this is that job classes have been fully integrated into the core components of the system. For instance, the scheduling component can distinguish the different types of workloads easily. Also the algorithm in the scheduler that is responsible to find resources for a job was improved. Details are explained below.

Read More...
Comments