Geres 1.5 Beta

Rating: No reviews yet
Downloads: 18
Change Set: 23320
Released: Dec 31, 2013
Updated: Jan 14, 2014 by gmarchetti
Dev status: Beta Help Icon

Recommended Download

Documentation Release Notes
documentation, 50K, uploaded Dec 31, 2013 - 18 downloads

Release Notes

GeReS 1.5 Beta Release Notes

Generic Resource Scheduler 1.5 Beta

The Generic Resource Scheduler 1.5 Beta release is ready.
Major changes in this release are:
  • It has been completely rewritten in C# to take advantage of new Azure SDK 2.2 features (management API, OnMessage event mechanism) that are not available with Python. Alas, the Python SDK depends on older releases of the Azure libraries. This limits us to Windows worker VMs.
  • It uses the .NET service management API (currently in preview) rather than the REST API. Although such API is still incomplete, it is a lot simpler than the REST one. It also removes the need to spawn Powershell processes, although Powershell is still more flexible than the .NET API.
  • It solves the problem of stopped but not deallocated instances. In this release, an “idle” queue is used for worker VMs to notify the autoscaler. The autoscaler removes idle VMs instead of just shutting them down. This reduces both running and storage costs. At the same time, it preempts the need of a management certificate in the worker VMs, as they do not perform any service management operation.
  • It implements three task queues for prioritization: high, medium and low. Worker nodes will pick tasks from the queues in that order.
  • It provides a set of command line utilities that can be used in custom scripts to submit jobs, query status, cancel jobs.
  • It uses a service bus eventing mechanism to provide notifications of job status changes. The sample notifier application shows how simple it is to take advantage of it.
  • The job message format is now {<job id> <executable> <parameters>}. There is no need to provide a “time submitted” field, as the .net api exposes the message insertion time.
A few things have not changed:
  • The software still uses VMs rather than worker roles. This is to maintain compatibility with any Windows application. V2 will provide a PaaS option.
  • We still assume that 1 job contains 1 task to be executed on 1 processor, hence we deploy small VMs.
  • The only PaaS role is the autoscaler. This will operate without user input and not run anything but the autoscaling component.
  • There still is no UI other than the command line environment. Work on several UI options is part of the v2 effort.
  • There still is no identity management or authentication. This is supposed to be handled separately. The agents and tasks will run with local user rights. V2 may offer some options to handle OAUTH / SAML2.

Installation Instructions

The package has 3 main components:
  1. The command line utilities
  2. The VM agent
  3. The autoscaler
The first 2 are console applications, the last one is a PaaS application to be run in a worker role.
They depend on the Azure SDK 2.2 for .NET and on the preview Azure Management libraries for .NET, which you can retrieve via Nuget.
You will need to create:
  • a storage account
  • a service bus name space
You will need to enter your:
  • service bus name and shared secret key
  • storage account name and key
  • connection strings
  • administrator user name and password for the VMs
where indicated in the source files before compiling.
The command line utilities require no installer: just compile the code and run the resulting executables. You can build an installable package with a setup wizard if you want.
To create compute node images that contain the VM agent, you’ll have to:
  1. Create a VM in Azure out of the available Server 2008R2 or 2012 images.
  2. Optionally, create a local user account.
  3. Copy the vmagent.exe and dependencies (e.g. all of the bin\debug or bin\release directory to a directory in such VM, e.g. c:\vmagent.
  4. Create a scheduled task to run vmagent.exe at boot under the credentials of your choice. We suggest either a local user or local service. Make sure that the task can run whether the user is logged in or not.
  5. Sysprep the virtual machine.
    1. Make sure that you delete or rename the existing unattend.xml in c:\windows\panther. This was created by Azure when deploying your virtual machine.
    2. Select the “OOBE”, “Generalize” and shutdown options.
    3. On the azure portal, capture the virtual machine and name the resulting image, e.g. geres15.
The autoscaler can be deployed directly from Visual Studio or you can create a .cspkg file, upload it to a blob, and then deploy using Powershell, System Center or other tools.
We advise to deploy the autoscaler in a different service or at least in a different role within the same service as the compute nodes. Note that the service is not deleted when the last VM in the compute node deployment is removed.

Observations

This is not and is not intended to be production-quality software. It is an example of what can be done with Azure to implement a job management tool.
In particular, there is no provision to authenticate users or check the processes they spawn.
We are considering a few more ideas for further development:
  • An option to request the size and number of VMs to be deployed in the job message.
  • An option to have more than 1 task executed by the same node. This makes sense with multi-core VMs.
  • An option to deploy MPI clusters in one go, e.g. if the job is tagged “MPI”, deploy all requested nodes from an image containing a mpi toolkit.
  • A simple web frontend for user authentication and job submission.
  • The use of the Windows Azure Scheduler API and, when available, of the Azure autoscaling API.

Reviews for this release

No reviews yet for this release.