Job Schedulers


FastX allows admins to completely customize the process of launching a session from the GUI. This allows for custom scripts and job scheduling tools to be easily integrated into FastX. FastX ships several reference implementations and libraries to get admins started in building their infrastructure into FastX.

Pending sessions

When launching a FastX session, there may be a delay from the initial start to the actual launching of a session. An example of this would be a slurm job scheduler waiting for a compute node to be available to process the job. At this stage, the session is “pending”. It is in the database, but it does not contain the started parameter. Custom schedulers can implement actions on the session that can be used to manipulate the pending sessions.

Common actions are terminate (cancelling a job) and logs (writing custom logs to the job).
Any of the session actions can be created. The only caveat is the action must return the same object as a running session.

Example scripts with inline documentation are found in /usr/lib/fastx/4/integration/schedule/test

Running Sessions

Once a session has started and is running, it will connect to the FastX web server and update the database. At this point the scheduler is complete and the FastX session actions will directly send messages to the session itself. Schedule scripts will no longer be used once the session is running

Default Scheduler

The default scheduler is located at /usr/lib/fastx/4/integration/schedule/default
This scheduler replaces the original method of starting sessions in previous versions. It also can be used as a reference for creating custom schedulers.

scheduleDir

To enable a custom job scheduler, add a scheduleDir option with the path to your custom schedule directory in your app.

To make a default scheduler, add scheduleDir to apps/default.ini

Scheduling Scripts

When launching a session with a custom scheduler, FastX will search the scheduleDir directory for well defined scripts to execute. These scripts are documented here for reference. See the /usr/lib/fastx/4/integration/schedule/test directory for more documentation

metadata.ini

This file contains metadata that the schedule scripts will use

# display name
name=
# display description
description=
# use a different scheduleDir as the parent.
# the job scheduler will check all the parents for the first script it finds with the proper name
parent=

#custom metadata for each script
[start]
# command line arguments to pass to the script
args[]=1
args[]=2
args[]=3

preprocess

The preprocess script will return the forms and also do load balancing. Preprocess will allow a user to completely change the data that is sent to the start script. In most cases, this script does not need to be changed from the default

input

{ 
  nodes: [] # array of nodes for load balancing
  me: {} # object of user data who is making the call
  start: {} # start data
  nodeID: "string" # nodeID of the node running the preprocessor
  loadBalancer: "string" # name of the builtin load balancer to  use
  loadBalancerScript: "string" # name of the custom load balancer script to execute
  forms: [] # array of forms
}

output

The output of the preprocess script should return a json result object with either a “form” stage or a “data” stage.

The form stage tells the client to display the HTML form to gather more information

result: {
   stage: "form",
   form: "html of the form as a string",
   originalFormData: { } // start object that was sent to the form
}

The data stage tells the server to pass the result.start object as the object that will be passed to the start script.

The result.start object should contain a nodeID parameter which tells the web server which tells the web server which node to run the start command on

result: {
   stage: "data",
   start: {} //data to pass to the start script. must contain a nodeID param
}

start

The start script is the main script that is used to create custom job schedulers. In previous versions, FastX had a global variable named LAUNCHER_SCRIPT which was used as the launcher script.

FastX extends the start script so each individual app can run its own launcher. This way you can have a heterogeneous cluster (some sessions start immediately, some may use slurm, others start Virtual Machines …)

The typical method for start is to fork a process and then execute /usr/lib/fastx/4/scripts/start –json=0 passing the input[‘start’] data to the forked process

input

{
 start {} // Start Object. Pass to the fastx/4/scripts/start --json 0
 me: {} // current user object who is execing the program
 nodeID: "string" // nodeID the script was launched from 
 userData: {} // custom user data object
 defaultStartScript: "string" // path to fastx/4/scripts/start
 weblink: {} // weblink object to set on weblinks
 rootDir: "string" // base dir of fastx /usr/lib/fastx/4
 varDir: $FX_VAR_DIR
 configDir: $FX_CONFIG_DIR
 localDir: $FX_LOCAL_DIR
 tmpDir: $FX_TEMP_DIR
 uid: 1000 // user's uid
 gid 1000 // user's gid
 homedir: "string" user's $HOME directory
 shell: "/bin/bash" //  user's $SHELL
 login: "user" linux username
}

output

The output of this data will tell the web server that the session has been scheduled. You can add a custom scheduling message as well as add a schedule object to pass to the session data

{
 result: {
   "output": [
       { "line": "This is stdout", "type": "stdout" },
       { "line": "This is stderr", "type": "stderr"}
   ],
   "stdout": "This is stdout as one string",
   "stderr": "This is stderr as one string",
   "message": "Test Start Message that the user can display",
   "schedule": { // this object will be passed to the session    data to be used with other actions
      "jobId": 123,
      "param2": "some_string"
   }
}

disconnect

This action disconnects clients from a session. Given the session is not running, this action is typically not used

input

{
  session: {} // session data object
  me: {} // user data object of user who is executing
}

output

{ result: {} }

exec

This action is used to execute custom commands.

input

{
  session: {} // session data object
  me: {} // user data object of user who is executing
}

output

{ result: {} }

log

You can use this action to display custom logs to the user. Typically this can be used for debugging

input

{
  session: {} // session data object
  me: {} // user data object of user who is executing
}

output

{ result: {
"log1": "output string 1",
"log2": "these logs will be displayed to the user when he goes to logs",
"log3": "log string can be anything"
} }

status

Pending jobs may fail before ever contacting the web server. In these instances, the pending sessions will stay in the session list until they are manually purged. The status script is a way to periodically check on the job to verify it is still active. Exit code 0 means that it is still active (and do nothing). Non zero exit codes will tell the web server to clean the pending session from the database.

Make sure to clean up any data in the status script

input

{
  session: {} // session data object
  me: {} // user data object of user who is executing
}

output

{ result: { } }

terminate

This action should be used to manually cancel a job. Enter in any job cancellation code in the terminate script.

{
  session: {} // session data object
  me: {} // user data object of user who is executing
}

output

{ result: { 
"status" : "message you can send to the user to see"
} }

Weblink

If WEBLINK_PORT is set in fastx.env or the environment the following object will be available. These options are available to the launcher script to launch weblinks

{
   "logLevel": "debug",
   "url": WEBLINK_URL,
   "uri": WEBLINK_URI,
   "primary": primary_token,
   "login": "loginname"
}