Table of Contents
- 1. APIs
- 2. RSL Specification v1.1
- GRAM5 Commands
- globusrun - Execute and manage jobs via GRAM
- globus-job-cancel - Cancel a GRAM batch job
- globus-job-clean - Cancel and clean up a GRAM batch job
- globus-job-get-output - Retrieve the output and error streams from a GRAM job
- globus-job-run - Execute a job using GRAM
- globus-job-status - Check the status of a GRAM5 job
- globus-job-submit - Submit a batch job using GRAM
- globus-personal-gatekeeper - Manage a user's personal gatekeeper daemon
- globus-gram-audit - Load GRAM4 and GRAM5 audit records into a database
- globus-gatekeeper - Authorize and execute a grid service on behalf of a user
- globus-job-manager - Execute and monitor jobs
- globus-job-manager-event-generator - Create LRM-independent SEG files for the job manager to use
- globus-fork-starter - Start and monitor a fork job
- 3. Job description
- 4. Semantics and syntax of protocols
- A. Errors
- Glossary
Table of Contents
Table 1.1. GRAM Client APIs
Name | Purpose |
|---|---|
Low-level functions for processing GRAM protocol messages. Symbolic constants for RSL attributes, signals, and job states. | |
Functions for submitting job requests, sending signals, and listening for job state updates. | |
Functions for parsing and manipulating job specifications in the RSL language. |
Table of Contents
This is a document to specify the existing RSL v1.0 implementation and interfaces, as they are provided in the GT 5.0.0 release. This document serves as a reference, and more introductory text.
The Globus Resource
Specification Language (RSL) provides a common interchange language to
describe resources. The various components of the Globus Resource
Management architecture manipulate RSL strings to perform their management
functions in cooperation with the other components in the system. The RSL
provides the skeletal syntax used to compose complicated resource
descriptions, and the various resource management components introduce
specific
ATTRIBUTE,VALUE>
pairings into this common structure. Each attribute in a resource
description serves as a parameter to control the behavior of one or more
components in the resource management
system.
The core syntax of the RSL syntax is the relation.
Relations associate an attribute name with a value, eg the relation
executable=a.out provides the name of an executable in a
resource request. There are two generative syntactic structures
in the RSL that are used to build more complicated resource descriptions
out of the basic relations: compound requests and
value sequences. In addition, the RSL
syntax includes a facility to both introduce and dereference string
substitution variables.
The simplest form of compound request, utilized by all resource management components, is the conjunct-request. The conjuct-request expresses a conjunction of simple relations or compound requests (like a boolean AND). The most common conjunct-request in Globus RSL strings is the combination of multiple relations such as executable name, node count, executable arguments, and output files for a basic GRAM job request. Similarly, the core RSL syntax includes a disjunct-request form to represent disjunctive relations (like a boolean OR). Currently, however, no resource management component utilizes the disjunct-request form.
The last form of compound request is the multi-request.
The multi-request expresses multiple parallel resources that
make up a resource description. The multi-request form differs
from the conjunction and disjunction in two ways: multi-requests introduce
new variable scope, meaning variables defined in one clause of a
multi-request are not visible to the other clauses, and multi-requests
introduce a non-reducible hierarchy to the resource
description. Whereas relations within a conjunct-request can be thought of
as constraints on the resource being described, the
subclauses of a
multi-request are best thought of as individual resource descriptions that
together constitute an abstract
resource collection; the same attributes may be
constrained in different ways in each
subclause without causing a logical contradiction. An example of a
contradiction would be to constrain the executable
attribute to be two conflicting values within a conjunction. Currently,
however, no resource management component utilizes the disjunct-request
form.
The simplest form of value in the RSL syntax is the string literal. When explicitly quoted, literals can contain any character, and many common literals that don't contain special characters can appear without quotes. Values can also be variable references, in which case the variable reference is in essence replaced with the string value defined for that variable. RSL descriptions can also express string-concatenation of values, especially useful to construct long strings out of several variable references. String concatenation is supported with both an explicit concatenation operator and implicit concatenation for many idiomatic constructions involving variable references and literals.
In addition to the simple value forms given above, the RSL syntax includes the value sequence to express ordered sets of values. The value sequence syntax is used primarily for defining variables and for providing the argument list for a program.
Each RSL string consists of a sequence of RSL tokens, whitespace, and comments. The RSL tokens are either special syntax or regular unquoted literals, where special syntax contains one or more of the following listed special characters and unquoted literals are made of sequences of characters excluding the special characters.
The complete set of special characters that cannot appear as part of an unquoted literal is:
+(plus)&(ampersand)|(pipe)((left paren))(right paren)=(equal)<(left angle)>(right angle)!(exclamation)"(double quote)'(apostrophe)^(carat)#(pound)$(dollar)
These characters can only be used for the special syntactic forms described in the section and in the section or as within quoted literals.
Quoted literals are introduced with the "
(double quote) or ' (single quote/apostrophe)
and consist of all the characters up to (but not including) the next solo
double or single quote, respectively. To escape a quote character within a
quoted literal, the appearance of the quote character twice in a row is
converted to a single instance of the character and the literal continues
until the next solo quote character. For any quoted literal, there is only
one possible escape sequence, eg within a literal delimited by the single
quote character only the single quote character uses the escape notation
and the double quote character can appear without escape.
Quoted literals can also be introduced with an alternate
user delimiter notation. User delimited literals are
introduced with the ^ (carat) character followed
immediately by a user-provided delimiter; the literal consists of all
the characters after the user's delimiter up to (but not including) the
next solo instance of the delimiter. The delimiter itself may be escaped
within the literal by providing two instances in a row, just as the regular
quote delimiters are escaped in regular quoted literals.
RSL string comments use a notation similar to comments in the C programming
language. Comments are introduced by the prefix (*.
Comments continue to the first
terminating suffix *) and cannot be nested. Comments are
stripped from the RSL string during processing and are syntactically
equivalent to whitespace.
Example 2.1. Quoted Literal Examples
Assign the value Hello. Welcome to "The Grid" to
the attribute arguments, using double-quote as the
delimiter and the escaping sequence.
arguments = "Hello. Welcome to ""The Grid"""
Assign the value Hello. Welcome to "The Grid" to
the attribute arguments using the single-quote delimiter.
arguments = 'Hello. Welcome to "The Grid'
Assign the value Hello. Welcome to "The Grid" to
the attribute arguments using a user-defined quoting
character !.
arguments = ^!Hello. Welcome to "The Grid"!
RSL strings can introduce and reference string variables. String
substitution variables are defined in a special relation using the
rsl_substitution attribute, and the definitions affect
variable references made in the same conjunct-request (or
disjunct-request), as well as references made within any multi-request
nested inside one of the clauses of the conjunction (or disjunction). Each
multi-request introduces a new variable scope for each subrequest, and
variable definitions do not escape the closest enclosing scope.
Within any given scope, variable definitions are processed left-to-right in the resource description. Outermost scopes are processed before inner scopes, and the definitions in inner scopes augment the inherited definitions with new and/or updated variable definitions.
Variable definitions and variable references are processed in a single pass, with each definition updating the environment prior to processing the next definition. The value provided in a variable definition may include a reference to a previously-defined variable. References to variables that are not yet provided with definitions in the standard RSL variable processing order are replaced with an empty literal string.
The RSL syntax is extensible because it defines structure without too many keywords. Each Globus resource management component introduces additional attributes to the set recognized by RSL-aware components, so it is difficult to provide a complete listing of attributes which might appear in a resource description. Resource management components are designed to utilize attributes they recognize and pass unrecongnized relations through unchanged. This allows powerful compositions of different resource management functions.
The following listing summarizes the attribute names utilized by existing resource management components in the standard Globus release. Please see the individual component documentation for discussion of the attribute semantics.
arguments- The command line arguments for the executable. Use quotes, if a space is required in a single argument.
count- The number of executions of the executable.
directory- Specifies the path of the directory the jobmanager will use as the default directory for the requested job.
dry_run- If dryrun = yes then the jobmanager will not submit the job for execution and will return success.
environment- The environment variables that will be defined for the executable in addition to default set that is given to the job by the jobmanager.
executable- The name of the executable file to run on the remote machine. If the value is a GASS URL, the file is transferred to the remote gass cache before executing the job and removed after the job has terminated.
file_clean_up- Specifies a list of files which will be removed after the job is completed.
file_stage_in- Specifies a list of ("remote URL" "local file") pairs which indicate files to be staged to the nodes which will run the job.
file_stage_in_shared- Specifies a list of ("remote URL" "local file") pairs which indicate files to be staged into the cache. A symlink from the cache to the "local file" path will be made.
file_stage_out- Specifies a list of ("local file" "remote URL") pairs which indicate files to be staged from the job to a GASS-compatible file server.
gass_cache- Specifies location to override the GASS cache location.
gram_my_job- Obsolete and ignored.
host_count- Only applies to clusters of SMP computers, such as newer IBM SP systems. Defines the number of nodes ("pizza boxes") to distribute the "count" processes across.
job_type- This specifies how the jobmanager should start the job. Possible values are single (even if the count > 1, only start 1 process or thread), multiple (start count processes or threads), mpi (use the appropriate method (e.g. mpirun) to start a program compiled with a vendor-provided MPI library. Program is started with count nodes), and condor (starts condor jobs in the "condor" universe.)
library_path- Specifies a list of paths to be appended to the system-specific library path environment variables.
max_cpu_time- Explicitly set the maximum cputime for a single execution of the executable. The units is in minutes. The value will go through an atoi() conversion in order to get an integer. If the GRAM scheduler cannot set cputime, then an error will be returned.
max_memory- Explicitly set the maximum amount of memory for a single execution of the executable. The units is in Megabytes. The value will go through an atoi() conversion in order to get an integer. If the GRAM scheduler cannot set maxMemory, then an error will be returned.
max_time- The maximum walltime or cputime for a single execution of the executable. Walltime or cputime is selected by the GRAM scheduler being interfaced. The units is in minutes. The value will go through an atoi() conversion in order to get an integer.
max_wall_time- Explicitly set the maximum walltime for a single execution of the executable. The units is in minutes. The value will go through an atoi() conversion in order to get an integer. If the GRAM scheduler cannot set walltime, then an error will be returned.
min_memory- Explicitly set the minimum amount of memory for a single execution of the executable. The units is in Megabytes. The value will go through an atoi() conversion in order to get an integer. If the GRAM scheduler cannot set minMemory, then an error will be returned.
project- Target the job to be allocated to a project account as defined by the scheduler at the defined (remote) resource.
proxy_timeout- Obsolete and ignored. Now a job-manager-wide setting.
queue- Target the job to a queue (class) name as defined by the scheduler at the defined (remote) resource.
remote_io_url- Writes the given value (a URL base string) to a file, and adds the path to that file to the environment throught the GLOBUS_REMOTE_IO_URL environment variable. If this is specified as part of a job restart RSL, the job manager will update the file's contents. This is intended for jobs that want to access files via GASS, but the URL of the GASS server has changed due to a GASS server restart.
restart- Start a new job manager, but instead of submitting a new job, start managing an existing job. The job manager will search for the job state file created by the original job manager. If it finds the file and successfully reads it, it will become the new manager of the job, sending callbacks on status and streaming stdout/err if appropriate. It will fail if it detects that the old jobmanager is still alive (via a timestamp in the state file). If stdout or stderr was being streamed over the network, new stdout and stderr attributes can be specified in the restart RSL and the jobmanager will stream to the new locations (useful when output is going to a GASS server started by the client that's listening on a dynamic port, and the client was restarted). The new job manager will return a new contact string that should be used to communicate with it. If a jobmanager is restarted multiple times, any of the previous contact strings can be given for the restart attribute.
rsl_substitution- Specifies a list of values which can be substituted into other rsl attributes' values through the $(SUBSTITUTION) mechanism.
save_state- Causes the jobmanager to save it's job state information to a persistent file on disk. If the job manager exits or is suspended, the client can later start up a new job manager which can continue monitoring the job.
scratch_dir- Specifies the location to create a scratch subdirectory in. A SCRATCH_DIRECTORY RSL substitution will be filled with the name of the directory which is created.
stderr- The name of the remote file to store the standard error from the job. If the value is a GASS URL, the standard error from the job is transferred dynamically during the execution of the job.
stderr_position- Specifies where in the file remote standard error streaming should be restarted from. Must be 0.
stdin- The name of the file to be used as standard input for the executable on the remote machine. If the value is a GASS URL, the file is transferred to the remote gass cache before executing the job and removed after the job has terminated.
stdout- The name of the remote file to store the standard output from the job. If the value is a GASS URL, the standard output from the job is transferred dynamically during the execution of the job.
stdout_position- Specifies where in the file remote output streaming should be restarted from. Must be 0.
two_phase- Use a two-phase commit for job submission and completion. The job manager will respond to the initial job request with a WAITING_FOR_COMMIT error. It will then wait for a signal from the client before doing the actual job submission. The integer supplied is the number of seconds the job manager should wait before timing out. If the job manager times out before receiving the commit signal, or if a client issues a cancel signal, the job manager will clean up the job's files and exit, sending a callback with the job status as GLOBUS_GRAM_PROTOCOL_JOB_STATE_FAILED. After the job manager sends a DONE or FAILED callback, it will wait for a commit signal from the client. If it receives one, it cleans up and exits as usual. If it times out and save_state was enabled, it will leave all of the job's files in place and exit (assuming the client is down and will attempt a job restart later). The timeoutvalue can be extended via a signal. When one of the following errors occurs, the job manager does not delete the job state file when it exits: GLOBUS_GRAM_PROTOCOL_ERROR_COMMIT_TIMED_OUT, GLOBUS_GRAM_PROTOCOL_ERROR_TTL_EXPIRED, GLOBUS_GRAM_PROTOCOL_ERROR_JM_STOPPED, GLOBUS_GRAM_PROTOCOL_ERROR_USER_PROXY_EXPIRED. In these cases, it can not be restarted, so the job manager will not wait for the commit signal after sending the FAILED callback
username- Verify that the job is running as this user.
The following are some simple example RSL strings to illustrate idiomatic usage with existing tools and to make concrete some of the more interesting cases of tokenization, concatenation, and variable semantics. These are meant to illustrate the use of the RSL notation without much regard for the specific details of a particular resource management component.
Typical GRAM5 resource descriptions contain at least a few relations in a conjunction:
Example 2.2. GRAM5 Job Request Examples
This example shows a conjunct request containing values that are unquoted literals and ordered sequences of a mix of quoted and unquoted literals.
(* this is a comment *) & (executable = a.out (* <-- that is an unquoted literal *)) (directory = /home/nobody ) (arguments = arg1 "arg 2") (count = 1)
This example demonstrates RSL substitutions, which can be used to make sure a string is used consistently multiple times in a resource description:
& (rsl_substitution = (TOPDIR "/home/nobody")
(DATADIR $(TOPDIR)"/data")
(EXECDIR $(TOPDIR)/bin) )
(executable = $(EXECDIR)/a.out
(* ^-- implicit concatenation *))
(directory = $(TOPDIR) )
(arguments = $(DATADIR)/file1
(* ^-- implicit concatenation *)
$(DATADIR) # /file2
(* ^-- explicit concatenation *)
'$(FOO)' (* <-- a quoted literal *))
(environment = (DATADIR $(DATADIR)))
(count = 1)
Performing all variable substitution and removing comments yields an equivalent RSL string:
& (rsl_substitution = (TOPDIR "/home/nobody")
(DATADIR "/home/nobody/data")
(EXECDIR "/home/nobody/bin") )
(executable = "/home/nobody/bin/a.out" )
(directory = "/home/nobody" )
(arguments = "/home/nobody/data/file1"
"/home/nobody/data/file2"
"$(FOO)" )
(environment = (DATADIR "/home/nobody/data"))
(count = 1)
Note in the above variable-substitution example, the variable
substitution definitions are not automatically made a part of the job's
environment. And explicit environment attribute must be
used to add environment variables for the job. Also note that the third
value in the arguments clause is not a variable reference but
only quoted literal that happens to contain one of the special characters.
The following is a modified BNF grammar for the Resource
Specification Language. Lexical rules are provided for
the implicit concatenation sequences in the form of conventional regular
expressions; for the implicit-concat non-terminal
rules, whitespace is not allowed between juxtaposed non-terminals. Grammar
comments are provided in square brackets in a column to the right
of the productions, eg [comment] to help relate
productions in the grammar to the terminology used in the above discussion.
Regular expressions are provided for the terminal class
string-literal and for RSL comments. These regular
expression make use of a common inverted character-class notation,
as popularized by the various lex tools.
Comments are syntactically equivalent to whitespace and can only appear
where the comment prefix cannot be mistaken for the trailing part of a
multi-character unquoted literal.
| RSL Grammar | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Table of Contents
- globusrun - Execute and manage jobs via GRAM
- globus-job-cancel - Cancel a GRAM batch job
- globus-job-clean - Cancel and clean up a GRAM batch job
- globus-job-get-output - Retrieve the output and error streams from a GRAM job
- globus-job-run - Execute a job using GRAM
- globus-job-status - Check the status of a GRAM5 job
- globus-job-submit - Submit a batch job using GRAM
- globus-personal-gatekeeper - Manage a user's personal gatekeeper daemon
- globus-gram-audit - Load GRAM4 and GRAM5 audit records into a database
- globus-gatekeeper - Authorize and execute a grid service on behalf of a user
- globus-job-manager - Execute and monitor jobs
- globus-job-manager-event-generator - Create LRM-independent SEG files for the job manager to use
- globus-fork-starter - Start and monitor a fork job
Name
globusrun — Execute and manage jobs via GRAM
Synopsis
globusrun [-help] [-usage] [-version] [-versions]
globusrun { -p | -parse }
{ -f RSL_FILENAME | -file RSL_FILENAME | RSL_SPECIFICATION }
globusrun [-n] [-no-interrupt]
{ -r RESOURCE_CONTACT | -resource RESOURCE_CONTACT }
{ -a | -authenticate-only }
globusrun [-n] [-no-interrupt]
{ -r RESOURCE_CONTACT | -resource RESOURCE_CONTACT }
{ -j | -jobmanager-version }
globusrun [-n] [-no-interrupt] { -k | -kill } {JOB_ID}
globusrun [-n] [-no-interrupt] [-full-proxy] [-D] { -y | -refresh-proxy } {JOB_ID}
globusrun { -status } {JOB_ID}
globusrun [-q] [-quiet] [-o] [-output-enable] [-s] [-server] [-w] [-write-allow] [-n] [-no-interrupt] [-b] [-batch] [-F] [-fast-batch] [-full-proxy] [-D] [-d] [-dryrun]
{ -r RESOURCE_CONTACT | -resource RESOURCE_CONTACT }
{ -f RSL_FILENAME | -file RSL_FILENAME | RSL_SPECIFICATION }
Description
The globusrun program for submits and manages jobs run on a local or remote job host. The jobs are controlled by the globus-job-manager program which interfaces with a local resource manager that schedules and executes the job.
The globusrun program can be run in a number of different modes chosen by command-line options.
When -help, -usage,
-version, or -versions command-line options
are used, globusrun will print out diagnostic information and then exit.
When the -p or -parse command-line option
is present, globusrun will verify the syntax of the RSL specification and then
terminate. If the syntax is valid, globusrun will print out the string
"RSL Parsed Successfully..." and exit with a zero exit code;
otherwise, it will print an error message and terminate with a non-zero exit
code.
When the -a or -authenticate-only
command-line option is present, globusrun will verify that the service named
by RESOURCE_CONTACT exists and the client's
credentials are granted permission to access that service. If authentication
is successful, globusrun will display the string "GRAM Authentication test
successful" and exit with a zero exit code; otherwise it will print
an explanation of the problem and will with a non-zero exit code.
When the -j or -jobmanager-version
command-line option is present, globusrun will attempt to determine the software
version that the service named by
RESOURCE_CONTACT is running. If successful, it will
display both the Toolkit version and the Job Manager package version and exit
with a zero exit code; otherwise, it will print an explanation of the problem
and exit with a non-zero exit code.
When the -k or -kill
command-line option is present, globusrun will attempt to terminate the
job named by JOB_ID. If successful, globusrun will
exit with zero; otherwise it will display an explanation of the problem and
exit with a non-zero exit code.
When the -y or -refresh-proxy
command-line option is present, globusrun will attempt to delegate a new
X.509 proxy to the job manager which is managing the job named by
JOB_ID. If successful, globusrun will
exit with zero; otherwise it will display an explanation of the problem and
exit with a non-zero exit code. This behavior can be modified by the
-full-proxy or -D command-line options
to enable full proxy delegation. The default is limited proxy delegation.
When the -status command-line option is present, globusrun will
attempt to determine the current state of the job. If successful, the state
will be printed to standard output and globusrun will exit with a zero exit code;
otherwise, a description of the error will be displayed and it will exit
with a non-zero exit code.
Otherwise, globusrun will submit the job to a GRAM service. By default, globusrun waits until the job has terminated or failed before exiting, displaying information about job state changes and at exit time, the job exit code if it is provided by the GRAM service.
The globusrun program can also function as a GASS file
server to allow the globus-job-manager program to stage files to and from the machine on
which globusrun is executed to the GRAM service node. This behavior is controlled
by the -s, -o, and -w
command-line options.
Jobs submitted by globusrun can be monitored interactively or detached. To have
globusrun detach from the GRAM service after submitting the job, use the
-b or -F command-line options.
Options
The full set of options to globusrun consist of:
-help- Display a help message to standard error and exit.
-usage- Display a one-line usage summary to standard error and exit.
-version- Display the software version of globusrun to standard error and exit.
-versions- Display the software version of all modules used by globusrun (including DiRT information) to standard error and then exit.
-p,-parse- Do a parse check on the job specification and print diagnostics. If a parse error occurs, globusrun exits with a non-zero exit code.
-f,RSL_FILENAME-fileRSL_FILENAME- Read job specification from the file named by
RSL_FILENAME. -n,-no-interrupt- Disable handling of the
SIGINTsignal, so that the interrupt character (typically Control-C) causes globusrun to terminate without canceling the job. -r,RESOURCE_CONTACT-resourceRESOURCE_CONTACTSubmit the request to the resource specified by
RESOURCE_CONTACT. A resource may be specified in the following ways:HOSTHOST:PORTHOST:PORT/SERVICEHOST/SERVICEHOST:/SERVICEHOST::SUBJECTHOST:PORT:SUBJECTHOST/SERVICE:SUBJECTHOST:/SERVICE:SUBJECTHOST:PORT/SERVICE:SUBJECT
If any of
PORT,SERVICE, orSUBJECTis omitted, the defaults of2811,jobmanager, andhost@HOSTare used respectively.- -j, -jobmanager-version
- Print the software version being run by the service
running at
RESOURCE_CONTACT. -k,JOB_ID-killJOB_ID- Kill the job named by
JOB_ID -D,-full-proxy- Delegate a full impersonation proxy to the service. By default, a limited proxy is delegated when needed.
-y,-refresh-proxy- Delegate a new proxy to the service processing
JOB_ID. -status- Display the current status of the job named by
JOB_ID. -q,-quiet- Do not display job state change or exit code information.
-o,-output-enable- Start a GASS server within the globusrun application
that allows access to its standard output and standard error streams
only. Also, augment the
RSL_SPECIFICATIONwith a definition of theGLOBUSRUN_GASS_URLRSL substitution and addstdoutandstderrclauses which redirect the output and error streams of the job to the output and error streams of the interactive globusrun command. If this is specified, then globusrun acts as though the-qwere also specified. -s,-server- Start a GASS server within the globusrun application
that allows access to its standard output and standard error streams
for writing and any file local the the globusrun invocation for reading.
Also, augment the
RSL_SPECIFICATIONwith a definition of theGLOBUSRUN_GASS_URLRSL substitution and addstdoutandstderrclauses which redirect the output and error streams of the job to the output and error streams of the interactive globusrun command. If this is specified, then globusrun acts as though the-qwere also specified. -w,-write-allow- Start a GASS server within the globusrun application
that allows access to its standard output and standard error streams
for writing and any file local the the globusrun invocation for reading or
writing. Also, augment the
RSL_SPECIFICATIONwith a definition of theGLOBUSRUN_GASS_URLRSL substitution and addstdoutandstderrclauses which redirect the output and error streams of the job to the output and error streams of the interactive globusrun command. If this is specified, then globusrun acts as though the-qwere also specified. -b,-batch- Terminate after submitting the job to the GRAM
service. The globusrun program will exit after the job hits any of the
following states:
PENDING,ACTIVE,FAILED, orDONE. The GASS-related options can be used to stage input files, but standard output, standard error, and file staging after the job completes will not be processed. -F,-fast-batch- Terminate after submitting the job to the GRAM
service. The globusrun program will exit after it receives a reply from
the service. The
JOB_IDwill be displayed to standard output before terminating so that the job can be checked with the-statuscommand-line option or modified by the-refresh-proxyor-killcommand-line options. -d,-dryrun- Submit the job with the
dryrunattribute set to true. When this is done, the job manager will prepare to start the job but start short of submitting it to the service. This can be used to detect problems with theRSL_SPECIFICATION.
Environment
If the following variables affect the execution of globusrun
X509_USER_PROXY- Path to proxy credential.
X509_CERT_DIR- Path to trusted certificate directory.
Bugs
The globusrun program assumes any failure to contact the job means the
job has terminated. In fact, this may be due to the
globus-job-manager program exiting after all jobs it is
managing have reached the DONE or FAILED
states. In order to reliably detect job termination, the
two_phase RSL attribute should be used.
Name
globus-job-cancel — Cancel a GRAM batch job
Synopsis
globus-job-cancel [ -f | -force ] [ -q | -quiet ] JOBID
globus-job-cancel [-help] [-usage] [-version] [-versions]
Description
The globus-job-cancel program cancels the job named by JOBID.
Any cached files associated with the job will remain until
globus-job-clean is executed for the job.
By default, globus-job-cancel prompts the user prior to canceling the job. This behavior
can be overridden by specifying the -f or
-force command-line options.
Options
The full set of options to globus-job-cancel are:
-help,-usage- Display a help message to standard error and exit.
-version- Display the software version of the globus-job-cancel program to standard output.
-version- Display the software version of the globus-job-cancel program including DiRT information to standard output.
-force,-f- Do not prompt to confirm job cancel and clean-up.
-quiet,-q- Do not print diagnostics for succesful cancel.
Implies
-f
Name
globus-job-clean — Cancel and clean up a GRAM batch job
Synopsis
globus-job-clean [ -r RESOURCE | -resource RESOURCE ]
[ -f | -force ] [ -q | -quiet ] JOBID
globus-job-clean [-help] [-usage] [-version] [-versions]
Description
The globus-job-clean program cancels the job named by JOBID if
it is still running, and then removes any cached files on the GRAM service node
related to that job. In order to do the file clean up, it submits a job which
removes the cache files. By default this cleanup job is submitted to the
default GRAM resource running on the same host as the job. This behavior can be
controlled by specifying a resource manager contact string as the parameter to
the -r or -resource option.
By default, globus-job-clean prompts the user prior to canceling the job. This behavior
can be overridden by specifying the -f or
-force command-line options.
Options
The full set of options to globus-job-clean are:
-help,-usage- Display a help message to standard error and exit.
-version- Display the software version of the globus-job-clean program to standard output.
-version- Display the software version of the globus-job-clean program including DiRT information to standard output.
-resource,RESOURCE-rRESOURCE- Submit the clean-up job to the resource named by
RESOURCEinstead of the default GRAM service on the same host as the job contact. -force,-f- Do not prompt to confirm job cancel and clean-up.
-quiet,-q- Do not print diagnostics for succesful clean-up.
Implies
-f
Name
globus-job-get-output — Retrieve the output and error streams from a GRAM job
Synopsis
globus-job-get-output [ -r RESOURCE | -resource RESOURCE ]
[ -out | -err ] [ -t LINES | -tail LINES ] [ -follow LINES | -f LINES ] JOBID
globus-job-get-output [-help] [-usage] [-version] [-versions]
Description
The globus-job-get-output program retrieves the output and error streams of the job named by
JOBID. By default, globus-job-get-output will retrieve all output
and error data from the job and display them to its own output and error
streams. Other behavior can be controlled by using command-line options.
The data retrieval is implemented by submitting another job which
simply displays the contents of the first job's output and error streams.
By default this retrieval job is submitted to the
default GRAM resource running on the same host as the job. This behavior can be
controlled by specifying a particular resource manager contact string as the
RESOURCE parameter to the -r or
-resource option.
Options
The full set of options to globus-job-get-output are:
-help,-usage- Display a help message to standard error and exit.
-version- Display the software version of the globus-job-get-output program to standard output.
-version- Display the software version of the globus-job-get-output program including DiRT information to standard output.
-resource,RESOURCE-rRESOURCE- Submit the retrieval job to the resource named by
RESOURCEinstead of the default GRAM service on the same host as the job contact. -out- Retrieve only the standard output stream of the job. The default is to retrieve both standard output and standard error.
-err- Retrieve only the standard error stream of the job. The default is to retrieve both standard output and standard error.
-tail,LINES-tLINES- Print only the last
LINEScount lines of output from the data streams being retrieved. By default, the entire output and error file data is retrieved. This option can not be used along with the-for-followoptions. -follow,LINES-fLINES- Print the last
LINEScount lines of output from the data streams being retrieved and then wait until canceled, printing any subsequent job output that occurs. By default, the entire output and error file data is retrieved. This option can not be used along with the-tor-tailoptions.
Name
globus-job-run — Execute a job using GRAM
Synopsis
globus-job-run [-dumprsl] [-dryrun] [-verify]
[-file ARGUMENT_FILE]
SERVICE_CONTACT
[ -np PROCESSES | -count PROCESSES ]
[ -m MAX_TIME | -maxtime MAX_TIME ]
[ -p PROJECT | -project PROJECT ]
[ -q QUEUE | -queue QUEUE ]
[ -d DIRECTORY | -directory DIRECTORY ] [-env NAME=VALUE]...
[-stdin
[ -l | -s ]
STDIN_FILE
] [-stdout
[ -l | -s ]
STDOUT_FILE
] [-stderr
[ -l | -s ]
STDERR_FILE
]
[-x RSL_CLAUSE]
[ -l | -s ] EXECUTABLE [ARGUMENT...]
globus-job-run [-help] [-usage] [-version] [-versions]
Description
The globus-job-run program constructs a job description from
its command-line options and then submits the job to the GRAM service running
at SERVICE_CONTACT. The executable and arguments to
the executable are provided on the command-line after all other options.
Note that the -dumprsl, -dryrun,
-verify, and -file command-line options must
occur before the first non-option argument, the
SERVICE_CONTACT.
The globus-job-run provides similar functionality to globusrun in that it allows interactive start-up of GRAM jobs. However, unlike globusrun, it uses command-line parameters to define the job instead of RSL expressions.
Options
The full set of options to globus-job-run are:
-help,-usage- Display a help message to standard error and exit.
-version- Display the software version of the globus-job-run program to standard output.
-version- Display the software version of the globus-job-run program including DiRT information to standard output.
-dumprsl- Translate the command-line options to globus-job-run into an RSL expression that can be used with tools such as globusrun.
-dryrun- Submit the job request to the GRAM service
with the
dryrunoption enabled. When this option is used, the GRAM service prepares to execute the job but stops before submitting the job to the LRM. This can be used to diagnose some problems such as missing files. -verify- Submit the job request to the GRAM service
with the
dryrunoption enabled and then without it enabled if the dryrun is successful. -fileARGUMENT_FILE- Read additional command-line options from
ARGUMENT_FILE. -np,PROCESSES-countPROCESSES- Start
PROCESSESinstances of the executable as a single job. -m,MAX_TIME-maxtimeMAX_TIME- Schedule the job to run for a maximum of
MAX_TIMEminutes. -p,PROJECT-projectPROJECT- Request that the job use the allocation
PROJECTwhen submitting the job to the LRM. -q,QUEUE-queueQUEUE- Request that the job be submitted to the
LRM using the named
QUEUE. -d,DIRECTORY-directoryDIRECTORY- Run the job in the directory named by
DIRECTORY. Input and output files will be interpreted relative to this directory. This directory must exist on the file system on the LRM-managed resource. If not specified, the job will run in the home directory of the user the job is running as. -envNAME=VALUE- Define an environment variable named by
NAMEwith the valueVALUEin the job environment. This option may be specified multiple times to define multiple environment variables. -stdin [-l | -s]STDIN_FILE- Use the file named by
STDIN_FILEas the standard input of the job. If the-loption is specified, then this file is interpreted to be on a file system local to the LRM. If the-soption is specified, then this file is interpreted to be on the file system where globus-job-run is being executed, and the file will be staged via GASS. If neither is specified, the local behavior is assumed. -stdout [-l | -s]STDOUT_FILE- Use the file named by
STDOUT_FILEas the destination for the standard output of the job. If the-loption is specified, then this file is interpreted to be on a file system local to the LRM. If the-soption is specified, then this file is interpreted to be on the file system where globus-job-run is being executed, and the file will be staged via GASS. If neither is specified, the local behavior is assumed. -stderr [-l | -s]STDERR_FILE- Use the file named by
STDERR_FILEas the destination for the standard error of the job. If the-loption is specified, then this file is interpreted to be on a file system local to the LRM. If the-soption is specified, then this file is interpreted to be on the file system where globus-job-run is being executed, and the file will be staged via GASS. If neither is specified, the local behavior is assumed. -xRSL_CLAUSE- Add a set of custom RSL attributes described
by
RSL_CLAUSEto the job description. The clause must be an RSL conjunction and may contain one or more attributes. This can be used to include attributes which can not be defined by other command-line options of globus-job-run. -l- When included outside the context of
-stdin,-stdout, or-stderrcommand-line options,-loption alters the interpretation of the executable path. If the-loption is specified, then the executable is interpreted to be on a file system local to the LRM. -s- When included outside the context of
-stdin,-stdout, or-stderrcommand-line options,-loption alters the interpretation of the executable path. If the-soption is specified, then the executable is interpreted to be on the file system where globus-job-run is being executed, and the file will be staged via GASS. If neither is specified, the local behavior is assumed.
Name
globus-job-status — Check the status of a GRAM5 job
Synopsis
globus-job-status JOBID
globus-job-status [-help] [-usage] [-version] [-versions]
Description
The globus-job-status program checks the status of a GRAM
job by sending a status request to the job manager contact for that job
specifed by the JOBID parameter. If
successful, it will print the job status to standard output. The states
supported by globus-job-status are:
- PENDING
The job has been submitted to the LRM but has not yet begun execution.
- ACTIVE
The job has begun execution.
- FAILED
The job has failed.
- SUSPENDED
The job is currently suspended by the LRM.
- DONE
The job has completed.
- UNSUBMITTED
The job has been accepted by GRAM, but not yet submitted to the LRM.
- STAGE_IN
The job has been accepted by GRAM and is currently staging files prior to being submitted to the LRM.
- STAGE_OUT
The job has completed execution and is currently staging files from the service node to other http, GASS, or GridFTP servers.
Options
The full set of options to globus-job-status are:
-help,-usage- Display a help message to standard error and exit.
-version- Display the software version of the globus-job-status program to standard output.
-versions- Display the software version of the globus-job-status program including DiRT information to standard output.
ENVIRONMENT
If the following variables affect the execution of globus-job-status.
- X509_USER_PROXY
- Path to proxy credential.
- X509_CERT_DIR
- Path to trusted certificate directory.
Name
globus-job-submit — Submit a batch job using GRAM
Synopsis
globus-job-submit [-dumprsl] [-dryrun] [-verify]
[-file ARGUMENT_FILE]
SERVICE_CONTACT
[ -np PROCESSES | -count PROCESSES ]
[ -m MAX_TIME | -maxtime MAX_TIME ]
[ -p PROJECT | -project PROJECT ]
[ -q QUEUE | -queue QUEUE ]
[ -d DIRECTORY | -directory DIRECTORY ] [-env NAME=VALUE]...
[-stdin
[ -l | -s ]
STDIN_FILE
] [-stdout
[ -l | -s ]
STDOUT_FILE
] [-stderr
[ -l | -s ]
STDERR_FILE
]
[-x RSL_CLAUSE]
[ -l | -s ] EXECUTABLE [ARGUMENT...]
globus-job-submit [-help] [-usage] [-version] [-versions]
Description
The globus-job-submit program constructs a job description
from
its command-line options and then submits the job to the GRAM service running
at SERVICE_CONTACT. The executable and arguments to
the executable are provided on the command-line after all other options.
Note that the -dumprsl, -dryrun,
-verify, and -file command-line options must
occur before the first non-option argument, the
SERVICE_CONTACT.
The globus-job-submit provides similar functionality to globusrun in that it allows batch submission of GRAM jobs. However, unlike globusrun, it uses command-line parameters to define the job instead of RSL expressions.
To retrieve the output and error streams of the job, use the program globus-job-get-output. To reclaim resources used by the job by deleting cached files and job state, use the program globus-job-clean. To cancel a batch job submitted by globus-job-submit, use the program globus-job-cancel.
Options
The full set of options to globus-job-submit are:
-help,-usage- Display a help message to standard error and exit.
-version- Display the software version of the globus-job-submit program to standard output.
-versions- Display the software version of the globus-job-submit program including DiRT information to standard output.
-dumprsl- Translate the command-line options to globus-job-submit into an RSL expression that can be used with tools such as globusrun.
-dryrun- Submit the job request to the GRAM service
with the
dryrunoption enabled. When this option is used, the GRAM service prepares to execute the job but stops before submitting the job to the LRM. This can be used to diagnose some problems such as missing files. -verify- Submit the job request to the GRAM service
with the
dryrunoption enabled and then without it enabled if the dryrun is successful. -fileARGUMENT_FILE- Read additional command-line options from
ARGUMENT_FILE. -np,PROCESSES-countPROCESSES- Start
PROCESSESinstances of the executable as a single job. -m,MAX_TIME-maxtimeMAX_TIME- Schedule the job to run for a maximum of
MAX_TIMEminutes. -p,PROJECT-projectPROJECT- Request that the job use the allocation
PROJECTwhen submitting the job to the LRM. -q,QUEUE-queueQUEUE- Request that the job be submitted to the
LRM using the named
QUEUE. -d,DIRECTORY-directoryDIRECTORY- Run the job in the directory named by
DIRECTORY. Input and output files will be interpreted relative to this directory. This directory must exist on the file system on the LRM-managed resource. If not specified, the job will run in the home directory of the user the job is running as. -envNAME=VALUE- Define an environment variable named by
NAMEwith the valueVALUEin the job environment. This option may be specified multiple times to define multiple environment variables. -stdin [-l | -s]STDIN_FILE- Use the file named by
STDIN_FILEas the standard input of the job. If the-loption is specified, then this file is interpreted to be on a file system local to the LRM. If the-soption is specified, then this file is interpreted to be on the file system where globus-job-submit is being executed, and the file will be staged via GASS. If neither is specified, the local behavior is assumed. -stdout [-l | -s]STDOUT_FILE- Use the file named by
STDOUT_FILEas the destination for the standard output of the job. If the-loption is specified, then this file is interpreted to be on a file system local to the LRM. If the-soption is specified, then this file is interpreted to be on the file system where globus-job-submit is being executed, and the file will be staged via GASS. If neither is specified, the local behavior is assumed. -stderr [-l | -s]STDERR_FILE- Use the file named by
STDERR_FILEas the destination for the standard error of the job. If the-loption is specified, then this file is interpreted to be on a file system local to the LRM. If the-soption is specified, then this file is interpreted to be on the file system where globus-job-submit is being executed, and the file will be staged via GASS. If neither is specified, the local behavior is assumed. -xRSL_CLAUSE- Add a set of custom RSL attributes described
by
RSL_CLAUSEto the job description. The clause must be an RSL conjunction and may contain one or more attributes. This can be used to include attributes which can not be defined by other command-line options of globus-job-submit. -l- When included outside the context of
-stdin,-stdout, or-stderrcommand-line options,-loption alters the interpretation of the executable path. If the-loption is specified, then the executable is interpreted to be on a file system local to the LRM. -s- When included outside the context of
-stdin,-stdout, or-stderrcommand-line options,-loption alters the interpretation of the executable path. If the-soption is specified, then the executable is interpreted to be on the file system where globus-job-run is being executed, and the file will be staged via GASS. If neither is specified, the local behavior is assumed.
Name
globus-personal-gatekeeper — Manage a user's personal gatekeeper daemon
Synopsis
globus-personal-gatekeeper [-help] [-usage] [-version] [-versions] [-list] [-directory CONTACT]
globus-personal-gatekeeper [-debug] {-start} [-jmtype LRM] [-auditdir AUDIT_DIRECTORY] [-port PORT] [-log [=DIRECTORY]] [-seg] [-acctfile ACCOUNTING_FILE]
globus-personal-gatekeeper [-killall] [-kill]
Description
The globus-personal-gatekeeper command is a utility which manages a gatekeeper and job manager service for a single user. Depending on the command-line arguments it will operate in one of several modes. In the first set of arguments indicated in the synopsis, the program provides information about the globus-personal-gatekeeper command or about instances of the globus-personal-gatekeeper that are running currently. The second set of arguments indicated in the synopsis provide control over starting a new globus-personal-gatekeeper instance. The final set of arguments provide control for terminating one or more globus-personal-gatekeeper instances.
The -start mode will create a new subdirectory of
and write the configuration
files needed to start a globus-gatekeeper daemon which will
invoke the globus-job-manager service when new authenticated
connections are made to its service port. The
globus-personal-gatekeeper then exits, printing the contact
string for the new gatekeeper prefixed by $HOME/.globusGRAM contact: to
standard output. In addition to the arguments described above, any arguments
described in globus-job-manager(8) can be appended to the
command-line and will be added to the job manager configuration for the service
started by the globus-gatekeeper.
The new globus-gatekeeper will continue to run in the
background until killed by invoking
globus-personal-gatekeeper with the -kill or
-killall argument. When killed, it will kill the
globus-gatekeeper and globus-job-manager
processes, remove state files and configuration data, and then exit. Jobs which
are running when the personal gatekeeper is killed will continue to run, but
their job directory will be destroyed so they may fail in the LRM.
The full set of command-line options to globus-personal-gatekeeper consists of:
-help,-usage- Print command-line option summary and exit
-version- Print software version
-versions- Print software version including DiRT information
-list- Print a list of all currently running personal gatekeepers. These entries will be printed one per line.
-directoryCONTACT- Print the configuration directory for the personal gatekeeper with the contact string
CONTACT. -debug- Print additional debugging information when starting a personal gatekeeper. This option is ignored in other modes.
-start- Start a new personal gatekeeper process.
-jmtypeLRM- Use
LRMas the local resource manager interface. If not provided when starting a personal gatekeeper, the job manager will use the defaultforkLRM. -auditdirAUDIT_DIRECTORY- Write audit report files to
AUDIT_DIRECTORY. If not provided, the job manager will not write any audit files. -portPORT- Listen for gatekeeper TCP/IP connections on the port
PORT. If not provided, the gatekeeper will let the operating system choose. -log[=DIRECTORY]- Write job manager log files to
DIRECTORY. IfDIRECTORYis omitted, the default ofwill be used. If this option is not present, the job manager will not write any log files.$HOME -seg- Try to use the SEG mechanism to receive job state change information, instead of polling for these. These require either the system administrator or the user to run an instance of the globus-job-manager-event-generator program for the LRM specified by the
-jmtypeoption. -acctfileACCOUNTING_FILE- Write gatekeeper accounting entries to
ACCOUNTING_FILE. If not provided, no accounting records are written.
Examples
This example shows the output when starting a new personal gatekeeper which
will schedule jobs via the lsf LRM, with debugging enabled.
%globus-personal-gatekeeper -start -jmtype lsfverifying setup... done. GRAM contact: personal-grid.example.org:57846:/DC=org/DC=example/CN=Joe User
This example shows the output when listing the current active personal gatekeepers.
%globus-personal-gatekeeper -listpersonal-grid.example.org:57846:/DC=org/DC=example/CN=Joe User
This example shows the output when querying the configuration directory for th eabove personal gatekeeper. gatekeepers.
%globus-personal-gatekeeper -directory "personal-grid.example.org:57846:/DC=org/DC=example/CN=Joe User"/home/juser/.globus/.personal-gatekeeper.personal-grid.example.org.1337
%globus-personal-gatekeeper -kill "personal-grid.example.org:57846:/DC=org/DC=example/CN=Joe User"killing gatekeeper: "personal-grid.example.org:57846:/DC=org/DC=example/CN=Joe User"
Name
globus-gram-audit — Load GRAM4 and GRAM5 audit records into a database
Synopsis
globus-gram-audit [--conf CONFIG_FILE] [--check] [--delete] [--audit-directory AUDITDIR]
Description
The globus-gram-audit program loads audit records to an
SQL-based database. It reads
by default to determine the audit directory and then uploads all files in that
directory that contain valid audit records to the database configured by the
globus_gram_job_manager_auditing_setup_scripts
package. If the upload completes successfully, the audit files will be removed.
$GLOBUS_LOCATION/etc/globus-job-manager.conf
The full set of command-line options to globus-gram-audit consist of:
--conf |
Use |
--check |
Check whether the insertion of a record was successful by querying the database after inserting the records. This is used in tests. |
--delete | Delete audit records from the database right after inserting them. This is used in tests to avoid filling the databse with test records. |
--audit-directory | Look for audit records in DIR, instead of looking in the directory specified in the job manager configuration. This is used in tests to control which records are loaded to the database and then deleted. |
--query | Perform the given SQL query on the audit database. This uses the database information from the configuration file to determine how to contact the database. |
FILES
The globus-gram-audit uses the following files (paths
relative to $GLOBUS_LOCATION.
etc/globus-gram-job-manager.conf |
GRAM5 job manager configuration. It includes the default path to the audit directory |
etc/globus-gram-audit.conf |
Audit configuration. It includes the information needed to contact the audit database. |
Name
globus-gatekeeper — Authorize and execute a grid service on behalf of a user
Synopsis
globus-gatekeeper [-help]
[-conf PARAMETER_FILE]
[-test] [ -d | -debug ]
{ -inetd | -f }
[ -p PORT | -port PORT ]
[-home PATH] [ -l LOGFILE | -logfile LOGFILE ]
[-acctfile ACCTFILE]
[-e LIBEXECDIR]
[-launch_method
{ fork_and_exit | fork_and_wait | dont_fork }
]
[-grid_services SERVICEDIR]
[-globusid GLOBUSID]
[-gridmap GRIDMAP]
[-x509_cert_dir TRUSTED_CERT_DIR]
[-x509_cert_file TRUSTED_CERT_FILE]
[-x509_user_cert CERT_PATH]
[-x509_user_key KEY_PATH]
[-x509_user_proxy PROXY_PATH]
[-k]
[-globuskmap KMAP]
Description
The globus-gatekeeper program is a meta-server similar to inetd or xinetd that starts other services after authenticating the TCP connection using GSSAPI.
The most common use for the globus-gatekeeper program is to start instances of the globus-job-manager(8) service. A single globus-gatekeeper deployment can handle multiple different service configurations by having entries in the grid-services directory.
Typically, users interact with the globus-gatekeeper program via client applications such as globusrun(1), globus-job-submit, or tools such as CoG jglobus or Condor-G.
The full set of command-line options to globus-gatekeeper consists of:
- -help
- Display a help message to standard error and exit
- -conf
PARAMETER_FILE - Load configuration parameters from
PARAMETER_FILE. The parameters in that file are treated as additional command-line options. - -test
- Parse the configuration file and print out the POSIX user id of the globus-gatekeeper process, service home directory, service execution directory, and X.509 subject name and then exits.
- -d, -debug
- Run the globus-gatekeeper process in the foreground.
- -inetd
- Flag to indicate that the globus-gatekeeper process was started via inetd or a similar super-server. If this flag is set and the globus-gatekeeper was not started via inetd, a warning will be printed in the gatekeeper log.
- -f
- Flag to indicate that the globus-gatekeeper process should run in the foreground. This flag has no effect when the globus-gatekeeper is started via inetd.
- -p
PORT, -portPORT - Listen for connections on the TCP/IP port
PORT. This option has no effect if the globus-gatekeeper is started via inetd or a similar service. If not specified and the gatekeeper is running as root, the default of754is used. Otherwise, the gatekeeper defaults to an ephemeral port. - -home
PATH - Sets the gatekeeper deployment directory to
PATH. This is used to interpret relative paths for accounting files, libexecdir, certificate paths, and also to set theGLOBUS_LOCATIONenvironment variable in the service environment. If not specified, the gatekeeper uses its working directory. - -l
LOGFILE, -logfileLOGFILE - Write status log entries to
LOGFILE - -acctfile
ACCTFILE - Set the path to write accounting records to
ACCTFILE. If not set, no accounting records will be written. - -e
LIBEXECDIR - Look for service executables in
LIBEXECDIR. If not specified, the default ofis used.HOME/libexec - -launch_method
fork_and_exit|fork_and_wait|dont_fork - Determine how to launch services. The method may be either
fork_and_exit(the service runs completely independently of the gatekeeper, which exits after creating the new service process),fork_and_wait(the service is run in a separate process from the gatekeeper but the gatekeeper does not exit until the service terminates), ordont_fork, where the gatekeeper process becomes the service process via theexec()system call. - -grid_services
SERVICEDIR - Look for service descriptions in
SERVICEDIR. If this is a relative path, it is interpreted relative to theHOMEvalue. If this is not specified, the default ofis used.HOME/etc/grid-services - -globusid
GLOBUSID - Sets the
GLOBUSIDenvironment variable toGLOBUSID. This variable is used to construct the gatekeeper contact string if it can not be parsed from the service credential. - -gridmap
GRIDMAP - Use the file at
GRIDMAPto map GSSAPI names to POSIX user names. If not specified, the default ofis used.HOME/etc/grid-mapfile - -x509_cert_dir
TRUSTED_CERT_DIR - Use the directory
TRUSTED_CERT_DIRto locate trusted CA X.509 certificates. The gatekeeper sets the environment variableX509_CERT_DIRto this value. - -x509_cert_file
TRUSTED_CERT_FILE - OBSOLETE GSI OPTION
- -x509_user_cert
CERT_PATH - Read the service X.509 certificate from
CERT_PATH. The gatekeeper sets theX509_USER_CERTenvironment variable to this value. - -x509_user_key
KEY_PATH - Read the private key for the service from
KEY_PATH. The gatekeeper sets theX509_USER_KEYenvironment variable to this value. - -x509_user_proxy
PROXY_PATH - Read the X.509 proxy certificate from
PROXY_PATH. The gatekeeper sets theX509_USER_PROXYenvironment variable to this value. - -k
- Assume authentication with Kerberos 5 GSSAPI instead of X.509 GSSAPI.
- -globuskmap
KMAP - Assume authentication with Kerberos 5 GSSAPI instead of X.509 GSSAPI and use
KMAPas the path to the kerberos principal to POSIX user mapping file.
ENVIRONMENT
If the following variables affect the execution of globus-gatekeeper
- X509_CERT_DIR
- Directory containing X.509 trust anchors and signing policy files.
- X509_USER_PROXY
- Path to file containing an X.509 proxy.
- X509_USER_CERT
- Path to file containing an X.509 user certificate.
- X509_USER_KEY
- Path to file containing an X.509 user key.
Name
globus-job-manager — Execute and monitor jobs
Synopsis
globus-job-manager {-type LRM} [-conf CONFIG_PATH] [-help] [-globus-host-manufacturer MANUFACTURER] [-globus-host-cputype CPUTYPE] [-globus-host-osname OSNAME] [-globus-host-osversion OSVERSION] [-globus-gatekeeper-host HOST] [-globus-gatekeeper-port PORT] [-globus-gatekeeper-subject SUBJECT] [-home GLOBUS_LOCATION] [-target-globus-location TARGET_GLOBUS_LOCATION] [-condor-arch ARCH] [-condor-os OS] [-history HISTORY_DIRECTORY] [-scratch-dir-base SCRATCH_DIRECTORY] [-enable-syslog] [-stdio-log LOG_DIRECTORY] [-log-levels LEVELS] [-state-file-dir STATE_DIRECTORY] [-globus-tcp-port-range PORT_RANGE] [-x509-cert-dir TRUSTED_CERTIFICATE_DIRECTORY] [-cache-location GASS_CACHE_DIRECTORY] [-k] [-extra-envvars VAR=VAL,...] [-seg-module SEG_MODULE] [-audit-directory AUDIT_DIRECTORY] [-globus-toolkit-version TOOLKIT_VERSION] [-disable-streaming] [-disable-usagestats] [-usagestats-targets TARGET] [-service-tag SERVICE_TAG]
Description
The globus-job-manager program is a servivce which starts and controls GRAM jobs which are executed by a local resource management system, such as LSF or Condor. The globus-job-manager program is typically started by the globus-gatekeeper program and not directly by a user. It runs until all jobs it is managing have terminated or its delegated credentials have expired.
Typically, users interact with the globus-job-manager program via client applications such as globusrun, globus-job-submit, or tools such as CoG jglobus or Condor-G.
The full set of command-line options to globus-job-manager consists of:
-help- Display a help message to standard error and exit
-typeLRM- Execute jobs using the local resource manager named
LRM. -confCONFIG_PATH- Read additional command-line arguments from the file
CONFIG_PATH. If present, this must be the first command-line argument to the globus-job-manager program. -globus-host-manufacturerMANUFACTURER- Indicate the manufacturer of the system which the jobs will execute on. This parameter sets the value of the
$(GLOBUS_HOST_MANUFACTURER)RSL substitution toMANUFACTURER -globus-host-cputypeCPUTYPE- Indicate the CPU type of the system which the jobs will execute on. This parameter sets the value of the
$(GLOBUS_HOST_CPUTYPE)RSL substitution toCPUTYPE -globus-host-osnameOSNAME- Indicate the operating system type of the system which the jobs will execute on. This parameter sets the value of the
$(GLOBUS_HOST_OSNAME)RSL substitution toOSNAME -globus-host-osversionOSVERSION- Indicate the operating system version of the system which the jobs will execute on. This parameter sets the value of the
$(GLOBUS_HOST_OSVERSION)RSL substitution toOSVERSION -globus-gatekeeper-hostHOST- Indicate the host name of the machine which the job was submitted to. This parameter sets the value of the
$(GLOBUS_GATEKEEPER_HOST)RSL substitution toHOST -globus-gatekeeper-portPORT- Indicate the TCP port number of gatekeeper to which jobs are submitted to. This parameter sets the value of the
$(GLOBUS_GATEKEEPER_PORT)RSL substitution toPORT -globus-gatekeeper-subjectSUBJECT- Indicate the X.509 identity of the gatekeeper to which jobs are submitted to. This parameter sets the value of the
$(GLOBUS_GATEKEEPER_SUBJECT)RSL substitution toSUBJECT -homeGLOBUS_LOCATION- Indicate the path where the Globus Toolkit(r) is installed on the service node. This is used by the job manager to locate its support and configuration files.
-target-globus-locationTARGET_GLOBUS_LOCATION- Indicate the path where the Globus Toolkit(r) is installed on the execution host. If this is omitted, the value specified as a parameter to
-homeis used. This parameter sets the value of the$(GLOBUS_LOCATION)RSL substitution toTARGET_GLOBUS_LOCATION -historyHISTORY_DIRECTORY- Configure the job manager to write job history files to
HISTORY_DIRECTORY. These files are described in the FILES section below. -scratch-dir-baseSCRATCH_DIRECTORY- Configure the job manager to use
SCRATCH_DIRECTORYas the default scratch directory root if a relative path is specified in the job RSL'sscratch_dirattribute. -enable-syslog- Configure the job manager to write log messages via syslog. Logging is further controlled by the argument to the
-log-levelsparameter described below. -stdio-logLOG_DIRECTORY- Configure the job manager to write log messages to files in the
LOG_DIRECTORYdirectory. Files will be named. Logging is further controlled by the argument to theLOG_DIRECTORY/gram_YYYYMMDD.log-log-levelsparameter described below. TheLOG_DIRECTORYvalue can include variables derived from the job manager environment using the same syntax as RSL substitutions. For example,-stdio-log $(HOME)would cause each user's logs to be stored in their individual home directories. -log-levelsLEVELS- Configure the job manager to write log messages of certain levels to syslog and/or log files. The available log levels are
FATAL,ERROR,WARN,INFO,DEBUG, andTRACE. Multiple values can be combined with the|character. The default value of logging when enabled isFATAL|ERROR. -state-file-dirSTATE_DIRECTORY- Configure the job manager to write state files to
STATE_DIRECTORY. If not specified, the job manager uses the default of. This directory must be writable by all users and be on a file system which supports POSIX advisory file locks.$GLOBUS_LOCATION/tmp/gram_job_state/ -globus-tcp-port-rangePORT_RANGE- Configure the job manager to restrict its TCP/IP communication to use ports in the range described by
PORT_RANGE. This value is also made available in the job environment via theGLOBUS_TCP_PORT_RANGEenvironment variable. -x509-cert-dirTRUSTED_CERTIFICATE_DIRECTORY- Configure the job manager to search
TRUSTED_CERTIFICATE_DIRECTORYfor its list of trusted CA certificates and their signing policies. This value is also made available in the job environment via theX509_CERT_DIRenvironment variable. -cache-locationGASS_CACHE_DIRECTORY- Configure the job manager to use the path
GASS_CACHE_DIRECTORYfor its temporary GASS-cache files. This value is also made available in the job environment via theGLOBUS_GASS_CACHE_DEFAULTenvironment variable. -k- Configure the job manager to assume it is using Kerberos for authentication instead of X.509 certificates. This disables some certificate-specific processing in the job manager.
-extra-envvarsVAR=VAL,...- Configure the job manager to define a set of environment variables in the job environment beyond those defined in the base job environment. The format of the parameter to this argument is a comma-separated sequence of VAR=VAL pairs, where
VARis the variable name andVALis the variables value. -seg-moduleSEG_MODULE- Configure the job manager to use the schedule event generator module named by
SEG_MODULEto detect job state changes events from the local resource manager, in place of the less efficient polling operations used in GT2. To use this, one instance of the globus-job-manager-event-generator must be running to process events for the LRM into a generic format that the job manager can parse. -audit-directoryAUDIT_DIRECTORY- Configure the job manager to write audit records to the directory named by
AUDIT_DIRECTORY. This records can be loaded into a database using the globus-gram-audit program. -globus-toolkit-versionTOOLKIT_VERSION- Configure the job manager to use
TOOLKIT_VERSIONas the version for audit and usage stats records. -service-tagSERVICE_TAG- Configure the job manager to use
SERVICE_TAGas a unique identifier to allow multiple GRAM instances to use the same job state directories without interfering with each other's jobs. If not set, the valueuntaggedwill be used. -disable-streaming- Configure the job manager to disable file streaming. This is propagated to the LRM script interface but has no effect in GRAM5.
-disable-usagestats- Disable sending of any usage stats data, even if
-usagestats-targetsis present in the configuration. -usagestats-targetsTARGET- Send usage packets to a data collection service for analysis. The
TARGETstring consists of a comma-separated list of HOST:PORT combinations, each contaiing an optional list of data to send. See Usage Stats Packets for more information about the tags. Special tag strings ofall(which enables all tags) anddefaultmay be used, or a sequence of characters for the various tags. -condor-archARCH- Set the architecture specification for condor jobs to be
ARCHin job classified ads generated by the GRAM5 codnor LRM script. This is required for the condor LRM but ignored for all others. -condor-osOS- Set the operating system specification for condor jobs to be
OSin job classified ads generated by the GRAM5 codnor LRM script. This is required for the condor LRM but ignored for all others.
Environment
If the following variables affect the execution of globus-job-manager
HOME- User's home directory.
LOGNAME- User's name.
JOBMANAGER_SYSLOG_ID- String to prepend to syslog audit messages.
JOBMANAGER_SYSLOG_FAC- Facility to log syslog audit messages as.
JOBMANAGER_SYSLOG_LVL- Priority level to use for syslog audit messages.
GATEKEEPER_JM_ID- Job manager ID to be used in syslog audit records.
GATEKEEPER_PEER- Peer information to be used in syslog audit records
GLOBUS_ID- Credential information to be used in syslog audit records
GLOBUS_JOB_MANAGER_SLEEP- Time (in seconds) to sleep when the job manager is started. [For debugging purposes only]
GRID_SECURITY_HTTP_BODY_FD- File descriptor of an open file which contains the initial job request and to which the initial job reply should be sent. This file descriptor is inherited from the globus-gatekeeper.
X509_USER_PROXY- Path to the X.509 user proxy which was delegated by the client to the globus-gatekeeper program to be used by the job manager.
GRID_SECURITY_CONTEXT_FD- File descriptor containing an exported security context that the job manager should use to reply to the client which submitted the job.
Files
$HOME/.globus/job/HOSTNAME/LRM.TAG.red- Job manager delegated user credential.
$HOME/.globus/job/HOSTNAME/LRM.TAG.lock- Job manager state lock file.
$HOME/.globus/job/HOSTNAME/LRM.TAG.pid- Job manager pid file.
$HOME/.globus/job/HOSTNAME/LRM.TAG.sock- Job manager socket for inter-job manager communications.
$HOME/.globus/job/HOSTNAME/JOB_ID/- Job-specific state directory.
$HOME/.globus/job/HOSTNAME/JOB_ID/stdin- Standard input which has been staged from a remote URL.
$HOME/.globus/job/HOSTNAME/JOB_ID/stdout- Standard output which will be staged from a remote URL.
$HOME/.globus/job/HOSTNAME/JOB_ID/stderr- Standard error which will be staged from a remote URL.
$HOME/.globus/job/HOSTNAME/JOB_ID/x509_user_proxy- Job-specific delegated credential.
$GLOBUS_LOCATION/tmp/gram_job_state/job.HOSTNAME.JOB_ID- Job state file.
$GLOBUS_LOCATION/tmp/gram_job_state/job.HOSTNAME.JOB_ID.lock- Job state lock file. In most cases this will be a symlink to the job manager lock file.
$GLOBUS_LOCATION/etc/globus-job-manager.conf- Default location of the global job manager configuration file.
$GLOBUS_LOCATION/etc/grid-services/jobmanager-LRM- Default location of the LRM-specific gatekeeper configuration file.
Name
globus-job-manager-event-generator — Create LRM-independent SEG files for the job manager to use
Synopsis
globus-job-manager-event-generator [-help] {-scheduler LRM} [-background] [-pidfile PIDPATH]
Description
The globus-job-manager-event-generator program is a utility which uses LRM-specific SEG parsers to generate a LRM-independent log file that a job manager instance can use to process job status change events. This program runs independently of all globus-job-manager instances so that only one process needs to deal with the LRM interface. The globus-job-manager-event-generator program can be run as a privileged user if required to interface with the LRM.
In order for globus-job-manager-event-generator to handle events for a particular LRM, the
globus_scheduler_event_generator_job_manager_setup
setup package must be configured after the LRM-specific setup package has been
run. This can be forced by gpt-postinstall -force or running
the command cd $GLOBUS_LOCATION/setup/globus;
./setup-seg-job-manager.pl.
The full set of command-line options to globus-job-manager-event-generator consists of:
-help- Print command-line option summary and exit.
-schedulerLRM- Process events for the local resource manager
named by
LRM. -background- Run globus-job-manager-event-generator as a background process. It will fork a new process, print out its process ID and then the original process will terminate.
-pidfilePIDPATH- Write the process ID of an instance of globus-job-manager-event-generator to
the file named by
PIDPATH. This file can be used to kill or monitor the globus-job-manager-event-generator process.
Files
- globus-job-manager-seg.conf
- Configuration file for globus-job-manager-event-generator. Each line consists of
a string of the form
LRM_log_path=, which indicates the directory containing LRM-independent format SEG log files for the LRM. This file is created by the running the globus_scheduler_event_generator_job_manager_setup setup package.PATH
Name
globus-fork-starter — Start and monitor a fork job
Synopsis
globus-fork-starter
Description
The globus-fork-starter program is executes jobs
specified on its standard input stream, recording the job state changes
to a file defined in the
configuration file. It runs until its standard input stream is closed and
all jobs it is managing have terminated. The log generated by this program
can be used by the SEG to provide job state changes and exit codes to the
GRAM service. The globus-fork-starter program is typically
started by the fork GRAM module.
$GLOBUS_LOCATION/etc/globus-fork.conf
The globus-fork-starter program expects its input to be
a series of task definitions, separated by the newline character, each
representing a separate job. Each task definition contains a number of fields,
separated by the colon character. The first field is always the literal string
100 indicating the message format, the second field is
a unique job tag that will be distinguish the reply from this program
when multiple jobs are submitted. The rest of fields contain attribute
bindings. The supported attributes
are:
directory- Working directory of the job
environment- Comma-separated list of strings defining environment
variables. The form of these strings is
var=value count- Number of processes to start
executable- Full path to the executable to run
arguments- Comma-separated list of command-line arguments for the job
stdin- Full path to a file containing the input of the job
stdout- Full path to a file to write the output of the job to
stderr- Full path to a file to write the error stream of the job
Within each field, the following characters may be escaped by preceding them with the backslash character:
- backslash (\)
- semicolor (;)
- comma (,)
- equal (=)
Additionally, newline can be represented within a field by using the escape sequence \n.
For each job the globus-fork-starter processes, it replies by writing a single line to standard output. The replies again consist of a number of fields separated by the semicolon character.
For a successful job start, the first field of the reply is the literal
101, the second field is the tag from the input, and the
third field is a comma-separated list of SEG job identifiers which consist the
concatenation of a UUID and a process id. The
globus-fork-starter program will write state changes to the
SEG log using these job identifiers.
For a failure, the first field of the reply is the literal
102, the second field is the tag from the input, the
third field is the integer representation of a GRAM erorr code, and the
fourth field is an string explaining the error.
Jobs are described in GRAM5's job description language. For detailed schema information see the GRAM5 RSL documentation. For more information and examples please check the Globusrun section in the GRAM5 Users's guide.
Table of Contents
The GRAM Protocol is used to handle communication between the Gatekeeper, Job Manager, and GRAM Clients. The protocol is based on a subset of the HTTP/1.1 protocol, with a small set of message types and responses sent as the body of the HTTP requests and responses. This document describes GRAM Protocol version 2 as used by GRAM5. This is compatible with with the GRAM Protocol parsers in GRAM2 with extensions.
GRAM messages are framed in HTTP/1.1 messages. However, only a small subset of the HTTP specification is used or understood by the GRAM system. All GRAM requests are HTTP POST messages. Only the following HTTP headers are understood:
- Host
- Content-Type (set to "application/x-globus-gram" in all cases)
- Content-Length
- Connection (set to "close" in all HTTP responses)
Only the following status codes are supported in response's HTTP Status-Line:
- 200 OK
- 403 Forbidden
- 404 Not Found
- 500 Internal Server Error
- 400 Bad Request
All messages use the carriage return (ASCII value 13) followed by line feed
(ASCII value 10) sequence to delimit lines. In all cases, a blank line
separates the HTTP header from the message body. All
application/x-globus-gram message bodies consist of
attribute names followed by a colon, a space, and then the value of the
attribute. When the value may contain a newline or double-quote character,
a special escaping rule is used to encapsulate the complete string. This
encapsulation consists of surrounding the string with double-quotes, and
escaping all double-quote and backslash characters within the string with a
backslash. All other characters are sent without modification. For example,
the string
rsl: &( executable = "/bin/echo" ) ( arguments = "hello" )
becomes
rsl: "&( executable = \"bin/echo\" ) (arguments = \"hello\" )"
In GRAM5, protocol extensions are supported in the status update messages. These extensions are implemented as extra attribute names after all of the attributes defined in the messages below. Older GRAM protocol parsers will ignore those extensions that occur after the attributes in the messages defined below. In GRAM5, the following extensions are used:
exit-code- Job exit code. Sent in job state callbacks and in job status replies when the job completes.
gt3-failure-type- Failure detail type for staging errors. Sent in job state callbacks and in job status replies when a job fails.
gt3-failure-message- Failure detail message for more context for errors. Sent in job state callbacks and in job status replies when a job fails.
gt3-failure-source- Failure detail message for the source of a failed file transfer. Sent in job state callbacks and in job status replies when a job fails.
gt3-failure-destination- Failure detail message for the destination of a failed file transfer. Sent in job state callbacks and in job status replies when a job fails.
version- Job manager package version. Sent in all messages from the job manager.
toolkit-version- Toolkit release that the job manager is running. Sent in all messages from the job manager.
This is the only form of quoting which
application/x-globus-gram messages support. Use of
% HEX HEX escapes (such as seen in URL encodings) is
not meaningful for this protocol.
A ping request is used to verify that the gatekeeper is configured properly to handle a named service. The ping request consists of the following:
POST ping/job-manager-nameHTTP/1.1Host:host-nameContent-Type: application/x-globus-gramContent-Length:message-sizeprotocol-version:version
The values of the message-specific strings are
job-manager-name- The name of the service to have the
gatekeeper check. The service name corresponds to one of
the gatekeeper's configured grid-services, and is usually
of the form
"jobmanager-
LRM". host-name- The name of the host on which the gatekeeper is running. This exists only for compatibility with the HTTP/1.1 protocol.
message-size- The length of the content of the message, not including the HTTP/1.1 header.
version- The version of the GRAM protocol which is being used. For the protocol defined in this document, the value must be the string "2".
A job request is used to scheduler a job remotely using GRAM. The ping
request consists of the HTTP framing described above with the
request-URI consisting of job-manager-name,
where job-manager name is the name of the
service to use to schedule the job. The format of a job request message
consists of the following:
POSTjob-manager-name[@user-name]HTTP/1.1Host:host-nameContent-Type: application/x-globus-gramContent-Length:message-sizeprotocol-version:versionjob-state-mask:maskcallback-url:callback-contactrsl:rsl-description
The values of the emphasized text items are as below:
job-manager-name- The name of the service to submit the job
request to. The service name corresponds to one of the
gatekeeper's configured grid-services, and is usually of the
form
jobmanager-
LRM. user-name- Starting with GT4.0, a client may request
that a certain account by used by the gatekeeper to start the
job manager. This is done optionally by appending the @ symbol
and the local user name that the job should be run as to the
job-manager-name. If the @ and username are not present, then the first grid map entry will be used. If the client credential is not authorized in the grid map to use the specified account, an authorization error will occur in the gatekeeper. host-name- The name of the host on which the gatekeeper is running. This exists only for compatibility with the HTTP/1.1 protocol.
message-size- The length of the content of the message, not including the HTTP/1.1 header.
version- The version of the GRAM protocol which is
being used. For the protocol defined in this document, the
value must be the string
2. mask- An integer representation of the job state mask. This value is obtained from a bitwise-OR of the job state values which the client wishes to receive job status callbacks about. These meanings of the various job state values are defined in the GRAM Protocol API documentation.
callback-contact- A https URL which defines a GRAM protocol listener which will receive job state updates. The from a bitwise-OR of the job state values which the client wishes to receive job status callbacks about. The job status update messages are defined below.
rsl-description- A quoted string containing the RSL description of the job request.
A status request is used by a GRAM client to get the current job state of a running job. This type of message can only be sent to a job manager's job-contact (as returned in the reply to a job request message). The format of a job request message consists of the following:
POSTjob-contactHTTP/1.1Host:host-nameContent-Type: application/x-globus-gramContent-Length:message-sizeprotocol-version:version"status"
The values of the emphasized text items are as below:
job-contact- The job contact string returned in a response to a job request message, or determined by querying the MDS system.
host-name- The name of the host on which the job manager is running. This exists only for compatibility with the HTTP/1.1 protocol.
message-size- The length of the content of the message, not including the HTTP/1.1 header.
version- The version of the GRAM protocol which is
being used. For the protocol defined in this document, the
value must be the string
2.
A callback register request is used by a GRAM client to register a new callback contact to receive GRAM job state updates. This type of message can only be sent to a job manager's job-contact (as returned in the reply to a job request message). The format of a job request message consists of the following:
POSTjob-contactHTTP/1.1Host:host-nameContent-Type: application/x-globus-gramContent-Length:message-sizeprotocol-version:version"registermaskcallback-contact"
The values of the emphasized text items are as below:
job-contact- The job contact string returned in a response to a job request message, or determined by querying the MDS system.
host-name- The name of the host on which the job manager is running. This exists only for compatibility with the HTTP/1.1 protocol.
message-size- The length of the content of the message, not including the HTTP/1.1 header.
version- The version of the GRAM protocol which is
being used. For the protocol defined in this document, the
value must be the string
2. mask- An integer representation of the job state mask. This value is obtained from a bitwise-OR of the job state values which the client wishes to receive job status callbacks about. These meanings of the various job state values are defined in the GRAM Protocol API documentation.
callback-contact- A https URL which defines a GRAM protocol listener which will receive job state updates. The from a bitwise-OR of the job state values which the client wishes to receive job status callbacks about. The job status update messages are defined below.
A callback unregister request is used by a GRAM client to request that the job manager no longer send job state updates to the specified callback contact. This type of message can only be sent to a job manager's job-contact (as returned in the reply to a job request message). The format of a job request message consists of the following:
POSTjob-contactHTTP/1.1Host:host-nameContent-Type: application/x-globus-gramContent-Length:message-sizeprotocol-version:version"unregistercallback-contact"
The values of the emphasized text items are as below:
job-contact- The job contact string returned in a response to a job request message, or determined by querying the MDS system.
host-name- The name of the host on which the job manager is running. This exists only for compatibility with the HTTP/1.1 protocol.
message-size- The length of the content of the message, not including the HTTP/1.1 header.
version- The version of the GRAM protocol which is being used. For the protocol defined in this document, the value must be the string "2".
callback-contact- A https URL which defines a GRAM protocol listener which should no longer receive job state updates. The from a bitwise-OR of the job state values which the client wishes to receive job status callbacks about. The job status update messages are defined @ref globus_gram_protocol_job_state_updates "below".
A job cancel request is used by a GRAM client to request that the job manager terminate a job. This type of message can only be sent to a job manager's job-contact (as returned in the reply to a job request message). The format of a job request message consists of the following:
POSTjob-contactHTTP/1.1Host:host-nameContent-Type: application/x-globus-gramContent-Length:message-sizeprotocol-version:version"cancel"
The values of the emphasized text items are as below:
job-contact- The job contact string returned in a response to a job request message, or determined by querying the MDS system.
host-name- The name of the host on which the job manager is running. This exists only for compatibility with the HTTP/1.1 protocol.
message-size- The length of the content of the message, not including the HTTP/1.1 header.
version- The version of the GRAM protocol which is
being used. For the protocol defined in this document, the
value must be the string
2.
A job signal request is used by a GRAM client to request that the job manager process a signal for a job. The arguments to the various signals are discussed in the protocol library documentation. The format of a job request message consists of the following:
POSTjob-contactHTTP/1.1Host:host-nameContent-Type: application/x-globus-gramContent-Length:message-sizeprotocol-version:version"signal"
The values of the emphasized text items are as below:
job-contact- The job contact string returned in a response to a job request message, or determined by querying the MDS system.
host-name- The name of the host on which the job manager is running. This exists only for compatibility with the HTTP/1.1 protocol.
message-size- The length of the content of the message, not including the HTTP/1.1 header.
version- The version of the GRAM protocol which is
being used. For the protocol defined in this document, the
value must be the string
2. signal- A quoted string containing the signal number and its parameters.
A job status update message is sent by the job manager to all registered callback contacts when the job's status changes. The format of the job status update messages is as follows:
POSTcallback-contactHTTP/1.1Host:host-nameContent-Type: application/x-globus-gramContent-Length:message-sizeprotocol-version:versionjob-manager-url:job-contactstatus:status-codefailure-code:failure-code
The values of the emphasized text items are as below:
callback-contact- The callback contact string registered with the
job manager either by being passed as the
callback-contactin a job request message or in a callback register message. host-name- The host part of the callback-contact URL. This exists only for compatibility with the HTTP/1.1 protocol.
message-size- The length of the content of the message, not including the HTTP/1.1 header.
version- The version of the GRAM protocol which is being
used. For the protocol defined in this document, the value must be
the string
2. job-contact- The job contact of the job which has changed states.
A proxy delegation message is sent by the client to the job manager to initiate a delegation handshake to generate a new proxy credential for the job manager. This credential is used by the job manager or the job when making further secured connections. The format of the delegation message is as follows:
POSTcallback-contactHTTP/1.1Host:host-nameContent-Type: application/x-globus-gramContent-Length:message-sizeprotocol-version:version"renew"
If a successful (200) reply is sent in response to this message, then the client will procede with a GSI delegation handshake. The tokens in this handshake will be framed with a 4 byte big-endian token length header. The framed tokens will then be wrapped using the GLOBUS_IO_SECURE_CHANNEL_MODE_SSL_WRAP wrapping mode. The job manager will frame response tokens in the same manner. After the job manager receives its final delegation token, it will respond with another response message that indicates whether the delegation was processed or not. This response message is a standard GRAM response message.
The following security attributes are needed to communicate with the Gatekeeper:
- Authentication must be done using GSSAPI mutual authentication
- Messages must be wrapped with support for the delegation message. When using Globus I/O, this is accomplished by using the the GLOBUS_IO_SECURE_CHANNEL_MODE_GSI_WRAP wrapping mode.
As the GRAM service processes a job, the job undergoes a series of state transitions. These states and their meanings follow:
Table 4.1. GRAM Job States
State | Meaning |
|---|---|
| Initial job state |
| Job staging in progress |
| Job submitted to LRM, awaiting execution |
| Job executing |
| Job made progress executing but is now suspended |
| Job staging in progress after job completed |
| Job completed successfully |
| Job was canceled or failed |
Table A.1. GRAM5 Errors
| Error Code | Reason | Possible Solutions |
|---|---|---|
| 1 | one of the RSL parameters is not supported | Check RSL documentation |
| 2 | the RSL length is greater than the maximum allowed | Use RSL substitutions to reduce length of RSL strings |
| 3 | an I/O operation failed | Enable trace logging and report to gram-dev@globus.org |
| 4 | jobmanager unable to set default to the directory requested | Check that RSL directory attribute refers to a directory that exists on the target system. |
| 5 | the executable does not exist | Check that the RSL executable attribute refers to an executable that exists on the target system. |
| 6 | of an unused INSUFFICIENT_FUNDS | Unimplemented feature. |
| 7 | authentication with the remote server failed | Check that the contact string contains the proper X.509 DN. |
| 8 | the user cancelled the job | Don't cancel jobs you want to complete. |
| 9 | the system cancelled the job | Check RSL requirements such as maximum time and memory are valid for the job. |
| 10 | data transfer to the server failed | Check gatekeeper and/or job manager logs to see why the process failed. |
| 11 | the stdin file does not exist | Check that the RSL stdin attribute refers to a file that exists on the target system or has a valid ftp, gsiftp, http, or https URL. |
| 12 | the connection to the server failed (check host and port) | Check that the service is running on the expected TCP/IP port.
Check that no firewall prevents contacting that TCP/IP port.
Check for runtme configuration errors. |
| 13 | the provided RSL 'maxtime' value is not an integer | Check that the RSL maxtime value evaluates to an integer. |
| 14 | the provided RSL 'count' value is not an integer | Check that the RSL count value evaluates to an integer. |
| 15 | the job manager received an invalid RSL | Check that the RSL string can be parsed by using globusrun -p RSL. |
| 16 | the job manager failed in allowing others to make contact | Check job manager log. |
| 17 | the job failed when the job manager attempted to run it | Verify that the LRM is configured properly. |
| 18 | an invalid paradyn was specified | OBSOLETE IN GRAM2 |
| 19 | the provided RSL 'jobtype' value is invalid | The RSL jobtype attribute is not indicated as supported by the LRM. Valid jobtype values are single, multiple, mpi, and condor. |
| 20 | the provided RSL 'myjob' value is invalid | OBSOLETE IN GRAM5 |
| 21 | the job manager failed to locate an internal script argument file | Check that exists and is executable.
Check that the LRM-specific perl module is located in directory and is valid. The command perl -I$GLOBUS_LOCATION/lib/perl $GLOBUS_LOCATION/lib/perl/Globus/GRAM/JobManager/LRM.pm can be used to check if there are any syntax errors in the script. |
| 22 | the job manager failed to create an internal script argument file | Check that your home directory is writable and not full. |
| 23 | the job manager detected an invalid job state | Check job manager logs. |
| 24 | the job manager detected an invalid script response | Check job manager logs. This is likely a bug in the LRM script. |
| 25 | the job manager detected an invalid script status | Check job manager logs. This is likely a bug in the LRM script. |
| 26 | the provided RSL 'jobtype' value is not supported by this job manager | Check that the RSL jobtype attribute is implemented by the LRM script. Note that some job types require configuration |
| 27 | unused ERROR_UNIMPLEMENTED | LRM does not support some feature included in the job request. |
| 28 | the job manager failed to create an internal script submission file | Check that the user's home file system is not full. Check job manager log |
| 29 | the job manager cannot find the user proxy | Check that client is delegating a proxy when authenticating with the gatekeeper.
Check that the user's home filesystem and the /tmp file system are not full. |
| 30 | the job manager failed to open the user proxy | Check that the user's home filesystem and the /tmp file system are not full. |
| 31 | the job manager failed to cancel the job as requested | Check that the user's home filesystem and the /tmp file system are not full. |
| 32 | system memory allocation failed | Check job manager log for details. |
| 33 | the interprocess job communication initialization failed | OBSOLETE IN GRAM5 |
| 34 | the interprocess job communication setup failed | OBSOLETE IN GRAM5 |
| 35 | the provided RSL 'host count' value is invalid | Check that the RSL host_count attribute evaluates to an integer. |
| 36 | one of the provided RSL parameters is unsupported | Check job manager log for details about invalid parameter. |
| 37 | the provided RSL 'queue' parameter is invalid | Check that the RSL queue attribute evaluates to a string that corresponds to an LRM-specific queue name. |
| 38 | the provided RSL 'project' parameter is invalid | Check that the RSL project attribute evaluates to a string that corresponds to an LRM-specific project name. |
| 39 | the provided RSL string includes variables that could not be identified | Check that all RSL substitutions are defined before being used in the job description. |
| 40 | the provided RSL 'environment' parameter is invalid | Check that the RSL environment attribute contains a sequence of VARIABLE VALUE pairs. |
| 41 | the provided RSL 'dryrun' parameter is invalid | Remove the RSL dryrun attribute from the job description. |
| 42 | the provided RSL is invalid (an empty string) | Include a non-empty RSL string in your job submission request. |
| 43 | the job manager failed to stage the executable | Check that the file service hosting the executable is reachable from the GRAM5 service node. Check that the executable exists on the file service node. Check that there is sufficient disk space in the user's home directory on the service node to store the executable. |
| 44 | the job manager failed to stage the stdin file | Check that the file service hosting the standard input file is reachable from the GRAM5 service node. Check that the standard input file exists on the file service node. Check that there is sufficient disk space in the user's home directory on the service node to store the standard input file. |
| 45 | the requested job manager type is invalid | OBSOLETE IN GRAM5 |
| 46 | the provided RSL 'arguments' parameter is invalid | OBSOLETE IN GRAM2 |
| 47 | the gatekeeper failed to run the job manager | Check the gatekeeper or job manager logs for more information. |
| 48 | the provided RSL could not be properly parsed | Check that the RSL string can be parsed by using globusrun -p RSL. |
| 49 | there is a version mismatch between GRAM components | Ask system administrator to upgrade GRAM service to GRAM2 or GRAM5 |
| 50 | the provided RSL 'arguments' parameter is invalid | Check that the RSL arguments attribute evaluates to a sequence of strings. |
| 51 | the provided RSL 'count' parameter is invalid | Check that the RSL count attribute evaluates to a positive integer value. |
| 52 | the provided RSL 'directory' parameter is invalid | Check that the RSL directory attribute evaluates to a string. |
| 53 | the provided RSL 'dryrun' parameter is invalid | Check that the RSL dryrun attribute evaluates to either yes or no. |
| 54 | the provided RSL 'environment' parameter is invalid | Check that the RSL environment attribute evaluates to a sequence of VARIABLE, VALUE pairs. |
| 55 | the provided RSL 'executable' parameter is invalid | Check that the RSL executable attribute evaluates to a string value. |
| 56 | the provided RSL 'host_count' parameter is invalid | Check that the RSL host_count attribute evaluates to a positive integer value. |
| 57 | the provided RSL 'jobtype' parameter is invalid | Check that the RSL jobtype attribute evaluates to one of single, multiple, mpi, or condor |
| 58 | the provided RSL 'maxtime' parameter is invalid | Check that the RSL maxtime attribute evaluates to a positive integer value. |
| 59 | the provided RSL 'myjob' parameter is invalid | OBSOLETE IN GRAM5. |
| 60 | the provided RSL 'paradyn' parameter is invalid | OBSOLETE IN GRAM2. |
| 61 | the provided RSL 'project' parameter is invalid | Check that the RSL project attribute evaluates to a string value. |
| 62 | the provided RSL 'queue' parameter is invalid | Check that the RSL queue attribute evaluates to a string value. |
| 63 | the provided RSL 'stderr' parameter is invalid | Check that the RSL stderr attribute evaluates to a string value or a sequence of DESTINATION URLs with optional CACHE_TAG string parameters. |
| 64 | the provided RSL 'stdin' parameter is invalid | Check that the RSL stdin attribute evaluates to a string value. |
| 65 | the provided RSL 'stdout' parameter is invalid | Check that the RSL stdout attribute evaluates to a string value or a sequence of DESTINATION URLs with optional CACHE_TAG string parameters. |
| 66 | the job manager failed to locate an internal script | Check job manager log for more details. |
| 67 | the job manager failed on the system call pipe() | OBSOLETE IN GRAM5 |
| 68 | the job manager failed on the system call fcntl() | OBSOLETE IN GRAM2 |
| 69 | the job manager failed to create the temporary stdout filename | OBSOLETE IN GRAM5 |
| 70 | the job manager failed to create the temporary stderr filename | OBSOLETE IN GRAM5 |
| 71 | the job manager failed on the system call fork() | OBSOLETE IN GRAM2 |
| 72 | the executable file permissions do not allow execution | Check that the RSL executable attribute refers to an executable program or script. |
| 73 | the job manager failed to open stdout | Check that the RSL stdout attribute refers to one or more valid destination files or URLs. |
| 74 | the job manager failed to open stderr | Check that the RSL stderr attribute refers to one or more valid destination files or URLs. |
| 75 | the cache file could not be opened in order to relocate the user proxy | Check that the user's home directory is writable and not full on the GRAM5 service node. |
| 76 | cannot access cache files in ~/.globus/.gass_cache, check permissions, quota, and disk space | Check that the user's home directory is writable and not full on the GRAM5 service node. |
| 77 | the job manager failed to insert the contact in the client contact list | Check job manager log |
| 78 | the contact was not found in the job manager's client contact list | Don't attempt to unregister callback contacts that are not registered |
| 79 | connecting to the job manager failed. Possible reasons: job terminated, invalid job contact, network problems, ... | Check that the job manager process is running. Check that the job manager credential has not expired. Check that the job manager contact refers to the correct TCP/IP host and port. Check that the job manager contact is not blocked by a firewall. |
| 80 | the syntax of the job contact is invalid | Check the syntax of job contact string. |
| 81 | the executable parameter in the RSL is undefined | Include the RSL executable in all job requests. |
| 82 | the job manager service is misconfigured. condor arch undefined | Add the -condor-arch to the command-line or configuration file for a job manager configured to use the condor LRM. |
| 83 | the job manager service is misconfigured. condor os undefined | Add the -condor-os to the command-line or configuration file for a job manager configured to use the condor LRM. |
| 84 | the provided RSL 'min_memory' parameter is invalid | Check that the RSL min_memory attribute evaluates to a positive integer value. |
| 85 | the provided RSL 'max_memory' parameter is invalid | Check that the RSL max_memory attribute evaluates to a positive integer value. |
| 86 | the RSL 'min_memory' value is not zero or greater | Check that the RSL min_memory attribute evaluates to a positive integer value. |
| 87 | the RSL 'max_memory' value is not zero or greater | Check that the RSL max_memory attribute evaluates to a positive integer value. |
| 88 | the creation of a HTTP message failed | Check job manager log. |
| 89 | parsing incoming HTTP message failed | Check job manager log. |
| 90 | the packing of information into a HTTP message failed | Check job manager log. |
| 91 | an incoming HTTP message did not contain the expected information | Check job manager log. |
| 92 | the job manager does not support the service that the client requested | Check that the client is talking to the correct servce |
| 93 | the gatekeeper failed to find the requested service | OBSOLETE IN GRAM2 |
| 94 | the jobmanager does not accept any new requests (shutting down) | Execute queries before the job has been cleaned up. |
| 95 | the client failed to close the listener associated with the callback URL | Call globus_gram_client_callback_disallow() with a valid the callback contact. |
| 96 | the gatekeeper contact cannot be parsed | Check the syntax of the gatekeeper contact string you are attempting to contact. |
| 97 | the job manager could not find the 'poe' command | OBSOLETE IN GRAM2 |
| 98 | the job manager could not find the 'mpirun' command | Configure the LRM script with mpirun in your path. |
| 99 | the provided RSL 'start_time' parameter is invalid | OBSOLETE IN GRAM2 |
| 100 | the provided RSL 'reservation_handle' parameter is invalid | OBSOLETE IN GRAM2 |
| 101 | the provided RSL 'max_wall_time' parameter is invalid | Check that the RSL max_wall_time attribute evaluates to a positive integer. |
| 102 | the RSL 'max_wall_time' value is not zero or greater | Check that the RSL max_wall_time attribute evaluates to a positive integer. |
| 103 | the provided RSL 'max_cpu_time' parameter is invalid | Check that the RSL max_cpu_time attribute evaluates to a positive integer. |
| 104 | the RSL 'max_cpu_time' value is not zero or greater | Check that the RSL max_cpu_time attribute evaluates to a positive integer. |
| 105 | the job manager is misconfigured, a scheduler script is missing | Check that the adminstrator has configured the LRM by running its setup script. |
| 106 | the job manager is misconfigured, a scheduler script has invalid permissions | Check that the adminstrator has installed the script.
Check that the file system containing that script allows file execution. |
| 107 | the job manager failed to signal the job | OBSOLETE IN GRAM2 |
| 108 | the job manager did not recognize/support the signal type | Check that your signal operation is using the correct signal constant. |
| 109 | the job manager failed to get the job id from the local scheduler | OBSOLETE IN GRAM2 |
| 110 | the job manager is waiting for a commit signal | Send a two-phase commit signal to the job manager to acknowledge receiving the job contact from the job manager. |
| 111 | the job manager timed out while waiting for a commit signal | Send a two-phase commit signal to the job manager to acknowledge receiving the job contact from the job manager. Increase the two-phase commit time out for your job. Check that the job manager contact TCP/IP port is reachable from your client. |
| 112 | the provided RSL 'save_state' parameter is invalid | Check that the RSL save_state attribute is set to yes or no. |
| 113 | the provided RSL 'restart' parameter is invalid | Check that the RSL restart attribute evaluates to a string containing a job contact string. |
| 114 | the provided RSL 'two_phase' parameter is invalid | Check that the RSL two_phase attribute evaluates to a positive integer. |
| 115 | the RSL 'two_phase' value is not zero or greater | Check that the RSL two_phase attribute evaluates to a positive integer. |
| 116 | the provided RSL 'stdout_position' parameter is invalid | OBSOLETE IN GRAM5 |
| 117 | the RSL 'stdout_position' value is not zero or greater | OBSOLETE IN GRAM5 |
| 118 | the provided RSL 'stderr_position' parameter is invalid | OBSOLETE IN GRAM5 |
| 119 | the RSL 'stderr_position' value is not zero or greater | OBSOLETE IN GRAM5 |
| 120 | the job manager restart attempt failed | OBSOLETE IN GRAM2 |
| 121 | the job state file doesn't exist | Check that the job contact you are trying to restart matches one that the job manager returned to you. |
| 122 | could not read the job state file | Check that the state file directory is not full. |
| 123 | could not write the job state file | Check that the state file directory is not full. |
| 124 | old job manager is still alive | Contact the returned job manager contact to manage the job you are trying to restart. |
| 125 | job manager state file TTL expired | OBSOLETE in GRAM2 |
| 126 | it is unknown if the job was submitted | Check job manager log. |
| 127 | the provided RSL 'remote_io_url' parameter is invalid | Check that the RSL remote_io_url attribute evaluates to a string value. |
| 128 | could not write the remote io url file | Check that the user's home file system on the job manager service node is writable and not full. |
| 129 | the standard output/error size is different | Send a stdio update signal to redirect the job manager output to a new URL |
| 130 | the job manager was sent a stop signal (job is still running) | Submit a restart request to monitor the job. |
| 131 | the user proxy expired (job is still running) | Generate a new proxy and then submit a restart request to monitor the job. |
| 132 | the job was not submitted by original jobmanager | OBSOLETE IN GRAM2 |
| 133 | the job manager is not waiting for that commit signal | Do not send a commit signal to a job that is not waiting for a commit signal. |
| 134 | the provided RSL scheduler specific parameter is invalid | Check the LRM-specific documentation to determine what values are legal for the RSL extensions implemented by the LRM. |
| 135 | the job manager could not stage in a file | Check that the file service hosting the file to stage is reachable from the GRAM5 service node. Check that the file to stage exists on the file service node. Check that there is sufficient disk space in the user's home directory on the service node to store the file to stage. |
| 136 | the scratch directory could not be created | Check that the directory named by the RSL scratch_dir attribute exists and is writable.
Check that the directory named by the RSL scratch_dir attribute is not full. |
| 137 | the provided 'gass_cache' parameter is invalid | Check that the RSL gass_cache attribute evaluates to a string. |
| 138 | the RSL contains attributes which are not valid for job submission | Do not use restart- or signal-only RSL attributes when submitting a job. |
| 139 | the RSL contains attributes which are not valid for stdio update | Do not use submit- or restart-only RSL attributes when sending a stdio update signal to a job. |
| 140 | the RSL contains attributes which are not valid for job restart | Do not use submit- or signal-only RSL attributes when restarting a job. |
| 141 | the provided RSL 'file_stage_in' parameter is invalid | Check that the RSL file_stage_in attribute evaluates to a sequence of SOURCE DESTINATION pairs. |
| 142 | the provided RSL 'file_stage_in_shared' parameter is invalid | Check that the RSL file_stage_in_shared attribute evaluates to a sequence of SOURCE DESTINATION pairs. |
| 143 | the provided RSL 'file_stage_out' parameter is invalid | Check that the RSL file_stage_out attribute evaluates to a sequence of SOURCE DESTINATION pairs. |
| 144 | the provided RSL 'gass_cache' parameter is invalid | Check that the RSL gass_cache attribute evaluates to a string. |
| 145 | the provided RSL 'file_cleanup' parameter is invalid | Check that the RSL file_clean_up attribute evaluates to a sequence of strings. |
| 146 | the provided RSL 'scratch_dir' parameter is invalid | Check that the RSL scratch_dir attribute evaluates to a string. |
| 147 | the provided scheduler-specific RSL parameter is invalid | Check the LRM-specific documentation to determine what values are legal for the RSL extensions implemented by the LRM. |
| 148 | a required RSL attribute was not defined in the RSL spec | Check that the RSL executable attribute is present in your job request RSL.
Check that the RSL restart attributes is present in your restart RSL. |
| 149 | the gass_cache attribute points to an invalid cache directory | Check that the RSL gass_cache attributes evaluates to a directory that exists or can be created.
Check that the user's home file system is writable and not full. |
| 150 | the provided RSL 'save_state' parameter has an invalid value | Check that the RSL save_state attribute has a value of yes or no. |
| 151 | the job manager could not open the RSL attribute validation file | Check that is present and readable on the job manager service node.
Check that is readable on the job manager service node if present. |
| 152 | the job manager could not read the RSL attribute validation file | Check that is valid.
Check that is valid if present. |
| 153 | the provided RSL 'proxy_timeout' is invalid | Check that RSL proxy_timeout attribute evaluates to a positive integer. |
| 154 | the RSL 'proxy_timeout' value is not greater than zero | Check that RSL proxy_timeout attribute evaluates to a positive integer. |
| 155 | the job manager could not stage out a file | Check that the source file being staged exists on the job manager service node. Check that the directory of the destination file being staged exists on the file service node. Check that the directory of the destination file being staged is writable by the user. Check that the destination file service is reachable by the job manager service node. |
| 156 | the job contact string does not match any which the job manager is handling | Check that the job contact string matches one returned from a job request. |
| 157 | proxy delegation failed | Check that the job manager service node trusts the signer of your credential. Check that you trust the signer of the job manager service node's credential. |
| 158 | the job manager could not lock the state lock file | Check that the file system holding the job state directory supports POSIX advisory locking. Check that the job state directory is writable by the user on the service node. Check that the job state directory is not full. |
| 159 | an invalid globus_io_clientattr_t was used. | Check that you have initialized the globus_io_clientattr_t attribute prior to using it with the GRAM client API. |
| 160 | an null parameter was passed to the gram library | Check that you are passing legal values to all GRAM API calls. |
| 161 | the job manager is still streaming output | OBSOLETE IN GRAM5 |
| 162 | the authorization system denied the request | Check with your GRAM system administrator to allow a particular certificate to be authorized. |
| 163 | the authorization system reported a failure | Check with your system administrator to verify that the authorization system is configured properly. |
| 164 | the authorization system denied the request - invalid job id | Check with your system administrator to verify that the authorization system is configured properly. Use a credential which is authorized to interact with a particular GRAM job. |
| 165 | the authorization system denied the request - not authorized to run the specified executable | Check with your system administrator to verify that the authorization system is configured properly. Use a credential which is authorized to interact with a particular GRAM job. |
| 166 | the provided RSL 'user_name' parameter is invalid. | Check that the RSL user_name attribute evaluates to a string. |
| 167 | the job is not running in the account named by the 'user_name' parameter. | Ask with the GRAM system administrator to add an authorization entry to allow your credential to run jobs as the specified user account. |
