Globus Compute Multi-User Endpoints
June 25, 2024 | Kevin Hunter Kesling
We recently announced the option to install a Globus Compute agent and configure it as either a single user or multi-user compute endpoint. A multi-user compute endpoint installation streamlines the user’s experience by automating the configuration and management of the compute endpoint agent. Further, while this work is geared toward administrators, non-administrators can also benefit. Let’s dig in.
To understand the multi-user compute endpoint, let’s briefly refresh what a single-user Globus Compute endpoint agent workflow looks like. Typically, a user will use pip
to install the globus-compute-endpoint
package from PyPI into their user space. This may be on their personal workstation, their research group’s cluster, in the home directory on a leadership class machine, or or any other system where they have access to compute resources. When they start the compute endpoint, the agent will run on the local machine as the locally authenticated user (c.f., $USER
), register with the Globus Compute service via Globus Auth, and then emit an endpoint UUID to the console. The user uses this UUID to submit tasks to the Globus Compute service.
The model of installing Compute endpoint agents wherever a user has access is easy to understand, accessible, and works, but comes with some limitations. A few common points that come up:
- A desire to share the compute resource – understandable from the user’s perspective, but can be challenging for some administrators.
- Configuration proliferation – it often happens that different computational workloads require different configurations, even within the same site (c.f. workstation, cluster, HPC)
- Command line knowledge – SSH and the command line interface is a powerful combination, but for many, represents an extra hurdle to “just execute my function”.
A multi-user compute endpoint installation tackles these points by abstracting away the user-endpoint agent configuration and startup. At its core, the multi-user endpoint is “simply” a process manager, starting user endpoint agents upon request from the Globus Compute service. This is a key detail, so I’ll reiterate it: a multi-user endpoint does not run tasks for users. It starts child processes (fork()
) on the host (becoming the appropriate local user and dropping privileges!), and lets the user compute endpoint agent (exec()
) process tasks as normal.
Meanwhile, the user no longer needs to manage their own compute endpoint. All they need is the multi-user endpoint UUID to submit compute tasks (we discuss the user_endpoint_config
item below):
From an administrative perspective, there are a couple of details of interest in this workflow:
- Authorizing a request to start a user endpoint on the multi-user compute endpoint host
- How to let users customize the user endpoint configuration (e.g., account id for appropriately charging consumed resources)
When starting a local user endpoint, an important question is how to map it to a valid local user and UID. The answer is to use the identity mapping logic already implemented and proven in Globus Connect Server—Globus Compute uses the same code. Mapping to a local user is required; if a multi-user endpoint does not have an identity mapping setup, it will fail to start. Details of how to set up identity mapping is beyond the scope of this blog article, but the teaser is that every request from the Globus Compute service to start a user endpoint includes the identity information of the user that submitted the tasks.
The other big question is user customization of the user endpoint configuration, and for this we use Jinja templates. After forking and dropping privileges, the user endpoint will apply any user-provided data to the administrator-written template via a Jinja processor. An example (incomplete!) template might be:
The ACCOUNT_ID (and braces, per Jinja syntax) would be replaced with the passed ACCOUNT_ID. From the example above, that would become account: ABC123MYID. The result is then passed to the user endpoint agent process, which then exec() s and the user endpoint starts up.
As previously mentioned, non-administrators also benefit from this enhancement to the Globus Compute service. Rather than having to manage multiple endpoint configurations (for example, charging different HPC accounts, changing provisioned cluster size, choosing different walltime limits, etc.), a user could write a template to allow all of these (and more!) items to be specified at task submission time (c.f., user_endpoint_config as in the first code snippet shown above.)
Key Benefits of the Multi-User Configuration
- Lowering Barriers of Use: Every HPC system has its own unique configurations and tools to run tasks and, as a result, configuration of a Globus Compute endpoint can be complicated for end users. By shifting the burden of configuration to HPC admins (the experts of their own systems), end users can submit functions without managing unnecessary-to-their-research boilerplate. (For example, needing to understand SSH, or know the specific cluster’s firewall policy, or correctly specifying the resource’s scheduler, options, and so forth.)
- Improved Access Control: With the multi-user option, administrators have granular control over user access permissions and resource usage. User access is controlled by the identity mappings to local user accounts, and can be augmented at the web service layer through authentication policies. All limits placed on users through scheduler controls as well as standard Unix limits are respected by the multi-user endpoint. Consequently, limited access can be granted to users without the need for SSH access to the machine.
- Efficient Resource Utilization: Multi-user compute allows administrators to optimize resource allocation and utilization by setting up predefined configurations for users or groups of users. For example, an admin may provide full access to a select group, while providing only limited access to a wider set of users.
Getting Started with Multi-User Compute
Getting started with the multi-user feature on Globus Compute is quick and easy. Administrators may install the Globus Compute Agent for supported Linux Distributions from packages in our Globus-hosted RPM and Deb repositories, or through pip with pip install globus-compute-endpoint. Once installed, creating a multi-user endpoint configuration involves the globus-compute-endpoint configure subcommand with the –multi-user flag. This generates necessary configuration files and sets up the endpoint.
After this initial setup, it’s important to configure an identity mapping. If you already have identity mapping configured for Globus Connect Server, you can re-use it with Globus Compute.
The user_config_template.yaml is where to place user endpoint configuration customizations. We have several examples of configurations for some well known machines available in the documentation. Most administrator-installed multi-user endpoints will likely need at least one templatable field (c.f. account id), but beyond that, this file can be as static or configurable as your site requires. Some common configuration items include partition, queue, and number of nodes to allocate.
Conclusion
We are excited to bring this new iteration of Globus Compute to our users. We feel it is a giant step forward both for administrators, who now have a new tool in their toolbox for providing access to their compute infrastructure, but also for end users who will be able to spend less time managing their Globus Compute endpoints and more time on their interesting work. We invite you to try it out and look forward to hearing about how it works for you, and how we can continue to improve it! If you have any questions, feel free to reach out to us at support@globus.org, and we’ll happily answer any questions.
Related Content
Globus Announces Multi-User Support for Globus Compute
Chicago, Illinois—May 7, 2024—Globus, the de facto standard platform for research IT, announced multi-user support for Globus Compute, a service that enables reliable, scalable, and high performance remote function execution, and delivers the same “fire-and-forget” capabilities for computation as the Globus core platform does for data management.
Computing with Globus
Globus offers a distributed Function as a Service (FaaS) platform that enables reliable, scalable, and high performance remote function execution....