PEARC24

July 21 – 25, 2024 (All Day)
  • Providence, Rhode Island

Globus is once again a proud sponsor of PEARC24 and will be presenting as well as exhibiting at this year’s conference. Please stop by our exhibit and learn what’s new. Join one of the following sessions to learn more about our platform:

Tutorials

The Globus Platform for Research Systems and Applications

Date: Monday, July 22 - 9:00 a.m. - 12:30 p.m. ET
Room: 554 A&B
Presenters: Lee Liming and Kyle Chard
This half-day tutorial introduces the Globus Platform: a suite of APIs and cloud-hosted services designed to simplify development and maintenance of data-intensive research applications. The Globus Platform enables researchers to automate their interactions with institutional, national, and cloud services, presenting a simplified view of systems with diverse access policies and interfaces. The platform is offered as a cloud-hosted service by the University of Chicago, so it can be used by researchers anywhere without installation or maintenance. Resource providers at 1000+ universities, research laboratories, and national facilities in 80+ countries have integrated their data storage, compute systems, and instruments with the Globus platform, so learning to use the platform provides benefits throughout a research career.

Scaling Instrument Science in the FAIR Age

Date: Monday, July 22 - 9:00 a.m. - 12:30 p.m. ET
Room: 550A
Presenters: Vas Vasiliadis and Greg Nawrocki
High resolution imaging instruments such as cryogenic electron microscopes and synchrotron beamlines require automation of data flows to increase throughput and researcher productivity, as well as to ensure the instrument remains highly utilized. Combined with the increasingly collaborative nature of research, this necessitates infrastructure that makes the resulting data products more FAIR—findable, accessible, interoperable and reusable. Over the past decade, the authors have led teams at many institutions in implementing solutions that automate instrument data management, from the point of capture through publication and re-use. In this half-day tutorial, we will present scenarios from research universities and national facilities that illustrate common use cases, highlight recurring researcher requirements, and describe the solutions that were developed in response. Attendees will have the opportunity to experiment with services and tooling used in these solutions. The tutorial will use existing materials presented at multiple events nationwide.

Globus Compute: Federated Function as a Service for Research Cyberinfrastructure

Date: Monday, July 22 - 1:30 p.m. - 5:00 p.m. ET
Room: 550A
Presenter: Kyle Chard and Reid Mello
Growing data volumes, new computing paradigms, and increasing hardware heterogeneity are driving the need to execute computational tasks across a continuum of distributed computing resources. Such needs are motivated by the desire to compute closer to data acquisition sources, exploit specialized computing resources (e.g., hardware accelerators), provide real-time processing of data, reduce energy consumption (e.g., by matching workload with hardware), and scale simulations beyond the limits of a single computer. Globus Compute addresses these needs by delivering a hybrid cloud platform implementing the Function-as-a-Service (Faas) paradigm. Researchers first register their desired function with a cloud-hosted service, they can then request invocation of that function with arbitrary input arguments to be executed on remote cyberinfrastructure. Globus Compute manages the reliable and secure execution of the function, provisioning resources, staging function code and inputs, managing safe and secure execution (optionally using containers), monitoring execution, and asynchronously returning results to users via the cloud platform. Functions are executed by the Globus Compute endpoint software—an agent that may be installed anywhere the researcher has access, and that effectively turns any existing resource (e.g., laptop, cloud, cluster, supercomputer, or container orchestration cluster) into a FaaS endpoint. The endpoint software is available in both single and multi-user modes, enabling setup by users and administrators, respectively. Over the last three years, Globus Compute has been used by thousands of researchers around the world to execute tens of millions of functions across nearly 10,000 distributed computing endpoints. This timely tutorial will address challenges of porting scientific applications in FaaS, opportunities for portable execution across endpoints, and the benefits of this approach (e.g., performance, energy efficiency). Further, it will directly relate to modern approaches in CI, for example enabling fine-grained and portable allocations in NSF ACCESS and as a common interface for remote computing in DOE’s integrated research infrastructure. The tutorial will use existing tutorial materials that have been delivered at many international venues.

Papers:

Enabling Remote Management of FaaS Endpoints with Globus Compute

Date: Wednesday, July 24 - 3:15-3:30 p.m.
Track: Systems and System Software
Room: 553 A&B
Presenter: Kyle Chard
Globus Compute implements a hybrid Function as a Service (FaaS) model in which a single cloud-hosted service is used by users to manage execution of Python functions on user-owned and -managed Globus Compute endpoints deployed on arbitrary compute resources. Here we describe a new multi-user and multi-configuration Globus Compute endpoint. This system, which can be deployed by administrators in a privileged account, enables dynamic creation of user endpoints that are forked as new processes in user space. The multi-user endpoint is designed to provide the security interfaces necessary for deployment on large, shared HPC clusters by, for example, restricting user endpoint configurations, enforcing various authorization policies, and via customizable identity-username mapping.

Zero Code and Infrastructure Research Data Portals

Date: Wednesday, July 24 - 3:15 p.m. - 3:30 p.m. ET
Track: Applications and Software
Room: Ballroom B
Presenter: Joe Bottigliero
Data portals are web applications that facilitate data discovery, access, and sharing. They are essential to meet the FAIR data principles and for advancing open science, fostering interdisciplinary collaborations, and enhancing the reproducibility of research findings. We present a novel zero code and infrastructure approach to simplify and accelerate the creation and customization of data portals. Our data portals do not require an application server and can be served from static content hosting services, removing the need to administer infrastructure. We present a new generator approach to portal development which allows users to create highly customized and powerful data portals by modifying only a JSON document.