Exploring the Landscape of the Midwest Research Computing and Data Consortium
Interview with Lee Liming, Director of Professional Services Team, University of Chicago - Globus
December 12, 2023 | Midwest Research Computing and Data Consortium
My role at Globus
At the University of Chicago, we have two distinct levels of research computing and data. The campus’s Research Computing Center, run by Dr. Birali Runesha, provides research computing services for the university’s researchers, students, and research staff. Then there’s Globus, a team I’m closely tied with, which offers research computing and data services to about 1,600 universities and research labs globally. We primarily provide these as cloud services and our mission extends beyond the university, reaching other institutions worldwide.
I’m responsible for helping research teams worldwide understand Globus services and figure out how to integrate them into their systems or applications. If you’re building an application that requires a lot of data movement or analysis, my team can guide you, provide best practices, documentation, and even assist in writing your application. We often build and operate services for teams creating applications globally.
Research computing – Challenges and opportunities
A significant part of our work involves modernizing existing systems. A good example is our collaboration with the Earth Systems Grid Federation (ESGF), part of the World Climate Research Program. ESGF provides a data network for managing access to data produced by climate observations and simulations. We’re in the midst of a three-year program to modernize the ESGF data nodes operated by the U.S. Department of Energy (DOE).
The modernization process involves leveraging institutional resources available at the labs. For instance, the US DOE ESGF nodes manage more than 7 petabytes of climate data. Institutions excel at provisioning large-scale storage, and Globus excels at large-scale data access. So we’re upgrading the nodes to use lab-wide storage with Globus access rather than custom storage and data access systems purchased and operated by the climate research teams. We’re also transitioning the nodes’ search index services (formerly run by climate researchers) to Globus’s Search service (operated by Globus and Amazon Web Services for the global research community). These changes offload significant IT responsibility from the climate teams, allowing them to focus more on climate research.
DIY to Commodity Computing
I’ve been in this space for 30 years, starting in the early ’90s at the University of Michigan. At that time, most of the university community used a central mainframe computer running an operating system written by employees of the university. Campus computing staff wrote the university’s course registration and scheduling system and the campus email system. Most people hadn’t yet heard about the web, and the majority of classrooms and labs didn’t have internet connectivity. Looking back, it seems amazing that this was the state of computing and networking at the university with the most sponsored research in the nation!
In the years since then, there’s been a massive shift from “do it yourself (DIY)” computing to systems shared by many institutions: open source, commercial, consortium-supported, and cloud-hosted. This has dramatically improved the capabilities of RCD teams at universities. Instead of every university building and maintaining custom systems, we now share best-of-breed solutions across hundreds, sometimes thousands of campuses. This enables RCD staff to focus on the unique needs of the communities they serve and outsource everyday problems to others. This long-term trend is still well underway, with RCD teams today sharing their services across campuses in ever widening networks. I encourage all RCD professionals to look for new ways we can work together to provide high-quality, useful capabilities to researchers everywhere.