The New Google Drive Connector: Notes from Product Management
September 20, 2017 | Rachana Ananthakrishnan
We recently announced availability of Globus for Google Drive, adding another storage system that can be used with the Globus service. Globus for Google Drive was driven by customer demand to make Google Drive more effective as a storage tier suitable for researchers on campus. Common use cases include using Google Drive as a backup destination for research data, as intermediate storage for computation results, as a highly accessible repository to support Data Management Plans, and as a conduit for securely distributing derived data to users of a science gateway.
To enable access to Google Drive an institution installs Globus Connect Server and adds Globus for Google Drive, which acts as a gateway to the storage. Users register their Google account with this gateway server and create a shared endpoint to access their Google Drive data. The shared endpoint then allows access to all Globus data management capabilities—transfer, sharing and publication—on Google Drive storage. This implies a researcher can use federated or campus identities to share data on their Google Drive via Globus, and groups they create on Globus can be used against any endpoints for managing data access. As with all other Globus endpoints, institutions can control access to the Google Drive environment for their institution. With support for Google Team Drive, this solution also allows enforcement of access policies where the data is owned by a team rather than the individual.
In developing Globus for Google Drive we encountered a number of unique and interesting scenarios. For one, Google Drive allows duplicate file names in the same folder, as well as the option to use the same name for a file and a folder. This presented some challenges when dealing with data movement between Google Drive and traditional (POSIX-compliant) file systems. In a different vein, Google has limits on the number and frequency of calls to their API. Early customers have bumped up against these limits when transferring large research datasets–some of which number in the millions of files–but, as with other storage systems, the Globus reliability layer ensures that transfers will complete successfully despite API rate limiting and other transient errors.
We are hosting a live Q&A webinar on Globus for Google Drive on October 3rd and you can register for the event here. You can also find more information on installing and using Globus for Google Drive. We have seen a tremendous interest for Google Drive support, and look forward to hearing from you on your experiences.
As institutions look for more cost effective storage with programmatic and user friendly interfaces to manage data across the research lifecycle, and to meet Data Management Plan requirements from funding agencies, such Globus storage connectors play a critical role in the campus storage ecosystem. We have requests to support many other storage systems, such as Box, and we continue to evaluate and prioritize these requests based on customer demand. If you need to connect a system that we don’t currently support please contact us so we can start the conversation.