Globus Toolkit: the middleware of scientific grid computing

Globus Toolkit provides federated authentication (GSI), job submission (GRAM), data transfer (GridFTP) and discovery (MDS) for distributed computing across heterogeneous clusters.

Open SourceR&D Open SourceGlobusGrid ComputingResearchMiddleware

The distributed computing problem in research

Scientific research produces data volumes and demands computing power that exceed the resources of any single institution. Physicists, biologists and climatologists need to aggregate computational power distributed across dozens of research centres, each with its own hardware, operating system and security policies. Globus Toolkit is the open source middleware that makes this aggregation possible, providing a common software infrastructure for grid computing.

Developed since the mid-1990s by Ian Foster and Carl Kesselman — the same researchers who coined the concept of grid computing — Globus Toolkit is now at version 4.x, built on a web services architecture conforming to the WSRF (Web Services Resource Framework) standard.

The core components

Globus Toolkit’s architecture is modular. Four main components address the fundamental needs of distributed computing:

  • GSI (Grid Security Infrastructure): handles federated authentication through X.509 certificates and delegated proxies. A researcher authenticates once at their home institution and obtains temporary credentials recognised by every site on the grid, without needing separate accounts on each one
  • GRAM (Grid Resource Allocation and Management): enables remote job submission to heterogeneous computational resources. GRAM translates requests into commands understood by local job schedulers — PBS, Condor, LSF — hiding the complexity specific to each cluster
  • GridFTP: an extension of the FTP protocol optimised for reliable transfer of large data volumes over wide-area networks. It supports parallel transfers across multiple TCP streams, automatic resumption after interruptions and authentication integrated with GSI
  • MDS (Monitoring and Discovery System): a directory and monitoring service that publishes information about resources available on the grid — computational capacity, storage space, service status — allowing clients to dynamically discover where to run their jobs

Grid computing and scientific infrastructures

Projects such as EGEE (Enabling Grids for E-sciencE) and TeraGrid use Globus Toolkit as the foundation of their infrastructure. CERN, in preparation for the start-up of the Large Hadron Collider, is building the Worldwide LHC Computing Grid (WLCG) to distribute analysis of data produced by its experiments across hundreds of sites worldwide.

The grid computing model addresses a precise architectural problem: federating heterogeneous resources while preserving the administrative autonomy of each site. The goal is not to build a single supercomputer, but to create an infrastructure that allows independent institutions to share resources according to agreed policies. Globus Toolkit provides the protocols and services that make this collaboration technically possible.

Link: globus.org

Need support? Under attack? Service Status
Need support? Under attack? Service Status