GA4GH Cloud Work Stream

The GA4GH Cloud Work Stream focuses on API standards (and implementations from our partner Driver projects and other community members) that make it easier to “send the algorithms to the data”. Specifically, we have 4 API standards that allow you to share tools/workflows (TRS), execute individual jobs on clouds using a standard API (TES), run full CWL/WDL workflows on execution platforms (WES), and read/write data objects across clouds in an agnostic way (DRS).


These standards are really inspired by large-scale, distributed compute projects including, for example, PCAWG. Efforts such as these are characterized by data living in many different cloud environments, compute needing to be done across these cloud locations, and a motivation for working with disparate clouds via common and consistent API interfaces. The net effect is a highly portable analysis code that ultimately enables “FAIR” science, e.g. findable, accessible, interoperable, and reproducible tools, workflows, and datasets.

Presentation

See this presentation from March of 2019 for an overview of what we do.

Standards

Our API standards are defined in Swagger YAML/OpenAPI 2.0 (with an eye toward OpenAPI 3.0) in repositories within the GA4GH GitHub organization:

Standards Maintenance

Security Considerations

If a security issue with any of the above specifications is realised please send an email to security-notification@ga4gh.org detailing your concerns.

HubFlow and Contributions

The way to contribute development effort and code to the project is via GitHub pull requests. GitHub provides a nice overview on how to create a pull request. See the CONTRIBUTING.md document in each schema repo. We follow HubFlow which means we use a feature branch strategy with pull requests always going to develop and releases happening from master.

Voting Process for PRs

See individual schema repos for more details. In short, the GA4GH has a number of Driver Projects, each of those associated with the Cloud Work Stream will nominate a representative. When a PR is up for a vote the Driver Project representatives will be notified over email and a note made in the PR. This starts a timer until the PR is accepted or rejected. None of the Driver representatives may vote against a proposed change for it to proceed. In addition, they must not be overridden by the Cloud Work Stream Leads, Brian O’Connor and David Glazer.

Our Multi-year Plan

See this document that describes our plans for the next couple years.

Meetings

Weekly Work Stream Meeting

We have a weekly call on Mondays at variable times to accommodate various timezones:

We invite anyone interested in these standards and/or the systems that implement them to join us on these calls.

Dial-in details for the calls are available on request to Rishi Nag

Past Meetings

GA4GH Connect in Hinxton, April 2019

See Cloud @ GA4GH Connect 2019 Hinxton

GA4GH 6th Plenary in Basel, Switzerland

See this site for more info.

GA4GH 5th Plenary in Orlando

This meeting happened Oct 15-17th and the Cloud Work Stream had a breakout session for most of the 15th.

For our agenda see here

For the general conference, agenda see here

If you were not able to attend you can watch the Plenary recording here

GA4GH-DREAM Infrastructure Challenges, API Test Beds, and Registries

We are working closely with the DREAM Challenges to test our API standards and workflow sharing process. Essentially, we are attempting to demonstrate API and process FAIR-compliance. This is a multi-phase effort with the first two challenges focusing on tool and workflow portability and reproducibility.

We also work closely with our Driver Projects through “Test Beds” that are focused efforts to demonstrate conformance of API implementations with the specifications.

Ultimately we want to have regular API verifications running and a registry where users can see the available services that are conforming to the specifications. This can be used by researchers to see what services are available and used by implementers to see what software is available for deployment that supports the GA4GH APIs.

Edit on Github...