The GA4GH Cloud Work Stream focuses on API standards (and implementations from our partner Driver projects and other community members) that make it easier to “send the algorithms to the data”. Specifically, we have 4 API standards that allow you to share tools/workflows (TRS), execute individual jobs on clouds using a standard API (TES), run full CWL/WDL workflows on execution platforms (WES), and read/write data objects across clouds in an agnostic way (DRS).
These standards are really inspired by large-scale, distributed compute projects including, for example, PCAWG. Efforts such as these are characterized by data living in many different cloud environments, compute needing to be done across these cloud locations, and a motivation for working with disparate clouds via common and consistent API interfaces. The net effect is a highly portable analysis code that ultimately enables “FAIR” science, e.g. findable, accessible, interoperable, and reproducible tools, workflows, and datasets.
See this presentation from March of 2019 for an overview of what we do.
Our API standards are defined in Swagger YAML/OpenAPI 2.0 (with an eye toward OpenAPI 3.0) in repositories within the GA4GH GitHub organization:
If a security issue with any of the above specifications is realised please send an email to security-notification@ga4gh.org detailing your concerns.
The way to contribute development effort and code to the project is via GitHub pull requests. GitHub provides a nice overview on how to create a pull request. See the CONTRIBUTING.md document in each schema repo. We follow HubFlow which means we use a feature branch strategy with pull requests always going to develop
and releases happening from master
.
See individual schema repos for more details. In short, the GA4GH has a number of Driver Projects, each of those associated with the Cloud Work Stream will nominate a representative. When a PR is up for a vote the Driver Project representatives will be notified over email and a note made in the PR. This starts a timer until the PR is accepted or rejected. None of the Driver representatives may vote against a proposed change for it to proceed. In addition, they must not be overridden by the Cloud Work Stream Leads, Brian O’Connor and David Glazer.
See this document that describes our plans for the next couple years.
We have a weekly call on Mondays at variable times to accommodate various timezones:
We invite anyone interested in these standards and/or the systems that implement them to join us on these calls.
Dial-in details for the calls are available on request to Rishi Nag
See Cloud @ GA4GH Connect 2019 Hinxton
See this site for more info.
This meeting happened Oct 15-17th and the Cloud Work Stream had a breakout session for most of the 15th.
For our agenda see here
For the general conference, agenda see here
If you were not able to attend you can watch the Plenary recording here
We are working closely with the DREAM Challenges to test our API standards and workflow sharing process. Essentially, we are attempting to demonstrate API and process FAIR-compliance. This is a multi-phase effort with the first two challenges focusing on tool and workflow portability and reproducibility.
We also work closely with our Driver Projects through “Test Beds” that are focused efforts to demonstrate conformance of API implementations with the specifications.
Ultimately we want to have regular API verifications running and a registry where users can see the available services that are conforming to the specifications. This can be used by researchers to see what services are available and used by implementers to see what software is available for deployment that supports the GA4GH APIs.