A Petabyte-scale Scientific Community Cloud

The OSDC enables scientific researchers to easily manage, share, and analyze large datasets.

What is the OSDC? >> Watch a Video

OSDC in brief

The Open Science Data Cloud provides the scientific community with resources for storing, sharing, and analyzing terabyte and petabyte-scale scientific datasets. The OSDC is a data science ecosystem in which researchers can house and share their own scientific data, access complementary public datasets, build and share customized virtual machines with whatever tools necessary to analyze their data, and perform the analysis to answer their research questions. It is a one-stop shop for making scientific research faster and easier.

Why is there a need?

With datasets growing larger and larger, researchers are finding that the bottleneck to discovery is no longer a lack of data but an inability to manage, analyze, and share their large datasets. Individual researchers can no longer download and analyze the important datasets in their scientific fields on their own computers. The goal of the Open Science Data Cloud is to remove the bottleneck to discovery by providing researchers with access to a variety of key datasets across scientific disciplines and the computing infrastructure to allow scientists to easily manage and share their data and analysis.>> read more

Featured on the OSDC

Bionimbus Bionimbus
Project Matsu Project Matsu
Tukey Tukey
What is the OSDC? What is the OSDC?



OCC at the AGU Fall Meeting

As part of the scientific program at the American Geophysical Union (AGU) Fall Meeting, Zac Flamig, a postdoc at the Center for Data Intensive Science at the University of Chicago and scientific lead for the OCC NOAA Big Data Project, will present on "The OCC NOAA Data Commons: First Year Experiences." This invited talk will take place Tuesday afternoon in the "Enabling Cloud Applications for Earth Science Data II session.” This session features 8 total speakers discussing how cloud computing is being used to enable science. AGU’s Fall Meeting boasts being the largest Earth and space science meeting in the world with more than 24,000 attendees in 2015. This talk provides a vital opportunity for OCC to share what it is working on within the NOAA Big Data Project as well as receive feedback on future directions. ... more ...

OCC and CDIS @ SC16

We're pleased to announce our presence at the annual Super Computing conference. This year's conference will be in Salt Lake City, Utah and the OCC and the Center for Data Intensive Science will showcase: innovations in data science applications in biology, medicine, health care, and the environment; new releases of data commons and data peering technology that support research communities, including specialized commons for cancer genomic data, weather data, and satellite imagery; data intensive computing systems; high performance analytics; and a Tuesday Birds-of-a-Feather session on Data Commons led by Dr. Robert Grossman, Dr. Allison Heath, and Dr. Zachary Flamig. ... more ...

University of Illinois / NCSA Joins the OCC

We’re pleased to announce that the Unversity of Illinois is now a member of the Open Commons Consortium! The OCC and the the National Center for Supercomputing Applications (NCSA) are working together on a data peering pilot that will make environmental data available across multiple commons. The pilot will peer resources in both the Open Science Data Cloud environmental data commons and the Resourcing Open Geospatial Education and Research (ROGER) supercomputer. "Data peering will enable data commons service providers to share large datasets using persistent digital IDs over high speed research networks, effectively expanding the amount and quality of data transparently accessible to researchers associated with each service provider. We think of it as a kind of instantaneous inter-library loan system for complex data," said Dr. Robert Grossman, Director of the OCC. "It is very exciting to collaborate with NCSA as founding partners in this effort.” "We're looking forward to working with NCSA, particularly on data peering. Data peering between NCSA and existing OCC resources will go a long way towards enabling sustainable access to large valuable datasets for more researchers and extending resources like the Environmental Data Commons," says Dr. Zac Flamig, of the Center for Data Intensive Science at the University of Chicago, and Scientific Lead for the OCC NOAA Working Group. Learn more about how your organization can become a member of the OCC here. ... more ...

MED-C and OCC to Establish a Biomedical Data Commons

We are very pleased to announce that the Molecular Evidence Development Consortium (MED-C) and the OCC have signed a memorandum of understanding to create the MED-C Biomedical Data Commons (BDC), which will allow the sharing of genetic and clinical data in a unified manner. Within a few years, the MED-C BDC is expected to be one of the largest genomic data commons, a key step toward the next stage of personalized medicine. “We are very pleased to be collaborating with MED-C to build this initiative. Not only will the MED-C BDC be one of the largest genomic data commons, but it will share a common architecture and interfaces with other OCC projects so that researchers will have transparent access to a critical mass of genomic and associated clinical and outcome data,” says Dr. Robert Grossman, director of the OCC. Read the full press release here. ... more ...

Harnessing the Power of Big Data in Fight Against Cancer

Dr. Grossman appeared on the Chicago Tonight program on WTTW PBS to discuss the Genomic Data Commons (GDC) effort at the Center for Data Intensive Science (CDIS) @ University of Chicago. ... more ...

How can I get involved?


Access the Public Data Commons

The OSDC has 1 PB of publicly accessible data in a wide variety of disciplines. Interested researchers can freely access and download these data to their own machines or apply for resources to compute over the data within the cloud.

Contribute to OSDC


All of the software developed as part of the OSDC is open source and hosted on GitHub. You can directly help the scientific cloud computing community by contributing to the open source OSDC software stack.

Apply for Compute and Storage

Fill out a short proposal for an OSDC resource allocation. Allocations start at 16 dedicated cores and 1TB of storage, but scale depending on the project needs and level of organizational partnership.


Partner with us and add your own racks to the OSDC (we will manage them for you). Organizations can also join the Open Commons Consortium (OCC) which is made up of working groups, including the OSDC.

Contact Us

Questions? Comments? Suggestions? Contact us at info@occ-data.org.