A Petabyte-scale Scientific Community Cloud

The OSDC enables scientific researchers to easily manage, share, and analyze large datasets.

What is the OSDC? >> Watch a Video

OSDC in brief

The Open Science Data Cloud provides the scientific community with resources for storing, sharing, and analyzing terabyte and petabyte-scale scientific datasets. The OSDC is a data science ecosystem in which researchers can house and share their own scientific data, access complementary public datasets, build and share customized virtual machines with whatever tools necessary to analyze their data, and perform the analysis to answer their research questions. It is a one-stop shop for making scientific research faster and easier.

Why is there a need?

With datasets growing larger and larger, researchers are finding that the bottleneck to discovery is no longer a lack of data but an inability to manage, analyze, and share their large datasets. Individual researchers can no longer download and analyze the important datasets in their scientific fields on their own computers. The goal of the Open Science Data Cloud is to remove the bottleneck to discovery by providing researchers with access to a variety of key datasets across scientific disciplines and the computing infrastructure to allow scientists to easily manage and share their data and analysis.>> read more

Featured on the OSDC

Bionimbus Bionimbus
Project Matsu Project Matsu
Tukey Tukey
What is the OSDC? What is the OSDC?



OCC and CDIS @ SC15

We're pleased to announce our presence at the annual Super Computing conference. This year's conference will be in Austin, TX and the OCC and the Center for Data Intensive Science will showcase innovations in: data science applications in biology, medicine, health care, and the environment; data commons and data peering supporting numerous research communities, including specialized commons for cancer genomic data, weather data, and satellite imagery; data intensive computing systems; and high performance analytics. ... more ...

HDF Group Joins the OCC

We’re pleased to announce that The HDF Group is now a member of the Open Commons Consortium! The HDF Group will participate in the NOAA Data Alliance Working Group on the WG committee that will: Refine and implement the full activities of the working group, establish and reinforce alliance partnerships, and build a sustainable framework for the alliance; Work with OCC staff and OCC member staff to implement hardware, software, and networking requirements; and Meet with the user community and determine the datasets to be hosted in the NOAA data commons and tools to be used in the computational ecosystem surrounding the NOAA data commons. ... more ...

From Clouds to Commons

Robert L. Grossman Director, Open Commons Consortium (OCC) The Open Cloud Consortium began operations seven years ago in 2008, which is a long time in technology. Our mission has remained the same -- to provide research infrastructure to facilitate discoveries and insights over large datasets. In 2008, the use of cloud computing to support discovery over large research datasets was new, as was technology for data intensive computing, as distinguished from compute intensive computing used by the high performance computing community. Support by NSF for big data was still 3 to 4 years in the future. The Open Science Data Cloud (OSDC) that began operations in 2010 is today still one of the largest scale science clouds in operations. The OSDC has provided millions of core hours to hundreds of investigators who have made discoveries in science, medicine, health care and the enironment. ... more ...

OSDC Griffin - A New Public Resource

The OCC is very pleased to announce availability of our newest public resource, OSDC Griffin. OSDC Griffin is a compute resource named after the Chicago Architect Marion Mahony Griffin. Griffin is an OpenStack cluster utilizing ephemeral storage in VMs with access to a separate S3-compatible storage system for persistent data storage. OSDC Griffin has 610 cores, 2391 GiB of RAM, and 369664 GiB of VM/ephemeral storage. Allocations to all users and projects are managed at the “tenant” level. To apply for a resource allocation on OSDC Griffin use the OSDC Resource Allocation Application and select "Public Cloud". Applications are considered on a quarterly basis as part of our allocation process. Recipients of OSDC resource allocations are expected to: Make appropriate use of OSDC resources and use good social behavior (ie - terminating VMs when not in use). ... more ...

NOAA Big Data CRADA at AMS Meeting

On August 6th, during the American Meteorological Society conference in Raleigh, NC there will be a panel discussion special on "The NOAA Big Data Crada and You!" Discussing the OCC's efforts as part of NOAA's Big Data project will be Dr. Mohan Ramamurthy from Unidata. You can follow all the latest news using the #AMSsummer hashtag. ... more ...

How can I get involved?


Access the Public Data Commons

The OSDC has 1 PB of publicly accessible data in a wide variety of disciplines. Interested researchers can freely access and download these data to their own machines or apply for resources to compute over the data within the cloud.

Contribute to OSDC


All of the software developed as part of the OSDC is open source and hosted on GitHub. You can directly help the scientific cloud computing community by contributing to the open source OSDC software stack.

Apply for Compute and Storage

Fill out a short proposal for an OSDC resource allocation. Allocations start at 16 dedicated cores and 1TB of storage, but scale depending on the project needs and level of organizational partnership.


Partner with us and add your own racks to the OSDC (we will manage them for you). Organizations can also join the Open Commons Consortium (OCC) which is made up of working groups, including the OSDC.

Contact Us

Questions? Comments? Suggestions? Contact us at info@occ-data.org.