Open Scientific Computing: How can state-of-the-art open-source data management technology ensure efficient and equitable access to science and engineering data, now and forever?

This workshop is an introduction to key tools and platforms for open-scientific computing and data management. It includes a hands-on demo of HDF5, HSDS, and other open technologies.

and

Imagine you are a climatologist sifting through thousands of rows of data from ice core readings to learn what atmospheric carbon dioxide levels were 500,000 years ago and how that impacted climatic conditions, and what that could mean for our future. Before beginning your analysis, there is the arduous task of cleaning and organizing the data. Given the sheer size of the data sets, this alone can take an enormous amount of time and effort. Then, you still have to sift through it to isolate specific time periods you want to examine. Across the world, other climatologists are conducting similar work, and some of the data you are working tirelessly to organize is already compiled. They have created codes and a system that would support your work, but you do not have access to this data. At the same time, you know that when your work is done, it will likewise not be accessible to other scientists doing similar work. What are the major barriers at play slowing this important work down? Scientists and researchers all over the world analyze large datasets to deepen our understanding of the natural world, in the past, present, and future. How data is managed and accessed is an important factor in their efficiency and efficacy. Open science, at its core, is about data accessibility and transparency. This module will explore HDF5, a data storage and management system built to address bottlenecks in data science like the example described above. After completing this module, you will:

Javascript is required to use Gala.

Unsupported Browser