Session Title: Growing a global education in Research Data Science

Session Organisers: Hugh Shanahan, Simon Hodson, Anelda Van der Walt

Session Description:

Contemporary research – particularly when addressing the most significant, transdisciplinary research challenges – cannot effectively be done without a range of skills relating to data handling. This includes the principles and practice of Open Science, research data management and curation, the use of a range of data platforms and infrastructures, large scale analysis, statistics, visualisation and modelling techniques, software development. To this end, we define ‘Research Data Science’ as the ensemble of these skills.

Research Data Science skills are common to all disciplines and training in research needs to take this into account. For example, all disciplines need to ensure that research is reproducible and that provenance is documented reliably. This requires a transformation in practice and the promotion of the necessary culture, practice and skills. 

The need for a consistent education in Research Data Science is increasingly paramount for many stakeholders, from scientists to funders to policy makers from many nations. We wish to focus discussions from many parties so as to produce an inclusive solution that benefits all involved. It is important that Open Data and Open Science benefit research in Low and Middle Income Countries and the unequal ability to exploit these developments does not become another lamentable aspect of the ‘digital divide’.  The current ‘Data Revolution’ brings equal opportunities to all countries with different level of technology development to  benefit from the data intensive technologies but require common set of data related competences and skills . 

A strategic priority shared by CODATA and the Research Data Alliance is to build capacity and to develop skills, training young researchers in the principles of Research Data Science. We are growing a global educational network of schools to provide a coherent and consistent education in Research Data Science, utilising the expertise from many disciplines and efforts on many parties from around the globe. We are also aligning these efforts with activities on curriculum development and the training of new teachers around the globe, 

The purpose of this session is to discuss these and similar initiatives, to identify lessons that can be learnt and opportunities for collaboration and coordination. 

The focus of the SciDataCon workshop will be on

  1. Training requirements (perspectives from different disciplines and geographies);
  2. Creating a common curriculum: approaches and models;
  3. Sustainability;
  4. Online communities as a means of developing materials.

The session will bring together the thoughts and ideas from existing efforts happening in this area. These different perspectives will help to focus minds on the most natural solutions that are starting to take shape around the world, as well as to shine light on any outstanding problems that still need to be solved.


Hugh Shanahan, Royal Holloway University of London

Anelda van der Welt, Talarify

Simon Hodson, CODATA

Session One: Global Initiatives for Data Skills 

11:30-11:45: Hugh Shanahan, Introductory Research Data Science – closing the skills gap

11:45-12:00: Ciira Maina, Data Science Africa – An Initiative to Bridge the Data Science Skills Gap in Africa.

12:00-12:15: Clement Onime, Growing skills and competences for Data Science Research in developing World: experiences from HPC

12:15-12:30: Jonah Duckles, Software Carpentry, more than just training workshops

12:30-12:45: Tracy Teal, Addressing the bottleneck to data-driven discovery: scaling data skills training for researchers

12:45-13:00: Anelda van der Welt, A “Whole Village” Approach to Developing Research Data Scientists

Session Two: Data Education Requirements, Curricula and the Role of Institutions

14:00-14:15: Amy Nurnberger, Finding a firm foundation in data information literacy

14:15-14:30: Alisa Surkis, Extending the BD2K Training Initiative to Biomedical Librarians

14:30-14:45: Michael Witt and Natasha Simon, 23 Things for research data management

14:45-15:00: Yuri Demchenko, Defining Customisable Model Curriculum for Research Data Management Training 

15:00-15:30: Panel Discussion   


This session has 11 papers.


This session has 0 posters.