The eSSENCE and SciLifeLab Uppsala nodes start a new interdisciplinary graduate school to address challenges in data-intensive science.
The school will train a generation of PhD students capable of working efficiently in the interface of eScience and data-driven applications, and will through a core curriculum of joint courses and activities create an environment where experts in data-intensive methodology work together with PIs in the sciences (including industrial collaborations). A typical PhD project will have at least one supervisor with expertise in computational science, Data Science or Data Engineering and one with expertise in the application area.
Mission. The school addresses the challenge of data-intensive science both from the foundational methodological perspective and from the perspective of data-driven science applications. The school should be an arena where experts in computational science, data science and data engineering (systems and methodology) work closely together with researchers in (data-driven) sciences, industry and society to accelerate data-intensive scientific discovery. The school should work actively to create synergies between the involved Strategic Research Initiatives (SRAs), add complement related strategic initiatives at the university and nationally, as well as actively encouraging collaboration with industry and society.
A PhD project in the school should involve both application and methods research, but the main weight can be either on the data-driven application or on the methodology development.
Methodology and system research
eScience tools and techniques, with a particular focus on methods development in data-intensive Data Science and Data Engineering forms the core of the school. Data engineering sciences as defined here encompasses several core areas in eScience, including distributed computing, high-performance computing, analysis of large, diverse, and/or high-velocity data, large-scale distributed machine learning, and methodology and systems challenges directly related to data-driven science. It also includes research on systems to support such analysis. Theory, methods and systems for data science using private, sensitive and decentralized data, security, trust and data quality are also core challenges addressed by the school.
Data-driven and data-intensive science in all scientific areas are at the core of the school. We anticipate that all PhD projects with the primary focus on the data-driven application side are such in nature that they involve one or many of the methodology challenges listed above.
About some of the the challenges motivating this initiative
Data-driven life science
SciLifeLab’s 10-year strategy is built on the idea of transforming into data-driven life science. This will necessitate a broad build-out of capabilities and competencies in machine learning and associated challenges in data-intensive computing, including analysis of rapid data streams (generated by experimental facilities) and in general big-data issues. Data quality is a major concern, and methodology for working efficiently with sensitive data is urgently needed. The roadmap towards technology- and data-driven life science can be found here: https://www.scilifelab.se/roadmap-2020-2030/ .
Materials science, data-intensive physics and chemistry
There is a need for methods and systems research in the area of managing and analyzing large datasets from large experimental and observational facilities. This is of particular relevance for materials research, high-energy and nuclear physics, and astronomy. Many areas in materials chemistry are becoming highly data intensive and require novel methods and data analysis tools. In the area of physics, there are several areas of astronomy/astrophysics/cosmology that necessitate novel statistical and computational techniques such to handle very large data flows and observational samples from future astronomical surveys.
What happens now?
There will be an open call for PhD projects in the fall 2021, open to main applicants at Uppsala University. Selected projects will recruit PhD students to start late 2021 or early 2022. A call for project proposals will be published separately.