eSSENCE scientist Ola Spjuth publishes manuscript on cross-biobank data integration

Enabling integrative cross-biobank research: the SAIL method for harmonizing and linking biomedical and clinical data across disparate data archives
O. Spjuth, J. Hastings, J. Dietrich, J. Heikkinn, N. Pedersen, J. Hottenga, S. Ripatti, P. Burton, I. Fortier, C. van Duijn, E. Wichmann, J. Rung, M. McCarthy, M. Allen, E. Raulo, I. Prokopenko, J. Karvanen, M. Perola, M. Kolz, E. J.C. de Geus, G. Willemsen, P. Magnusson, J-E. Litton, J. Palmgren, M. Krestyaninova, and J. Harris.
European Journal of Human Genetics advance online publication 26 August 2015; doi:10.1038/ejhg.2015.165

Abstract

A wealth of biospecimen samples are stored in modern globally distributed biobanks. Biomedical researchers worldwide need to be able to combine the available resources to improve the power of large-scale studies. A prerequisite for this effort is to be able to search and access phenotypic, clinical and other information about samples that are currently stored at biobanks in an integrated manner. However, privacy issues together with heterogeneous information systems and the lack of agreed-upon vocabularies have made specimen searching across multiple biobanks extremely challenging. We describe three case studies where we have linked samples and sample descriptions in order to facilitate global searching of available samples for research. The use cases include the ENGAGE (European Network for Genetic and Genomic Epidemiology) consortium comprising at least 39 cohorts, the SUMMIT (surrogate markers for micro- and macro-vascular hard endpoints for innovative diabetes tools) consortium and a pilot for data integration between a Swedish clinical health registry and a biobank. We used the Sample avAILability (SAIL) method for data linking: first, created harmonised variables and then annotated and made searchable information on the number of specimens available in individual biobanks for various phenotypic categories. By operating on this categorised availability data we sidestep many obstacles related to privacy that arise when handling real values and show that harmonised and annotated records about data availability across disparate biomedical archives provide a key methodological advance in pre-analysis exchange of information between biobanks, that is, during the project planning phase.

Link to article full text: http://www.nature.com/ejhg/journal/vaop/ncurrent/full/ejhg2015165a.html

eSSENCE researchers publish article on workflows for automating data-intensive bioinformatics

In the EU COST Action SeqAhead, researchers lead by eSSENCE researcher Ola Spjuth have conducted a large survey on the use of scientific workflows in data-intensive bioinformatics.

Experiences with workflows for automating data-intensive bioinformatics
Ola Spjuth, Erik Bongcam-Rudloff, Guillermo Carrasco Hernández, Lukas Forer, Mario Giovacchini, Roman Valls Guimera, Aleksi Kallio, Eija Korpelainen, Maciej M Kańduła, Milko Krachunov, David P Kreil, Ognyan Kulev, Paweł P. Łabaj, Samuel Lampa, Luca Pireddu, Sebastian Schönherr, Alexey Siretskiy and Dimitar Vassilev
Biology Direct 2015, 10:43 doi:10.1186/s13062-015-0071-8

ABSTRACT
High-throughput technologies, such as next-generation sequencing, have turned molecular biology into a data-intensive discipline, requiring bioinformaticians to use high-performance computing resources and carry out data management and analysis tasks on large scale. Workflow systems can be useful to simplify construction of analysis pipelines that automate tasks, support reproducibility and provide measures for fault-tolerance. However, workflow systems can incur significant development and administration overhead so bioinformatics pipelines are often still built without them. We present the experiences with workflows and workflow systems within the bioinformatics community participating in a series of hackathons and workshops of the EU COST action SeqAhead. The organizations are working on similar problems, but we have addressed them with different strategies and solutions. This fragmentation of efforts is inefficient and leads to redundant and incompatible solutions. Based on our experiences we define a set of recommendations for future systems to enable efficient yet simple bioinformatics workflow construction and execution.

Link to full text: http://www.biologydirect.com/content/10/1/43

Software X

Software X: A new open-access journal from Elsevier for scientific software. The journal publishes peer-reviewed software and also allows for post-publication updates and metadata.

http://www.journals.elsevier.com/softwarex/

UPPMAX publishes report on hadoop for NGS data analysis

Researchers affiliated with UPPMAX and financed by the Strategic Research Program eSSENCE published a report on the applicability of Hadoop for analyzing next-generation sequencing data.

Siretskiy A, Sundqvist T, Voznesenskiy M, Spjuth O.
A quantitative assessment of the Hadoop framework for analyzing massively parallel DNA sequencing data.
Gigascience. 2015 Jun 4; 4:26. doi: 10.1186/s13742-015-0058-5
http://www.gigasciencejournal.com/content/4/1/26
Continue reading

Swedish e-Science Education graduate school (SeSE) courses: Autumn 2015

The Swedish e-Science Education graduate school (SeSE) now announces its graduate courses of autumn 2015.

The courses are listed at the SeSE website:

http://sese.nu/courses-autumn-2015/

SeSE gives basic training in fields where the use of e-Science is emerging and where such education can have an immense impact on the research as well as advanced training for students in fields that are already computer-intensive. The course curriculum is tailored to meet a broad set of prerequisites and will foster collaborations between Swedish researchers, possibly opening up for new research fields utilising e-Science tools and methods and also making it possible for inter- disciplinary collaborations on a Nordic level.

Swedish e-Science Education graduate school (SeSE) courses: spring 2015

The Swedish e-Science Education graduate school (SeSE) now announces its graduate courses of spring 2015.

The courses are listed at the SeSE website:

http://sese.nu/courses-spring-2015/

SeSE gives basic training in fields where the use of e-Science is emerging and where such education can have an immense impact on the research as well as advanced training for students in fields that are already computer-intensive. The course curriculum is tailored to meet a broad set of prerequisites and will foster collaborations between Swedish researchers, possibly opening up for new research fields utilising e-Science tools and methods and also making it possible for inter- disciplinary collaborations on a Nordic level.

2015 IEEE 11th International Conference on e-Science

Do you want to be an eSSENCE ambassador for Swedish e-Science?
The eSSENCE programme council encourages researchers in the eSSENCE community to submit papers to this international multidisciplinary e-Science conference. Persons whose papers are accepted will be eligible for funding, provided that eSSENCE is acknowledged (see http://essenceofescience.se/publications/). Funding includes:
* conference fee
* travel
* accommodation

Call for papers
Submission deadlines: March 8, 2015 for abstracts and March 15, 2015 for papers
11th IEEE International Conference on e-Science
31 August – 4 September, 2015
Munich, Germany

To the conference website

——————–
The IEEE international e-Science conference series is a yearly event that takes place in different cities around the world.

IEEE-2011 IEEE-2011 in Stockholm
IEEE-2012 IEEE-2012 in Chicago
IEEE-2013 IEEE-2013 in Beijing
IEEE-2014 IEEE-2014 in Sao Paulo
IEEE-2015IEEE-2015 in Munich

Swedish Theoretical Chemistry Meeting, Uppsala, 27-29 Oct 2014

We invite you to participate in the following conference:

Swedish Theoretical Chemistry Meeting 2014 – New Horizons
at the Ångström Laboratory, Uppsala University, Sweden
27–29 October 2014 (lunch to lunch)

Room: Häggsalen at the Ångström Lab, Uppsala University
Website: https://sites.google.com/site/stkmuu14
Deadline for abstracts and registration: 26 September 2014
Among the invited speakers are:

  • Prof. Roberto Car (Princeton U, USA)
    “Water, ice, dynamics and hybrids”
  • Prof. Martin Head-Gordon (UC Berkeley, USA)
    “The next generation of electronic structure theories”
  • Prof. Sally Price (UCL, UK)
    “Crystal-polymorphs prediction for pharmaceutical development – a challenge for force-field development and our understanding of crystallization”
  • Prof. Joost VandeVondele (ETH Zurich, CH)
    “Exploring the frontiers in sampling, large-scale models, and electroncorrelation using petascale computing and CP2K.”
  • Dr. Horst Weiss (BASF, DE)
    “Driving innovation for materials in the chemical industry – modelling needs and opportunities”

For this meeting we define theoretical chemistry in a broad sense, encompassing theory and computation from all fields of chemistry and its interfaces with physics, biology, materials science, and so on. The meeting is subtitled “New Horizons”, and the presentations will discuss new methods and directions, from fundamental concepts of bonding to calculations for industrial applications.

We welcome your participation and your oral or poster contributions!

/Kersti Hermansson and co-organisers