“Recommendations on e-infrastructures for next-generation sequencing”
O. Spjuth, E. Bongcam-Rudloff, J. Dahlberg, M. Dahlö, A. Kallio, L. Pireddu, F. Vezzi, and E. Korpelainen,
GigaScience, vol. 5, no. 1, pp. 1-9, 2016.
With ever-increasing amounts of data being produced by next-generation sequencing (NGS) experiments, the requirements placed on supporting e-infrastructures have grown. In this work, we provide recommendations based on the collective experiences from participants in the EU COST Action SeqAhead for the tasks of data preprocessing, upstream processing, data delivery, and downstream analysis, as well as long-term storage and archiving. We cover demands on computational and storage resources, networks, software stacks, automation of analysis, education, and also discuss emerging trends in the field. E-infrastructures for NGS require substantial effort to set up and maintain over time, and with sequencing technologies and best practices for data analysis evolving rapidly it is important to prioritize both processing capacity and e-infrastructure flexibility when making strategic decisions to support the data analysis demands of tomorrow. Due to increasingly demanding technical requirements we recommend that e-infrastructure development and maintenance be handled by a professional service unit, be it internal or external to the organization, and emphasis should be placed on collaboration between researchers and IT professionals.