Assessment of eukaryotic communities in environmental samples: A workflow comparison for next-generation sequencing data
To understand function and stability of ecosystems it is crucial to gain insights into their species composition, particulary in the face of global warming. Next Generation Sequencing (NGS) is the method of choice for getting fast overviews of species diversity in a high number of samples. Currently, there are lively discussions about bioinformatic techniques to enhance the quality of sequencing outputs and how to post process these data in order to estimate the “real” diversity as precisely as possible. In this study, we analyzed the protist composition of three water samples, collected in the Fram Strait in 2010. We compared different potential sequencing error corrected and uncorrected datasets, which were generated with widely used open-source software: QIIME, mothur and PhyloAssigner. Relative abundance of protist phyla was hardly affected by the choice of the software, quality filtering and error correction. However, the outputs differed strongly in relative abundance of diatom genera and were not comparable to dominant diatoms observed with light microscopy. Our main findings are beneficial for the enhancement of study design, data preparation and interpretation and gives insights into the optimization potential of NGS experiments in general.
Helmholtz Research Programs > PACES II (2014-2020) > TOPIC 4: Research in science-stakeholder interactions > WP 4.2: Channelling research data to enhanced data products