Join the EU Horizon MARCO-BOLO project in comparing bioinformatics pipelines for analysing eDNA data by participating in the DATA ANALYSIS CHALLENGE.
In doing so, you will help improve recommendations for eDNA metabarcoding pipeline choice amongst the myriad of options out there and contribute towards the development of indicator workflows to report on biodiversity monitoring.
Find a more detailed description of the challenge and the datasets below.
What you do
- 🔧 Come with your favourite pipeline
- 💻 Run it on one of our plankton or fish eDNA datasets
- 🌐 Send us your resulting OTU/ASV/ZOTU and Taxonomy tables
What we do
- 💾 Provide the eDNA datasets (18S plankton time series and 12S/16S/COI aquarium samples) along with standardised reference libraries
- 📊 Compile and compare the resulting tables across participants’ pipelines
- 📝 Report and share these results back with the participants and on the MARCO-BOLO website
Timeline of challenge
- Extended until the 13th of September 2024: Registration deadline. Datasets are shared with participants who expressed their interest in joining.
- 30th of September 2024: Deadline for the submission of analysis results.
- 7th and 8th of October 2024: Data comparison and indicator workflow development by MARCO-BOLO internally.
- Early November 2024: MARCO-BOLO initial report.
Registration form
Register your interest in the challenge using the Registration form before 13th September 2024.
Why register to participate
- You want your pipeline to be included in the comparison
- You like a challenge and want to see how your pipeline performs on new data
- You want to contribute towards better guidelines on pipeline choice and improved workflows for biodiversity monitoring
- You will be acknowledged for your contribution to eventual outputs
The Challenge
Read on for detailed information
In the context of the MARCO-BOLO project, we are testing different bioinformatic pipelines to optimise the pelagic habitat indicator workflows to report on biodiversity monitoring to the European Marine Strategic Framework Directive (MSFD). Currently, there is considerable variation in the analysis of environmental DNA (eDNA) data using bioinformatic pipelines. Also, there is no clear assessment of how these different pipelines and parameter choices therein affect the calculation of indicators.
We are therefore inviting you to take part in our DATA ANALYSIS CHALLENGE, where we aim to compare different bioinformatics pipelines currently used for eDNA analysis across the eDNA community. In a follow-up analysis, this will allow us to compare the effect of pipeline choice on the MSFD pelagic habitat indicator calculations.
The goal of this challenge is to analyse the same datasets with different pipelines. We will provide small datasets that can be run with your personal/favourite analysis method, and ask that you only return the final OTU/ASV tables and taxonomic information resulting from your analysis.
Your participation will be acknowledged in the Marco-Bolo project reports and on the Marco-Bolo website. The final comparison will provide clear evidence of how robust and comparable current eDNA practices are, with the aim of increasing our understanding of the method and trust in the results. The datasets will also remain available for further comparisons, whenever new workflows are being developed.
The Datasets
Based on your preference, you may choose to analyse one or more of the following datasets. The proposed datasets cover an 8-year time series of plankton protist communities in the Western English Channel identified with the 18S V4 region and a punctual multi-marker (12S, 16S, COI) assessment of the fish communities inhabiting the Lisbon aquarium.
Plankton 18S time series
This time series was collected to analyse patterns of plankton protist community succession off the French coast in the Western English Channel. Over the course of 8 years (2009-2016), bimonthly samples of 5 L of seawater were collected at 60m depth at the SOMLIT-Astan station (Roscoff, Western English Channel). The primers TAReuk454FWD1 and TAReukREV3 were used to amplify the V4 region of the 18S rRNA gene and to target most eukaryotic groups. This long-term, comprehensive dataset will allow us to explore how biodiversity indicators inform the temporal structure of community changes.
-
185 samples collected over 8 years
-
One primer pair: V4 18S rRNA: TAReuk454FWD1 & TAReukREV3 ~380 bp (Stoeck et al., 2010)
-
Reference library: please use the PR2 reference sequence database above version 5.0.0
[Data ownership: This dataset was published and made publicly available by Caracciolo et al. 2022]
Fish 12S/16S/COI aquarium dataset
This dataset consists of triplicate 5L water volumes collected from the Lisbon Aquarium (Oceanário de Lisboa). Extracted DNA was amplified with multiple primer pairs to target fish species using the 12S, 16S and COI genes. The existence of a reference list of species inhabiting the aquarium at the time of sampling allows to compare primer- and pipeline efficiency and accuracy not only across marker genes but also against the target list of species actually present.
- 9 samples (triplicate samples for each primer pair)
- 3 primer pairs:
- COl: Leray-Lobo ~ 313bp (Leray et al. 2013; Lobo et al. 2013)
- 12S rRNA: MIFISHU-E ~170bp (Miya et al. 2015)
- 16S rRNA: Fish16sF/D/ & 16s2R ~ 200bp (Berry et al. 2015)
- Reference libraries: will be provided in fasta format
[Data ownership: A-Fish-DNA-Scan and ME-BARCODE group at the University of Minho. By participating in this challenge, you agree to not use the dataset beyond this exercise. Contact: Filipe Costa
Contact and help
This challenge is jointly organised by WP2 and WP5 of MARCO-BOLO, with efforts led by Hanneloor Heynderickx (WP5, VLIZ), Saara Suominen (WP2, UNESCO), Daniel Morais (WP2, UIT) and Emilie Boulanger (WP2, UNESCO).
You can direct your questions regarding the challenge to Saara Suominen at s.suominen@unesco.org.
Please reach out if you get stuck at any stage of the challenge and we will help you as best we can.