IOW Logo

Bioinformatics and ‘omics data science

Molecular approaches, such as next generation DNA sequencing, are being used in many fields of the life sciences. Technological advancements enable the generation of data of steadily increasing volume and complexity. While this development brings challenges in data analysis and management, it also offers opportunities for the application of data-driven approaches that rely on the integration of large amounts of data. To tackle both these challenges and opportunities, in this working group we focus on: (i) bioinformatic workflow optimization and automatization, (ii) sequence data management and archiving, (ii) large scale sequence data integration and reuse.

There are petabyte of data, billions of sequences, and millions of samples for which data has been collected and deposited on public archives. Very often the study that collected the data will only scratch the surface of the information contained therein, leaving vast amounts of data unexplored. Therefore, there is a large and so far largely untapped potential in the reuse of such data sets for different purposes than they were initially generated for. Here, we work closely with other groups within the department, providing the methodological toolbox and bioinformatic expertise to mine existing sequence data resources for addressing relevant research questions in biological oceanography. Current collaborations include:

  • Assessment of the global distribution of *Vibrio vulnificus* in coastal areas by the integration and reuse of tens of thousands of publicly available 16S amplicon sequencing data sets (in collaboration with WG Environmental Microbiology)
  • Global biogeography of the toxic dinoflagellate *Alexandrium pseudogonyaulax* that is invasive in the Baltic Sea using publicly available marker gene sequences as well as long-read amplicon and metagenomic data (collaboration with WG Phytoplankton Ecology)
  • Pangenome analysis and ecological differentiation of the sulfur oxidizing bacteria Beggiatoaceae via the analysis of publicly available genome assemblies (collaboration with WG Geomicrobiology)

Furthermore, we are dedicated to FAIR research data management as prerequisite for the successful integration and reuse of DNA sequencing data. In this context we are active in regional (ORDS-MV, FDM-MV) and national (NFDI4Biodiversity, GFBio e.V.) networks and work closely with the IT and data management group at IOW.

As a partial infrastructure unit, we further offer the following services:

  • Consulting on DNA/RNA sequencing projects (sampling/experimental design, sequencing approaches, bioinformatic and statistical data analysis)
  • Support of sequencing projects by running routine bioinformatic sequence analysis
  • Provision of scripts and workflows (and tutorials) for common sequencing approaches
  • Regular teaching and training workshops
  • Coordination of sequence data management
  • Maintenance of software and computing environment at IOW (in coordination with and supported by the IT and the department of physical oceanography)

Further material:

Documentation and training materials can be found on the IOW gitea (disclaimer: some of the repositories are for IOW-internal use only).

Additionally, courses and workshops are offered on demand. For this, please leave a suggestion in a comment (or vote for existing suggestions) on this issue tracker

Open coding sessions:

Every Wednesday from 2 - 4pm the open coding sessions take place.

They are an informal space to discuss any topics related to bioinformatics, coding, data management, statistical analyses, visualization, etc. The idea is that if you join the open coding sessions, you continue to work on your data just as you would anyway, so that no time is being lost by joining. However, if you encounter an issue, there will be other people around to help you out immediately. So even if you may not have a problem, please also consider joining the open session in case your expertise may help others. It will also be a great platform for networking and getting a peak into what others are doing at IOW.

Alternatively, you can of course also collect questions to put to the community beforehand. For this you can use the rocket.chat channel for the open coding sessions.There, announcements about upcoming coding sessions will also be shared.

Accompanying the weekly open coding sessions, there is also an IOW gitea repository to document the code solutions to the problems discussed in the sessions and the chat.

The open coding sessions are open to all regardless of their level and expertise. No preparation is required. The open coding session are again taking place online on zoom or if there is interest in a hybrid format (room TBD for each session).

 

 

Team