Bioinformatics Class Project Day 1

Author

Solomon Chak

Published

April 17, 2023

Background

Transportable elements (TEs) are mobile genetic elements that can copy/cut and paste themselves in the genome (Bourque et al. 2018). TEs are an important contributor to genome size differences among eukaryotes. They are also an important source of genetic mutation. TE insertion could break up genetic elements, or even shuffle elements around a genome. TEs are also being increasing documented to fuel adaptive evolution (Schrader & Schmitz 2018).

TEs are more abundant in eusocial species of snapping shrimps in the genus Synalpheus (Chak et al. 2021). But it is unclear whether TEs is involved in the evolution of eusociality. Although this has been suggested in other eusocial group such as termites (Korb et al. 2015).

One of the most characteristic features of eusocial societies is the reproductive division between queens and workers (Wilson 1971). Queens are usually the only reproductive individual in a colony, while workers do not reproduce but instead, depending on species, perform tasks such as colony defense, foraging, feeding the young/queen, and colony maintenance. In some eusocial species, workers even become sterile for life.


Objective

The object of this class project is to determine whether TEs are contributing to the evolution of eusociality, especially in the differentiation between workers and queens.

Raw Data

We have the raw RNAseq data from RNA extracted from three workers and three queens in a eusocial shrimp species.


Discussion

Think and share about:

  1. Based on the introductory text above, what is the hypothesis that we are trying to test?

  2. How can the raw data be used to test the above hypothesis?

  3. What software are needed to process the raw genomic data.


How to process the raw data?

We can run a differential gene expression analysis based on the RNA-seq data from three workers and three queens. This will allow us to test which genes are up or down regulated in queens (vs. workers).

The raw data are slightly different from Day 14 & 15. The key difference is that we are mapping the RNA-seq data to an annotated transcriptome assembly in the project, instead of a well annotated genome (in Day 14/15). Therefore, the Galaxy protocol is also different from before.

Files Day 14 & 15 Project
Samples
(sequences)
Infected vs. non-infected
(single-ended)
Queen vs. worker
(Pair-ended)
Reference Genome Assembly Transcriptome Assembly
Annotation Genome annotation None

Go to the “S.eliz_RNAseq_tutorial” on Bright Space or here.