Skip to main content

Lab 5: Bioinformatics

This lab is separated into three separate, standalone modules. Prior Sanger sequencing is not required – example sequences are provided for each module. Approximately one class period is recommended for each Lab Activity.

Required resources: DNA Sequence Files

I. Sanger Sequence Analysis

Goal: To analyze and interpret the quality of Sanger sequences, and generate a consensus DNA sequence for bioinformatics analyses.


  • Lab Activity: Bioinformatics I: Sanger Sequence Analysis (10/21) (.docx / .pdf)
  • Quick Sheet: A quick reference sheet for students that are already familiar with NCBI, Sanger sequencing, and SnapGene Viewer. (10/21) (.docx / .pdf)


II. NCBI Taxonomy & BLAST Searching

Goals: (1) To show the ways in which the NCBI online database classifies and organizes information on DNA sequences, evolutionary relationships, and scientific publications. (2) To identify an unknown nucleotide sequence from an arthropod endosymbiont using the NCBI search tool BLAST.



  • Lab Activity: Bioinformatics II: NCBI Taxonomy & BLAST Searching (6/21) (.pptx.pdf)



III. Phylogenetics

NEW! These lab activities have not yet been widely tested. We appreciate your feedback on classroom implementation.

Goals: (1) To generate a Wolbachia phylogenetic tree. (2) To determine the relatedness of an unknown sequence to those of known Wolbachia strains and identify a putative Supergroup designation. (3) To generate an Arthropod phylogenetic tree using the barcoding CO1 gene.


  • Lab Activity (optional): Bioinformatics III: Wolbachia Identification and Naming (optional introductory activity for advanced Wolbachia studies) (.docx / .pdf)
  • Lab Activity: Bioinformatics III: Wolbachia Phylogenetics (.docx / .pdf)
  • Lab Activity: Bioinformatics III: Arthropod Phylogenetics (.docx / .pdf)



Sanger Sequencing

Q:What causes a low-quality run?

A: Many variables might contribute to a low-quality Sanger run. These include, but are not limited to:

  • Non-specific primer binding
  • Contamination with other samples during DNA extraction and/or PCR
  • Arthropod-specific: Amplification of both the COI gene and nuclear mitochondrial pseudogenes (numts)
  • Wolbachia-specific: The arthropod is infected with more than one Wolbachia strain (co-infection)
  • Not enough DNA template
  • DNA degradation
  • Inhibitory contaminants

If both directions are low-quality, it is recommended that you search the literature for your specific arthropod to confirm if: (i) numts are a known issue; (ii) optimized primers have been developed; (iii) Wolbachia co-infections have been described.

Q: Are both forward and reverse sequences necessary?

A: This depends on the overall goal of your project.

  • If the sequences are intended for an online data repository (such as GenBank) or publication, both forward and reverse reactions are highly recommended – but not required – to validate quality.
  • If the sequences are part of an informal and/or pilot study, sequence data from one direction is recommended to reduce costs.


Q: Why doesn’t my sequence fall within a clade on the phylogenetic tree?

A: First, review the original Sanger sequence (chromatogram) file. If the peaks are low-quality, the program may not be able to resolve your sequence because it doesn’t reflect the actual sequence of your organism. If the peaks are high-quality, there are a few possible explanations:

  • Your sequence may not fall within a clade on the tree. The reference files used for this activity include a small subset of the total Wolbachia/Arthropod biodiversity. We recommend performing a BLASTN of your sequence to find the most closely related organisms. You can copy-paste the top hits into your FASTA file in order to generate a more robust tree.
  • With such a small DNA sequence, there may not be enough information to properly resolve your organism. In this case, it is best practice to sequence additional genes (or the complete genome) to generate a more robust phylogenetic tree.