Maximum Parsimony tutorial using PAUP

2 minute read

Step 1: Downloading and installing software

For this tutorial the programs we will use are SeaView, PAUP, and the text editor of your choice. SeaView has many uses, including:

  • Viewing molecular sequences
  • Algorithmic alignment of molecular sequences
  • Manually editing and alignment of molecular sequences
  • Estimating phylogenetic trees from molecular sequences
  • Viewing phylogenetic trees

If you are running Windows or macOS, you can download the latest version of SeaView from the SeaView web site. If you are running Ubuntu, then SeaView is available from the package manager. You can install it from the “Ubuntu Software” GUI, or manually using apt install seaview.

PAUP is used to infer trees from molecular data, and incorporates many different methods and models for doing so. These include:

If you are running macOS or Linux, please download the latest command line version of PAUP for your platform from the PAUP test-version downloads web site. Extract PAUP, and make sure the program is executable by opening the command line, navigating to the directory it was stored in, and running chmod +x paup4a164_ubuntu64 on Ubuntu or chmod +x paup4a164_osx on macOS. If you are running Windows, download the Windows GUI version from the same web site.

If you do not have a favorite text editor already, I recommend Sublime Text or Visual Studio Code. You can download and install either program from their respective web sites.

After downloading the software, download the workshop materials archive to your computer, and extract its contents.

Step 2: Exploring the true tree and sequence data

Launch SeaView, and then open the fz.tree file in the phylogenetics-workshop folder. This will show you an ultrametric tree that was randomly generated for this workshop (using a coalescent model).

Still in SeaView, open the multiple sequence alignment file. This is a 100,000 character alignment generated based on the tree you just opened, and using a Jukes-Cantor model of molecular evolution.

Step 3: Inferring the maximum parsimony tree with PAUP

We will use PAUP to infer a phylogenetic tree. Open the command line on your computer, and navigate to the extracted phylogenetics-workshop folder. On Windows, run paup On macOS or Linux, replace paup with the path to the PAUP executable on your computer. For example if you saved it to the Downloads folder on a Mac, this might be ~/Downloads/paup4a164_osx. Run the following lines of PAUP code:

  1. Set Criterion=Parsimony;

This tells PAUP that the parsimony score of a tree should be used to judge its goodness of fit.

  1. BandB;

This command will identify the best fitting tree according to the parsimony criterion. Normally we have to use some kind of stochastic algorithm like hill-climbing or MCMC to infer trees, as the number of possible trees is so large. Because this data set is relatively small (100,000 sites and 12 taxa), we can instead use an exact “branch-and-bound” algorithm.

  1. SaveTrees file=mp.tree replace=yes;

Save the inferred tree as a file with the name mp.tree.

  1. Quit;

Should be self-explanatory.

Step 3: Exploring the inferred tree

Open the inferred tree in SeaView. Make sure the true tree is still open. The kind of inference we used produces an unrooted tree without branch lengths, so you may have to reroot it or rotate nodes in SeaView. Experiment with the “Swap” and “Re-root” options in SeaView so that the trees match.

What if any nodes are different between the truth and the estimated tree topology?