Cladistics is a quantitative method of classification of plants that attempts to recover evolutionary relationships, based on observable characters.

Since the dawn of history, humans have classified plants. In primitive cultures classifications were by economic use, such as food, clothing, medicine, and shelter. Later the form (morphology) of a plant became important, for example, trees, shrubs, or herbs.

Carolus Linneaus considered the similarity of floral parts to be critical, and this formed the basis of his classification system. Each of these systems is said to be “artificial.” That is, the classification was solely for a human purpose and did not attempt to indicate genetic relationships between plants.

Since Charles Darwin, the goal of plant systematics has been to develop a “natural,” phylogenetic classification, one that represents the natural relationships of each species to all others. Cladistics was developed as a method to construct phylogenetic classifications.

A Brief History

Three systems have evolved to aid systematists (scientists who study the phylogenetic relationships of organisms) in their work. Traditional phylogenetics was based on intuition and involved the “art and science” of character weighting.

The scientist studied a group of plants and decided which characters he or she thought were important. Evolutionary relationships were then based on these characters. Individual bias led to disagreements that could not be resolved objectively.

Computer-assisted numerical approaches permitted systematists to employ a more objective methodology and analyze large quantities of data, gathered from a variety of sources that range from traditional morphology to the most sophisticated molecular techniques.

The earliest attempt, phenetics, used computers to determine the degree of total similarity between taxa. Unfortunately, this ignored both parallel and convergent evolution.

The methods of cladistics were first formalized in the 1950’s and 1960’s by Willi Hennig. This approach requires three assumptions to be met: evolution occurs; evolution is monophyletic (that is, lineages derive from a common ancestor); and characteristics passed from generation to generation are either modified or not.

Although phylogenetics is concerned with genealogical relationships, the latter cannot be observed; rather, they must be inferred from observable characters (morphological, biochemical, behavioral, and so on) in much the same way as one infers genotypes when constructing a family pedigree.

Cladistics is a quantitative method that attempts to recover evolutionary relationships, based on observable characters, and presents the resulting phylogeny in the form of a treelike diagram called a cladogram.

When many different organisms are being classified and when many different characters are being analyzed simultaneously, alternative cladograms may result.

The most parsimonious tree (the cladogram requiring the fewest evolutionary changes) is generally preferred, because it is assumed that the simplest pathway is the one most likely to reflect the evolutionary history of the plants being examined.

Constructing a Cladogram

The most important decision to make before beginning construction of a cladogram to represent the relationships among a group of plants is the choice of an appropriate outgroup.

The outgroup cannot belong to the group of plants being analyzed, but it should be closely related. Much of the work of a phylogenetic study is determining an appropriate outgroup to be used for comparisons.

The next step involves construction of a character matrix. A character is any feature of a plant. It may be an observable morphological or biochemical feature or an ecological or physiological attribute.

Every useful character will have more than one character state. For instance, the character “root type” may have the character states “taproot,” “fibrous root,” or “adventitious root.”

Characters having a common origin are called homologous. Cladistic analysis recognizes two types of homologies: plesiomorphies and apomorphies. Plesiomorphies are considered to be the primitive state of a character; that is, the character is unchanged from the ancestral condition.

Plesiomorphies are determined by comparison of the character states in the members of the taxa being investigated with the character state in the outgroup. A character state found in both the outgroup and the taxa being examined is considered to be plesiomorphic.

Any modification of the character state is considered to be apomorphic; thus, apomorphies are derived from plesiomorphies. Apomorphies shared by two or more taxa are called synapomorphies. Identification of synapomorphies, assumed to be derived from increasingly recent common ancestors, provides the basis for constructing cladograms.

Qualitative Approach

The first step toward a qualitative approach to constructing a cladogram is to examine the character matrix and list groupings of taxa according to the apomorphic trait for each character. Next, one character to begin the tree is chosen.

Any character will do, but it is simplest to begin with a character in which only the outgroup has the plesiomorphic state and all ingroup taxa share the same apomorphy. For instance, a conifer might be the outgroup for classifying flowering trees.

The plesiomorphic reproductive structure would be a cone, and the synapomorphy shared by all ingroup members would be flowers. The tree would have the conifer at the base, with a single line extending to the right to a branch point (node) from which all ingroup members diverge.

The character state “flower” would be placed on the line between the conifer and the node, indicating that the shared character state, flowers, evolved prior to the divergence of ingroup taxa from one another.

Next, a second character is added to the existing tree. For instance, the conifer and dicot trees would all share the plesiomorphic character of a taproot, but monocot trees, such as palms, would have fibrous roots.

The tree should now be extended to the right to form a second node with the character state “fibrous roots” added to the new stem segment and the monocot trees branching off the second node.

The monocot taxa diverged from each other after fibrous roots evolved. The dicots do not have fibrous roots, so they are diagramed at the node to the left of “fibrous roots.” The tree is continued by the addition of one character at a time until all have been used.

Quantitative Approach

The qualitative approach becomes increasingly difficult as the number of taxa and number of characters are increased. The advantages of the quantitative approach are that the process can be automated and human bias can be minimized. The following example illustrates “by hand” the way computers can be programmed to produce a cladogram.

The first step is to code the character matrix to produce a numerical matrix for analysis. Plesiomorphic characters in the data matrix are coded as 0; different apomorphic character states are coded as successive integers, 1, 2, 3, and so on.

The simplest cladogram consists of a Y-shaped diagram representing three taxa, two from the group being studied and a third being the outgroup. The outgroup is placed at the bottom and serves to “root” the tree. The two ingroup species are located at the top of each arm.

The point where the two arms diverge is a node and represents the ancestral taxon derived from the outgroup that gave rise to both ingroup taxa—it represents the common ancestor of the ingroup taxa. A numerical algorithm computes what the character states of this ancestral species must have been.

Additional taxa can now be added to the cladogram, one at a time. A series of new trees are constructed in which the new taxon is added between each existing taxon and each existing node.

There are three places a fourth taxon could be added to the simple tree: between the root and the node, between the node and the first ingroup taxon, and between the node and the second ingroup taxon.

An algorithm computes which of the three possible trees is the most parsimonious, and this tree is used as the basis for adding the fifth taxon (in one of now five possible positions between the four existing taxa and the two nodes). This process is continued until all taxa have been added to the tree and the cladogram is complete.