Smith Woodward embarked on the production of a catalogue of fossil fishes half a century after Louis Agassiz began a similar exercise. These two palaeontological goliaths remain the only authorities who saw all relevant fossil fish material in all important collections. Between their works there was a substantial increase in the number of species recognized, reflecting the nineteenth-century passion for collecting, the rise of museums, as well as an acceptance that species change through time. Agassiz was working in pre-evolutionary days but Smith Woodward’s view on fish diversity was strongly influenced by the theory of evolution and specifically the writings of Thomas Henry Huxley as well as those of Edward Drinker Cope and Ramsay Heatley Traquair and their ideas of grades of evolution. Many of Smith Woodward’s generic and species descriptions survive today as his lasting legacy. Higher classification has changed considerably with new discoveries and differing methods of classification.
The tree of life of fishes is in a state of flux because we still lack a comprehensive phylogeny that includes all major groups. The situation is most critical for a large clade of spiny-finned fishes, traditionally referred to as percomorphs, whose uncertain relationships have plagued ichthyologists for over a century. Most of what we know about the higher-level relationships among fish lineages has been based on morphology, but rapid influx of molecular studies is changing many established systematic concepts. We report a comprehensive molecular phylogeny for bony fishes that includes representatives of all major lineages. DNA sequence data for 21 molecular markers (one mitochondrial and 20 nuclear genes) were collected for 1410 bony fish taxa, plus four tetrapod species and two chondrichthyan outgroups (total 1416 terminals). Bony fish diversity is represented by 1093 genera, 369 families, and all traditionally recognized orders. The maximum likelihood tree provides unprecedented resolution and high bootstrap support for most backbone nodes, defining for the first time a global phylogeny of fishes. The general structure of the tree is in agreement with expectations from previous morphological and molecular studies, but significant new clades arise. Most interestingly, the high degree of uncertainty among percomorphs is now resolved into nine well-supported supraordinal groups. The order Perciformes, considered by many a polyphyletic taxonomic waste basket, is defined for the first time as a monophyletic group in the global phylogeny. A new classification that reflects our phylogenetic hypothesis is proposed to facilitate communication about the newly found structure of the tree of life of fishes. Finally, the molecular phylogeny is calibrated using 60 fossil constraints to produce a comprehensive time tree. The new time-calibrated phylogeny will provide the basis for and stimulate new comparative studies to better understand the evolution of the amazing diversity of fishes.
“…With the variety of both primitive and advanced teleosts living today, we are most emphatically of the opinion that approaches other than morphological ones would be exceedingly fruitful in the investigation of teleostean interrelationships..."
Greenwood et al. (1966)1
Our view of the phylogeny and classification of bony fishes is rapidly changing under the influence of molecular phylogenetic studies based on larger and more taxonomically comprehensive datasets. Classification schemes displayed in widely used text books on fish biodiversity (e.g.,2,3) have been based on loosely formulated syntheses (supertrees) and community consensus views of largely disconnected studies. The phylogenetic structure underpinning such classifications has many areas that are notably unresolved and poorly known, providing weak or no justification for many groups that, although formally recognized, are implicitly known to be polyphyletic (e.g. percoids, perciforms, scorpaeniforms). A comprehensive phylogenetic tree for all major groups of fishes has been elusive because explicit analyses including representatives across their diversity have never been accomplished. Detailed morphological cladistic investigations of fish relationships have typically focused on lower taxonomic scales and few attempts to synthesize morphology at higher taxonomic levels proved to be challenging and met limited success (e.g.,4). A recent effort to systematically collect morphological synapomorphies from published records for all currently recognized groups resulted in the first teleost classification based on monophyletic groups5. This effort, however, did not produce a global phylogenetic hypothesis. Similarly, molecular analyses have been limited and many times conflicting in terms of genetic coverage and taxonomic sampling.
As predicted by Greenwood et al.1, development of molecular markers, especially sequences of mitochondrial DNA (mtDNA) genes or complete mitochondrial genomes, catalyzed new views of bony fish relationships by providing a common yardstick of phylogenetic information across vast taxonomic scales6,7,8. Studies based on mitogenomic data proliferated to methodically probe conflicting hypotheses of relationship for several groups at diverse taxonomic levels, many times proposing alternative arrangements supporting new clades unsuspected by previous classifications10,11,12,13. In spite of their new powerful insights, mitogenomic hypotheses were not universally embraced because they represent information from a single locus, prompting corroboration from additional genomic regions. Several nuclear DNA markers were subsequently developed and applied to infer bony fish relationships. The most popular ones include 28S ribosomal subunit14,15,16, tmo4c4 17,18, rhodopsin19,20, rag1 and rag221,22, mll20, irbp23, and rnf21324. Using a systematic approach to scan genomic databases, a larger set of nuclear markers became available in 2007 25, opening a new window to obtaining large multilocus datasets25,26,27,28,66. Recent studies using between 10 and 20 of these nuclear markers for a few hundred taxa27,28,29,30,31,66, have shown improved resolution of phylogenetic relationship at higher and lower taxonomic levels. Many but not all of the mitogenomic hypotheses received support from nuclear gene data, but the discovery of new clades continued with increasing taxonomic sampling. Initially identified by letters (e.g., clades A, B, C, etc.19,23,32), new names were recently proposed for many groupings supported by molecular evidence, such as Stiassnyiformes, Zeiogadiformes, Carangimorpha, Cottimorpha, Ovalentaria, Gobiiformes etc.24,31,33. Validation of these groups (and their proposed names) is pending until a comprehensive study including all taxa is produced. Proliferation of new names is useful for identification of the newly discovered groups, but may create confusion if not systematically organized into a global classification.
Molecular phylogenetic methods (e.g., BEAST34) in combination with fossil evidence also opened a new temporal window to understand bony fish diversification. Attempts to estimate divergence dates among crow-group lineages using this approach (e.g., 35,36,37) frequently produced conflicting views with the paleontological literature38,39,40, sometimes implying large gaps in the fossil record. The discrepancy is larger when divergence estimates for crown teleost lineages have been based on mitogenomic data (e.g.,37,41,115). Nucleotide saturation, compressing basal branch lengths for mtDNA, and the specific approaches used to apply fossils constraints to calibrate the molecular phylogeny may explain this discordance43. Other studies based on several nuclear genes and larger sets of fossil calibration points produced divergence dates more consistent with the fossil record29,66, but a comprehensive time-tree for osteichthyan diversification is not yet available.
The shape of the bony fish tree of life is currently better resolved for the early-branching lineages than for the more apical acanthomorph groups, in particular the percomorphs, a large and diverse group of spiny-finned fishes with uncertain affinities that came to be known as “bush at the top”44. Few basal branching events among osteichthyans remain problematic, for example, the relationships among lungfishes, coelacanths, and tetrapods45,46,47,66. In contrast, the basal branching pattern for early extant actinopterygians (involving polypteriforms, chondrosteans, lepisosteids, Amia and teleosts) have been resolved with confidence based on morphological and DNA sequence evidence66. Similarly, recent molecular studies based on several nuclear genes25 consistently support relationships among major teleost groups: Elopomorpha, Osteoglossomorpha and Euteleostei29,66. The deeper nodes among euteleosts and percomorphs also could be resolved with confidence with this new set of nuclear markers, but a comprehensive phylogeny including all groups is lacking. In this study we report phylogenetic results based on a taxonomically comprehensive dataset with DNA sequences for 21 nuclear genes. A dataset with 1416 taxa was assembled, including four tetrapod and two chondrichthyan outgroups. Bony fish diversity is represented by 1093 genera (of ca. 4300), 369 families (of 502), and all traditionally recognized orders5, making this the most comprehensive dataset ever compiled in systematic ichthyology. Phylogenetic results corroborate many previously established hypotheses, but also provide unprecedented resolution among percomorphs. The uncertain relationships involving most of the extant diversity of percomorphs is resolved into several well-supported groups and, for the first time, we offer a monophyletic definition for Perciformes. Using a set of 60 calibrations, we also provide the most comprehensive hypothesis to date about the tempo of osteichthyan diversification. Considering the new clades obtained in this study and previously published well-supported clades, we propose a new classification for bony fishes based on the nomenclatural scheme recently proposed by Wiley and Johnson5. Our hope is that this explicit proposal will facilitate communication among ichthyologists attempting to chart the rapidly changing landscape of phylogeny and classification of fishes.
Materials and Methods
Molecular data and taxonomic sampling
This study is the main product of the Euteleost Tree of Life Project (EToL). A total of 21 molecular markers with a genome-wide distribution were examined, the majority of which were developed by EToL using a genomic screen pipeline25. This pipeline compared the Danio rerio and Takifugu rubripes genomes to identify single-copy genes with long exons (>800 bp) and divergence levels suggesting they evolve at rates appropriate for phylogenetic resolution among distantly related taxa. Exons markers were sequenced from 11 nuclear genes previously published by our group (kiaa1239, ficd, myh6, panx2, plagl2, ptchd4 (=ptr), ripk4, sidkey, snx33 (=sh3px3), tbr1b (=tbr1), and zic1) and three additional markers, including one intron (hoxc6a) and two exons (svep1, and vcpip), were newly developed for this study using the same approach. Sequence data from seven additional markers, including EToL markers (enc1, gtdc2 (=glyt), and gpr85 (=sreb2)) or markers developed by others (16S mtDNA, rag1, rag2, and rh), were generated for our previous studies (e.g., 25,26,27,28,66,96) or obtained from NCBI, Ensembl, or other genomic databases.
A total of 1184 bony fish taxa were initially targeted for this study and samples were primarily obtained from the tissue repository of the Ichthyology Collection at University of Kansas (1129 samples) or other collections. Of the initial list, samples for 18 taxa either failed to amplify or belonged to duplicate species that were ultimately combined or discarded. Sixty taxa that produced sequence data for one or two genes only were also discarded. Twenty-five additional taxa were excluded from the final matrix because they had low genetic coverage and highly variable phylogenetic placement in preliminary analyses, as identified using bootstrap trees obtained with RAxML v7.349 and the RogueNaRok server50. Our final sampling thus included 1081 taxa and sequence data from 335 additional taxa were obtained from previous EToL studies (e.g., 25,26,27,28,66,96) or public databases (Table S1). In order to minimize missing data, some sequences retrieved from public databases were combined as genus-level composite taxa (52 taxa). DNA extraction, amplification protocols via nested PCR, and primers followed previous studies (e.g., 25,26,27,28,66,96). Primer sequences and optimized PCR conditions used for the three new markers is presented in Table 1. The PCR amplicons obtained were submitted for purification and sequencing in both directions to High Throughput Sequencing Solutions (HTSeq.org) or other core facilities.
Fish diversity is represented in the phylogenetic data matrix by a sample of 1410 bony fish species (of ca. 3100051) plus four tetrapod species and two chondrichthyan outgroups (total 1416 terminals). The taxonomic sampling of bony fishes consists of 1093 genera (of ca. 4300), 369 families (of 502; see below), and all traditionally recognized orders (e.g.,5). Our taxonomic sampling emphasizes representation of percomorph groups, with 1037 (of >15000) species in 201 families. All scientific names were checked against the Catalog of Fishes51. A complete list of material examined is given in Table S1.
Sequence alignment and phylogenetic analyses
Contigs were assembled from forward and reverse sequences using CodonCode Aligner v3.5.4 (CodonCode Corporation), Sequencher v4 (Gene Codes Corporation), or Geneious Pro v4.5 (Biomatters Ltd.). Exon markers were aligned individually based on their underlying reading frame in TranslatorX52 using the MAFFT aligner53. The hoxc6a and 16S sequences were aligned with MAFFT v6.953 using 1000 iterations and the genafpair algorithm. Because nested PCR is highly prone to cross-contamination, we vetted the data by visually inspecting individual gene trees estimated with the Geneious Tree Builder algorithm in Geneious. To qualitatively assess gene-tree congruence, the final gene alignments were analyzed under maximum likelihood (ML) in RAxML using ten independent runs for each; exon alignments were partitioned by codon position. Alternative approaches to analyze combined data based on species-tree methods that account for gene-tree heterogeneity due to lineage sorting (e.g.,54,55,56,57) could not be applied to this dataset due to high proportion of missing data (see Results).
Individual genes were concatenated using SequenceMatrix v1.7.858 or Geneious. Two datasets were assembled and analyzed separately, one including all 1416 taxa with sequence data from three genes or more (3+ dataset) and a subset including 1020 taxa with sequence data from seven genes or more (7+ dataset). Analyses of the 3+ dataset were performed under maximum likelihood (ML) using two partitioning schemes, a simple one determined arbitrarily with 5 data partitions (3 codon positions across all exons plus 16S and hoxc6a), and a more complex scheme with 24 partitions (a combination of codon positions and individual genes plus 16S and hoxc6a) indicated by PartitionFinder59. To make the PartitionFinder analysis scalable, a representative subset of 201 taxa was run under the Bayesian Information Criterion59. The 7+ dataset was analyzed with the 24-partition scheme only. Analyses for both datasets and partition schemes were conducted in RAxML using 30 independent replicates under the GTRGAMMA model. Nodal support was assessed using the rapid bootstrapping algorithm of RAxML with 1000 replicates estimated under the GTRCAT model60, and the collection of sample trees was used to draw the bibartition frequencies on the optimal tree. All RAxML analyses were conducted in the CIPRES portal v3.1.
For comparison purposes, the 3+ dataset was also analyzed under implied-weighted parsimony61. The optimal tree search and bootstrap trees were set to run independently. Gaps were treated as missing characters and all parsimony uninformative characters were ignored. A relatively mild value of k (20) was chosen arbitrarily due to computational limitations to explore sensitivity of the nodes to other weighting functions. Tree searches were performed in TNT 1.163 using a driven-search strategy combining the following tree-search algorithms: ratchet, drift, sectorial searches and tree fusion. The exhaustiveness of the search parameters was self-adjusted every 2 hits of the current best score. To maximize tree-space exploration, the final searches implemented tree-bisection-reconnection (TBR). A strict consensus of nine equally optimal trees (length 407187 steps; fit 7309.19) was computed. Bootstrap search strategies were relaxed to ten random addition sequences and TBR, saving only one tree per replicate (1000 replicates); bootstrap bipartition frequencies were drawn on the consensus tree.
Divergence time estimates
Time-tree estimation in a Bayesian framework using the complete dataset was computationally infeasible. Thus, we selected a subset of 202 taxa for 18 genes that had representation of: (i) all major bony fish lineages, (ii) lineages encompassing the nodes in which the assignment of fossil calibrations is most informative, (iii) taxa with the highest genetic coverage to minimize missing data in the data matrix (the markers vcpip, svep1, hoxc6a, including a high proportion of missing data, were also excluded). Divergence times were estimated in BEAST v1.7 using the uncorrelated log-normal (UCLN) clock-model34. Sixty calibration points were selected as priors for divergence time estimates, of which 58 are based on previous studies29,64,65,66 and two (calibrations 45 and 60) are proposed here (Appendix 1). However, the actual BEAST analysis conducted for this study included 59 calibrations only (see details under calibration 60, Appendix 1). A starting chronogram that satisfied all priors (e.g., monophyly and initial divergence times) was generated under penalized likelihood in r8s v1.7167 using the RAxML tree. To model branching rates on the tree, a birth-death process was used for the tree prior with initial birth rate = 1.0 and death rate = 0.5. The substitution model was GTR+G with 4 rate classes and the data were partitioned into 4 categories with independent parameter estimation: three codon positions across exons of protein-coding genes plus 16S. Clock and tree priors were linked across partitions. Five replicates of the Markov chain Monte Carlo (MCMC) analyses were each run for 200 million generations, with the topology constrained to that recovered in the phylogenetic analyses of the 3+ dataset (pruned for taxa not included in the subset). Post-run analysis of MCMC log files was assessed using Tracer v. 1.568 and mixing was considered complete if the effective sample size of each parameter was >20034,68. Tree files from the five runs were combined in LogCombiner v1.7.468 with the first 10% of trees from each run discarded as burn-in. The maximum clade credibility tree, with means and 95% highest posterior density of divergence times, was estimated with TreeAnnotator v1.6.168.
The complete tree with 1416 taxa was time-calibrated under penalized likelihood (PL67) with treePL69. The PL model, which assumes rate autocorrelation, has been shown to perform poorly in simulation studies resulting in high stochastic error of divergence time estimates70. To ameliorate this problem, mean highest posterior density estimates of clade ages obtained with the subset in BEAST were imposed as fixed secondary calibrations for the PL analysis, rather than using primary calibrations with minimum and maximum age constrains. A total of 126 secondary calibrations were used for this analysis, including the ages obtained for all major groups in the tree as well as the nodes near which primary calibrations were defined. The rate smoothing parameter was set to 10 based on the cross-validation procedure and the χ2 test in treePL (four smoothing values between 1 and 1000 were compared).
Results and Discussion
The final concatenated alignments included 21 markers with 20853 sites for 1416 taxa in the 3+ dataset and 1020 taxa in the 7+ dataset. The average presence of data (number of sequences per taxon) across the alignments was 41.0% for the 3+ dataset and 48.2% for the 7+ dataset. A summary of dataset features, including data presence, alignment length, and sequence variation for each marker is given in Table 2 (see also Table S1). The new sequences have been deposited in GenBank under accession numbers KC825360-KC831391. The sequence alignment (nexus format), ML tree (newick format), and Table S1 are available from the Dryad repository (DOI:10.5061/dryad.c4d3j). The main phylogenetic hypothesis is summarized in Fig. 1 (24-partition RAxML tree, 3+ dataset, time-calibrated under PL). Fig. 2 provides measures of congruence among alternative analyses (concatenation and gene trees) for all major clades and provides discrete tests for traditional hypotheses in ichthyology. Figs. 3–10 provide more detail on the relationships within selected percomorph clades based on the tree in Fig. 1. The time-calibrated (BEAST) tree for the subset (202 taxa and 18 genes) and 59 calibration points are shown in Fig. 11 (see also Appendix 1); Fig. 12 compares the results of divergence times estimated for major groups with those obtained by other recent multi-locus studies. The complete phylogeny with bootstrap values and taxonomic annotations is depicted in Fig. S1 as a cladogram and can also be visualized online as a time-tree using a fractal explorer and zooming interface at OneZoom73 (also posted at DeepFin).
The basal nodes of the tree and relationships among early branching groups of bony fishes have been well established and thoroughly discussed by recent molecular systematic studies based on similar sets of genes29,66, albeit with reduced taxonomic sampling. Because our results corroborate these hypotheses (e.g. monophyly of Actinopterygii and Holostei, branching order of elopomorphs and osteoglossomorphs; Fig. 1), we refer the reader to those papers for discussion on relationships among lineages from the root of the tree up to the Euteleosteomorpha. The most significant new results involve crown acanthomorph lineages, in particular the unprecedented resolution among percomorphs, represented in this study by 1037 species in 201 families. The proverbial “bush at the top” is now disambiguated into several well-supported clades at the ordinal or supraordinal level, with well-resolved relationships amongst them (Fig. 1). We also provide for the first time a monophyletic definition of Perciformes, sinking into this clade components of Scorpaeniformes, Gasterosteiformes, and Cottiformes (Fig. 10; see also16,74). Among the euacanthomorphs, we find the non-monophyly of Beryciformes (including Stephanoberyciformes) and a sister-group relationship between holocentrids and percomorphs, first recognized by Stiassny and Moore75 and Moore76, but challenged by Johnson and Patterson4.
Based on the topology obtained (Figs. 1-10, S1) we propose a new classification for ordinal and subordinal groups of bony fishes and subsequently discuss some of the most significant findings.
Revised Classification for Bony Fishes
The nomenclatural arrangement presented in Appendix 2 builds on the existing classification by Wiley and Johnson5 and intends to preserve names and taxonomic composition of groups whenever possible. However, adjustments are made to recognize new well-supported molecular clades, many of which also have been obtained by previous molecular studies (several examples discussed below). Order-level or supraordinal taxa are erected (new) or resurrected on the basis of well-supported clades only (>90% bootstrap values). Current taxon names supported by previous molecular or morphological studies are retained if congruent with our results, even if bootstrap support is low (e.g., Osteoglossocephalai sensu Arratia79 with only 38% bootstrap). In some cases, ordinal or subordinal taxa that were not monophyletic in our analysis are also validated, as long as the incongruence is not supported by strong bootstrap values. Examples include the suborder Blennioidei (not monophyletic here but monophyletic in Wainwright et al.31) and the order Pleuronectiformes (not monophyletic here but monophyletic in Betancur-R. et al.28).
Family names for bony fishes are based on Eschmeyer and Fong89 and van der Laan et al.90, with minor modifications. Consult van der Laan et al.90 for authorship of family names and Wiley and Johnson5 for authorship of ordinal and subordinal names. Our list is not intended as a comprehensive revision of valid family names; instead, it is simply an adaptation of their list based on published studies that we know validate or synonymize family groups using explicit phylogenetic evidence. Unlike Eschmeyer and Fong89 and van der Laan et al.90, we do not recognize the family status of Anotopteridae, Omosudidae (synonyms of Alepisauridae91) or Latidae (synonym of Centropomidae27,92). Also, we recognize the following families, listed in Eschmeyer and Fong89 and van der Laan et al. 90 as synonyms or subfamilies of other families: Botiidae (following Chen et al.93), Diplophidae (following Nelson2; apparently omitted by Eschmeyer and Fong89), Horabagridae (following Sullivan et al.94), Sinipercidae (following Li et al.96), Steindachneriidae (following Roa-Varon and Ortí98), Zanclorhynchidae, the aulopiform Bathysauropsidae and Sudidae (following Davis91), and the pleuronectiform Paralichthodidae, Poecilopsettidae, and Rhombosoleidae (following Chapleau97, Munroe99, Betancur-R. et al.28). A total of 502 families are recognized here, of which 369 (73.5%) were examined. Of these, 146 families included only one representative (39.6%) and 40 (17.9%) of the remaining 223 were rendered non-monophyletic in our analysis (non-monophyletic families are indicated below). For each order/suborder we list all families examined as well as the unexamined families whose taxonomic affinity is expected on the basis of traditional taxonomy or phylogenetic evidence. The list of unexamined families is also intended as a resource that may help fish systematists to direct future sequencing efforts.
A total of 66 orders are classified, three of which are new (Holocentriformes, Istiophoriformes, and Pempheriformes), and 15 are resurrected or validated under a new circumscription. Some ordinal or subordinal names may appear to be new, but most can be found in the literature at various hierarchical levels. As examples, Spariformes is a Bleeker name and Centrarchiformes is a Webber and de Beaufort name. Because priority is not applied to names above the family level, we have not made a thorough attempt to establish first use. Only those three for which no reference could be found are listed as “new.” New infraorders are named in Suborder Cottioidei to circumscribe well-corroborated clades and may conserve the rank of superfamily in subsequent revisions. The ordinal status of 50 percomorph families examined (as well as many others unexamined) belonging to Carangimorphariae, Ovalentariae, and Percomorpharia remains uncertain (i.e.,incertae sedis) due to poor phylogenetic resolution. Percentages in parentheses following names indicate bootstrap support (no bootstrap values shown for redundant groups or monotypic taxa). The complete phylogenetic tree with annotated classification is illustrated in Fig. S1. The new classification scheme presented here should be considered a work in progress (version 1; Appendix 2), as any other hypothesis. It is likely to include involuntary errors and omissions in addition to the many unexamined, sedis mutabilis, and incertae sedis taxa. Updates should be forthcoming as new evidence become available and feedback from experts help refine it. For the most updated version visit DeepFin.
Comparison of classifications
Our results (Appendix 2) invite comparison to the recent classification of Wiley and Johnson5 based on morphological evidence gleamed from many investigators. Of 123 clades recognized by them, 70 (56.9%) are congruent with bootstrap values >95% obtained in this study. Five of these 70 clades are included in our sample by only one family and thus their monophyly is not critically tested. Another six clades (4.9%) are congruent but are supported by lower bootstrap values; seven additional clades (5.7%) are monotypic. Forty clades (32.5%) are incongruent, with some being grossly polyphyletic in our tree. Notable examples are Protacanthopterygii, Smegmamorpharia, and Labriformes. Others are incongruent based on exclusion of subclades and are rendered monophyletic in our classification by the addition or removal of smaller clades. Examples include Stomiatii (inclusion of Osmeriformes sensu stricto), Otomorpha (inclusion of Alepocephaliformes), Neoteleostei (removal of Stomiatiformes), and Lampridiformes (removal of Stylephorus).
There is considerable consensus between morphology and the interrelationships of major clades. For example, the major cohorts of living teleosts and their interrelationships are congruent with the listing convention employed by Wiley and Johnson5; this is also true within many of the major clades (e.g. relationships within Elopomorpha). But there is also incongruence. For example, relationships among early-branching acanthomorph groups differ considerably from previous morphological hypotheses (e.g., Johnson and Patterson4) with lampridiforms, percopsiforms, zeiforms and gadiforms branching off basally relative to polymixiiforms. More explicit tests of new and alternative phylogenetic hypotheses based on multiple analyses of our dataset are presented in Fig. 2.
Novel Clades of Teleost Fishes
The following sections highlight some of the salient features of this global phylogeny and classification of bony fishes, especially in reference to well-established relationships and newly found clades among the euteleosts. We do not attempt to provide a complete account of all taxonomic issues, but to give some perspective and contrast to discuss the evidence supporting novel and established taxa.
Early euteleost lineages: tenuous relationships (Fig. 1)
Our analyses support several recent hypotheses based on molecular data that contradict the consensus based on morphology2,5 relative to the composition of “protacanthopterygians.” Although our results fall short of resolving with confidence circumscription and relationships among taxa in this group (hence Protacathopterygii is a sedis mutabilis taxon in our proposed classification), some relationships are well supported and consistent with previous studies (Fig. 1). First, is the hypothesis that alepocephalid fishes (slickheads) have affinities within Otomorpha, instead of Argentiformes, as proposed by Johnson and Patterson4. This result was first proposed on the basis of mitogenomic data10,41,100,101 and recently corroborated with a subset of the nuclear markers used in this study29. Second, is the sister group relationship of Osmeriformes and Stomiatiformes (=Stomiiformes), first proposed by López et al.21 based on mtDNA and rag1 sequence data. Finally, the position of Lepidogalaxias at the base of the euteleosts rendering Galaxiidae non-monophyletic also was proposed previously102,29 and supported by our data (see also Fig. 2).
Paracanthomorphacea: mitogenomics dixit (Fig. 1)
This name was first introduced as superorder Paracanthopterygii (sensu Greenwood et al.1) to refer to a large group of spiny-finned fishes that included Batrachoidiformes, Gadiformes (with Ophioidei and Zoarcoidei), Gobiesociformes, Lophiiformes, and Percopsiformes. Many other taxa were added and also removed on the basis of conflicting evidence ever since Paracanthopterygii was conceived, but a conservative stance persisted in classifications supporting the original circumscription, with the exclusion of Gobiesociformes2. More recently, mitogenomic data7,8 discovered a sister-group relationship between Zeiformes and Gadiformes, a result also obtained with nuclear genes 19,24,103; the name Zeioigadiformes24 was coined for this new grouping. Miya et al.11 redefined the Paracanthopterygii to include Polymixiidae, Percopsiformes, Gadiformes, and Zeioidei and subsequently Miya et al.13 added to this group the lampridiform genus Stylephorus, which was unexpectedly found to form the sister group of Gadiformes. Analysis of four nuclear markers in addition to mtDNA confirmed this result103, supporting a monophyletic taxon Paracanthopterygii that includes percopsiforms, gadiforms, Stylephorus (placed in its own order Stylephoriformes) and zeiforms, in agreement with our results (Fig. 1, 2). A review of published morphological characters by Borden et al.105 also found significant congruence between this arrangement and morphological character-state distributions for many of the proposed relationships.
Euacanthomorphacea: holocentrids sister to percomorphs (Fig. 1)
Johnson and Patterson4 included polymixiids, percopsids and crown acanthomorphs in their Euacanthopterygii, a taxon not classified by Wiley and Johnson5. We adopt the name but modify the circumscription to recognize a well-supported clade (99% bootstrap) that includes beryciforms, holocentrids and percomorphs. The main issue at this level is delimitation of Beryciformes and relationships of its proposed components to Percomorphaceae. Most classifications2,4 accept separate orders Stephanoberyciformes and Beryciformes, each monophyletic and placed as successive sister-groups of the percomorphs. Molecular data (mitogenomic and smaller subsets of nuclear genes), in contrast, have supported the inclusion of Stephanoberyciformes in the same clade as Beryciformes8,29 and consistently include holocentrids within this clade. Our results, however, reject this hypothesis in favor of recognizing a separate holocentrid clade (proposed here as a new order, Holocentriformes) that is sister to percomorphs (Fig. 1), a result first obtained by Stiassny and Moore75 and Moore 76 but subsequently challenged by Johnson and Patterson4. Despite relatively low support for our holocentrid-percomorph clade (57-69% bootstrap), proportionally more individual gene trees support this relationship (47%) relative to the alternative molecular hypothesis uniting holocentrids with the remaining beryciform groups (20%; Fig. 2). Our new circumscription of Beryciformes is also most similar to that of the order Trachichthyiformes described by Moore76, except that the latter excludes the berycids.
Percomorphaceae: no longer an unresolved bush (Figs. 1-10)
A major contribution from our study has been the disambiguation of the percomorph bush into nine well-supported supraordinal groups (six Series and three Subseries; Fig. 1; Appendix 2): Ophidiimorpharia, Batrachoidimorpharia, Gobiomorpharia (Fig. 3), Scombrimorpharia (Figs. 4 and 5), Carangimorpharia (with three Subseries: Anabantomorphariae, Fig. 6; Carangimorphariae, Fig. 7; and Ovalentariae, Fig. 8), and Percomorpharia (Figs. 9). Furthermore, increased phylogenetic resolution within Percomorpharia allowed the definition of a monophyletic Perciformes (Figs. 9 and 10), for the first time recovered from a vast taxonomic sample. With the exception of the cusk-eels (Ophidiimorpharia) and the toadfishes (Batrachoidimorpharia), whose monophyly has been recognized in most classifications (i.e., 2,5; but see 106,107), the remaining seven supraordinal clades (four Series and three Subseries) have never been discovered by examination of anatomical features. Under different combinations of taxa, however, and based on diverse genetic markers, several of these clades have been obtained, in one form or another, by previous molecular studies (e.g.,7,8,11,12,19,20,24,27,28,29,30,31,32,33).
A corollary of the increased resolution of percomorph relationships is the demise of the Smegmamorpharia sensu Johnson and Patterson4 (see also Wiley and Johnson5; Fig. 2). Elements included in this supraordinal taxon are now scattered throughout the molecular phylogeny, placed within many of the newly found clades with high bootstrap support. For example, the pygmy sunfishes (Elassoma) are back with the other sunfishes (centrarchids), as suggested by earlier classifications and recently confirmed by molecules30. Centrarchids plus elassomatids are placed here in the resurrected order Centrarchiformes (within Percomorpharia, Fig. 9). Mugiliforms (mullets) and atherinomorphs (silversides, needlefishes, halfbeaks, guppies and allies) are placed within Ovalentariae (Fig. 8). The swamp eels and spiny eels (order Synbranchiformes, suborders Synbranchoidei and Mastacembeloidei) are placed with confidence in Anabantomorphariae (Fig. 5), together with armored sticklebacks (Indostomidae), one of the 11 families previously included in the order Gasterosteiformes. The polyphyly of Gasterosteiformes (another large clade assigned to Smegmamorpha) was first pointed out by mitogenomic evidence12. Our results place the sticklebacks, tubesnouts and sand eels (previously assigned to Gasterosteoidei) in our newly defined Perciformes (suborder Cottioidei; Fig. 10) and the rest of the families previously assigned to the suborder Syngnathoidei were relocated to our newly defined order Syngnathiformes within the Scombrimorpharia (Fig. 4, see below).
Phylogenetic resolution within five newly discovered clades, however, will require additional study. Relationships within Syngnathiformes, Scombriformes, Carangimorphariae, Ovalentariae, and Percomorpharia may be challenging to recover given the rapid radiation and diversification of these clades.
Gobiomorpharia: sweepers are out (Fig. 3)
Based on a phylogeny estimated with four mitochondrial markers, Thacker33 resurrected the order Gobiiformes, to accommodate three suborders: Gobioidei (gobies and sleepers), Kurtidoidei (nurseryfish), and Apogonoidei (including apogonids and pempherids). Previous molecular studies have shown affinities between gobioids, apogonids, kurtids and, to some extent, pempherids and dactylopterids8,11,16. There is also morphological evidence supporting a close relationship between gobids and apogonids108,109 as well as between kurtids and apogonids110. Our results provide partial support for the Gobiiformes sensu Thacker33 but we treat it here as a supraordinal group (Gobiomorpharia). A major difference is that our hypothesis segregates the family Pempheridae (sweepers) to its own order (Pempheriformes, together with Glaucosomatidae), within Percomorpharia (Figs. 1, 3, 9).
Scombrimorpharia: sea horses and tunas are close relatives (Figs. 1, 4 and 5)
One of the most unanticipated new percomorph clades is the Scombrimorpharia, grouping such disparate fishes as seahorses and tunas. This clade includes the newly circumscribed orders Syngnathiformes (Fig. 4) and Scombriformes (Fig. 5). Not surprisingly, a close relationship among taxa contained within this group, including syngnathids, mullids, callionymids, dactylopterids, scombrids, stromateids, an others, has never been proposed on morphological grounds. The Syngnathiformes, as defined here (Fig. 4), comprises mostly tropical marine reef-dwellers, traditionally placed in three distinct percomorph orders, including Gasterosteiformes (syngnathids), “Perciformes” (mullids and callionymids) and “Scorpaeniformes” (dactylopterids). Recent molecular studies have emphasized the non-monophyly of Scorpaeniformes74. We have noted above the dissolution of Gasterosteiformes12 and, as discussed below, we provide a restricted definition for Perciformes that includes many scorpaeniform taxa (Fig. 10).
Our new order Scombriformes (Fig. 5) includes most of the families previously grouped in the perciform suborder Scombroidei2 or the order Scombriformes5, except for the barracudas (Sphyraenidae) and the billfishes and swordfishes (here placed in their own order, Istiophoriformes). Sphyraenidae and Istiophoriformes are now firmly placed within Carangimorphariae (Fig. 7) together with disparate taxa such as remoras (Echeneidae), archer fishes (Toxotidae), jacks (Carangidae), flatfishes (Pleuronectiformes), and others (see below). Because billfishes and tunas are not closely related as previously suggested by anatomical studies83 (Fig. 2), the new hypothesis implies that endothermy has evolved at least twice independently in teleosts111,112. This new circumscription of Scombriformes also comprises families belonging to multiple orders in previous classifications, such as Stromateiformes (Centrolophidae, Nomeidae, Ariommatidae, Stromateidae), Trachiniformes (Chiasmodontidae), Icosteiformes (Icosteidae), and Perciformes (Bramidae, Pomatomidae, and Caristiidae). Despite the disparate morphology among members of Scombriformes, most are offshore fishes that inhabit pelagic and/or deep-sea waters.
Anabantomorphariae: freshwater and air breathing (Fig. 6)
Another major percomorph group proposed here is the series Carangimorpharia, including three subseries: Anabantomorphariae, Carangimorphariae, and Ovalentariae (Fig. 1). Species in Anabantomorphariae include representatives placed in three separate orders by Wiley and Johnson5: Synbranchiformes (swamp eels), Gasterosteiformes (Indostomus, the armored stickleback), and Anabantiformes (gouramis) (Fig. 6). While the first two orders belonged to the Smegmamorpharia4,5, the Anabantiformes were placed as incertae sedis in Percomorphacea5. The monophyly of Anabantomorphariae has also been supported on the basis of mitogenomics8,11,12 and nuclear markers28. A remarkable condition shared by members of this novel grouping is their mostly freshwater origin and restriction to Africa and South East Asia (although some members in the family Synbranchidae occur in Mexico, and Central and South America). Most are able to occupy marginal, stagnant waters due to their capacity to tolerate anoxia and to obtain oxygen directly from the air. Anabantiforms have a suprabranchial organ and synbranchids have suprabranchial pouches with respiratory function.
Carangimorphariae: flatfishes and unlikely relatives (Fig. 7)
A close affinity between other seemingly disparate groups, including barracudas, swordfishes, jacks, flatfishes, and others, has been well established by recent molecular studies10,16,19,24,27,28,112 (Fig. 7). This higher-level group has been referred to as ‘‘clade L’’ sensu Chen et al.19 or Carangimorpha by Li et al.24 (see also27,28). In looking for possible anatomical synapomorphies uniting flatfishes, billfishes, and carangids, Little et al.112 found that most taxa share a relatively low number of vertebrae, have multiple dorsal pterygiophores inserting before the second neural spine, and lack supraneurals, among others. However, according to Friedman113, some of these characters are symplesiomorphies while others are absent in the remaining carangimorph groups. It thus seems paradoxical that despite the apparent lack of morphological synapomorphies for carangimorphs there is a strong molecular signal supporting their monophyly, whereas the opposite is true for pleuronectiforms28. For additional insights and discussion on Carangimorphariae we refer the reader to recent studies24,27,28,112,113.
Ovalentariae: sticky eggs (Fig. 8)
Ovalentariae is one of the most spectacular percomorph radiations, including more than 5000 species in some 44 families, grouping seemingly distinct groups such as cichlids, mullets, blennies, and atherinomorphs (atheriniforms, beloniforms, and cyprinodontiforms). This clade was first found on the basis of mitogenomic evidence8,12 and later confirmed with nuclear sequence data23,24,26,31. Our results suggest that this group can be divided into four subgroups (superorders), two of which already existed (Atherninomorphae and Mugilomorphae) and two that are new: (i) Cichlomorphae (Cichlidae plus Pholidichthyidae) and (ii) Blennimorphae (blennioids plus clingfishes, jawfishes and basslets). Many families in Ovalentariae, however, remain incertae sedis (e.g., Embiotocidae and Pseudochromidae). Two different studies have coined a name for this group; first Stiassnyiformes by Li et al.24 and, more recently, Ovalentaria by Wainwright et al.31 for their characteristic demersal, adhesive eggs with chorionic filaments (lost secondarily in some groups). An interesting implication of this phylogenetic hypothesis is that the pharyngeal jaw apparatus (pharyngognathy), present in many members of this clade (e.g., Cichlidae, Pomacentridae, Hemiramphidae), has evolved multiple times in percomorphs31. We refer the reader to Wainwright et al.31 for additional discussion on Ovalentariae.
Percomorpharia: the new bush at the top (Fig. 9)
Percomorpharia is by far the largest percomorph clade, including 11 orders with some of the most prominent ones such as Perciformes, Labriformes, Lophiiformes, and Tetraodontiformes. At least 151 families (105 examined) belong in Percomorpharia, including three of the top ten most diverse families of fishes (i.e., Labridae, Serranidae, and Scorpaenidae)2. More than one third (514) of the species in our bony fish phylogeny are placed in this clade. Previous molecular studies obtained monophyletic groups with a combination of taxa here assigned to Percomorpharia, but with far more limited sampling (e.g., 8,11,16,74). Although most family-level and ordinal groups within Percomorpharia receive high bootstrap support, interrelationships among them are largely unresolved (hence, the new bush at the top; Fig. 9). Several of these groups are newly proposed or resurrected orders under new circumscription (e.g., Uranoscopiformes, Ephippiformes, Pempheriformes). Our new arrangement removes anglerfishes (Lophiiformes) from Paracanthomorphacea, as was suggested by previous classifications78