A time-calibrated, multi-locus phylogeny of piranhas and pacus
(Characiformes: Serrasalmidae) and a comparison of species tree
Andrew W. Thompson
, Ricardo Betancur-R.
, Hernán López-Fernández
, Guillermo Ortí
Department of Biological Sciences, The George Washington University, 2023 G St. NW, Washington, DC 20052, USA
Department of Biology, University of Puerto Rico – Rio Piedras, P.O. Box 23360, San Juan, PR 00931, USA
Department of Natural History, The Royal Ontario Museum, 100 Queens Park, Toronto, ON M5S 2C6, Canada
Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks Street, Toronto, ON M5S 3B2, Canada
Received 12 September 2013
Revised 17 June 2014
Accepted 18 June 2014
Available online 28 September 2014
The phylogeny of piranhas, pacus, and relatives (family Serrasalmidae) was inferred on the basis of DNA
sequences from eleven gene fragments that include the mitochondrial control region plus 10 nuclear
genes (two exons and eight introns). The new data were obtained for a representative sampling of 53
specimens, collected from all major South American rivers, accounting for over 40% of the valid species
and all genera excluding Utiaritichthys. Two fossil calibration points and relaxed-clock Bayesian analyses
were used to estimate the timing of diversiﬁcation. The new multilocus dataset also is used to compare
several species-tree approaches against the results obtained using the concatenated alignment analyzed
under maximum likelihood and Bayesian inference. Individual gene trees showed substantial topological
discordance, but analyses based on concatenation and Bayesian and maximum likelihood-based species
trees approaches converged onto a single phylogeny. The resulting phylogenetic hypothesis is robust and
supports a division of the family into three major clades, consistent with previous results based on mito-
chondrial DNA alone. The earliest branching event separated a ‘‘pacu’’ clade (Colossoma,Mylossoma and
Piaractus) from the rest of the family in the Late Cretaceous (over 68 Ma). The other two clades, that
contain most of the diversity, are formed by the ‘‘true piranhas’’ (Metynnis,Pygopristis,Pygocentrus,
Pristobrycon,Catoprion, and Serrasalmus) and the Myleus-like pacus (the Myleus clade). The ‘‘true’’ piranha
clade originated during the Eocene (!53 Ma) but the most recent diversiﬁcation of ﬂesh-eating piranhas
within the genera Serrasalmus and Pygocentrus did not start until the Miocene (!17 Ma). A comparison of
species tree approaches indicates that most methods tested are consistent with results obtained by con-
catenation, suggesting that the gene-tree incongruence observed is mild and will not produce misleading
results under simple concatenation analysis. Non-monophyly of several genera (Pristobrycon,Tometes,
Myloplus,Mylesinus) and putative species (Serrasalmus rhombeus) was obtained, suggesting that further
study of this family is necessary.
!2014 Elsevier Inc. All rights reserved.
Piranhas are among the most notorious South American ﬁshes.
Their widespread reputation stems from their predatory habits and
remarkable feeding adaptations, including sharp, triangular teeth
and the strongest bite force measured in any ﬁsh to date
(Grubich et al., 2012). Piranhas and their relatives, commonly
known as ‘‘pacu’’ or ‘‘tambaqui,’’ are classiﬁed in the well-deﬁned
family Serrasalmidae, order Characiformes (Calcagnotto et al.,
2005; Oliveira et al., 2011). This family is widely distributed in
all major South American river systems east of the Andes. Serrasal-
mids occupy a diverse array of habitats from lowland ﬂoodplains,
ﬂooded forests, and upstream headwater regions of rivers (Géry,
1977, 1984; Lowe-McConnel, 1975) where they perform unique
ecological functions and sustain important continental ﬁsheries
and aquaculture (Araujo-Lima and Goulding, 1997). Several species
are carnivorous predators that bite off pieces of ﬂesh from other
ﬁshes, whereas other predatory piranhas are specialized in nipping
ﬁns and even scraping scales off other ﬁshes (lepidophagy)
(Goulding, 1980; Winemiller, 1989). The majority of serrasalmids
1055-7903/!2014 Elsevier Inc. All rights reserved.
Corresponding author. Fax: +1 202 994 6100.
E-mail address: firstname.lastname@example.org (G. Ortí).
Molecular Phylogenetics and Evolution 81 (2014) 242–257
Contents lists available at ScienceDirect
Molecular Phylogenetics and Evolution
journal homepage: www.elsevier.com/locate/ympev
commonly known as pacu (common names vary by country and in
the aquarium trade), however, are herbivores or omnivores that
eat seeds, fruits, leaves, and various invertebrate and vertebrate
preys. Seed eating appears to have evolved repeatedly among her-
bivorous serrasalmids (Correa et al., 2007), and species in the pacu
genera Colossoma,Mylossoma and Piaractus are effective seed dis-
persers that may have a fundamental role in maintaining tree
diversity in Amazonian lowland forests (Anderson et al., 2009,
2011; Goulding, 1980). The fossil record of Serrasalmidae, dating
back to the Cretaceous-Paleocene, reveals dentition patterns simi-
lar to those of living taxa, suggesting that diversiﬁcation of trophic
ecology occurred early in the evolution of serrasalmids (Gayet and
The Serrasalmidae includes about 87 valid species (Eschmeyer
and Fong, 2013; Jégu, 2003) in 16 genera of fossil and extant taxa:
Pygocentrus,Pygopristis,Tometes, and Utiaritichthys.Serrasalmus
and Pygocentrus species (‘‘piranhas’’) have one row of serrated tri-
cuspid teeth on the upper and lower jaw, while other species
(‘‘pacu’’) have two rows of incisors or molariform teeth on the pre-
maxilla, one row of teeth on the dentary, and often one pair of
symphyseal teeth. Teeth of lepidophagous species aid in removing
scales from other ﬁshes (Leite and Jégu, 1990; Nico and Taphorn,
1988). This variability in tooth arrangement and morphology has
traditionally been used to infer morphological phylogenies and to
propose classiﬁcation schemes.
Eigenmann (1915) classiﬁed serrasalmids into two subfamilies:
Serrasalminae (characterized by having a single row of teeth on
each jaw) and Myleinae (including the lepidophagus Catoprion;
diagnosed by the presence of two rows of premaxillary teeth).
Later classiﬁcations (Géry, 1977; Gosline, 1951; Norman, 1929) dif-
fered only by assignment of taxonomic rank. Machado-Allison
(Machado-Allison, 1982, 1983, 1985; Machado-Allison et al.,
1989) performed the ﬁrst cladistic analysis on serrasalmids using
morphological characters (Fig. 1A). His resulting hypothesis
divided the family into two major clades, one with frugivorous spe-
cies, and another that corresponds to carnivorous species with a
single row of tricuspid teeth. This hypothesis is congruent with
Eigenmann’s except that the genera Catoprion and Metynnis are
placed in the Serrasalminae (or ‘‘piranha clade’’).
Ortí et al. (1996) inferred the ﬁrst molecular phylogeny of Ser-
rasalmidae using mitochondrial DNA (mtDNA) 12S and 16S rRNA
markers (Fig. 1B). They found three major lineages: (1) a ‘‘pacu
clade’’ of herbivores (Colossoma,Mylossoma,Piaractus); (2) a ‘‘Myle-
us clade’’ with the other pacu species (Myleus,Mylesinus,Tometes,
Ossubtus); and (3) a ‘‘piranha clade’’ (Serrasalmus,Pygocentrus,
Pygopristis,Pristobrycon,Catoprion,Metynnis). The genus Acnodon
was placed as the sister taxon of clades 2 and 3. The slowly evolv-
ing mitochondrial rRNA markers did not provide enough resolution
to conﬁdently resolve relationships within these clades. A subse-
quent study (Ortí et al., 2008) based on an augmented mtDNA
dataset with sequences of the highly variable mitochondrial Con-
trol Region (D-loop) also supported this hypothesis (Fig. 1B), but
obtained non-monophyletic assemblages for the genera Pristobry-
con,Myloplus, and Tometes. The phylogeny of Serrasalmidae
resolved into three clades also was obtained with a new analysis
of 102 morphological characters that included the recently
described Miocene fossil Megapiranha paranensis (Fig. 1C; Cione
et al., 2009; Dahdul, 2007). Although the latter two hypotheses
(Fig. 1B and C) are generally congruent, several conﬂicting relation-
ships, low resolution (speciﬁcally in the Myleus clade), and the
indication of non-monophyletic genera remain problematic. Addi-
tional molecular studies addressed relationships among species
within the piranha clade, focusing mostly on the ichthyofauna of
the Orinoco (Venezuela) or Madeira basin in the Bolivian Amazon
(Freeman et al., 2007; Hubert et al., 2007), but as before, these
were based solely on mtDNA sequences. A time-calibrated phylog-
eny for piranhas was proposed using a crude estimate of mtDNA
substitution rate based on the age of separation of the Amazon
and Orinoco basins (Hubert et al., 2007).
Development of new molecular markers (especially nuclear
loci) and advances in genomics have allowed for the expansion
of molecular datasets to increase accuracy in phylogenetic estima-
tion. But larger and more complex datasets bring new methodolog-
ical challenges. Perhaps the most studied of these issues has been
the relationship between discordant gene trees and their contain-
ing species tree, a prosperous new research area in systematics
that has spawned a diversity of new methods to estimate species
trees (Degnan and Rosenberg, 2006, 2009; Edwards, 2008; Heled
and Drummond, 2010; Huang and Knowles, 2009; Knowles,
2009; Knowles and Carstens, 2007; Kubatko et al., 2009; Larget
et al., 2010; Liang and Pearl, 2007; Liu, 2008; Maddison, 1997;
Maddison and Knowles, 2006; Wu, 2011). It is widely accepted that
three biological processes can give rise to gene tree discordances:
(1) xenology (horizontal gene transfer via hybridization or intro-
gression), (2) gene duplication and extinction (hidden paralogy),
and (3) incomplete lineage sorting (ILS, or deep coalescence)
(Maddison, 1997). ILS has been addressed in phylogenetic models
to analyze multi-gene data to infer species trees (Maddison,
1997; Maddison and Knowles, 2006), and a number of computer
programs currently are available that implement these so-called
‘‘species tree methods’’. Although the confounding effect of ILS
on phylogenetic inference has been shown using simulations
(Carstens and Knowles, 2007; Chung and Ane, 2011; Heled and
Drummond, 2010; Wu, 2011), recent studies including angio-
sperms, insects, mammals, birds and ﬁshes (Carstens and
Knowles, 2007; Hobolth et al., 2011; Hollingsworth and Hulsey,
2011; Willis et al., 2007) demonstrate that this process also can
be pervasive in empirical datasets, and may even generate discor-
dance on estimates of ancient divergences (Oliver, 2013). Although
the use of species tree methods has grown extensively in recent
years, analysis via concatenation of multiple loci still is the most
common method of molecular phylogenetic inference. Many simu-
lation studies addressed the efﬁcacy of diverse species tree meth-
ods, but only few studies have compared them or their results
versus concatenation using an empirical dataset (Betancur-R.
et al., 2013; Camargo et al., 2012; Faircloth et al., 2012, 2013;
Huang et al., 2010; McCormack et al., 2012; Sen et al., 2012).
Two main types of species-tree methods exist to reconcile gene-
alogical discordances of genes under coalescent theory: tree-based
and sequence-based methods. Tree-based methods estimate the
species tree topology with branch lengths using as input individual
gene trees inferred independently (and by any criterion). These
approaches use neighbor joining, parsimony, or maximum likeli-
hood optimization to either minimize deep coalescence or to aver-
age ranks of coalescent events in the species tree (Kubatko et al.,
2009; Liu et al., 2009b;Maddison, 1997; Than and Nakhleh,
2009; Wu, 2011). Sequence-based approaches use likelihood mod-
els derived from the coalescent theory to simultaneously maximize
species tree and gene tree probability in a Bayesian framework
(Edwards et al., 2007; Heled and Drummond, 2010; Liu, 2008).
Here, we explore both types of analyses using a new empirical
dataset and compare results obtained by each approach.
The present study investigates the phylogeny of serrasalmids on
the basis of a new multilocus dataset with DNA sequences of 10
nuclear genes plus the mtDNA control region, including represen-
tatives of all extant genera with the exception of Utiaritichthys.We
compare results of analyses of a concatenated dataset versus ﬁve
methods of species tree inference and estimate a time-calibrated
phylogeny for this group based on two fossil calibrations. Our goal
is to use this new dataset to test previous hypotheses of relation-
A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257 243
ships among serrasalmids (so far only based on morphology and
mtDNA), to assess monophyly of problematic genera, and to estab-
lish a time frame for major divergences within this family. Despite
a well-known fossil record, no previous study has produced esti-
mates of serrasalmid divergence times based on fossil-calibrated
molecular phylogenies. Finally, comparison of species tree
Fig. 1. Previous phylogenetic hypothesis for genera in the family Serrasalmidae. (A) Morphological hypothesis by Machado Allison (Machado-Allison, 1982, 1983, 1985;
Machado-Allison et al., 1989), (B) mtDNA phylogeny by Ortí et al. (1996), and (C) morphological hypothesis by Cione et al. (2009) after Dahdul (2007). Major lineages are
colored as follows: red = Pygocentrus; blue = Serrasalmus; cyan = Metynnis; green = Catoprion,Prystobrycon, and Pygopristis, teal = Myloplus,Myleus,Mylesinus,Tometes,
Ossubtus; orange = Acnodon; purple = Mylossoma,Colossoma, and Piaractus.
244 A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257
methods is intended to explore the consistency of these
approaches to explain gene tree discordance using empirical
2. Materials and methods
2.1. Taxon sampling
Tissue samples were obtained from existing collections at the
Royal Ontario Museum, Auburn University, The Smithsonian Trop-
ical Research Institute, Academy of Natural Sciences in Philadel-
phia, and the personal tissue collection of G. Ortí (Table 1). With
the exception of the genus Utiaritichthys, all extant genera were
sampled for this study. Taxonomic names used follow M. Jegu’s
recommendations (Jégu, 2003; Jégu et al., 2004). A total of 38 nom-
inal species are represented in the sample (about 44% of all valid
species). Specimens were collected from all major rivers in South
America, ranging from the Orinoco basin in Venezuela, through
the Amazonas, to the Paraná basin in Argentina (Table 1).
2.2. Molecular markers
For specimens listed in Table 1, we sequenced fragments of 10
nuclear genes and the mitochondrial DNA (mtDNA) control region
(Dloop). The latter was sequenced to augment mtDNA datasets
generated by previous studies on serrasalmids (Freeman et al.,
2007; Ortí et al., 1996, 2008). Two of the nuclear markers (SREB
and PTR) are conserved exons (Li et al., 2007), seven (4174E20,
25073E1, 25073E2, 35692E1, 36298E1, 14867E1, 55305E1) are
new exon-priming intron-crossing (EPIC) markers (Li et al.,
2010), and one EPIC marker (GPD) has been reported before
(Hassan et al., 2002). We chose to sequence non-coding intronic
alleles as they can be more variable and contain more phylogenetic
signal relative to protein-coding exon markers. Exons were chosen
for their prevalent use in higher taxonomic level studies (ETOL:
Euteleost Tree of Life project (Betancur-R. et al., 2013; Li et al.,
Total Genomic DNA was extracted from ﬁn clips or muscle tis-
sue stored in 95% ethanol using a Qiagen DNeasy Blood and Tissue
extraction kit. Primers used for DNA ampliﬁcation are listed in
Table 2. All markers were ampliﬁed by PCR in 30
L reactions con-
L dNTPs (1 mM each), 3
L reaction buffer (10"), 1
of each primer (10
M each), 0.15
L of Takara Ex Taq DNA poly-
L of template DNA and 20.45
O. PCR conditions
for Dloop were as follows: 94 "C (30 s), 34 cycles of 94 "C (30 s),
54 "C (1 min), 72 "C (1:30 min), followed by 72 "C (5 min). All EPIC
markers were ampliﬁed by touchdown PCR as follows: 94 "C
(1 min), 16 cycles of 98 "C (15 s), 58 "C (30 s), 72 "C (1:30 min), fol-
lowed by 16 cycles of 94 "C (45 s), 54 "C (30 s), 72 "C (1:30 min) fol-
lowed by 72 "C (10:00 min). Exon markers were ampliﬁed using
nested PCR as follows: 95 "C (30 s), 28 cycles of 98 "C (10 s),
55 "C (30 s), 72 "C (45 s), followed by 72 "C (5 min) for primer set
one and then 95 "C (30 s), 28 cycles of 98 "C (10 s), 62 "C (30 s),
72 "C (45 s), followed by 72 "C (5 min) for the nested primer set
2. M13 clones (see below) were ampliﬁed with PCR for sequencing.
Amplicons obtained were submitted for puriﬁcation and sequenc-
ing to High Throughput Sequencing Solutions (HTSeq.org), at the
University of Washington, Seattle, Washington.
Non-coding nuclear alleles often contain length variant hetero-
zygosity (LVH). This is problematic with Sanger sequencing
approaches of PCR amplicons as LVH muddles chromatograms
when allelic variants are not isolated before sequencing. We
observed LVH in EPIC sequences by initial sequencing and subse-
quently isolated them by cloning the PCR products using a Pro-
mega pGEM T Vector System kit, following manufacturer’s
protocol. Four to eight clones isolated from each PCR product were
sequenced to conﬁrm LVH between the two allelic sequences. In
the few cases where cloning failed, we used the readable, high
quality portion of the sequence. When two alleles that differed
by two or more base pairs were present in un-cloned samples (evi-
dent by two or more double peaks with similar intensity in chro-
matograms), we created a strict consensus sequence using IUPAC
standard code and used this sequence for both alleles in phyloge-
netic analyses. We used tandem repeat ﬁnder (Sanders and Lee,
2007) to ﬁnd and remove tandem repeat regions in Dloop
sequences to ensure positional homology during alignment (see
Table S1 in Supplementary material for removed regions).
Sequences for each molecular marker were aligned separately
using 50 iterations in SATe (Liu et al., 2009a) via the MAFFT aligner
(Katoh et al., 2005). Problematic alignments with poor homology
regions were then reﬁned in Geneious v5.6 (Geneious) by visual
inspection or using the MUSCLE plug-in Edgar (2004) with default
2.3. Phylogenetic analyses
Data partitions by gene and by codon position (where applica-
ble) were deﬁned for phylogenetic analysis. Each gene and codon
position of exon markers was assigned a substitution model using
Jmodeltest (Posada, 2008) or Treeﬁnder (Jobb et al., 2004). These
models were used for all analyses (see Table 3), except when
implementing the software RAxML (Stamatakis, 2006) for which
we used GTRCAT for the bootstrapping phase and GTRGAMMA
for the ﬁnal ML optimization. Branch support was assessed using
RAxML’s rapid bootstrapping algorithm, with automatic halting
of bootstrap searches.
A total of 53 individuals (38 nominal species) assigned to 40
operational taxonomic units (OTUs) were sampled to represent
the diversity in the family Serrasalmidae and sequenced for 11 loci
(see below and Table 1). This dataset is 99% complete (most indi-
viduals have sequences for all 11 loci) and 100% complete for 40
OTUs (all species have sequences for all 11 loci).
2.3.1. Comparison of concatenation and species tree methods
The dataset was analyzed to compare results of concatenation
versus species-tree methods, and to estimate a time-calibrated
phylogeny for serrasalmids. Analyses of the concatenated dataset
were conducted using maximum likelihood (ML) and Bayesian
approaches, as implemented in RAxML and BEAST (Drummond
and Rambaut, 2007), respectively. Allelic variation at each gene
was collapsed to a single consensus sequence (using ambiguity
codes when necessary) to represent the individual in the concate-
nated ﬁle. The RAxML analyses were partitioned and unlinked
GTRCAT models assigned to each partition. Partitions were deﬁned
a priori by gene type and by codon position for exons, as follows:
nine partitions of nine individual non-coding markers and three
partitions for all exon genes, resulting in twelve partitions for the
concatenated dataset (see details in Table 3). We chose a maxi-
mum likelihood tree obtained with 50 independent RAxML runs
to compare with trees obtained with other methods. Details for
the concatenated BEAST analyses are provided below under the
divergence time estimates (see also individual gene tree inference
We implemented coalescent-based species tree methods using
both the sequence-based method
BEAST (Heled and Drummond,
2010) and tree-based methods such as STAR (Liu et al., 2009b),
STEM (Kubatko et al., 2009), STELLS (Wu, 2011), and MDC
(Maddison, 1997; Than and Nakhleh, 2009). Whereas the former
takes a DNA alignment as input and computes gene trees and spe-
cies trees simultaneously, the latter directly uses gene trees as
input (obtained independently by any criterion) to infer the under-
A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257 245
Specimens and molecular markers used in this study. For each specimen, its taxonomic designation (species name with collection number), collection locality, locus name, and GenBank accession numbers are given. Loci with two
accession numbers for a single specimen represent heterozygote genotypes for which both alleles were sequenced. Asterisks highlight loci for which alleles were determined by cloning (
) or phased out due to one polymorphism (
ROM: Royal Ontario Museum. GOlab: Orti lab tissue collection. ANSP: Academy of Natural Sciences of Philadelphia. AUM: Auburn University Museum. STRI: Smithsonian Tropical Research Institution. INPA: Instituto Nacional de
Pesquisa Amazonica. N. Lujan: specimens and tissues collected by Nathan Lujan.
Taxon Source Tissue voucher
Collection locality 55305-E1 Gpd 25073-E2 14867-E1 25073-E1 35692-E1 36298-E1 4174-E20 SREB PTR D-loop
Acnodon normani 243 GOlab 243 Xingu River, Brazil KC132301
KC131776 KC131695 KC131703 KC131628 KC132212 KC131867 KC132373 KC132239 AF284460
Acnodon oligacanthus 403 GOlab 403 Rio Maroni, Guyana KC132356 KC131846 KC131684 KC131705 KC131585 KC132233 KC131864 KC132374 KC132241 KC131907
Catoprion mento 80 GOlab 80 Unknown KC132304 KC132075 KC131781 KC131689 KC131769 KC131622 KC132205 KC131898 KC132379 KC132245
Catoprion mento 2096 ROM 2096/6600 Rupununi River, Guyana KC132303 KC132074 KC131780 KC131688 KC131768 KC131621 KC132204 KC131899 KC132377 KC132244 KC131912
Colossoma macropomum 216 GOlab 216 Rio Solimoes, Brazil KC132360 KC132073 KC131778 KC131697 KC131701 KC131625 KC132210 KC131904 KC132375 KC132243 AF283963
Metynnis aff. argenteus 1471 ROM 1471 Iriri River-Xingu, Brazil KC132348 KC132076 KC131848 KC131690 KC131773 KC131567 KC132165 KC132381 KC132246 KC131915
Metynnis cf luna 2055 ROM 2055/6559 Rupununi River, Guyana KC132350 KC132080 KC131849 KC131691 KC131775 KC131568 KC132162
KC132384 KC132249 KC131920
Metynnis hypsauchen 238 ROM 238 Río Cinaruco, Venezuela KC132351 KC132079 KC131850 KC131692 KC131774 KC131569 KC132163 KC132383 KC132248 KC131918
GOlab BR1018 Brazil KC132338
KC131826 KC131679 KC131711 KC131578 KC132226 KC132386 KC132252 KC131926
GOlab BR1006 Brazil KC132336
KC131825 KC131680 KC131709 KC131577 KC132228 KC131853 KC132385 KC132251 KC131925
Myleus setiger 1385 ROM 1385 Iriri River-Xingu, Brazil KC132362 KC132085 KC131839 KC131672 KC131708 KC131579 KC132218 KC131860 KC132391 KC132259 KC131948
Myloplus cf. planquettei ANSP No tag Essequibo River,
KC132339 KC132081 KC131828 KC131683 KC131718 KC131573 KC132220 KC131857 KC132387 KC132253 KC131927
Myloplus rhomboidalis 1555 ROM 1555 Iriri River-Xingu, Brazil KC132345 KC132083 KC131841 KC131694 KC131707 KC131575 KC132214 KC131862 KC132388 KC132254 KC131929
Myloplus rubripinnis 2090 ROM 2090/6594 Rupununi River, Guyana KC132358 KC131837 KC131682 KC131723 KC131583 KC132231 KC131861 KC132389 KC132255 KC131936
Myloplus schomburgkii 233 INPA 233 Rio Urubu, Brazil KC132346 KC132084 KC131842 KC131671 KC131724 KC131571 KC132216 KC131863 KC132390 KC132257 AF283968
Myloplus torquatus P4661 AUM P4661 Unknown KC132340 KC132086 KC131831 KC131677 KC131714 KC131581 KC132224 KC131854 KC132392 KC132260 KC131963
Mylossoma duriventre 203 INPA 203 Rio Solimoes, Brazil KC132359 KC132077 KC131844 KC131699 KC131702 KC131629 KC132161 KC132382 KC132247 AF283961
Ossubtus xinguense 1636 ROM 1636 Iriri River-Xingu, Brazil KC132352 KC132087 KC131840 KC131681 KC131706 KC131582 KC132215 KC131866 KC132393 KC132261 KC131964
Piaractus brachypomus 200 GOlab 200 Rio Solimoes, Brazil KC132365 KC131847 KC131696 KC131700 KC131627 KC132211 KC131903 KC132394 KC132262 AF283958
Pristobrycon calmoni T05407 ROM T05407 Waini River, Guyana KC132312 KC132099 KC131818 KC131631 KC131762 KC131595 KC132182 KC131891 KC132406 KC132274 KC131967
Pristobrycon stiolatus 248 ROM 248 Río Cinaruco, Venezuela KC132334 KC132096 KC131785 KC131685 KC131766 KC131619 KC132202 KC131896 KC132402 KC132269 KC131983
Pristobrycon stiolatus 400 ROM 400 Rio Maroni, Guyana KC132335 KC132097 KC131786 KC131693 KC131767 KC131620 KC132203 KC131897 KC132403 KC132271 KC131984
Pygocentrus nattereri T06576 ROM T06576 Pirara River, Guyana KC132306 KC132092 KC131787 KC131667 KC131743 KC131612 KC132168 KC131892 KC132398 KC132266 KC131979
Pygocentrus nattereri 3245 STRI 3245 Amazon River, Peru KC132307 KC132090 KC131788 KC131668 KC131745 KC131613 KC132199 KC131893 KC132397 KC132265 KC131976
Pygocentrus piraya 2595 ROM 2595/7151 Rio Sao Francisco, Brazil KC132332 KC132093 KC131790 KC131666 KC131728 KC131614 KC132166 KC131894 KC132399 KC132267 KC131980
Pygocentrus piraya 2596 ROM 2596/7152 Rio Sao Francisco, Brazil KC132316 KC132095 KC131791 KC131665 KC131726 KC131615 KC132167 KC131895 KC132400 KC132268 KC131981
246 A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257
Table 1 (continued)
Taxon Source Tissue voucher
Collection locality 55305-E1 Gpd 25073-E2 14867-E1 25073-E1 35692-E1 36298-E1 4174-E20 SREB PTR D-loop
Pygopristis denticulata T05397 ROM T05397 Moruka River, Guyana KC132354 KC132089 KC131783 KC131687 KC131771 KC131624 KC132201 KC131900 KC132396 KC132264 KC131970
Pygopristis denticulata PNG 141 GOlab PNG 141 Unknown KC132353 KC132088 KC131782 KC131686 KC131770 KC131623 KC132200 KC131902 KC132395 KC132263 KC131969
Serrasalmus altuvei 246 ROM 246 Rio Cinaruco, Venezuela KC132308 KC131804 KC131653 KC131741 KC131586 KC132179 KC131856 KC132404 KC132272
Serrasalmus altuvei 247 ROM 247 Rio Cinaruco, Venezuela KC132313 KC132098 KC131806 KC131652 KC131731 KC131589 KC132180 KC131871 KC132405 KC132273 KC131986
GOlab 221 Rio Solimoes, Brazil KC132309 KC132100 KC131817 KC131640 KC131733 KC131593 KC132189 KC131875 KC132407 KC132275 KC131989
Serrasalmus eigenmanni 2121 ROM 2121/6625 Rupununi River, Guyana KC132325 KC132101 KC131813 KC131641 KC131761 KC131610 KC132176 KC131876 KC132408 KC132276 KC131990
Serrasalmus eigenmanni BR928 GOlab BR928 Brazil KC132328 KC132102 KC131814 KC131660 KC131748 KC131606 KC132170 KC131890 KC132409 KC132277 KC131991
Serrasalmus gouldingi V5270 AUM V5270 Venezuela KC132327 KC132105 KC131810 KC131645 KC131760 KC131601 KC132174 KC131889 KC132411 KC132279 KC131993
Serrasalmus gouldingi P4660 AUM P4660 Unknown KC132319 KC132104 KC131797 KC131643 KC131754 KC131598 KC132175 KC131887 KC132410 KC132278 KC131992
Serrasalmus humeralis 6938 ANSP 6938 Lawa River, Suriname KC132326 KC132107 KC131816 KC131644 KC131765 KC131607 KC132172 KC131884 KC132413 KC132281 KC131995
Serrasalmus humeralis 6928 ANSP 6928 Lawa River, Suriname KC132329 KC132106 KC131815 KC131646 KC131764 KC131608 KC132171 KC131885 KC132412 KC132280 KC131994
Serrasalmus manueli V5269 AUM V5269 Venezuela KC132324 KC132110 KC131823 KC131634 KC131750 KC131597 KC132198 KC131882 KC132416 KC132285 KC132001
Serrasalmus manueli 4331 ANSP 4331 Rio Orinoco, Venezuela KC132321 KC132109 KC131796 KC131635 KC131755 KC131600 KC132197 KC131881 KC132415 KC132283 KC131998
Serrasalmus manueli 1380 ROM 1380 Iriri River-Xingu, Brazil KC132320 KC132108 KC131795 KC131658 KC131749 KC131599 KC132187
KC132414 KC132282 KC131997
Serrasalmus marginatus 2550 STRI 2550 Rio Parana, Argentina KC132317 KC132111 KC131807 KC131655 KC131729 KC131591 KC132184 KC131878 KC132417 KC132286 KC132003
Serrasalmus medinai 413 GOlab 413 Unknown KC132323 KC132112 KC131812 KC131636 KC131737 KC131605 KC132179 KC131880 KC132418 KC132287 KC132004
Serrasalmus rhombeus P6307 ANSP P6307 Rio Nanay, Peru KC132330
KC131809 KC131642 KC131735 KC131590 KC132209 KC131877 KC132423 KC132290 KC132010
Serrasalmus rhombeus 220 GOlab 220 Rio Negro-Solimoes,
KC132364 KC132114 KC131803 KC131637 KC131738 KC131603 KC132183 KC131879 KC132422 KC132289 AF283951
Serrasalmus rhombeus 1353 ROM 1353 Iriri River-Xingu, Brazil KC132310 KC132113 KC131799 KC131649 KC131734 KC131592 KC132190 KC131873 KC132420 KC132288 KC132006
Serrasalmus spilopleura 604 STRI 604 Amazon River, Peru KC132315 KC132116 KC131793 KC131632 KC131763 KC131618 KC132193 KC131872 KC132426 KC132292 KC132019
Serrasalmus spilopleura 2326 STRI 2326 Rio Parana, Argentina KC132333 KC131792 KC131633 KC131753 KC131617 KC132195 KC131874 KC132425 KC132291 KC132016
Tometes sp. 246 GOlab 246 Rio Xingu, Brazil KC132363 KC131835 KC131673 KC131717 KC131576 KC132230 KC131856 KC132427 KC132294 KC132021
Tometes sp. 7022 ANSP 7022 Litanie River, Suriname KC132343 KC132117 KC131830 KC131670 KC131720 KC131574 KC132219 KC131858 KC132428 KC132296 KC132024
Serrasalmus sp. 1 N.
UID 1 Unknown KC132318
KC131801 KC131661 KC131616 KC132207 KC131869 KC132372 KC132297 KC132029
Myloplus sp. 11 N.
UID 11 Unknown KC132347 KC132118 KC131833 KC131674 KC131721 KC131572 KC132221 KC131851 KC132429 KC132298 KC132031
Serrasalmus sp. 19 N.
UID 19 Unknown KC132314 KC132119 KC131798 KC131656 KC131752 KC131611 KC132185 KC131870 KC132430 KC132299 KC132039
Myloplus sp. 4 N.
UID 4 Unknown KC132342 KC132120 KC131832 KC131676 KC131713 KC131580 KC132222 KC131852 KC132431 KC132300 KC132062
A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257 247
lying species tree. To estimate individual gene trees, we imple-
mented the uncorrelated lognormal model (UCLN) of rate variation
(Drummond et al., 2006) in BEAST v1.7.0 with a gamma prior dis-
tribution of the mean to construct trees for each of the 11 genes.
For each gene alignment, the Markov chain Monte Carlo (MCMC)
was run for 5.0 "10
generations. All individual exon alignments
were partitioned by codon position (see Table 3 for models). Each
gene and codon position of exon markers was assigned a well-ﬁt-
ted substitution model using Jmodeltest or Treeﬁnder. Because
some species-tree methods require rooted gene trees (e.g., STEM),
they were all rooted with Piaractus arbitrarily chosen from the ear-
liest branching pacu clade (Colossoma,Mylossoma,Piaractus)
among serrasalminds. A basal branching of this clade is supported
by phylogenetic results from previous molecular studies
(Calcagnotto et al., 2005; Javonillo et al., 2010; Oliveira et al.,
2011; Ortí et al., 1996, 2008). Convergence for BEAST runs was esti-
mated with likelihood plots and effective sample size (ESS) esti-
mated with Tracer v1.5 (Rambaut and Drummond, 2007), and
the ﬁrst 20% of the posterior distribution was discarded as burn-in.
BEAST implements a sequence-based coalescent approach that
estimates the posterior probability of all gene trees and the species
tree simultaneously. The
BEAST species-tree analyses were con-
ducted using the two alleles from each individual, separately for
each gene. Homozygous sequences of nuclear genes were dupli-
cated for species tree methods (the two alleles must be present).
If only one polymorphism was present, each allele was assigned
a different nucleotide at the polymorphic character. Cloning was
used to separate alleles with more that one polymorphism. An
asterisk in Table 1 highlights successfully phased sequences, and
each phased heterozygous allele received a different Genbank
accession number. Sequences with more than one polymorphism
were phased by cloning (see above). When cloning failed, ambigu-
ities were used to create a consensus sequence that was duplicated
to represent two alleles. The MCMC was run for 2.0 "10
tions (see concatenated BEAST analysis for rate and node density
priors) and the ﬁrst 20% of the recorded state parameters were dis-
carded as burn-in. The output of
BEAST is an ultrametric tree that
depicts species or OTUs (not individual organisms or sequences) as
terminals, requiring an a priori assignment of individuals to spe-
cies. In our analysis, we did not constrain the monophyly of Serra-
salmus rhombeus and Tometes sp. because preliminary results and
previous work (Ortí et al., 2008) suggested that these may be spe-
cies complexes that require further taxonomic treatment. Instead,
we used a conservative approach to test whether these nominal
species are monophyletic. Each individual S. rhombeus and Tometes
sp. was treated as a separate ‘‘species’’ or OTU in
BEAST as well as
in other species-tree methods. This resulted in an analysis with 38
nominal species and 53 individuals which were assigned to 40
operational taxonomic units. All other individuals were assigned
to their nominal species. Because
BEAST and BEST are currently
the only programs that infer a species tree using sequence data
in a Bayesian framework, we attempted to compare the output
from both programs.
In addition to sequence-based methods, we explored four
methods that use individual gene trees as input, for which we used
the gene trees obtained from BEAST, as explained above. The STAR
approach (Species Tree estimation using Average Ranks of coales-
cences) uses a neighbor-joining algorithm to estimate a species
PCR primers used in this study.
Marker name Primers 5
4174-E20 4174E20f CTYTCGCTGGCTTTGTCTCAAATCA Li et al. (2010)
4174E20r2 CTTTTACCATCKCCACTRAAATCCAC Li et al. (2010)
25073-E1 25073E1f2 CGTYTCCCAGCTSAGGAAGATGAA Li et al. (2010)
25073E1r2 GTACTCTCKGTACATGTTGTGRGTKCC Li et al. (2010)
35692-E1 35692E1f2 CCAAGAAGGACTGGTAYGATGTCAAGG Li et al. (2010)
35692E1r2 ACTTCTTVACCATGGAGCACATCTTGT Li et al. (2010)
36298-E1 36298E1f2 GATCCTGAGGGAYTCCCAYGGTGT Li et al. (2010)
36298E1r2 GGGCCAGGACTCTCYTGGTCTTGTAGT Li et al. (2010)
25073-E2 25073E2f2 GAAGGTGAARAACTTTGGBATCTGG Li et al. (2010)
25073E2r2 ATGACCTGSACCTTCATGATYTGG Li et al. (2010)
14867-E1 14867E1f2 CCACAARTACAAGGCCAAGAGRAACTG Li et al. (2010)
14867E1r2 GTTCTCCTTSTCCTGSACGGTCTT Li et al. (2010)
55305-E1 55305E1f2 CCTAGTGGACTGTARTAACGCCCCYCT Li et al. (2010)
55305E1r2 AAGCCATCCAGTTTGCATAAACACTATC Li et al. (2010)
GPD Gpd2F GCCATCAATGACCCCTTCATCG Hassan et al. (2002)
Gpd3R TTGACCTCACCCTTGAAGCGGCCG Hassan et al. (2002)
PTR PTR_F458-1st AGAATGGATWACCAACACYTACG Li et al. (2007)
PTR_R1248-1st TAAGGCACAGGATTGAGATGCT Li et al. (2007)
PTR_F463-2nd GGATAACCAACACYTACGTCAA Li et al. (2007)
PTR_R1242-2nd ACAGGATTGAGATGCTGTCCA Li et al. (2007)
SREB SREB2_F10-1st ATGGCGAACTAYAGCCATGC Li et al. (2007)
SREB2_R1094-1st CTGGATTTTCTGCAGTASAGGAG Li et al. (2007)
SREB2_F27-2nd TGCAGGGGACCACAMCAT Li et al. (2007)
SREB2_R1082-2nd CAGTASAGGAGCGTGGTGCT Li et al. (2007)
Dloop FTTF GCCTAAGAGCATCGGTCTTGTAA Ortí et al. (2008)
F12R GTCAGGACCATGCCTTTGTG) Ortí et al. (2008)
FTTF2 CTAACT CCCAAAGCTAGTATT Ortí et al. (2008)
F12R2 CTACACTAGCT ACAACTATATAA Ortí et al. (2008)
FDLR3 GTTTTGGGGTTTGACA GGA Ortí et al. (2008)
PMDLR3 TAATGCATATTA TCCTTGAT Ortí et al. (2008)
M13 cloning vector M13-F GGTTTTCCCAGTCACGAC
248 A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257
tree from the average ranks of coalescence taken from the gene
trees. STAR estimates the underlying organismal topology ignoring
branch length information so the resulting tree is a cladogram. All
gene trees must be rooted consistently with a single sequence, and
we did so by re-rooting the BEAST gene trees with P. brachypomus
allele 2 as a representative of the Piaractus clade. All other species
trees also were rooted with this clade.
STELLS (Species Tree inference with Likelihood for Lineage Sort-
ing), starts with a population of possible species trees and searches
for the most likely species tree or trees using likelihood heuristics
and only the tree topologies estimated for individual genes. We
constrained our search of tree space by deﬁning the following
parameters in the command terminal: k = 1, Kx = 150, Kc = 15,
and Ka = 150. The –S and –g coarse-mode switches also were
needed to further constrain the search. Resulting branch lengths
are in standard coalescent units (g/2Ne).
STEM also is a likelihood-based method that uses gene tree
topologies to search for the maximum likelihood species tree. For
this analysis, we used the following parameter values:
beta = 0.0005 to increase cooling time, theta = 0.015 to condition
branch lengths, and 2000 iterations. We tried a number of different
theta values. Default theta value of 0.001 resulted in near zero
internal branch lengths. When this value was increased closer to
0.010, internal branch lengths became longer. Increasing theta
beyond 0.015 only seemed to change the scale of all branches
and, while their proportions remained reasonable, the tree topol-
ogy did not change. Here, the tree with the highest likelihood is
reported given that STEM can output multiple trees of high likeli-
hood. The result is an ultrametric tree.
The most parsimonious species tree that implies the lowest
number of deep coalescences (Maddison, 1997) was obtained by
the MDC method (minimizing deep coalescence) using an algo-
rithm (Than and Nakhleh, 2009) implemented in the software
PhyloNet (Than et al., 2008). This approach implements a search
of species tree clades not found in the gene trees. However, our
search included only clusters inferred by our gene trees, which
greatly reduces computational demand. The resulting tree is a
2.3.2. Divergence time estimates
A relaxed clock phylogenetic model was implemented in BEAST
to infer a time-calibrated phylogeny for serrasalmids using all loci
concatenated. We used the UCLN clock model, as explained above,
with tree priors linked across partitions, and used IUPAC ambiguity
codes to collapse the variation from heterozygous alleles into a
consensus sequence. Partitioning schemes and substitution models
are given in Table 3. One individual per species and a birth–death
tree prior were used. Divergence times were estimated using two
fossil calibrations with hard lower bounds and 95% soft upper
bounds that followed exponential distributions. A starting chrono-
gram that satisﬁed all priors (e.g., monophyly and initial diver-
gence times) was generated under penalized likelihood in r8s
v1.71 (Sanderson, 2003) using the RAxML tree of the concatenated
dataset. The MCMC ran for 10
generations and the ﬁrst 20% of
trees were discarded as burn-in. Fossil calibrations were as
(1) Serrasalmidae gen et sp. indet. MRCA: Serrasalmus,Piaractus.
Hard lower bound: isolated pacu-like teeth similar to those
of extant Colossoma,Mylossoma,Piaractus or Myleus, origi-
nating from El Molino Formation (70–61 Ma) of Bolivia, con-
stitute the oldest serrasalmid fossils (Dahdul, 2007; Gayet
and Meunier, 1998). Absolute age estimate: 61 Ma; 95% soft
upper bound: 69 Ma, based on secondary calibration for the
mean divergence of Serrasalmidae + Hemiodontidae (see
Oliveira et al., 2011), estimated on the basis of a multi-locus
time-calibrated phylogeny of bony ﬁshes (Betancur-R. et al.,
2013). Prior setting: exponential distribution, mean = 2.67.
Partitioning schemes and substitution models used for ML and Bayesian (BEAST)
Partition name Model for
4174-E20 GTR + G 1 1
25073-E1 GTR + G 2 2
35692-E1 GTR + G 3 3
36298-E1 HKY + I + G 4 4
25073-E2 GTR + I + G 5 5
14867-E1 HKY + I + G 6 6
55305-E1 HKY + G 7 7
GPD HKY + G 8 8
Dloop GTR + I + G 9 9
PTR codon position 1 HKY + G 10 10
PTR codon position 2 HKY + G 11 11
PTR codon position 3 HKY + G 12 12
SREB codon position 1 HKY + G 10 13
SREB codon position 2 HKY + G 11 14
SREB codon position 3 HKY + G 12 15
Dataset attributes. Sequence length, variation and other characteristics are shown for each gene partition.
Parameter 4174 14867 25073 E1 25073 E2 35692 36298 55305 Gpd SREB PTR Dloop (2 Alleles-
975 542 1236 643 1179 1175 1176 229 939 711 1084 –
% Identical sites 30.2 43.2 16.0 49.3 31.6 20.0 78.7 52.4 91.4 88.6 26.0 –
86.9 88.8 78.6 91.7 86.3 62.0 97.7 94.3 99.0 98.1 94.3 –
102 106 106 106 106 106 106 102 106 106 44 1102
# Seq. from
7.5 24.5 24.5 24.5 7.5 24.5 7.5 7.5 0 0 0 11.9
18.9 30.2 43.4 41.5 15.1 37.7 22.6 22.6 13.2 17.0 0 12.6
96.2 100 100 100 100 100 100 96.2 100 100 96.2 99.1
A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257 249
(2) Carnivorous piranha clade. MRCA: Serrasalmus and Pygocen-
trus.Hard lower bound: compressed and triangularly-
shaped teeth with cutting edge have been assigned to this
node. Fossil teeth collected from the Villavieja formation
(La Venta) in Colombia (16.3–15.5 Ma) are the oldest fossil
remains belonging to true, carnivorous piranhas (Lundberg,
1997). Absolute age estimate: 15.5 Ma; 95% soft upper
bound: 61 Ma, based on isolated serrasalmid teeth from
the Molino formation (Cione et al., 2009; Dahdul, 2007;
Gayet and Meunier, 1998). Prior setting: exponential distri-
bution, mean = 15.19.
3.1. Sequence data and alignments
We obtained a total of 1102 DNA sequences (including both
alleles for diploid genes) for 11 markers and 53 individuals. Gen-
Bank accession numbers for all newly obtained sequences are
listed in Table 1. Dloop alignments were excised of tandem repeats
that are described in Table S1. In total, 68 sequences were deter-
mined from cloned PCR products, and 72 that had only one poly-
morphism were phased out manually. Four different introns
required !25% of their sequences to be cloned, and in total we
cloned !12% of all sequences used for the species tree analyses.
Asterisks in Table 1 denote phased alleles, and each phased allele
was assigned its own GenBank accession number. Six markers
had !30% or less identical sites. The concatenated dataset resulted
in a 9889 bp alignment, had only 1% missing data, and represented
all 40 OTUs. The markers used in this study include intron loci that
are highly variable and result in strong phylogenetic signal (some
dataset attributes are shown in Table 4). Alignment ﬁles are avail-
able from the authors upon request.
3.2. Phylogenetic trees
3.2.1. Comparison of concatenation and species tree methods
Fig. 2 summarizes topologies for individual genes obtained with
BEAST. Gene trees for loci 55305E1 and 36298E1 are the only ones
that are topologically congruent with the hypothesis obtained by
Fig. 2. Individual gene trees obtained with BEAST. Major lineages are colored as follows: red = Pygocentrus; blue = Serrasalmus; cyan = Metynnis; green = Catoprion,
Prystobrycon, and Pygopristis, teal = Myloplus,Myleus,Mylesinus,Tometes,Ossubtus; orange = Acnodon; purple = Mylossoma,Colossoma, and Piaractus. (For interpretation of the
references to color in this ﬁgure legend, the reader is referred to the web version of this article.)
250 A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257
concatenation under maximum likelihood (with respect to the
composition and relationships among major clades) as shown in
Fig. 3. Species trees inferred by
BEAST and STAR (Figs. 4 and 5)
resulted in the same topology with respect to major clades shown
in Fig. 3 (see Fig. 6 for numbered clades discussed in the text). The
BEST analysis did not approach stationarity after several million
generations, suggesting that it may not scale well to the size of
our dataset or perhaps due to an error in the software (L. Liu pers.
com.). Other species trees generated by MDC, STELLS, and STEM
(shown in Fig. 5), have incongruent topologies with respect to
these clades (summarized in Table 5). In these cases, clade 6, the
‘‘Myleus clade,’’ is not monophyletic. STEM (Fig. 4) produced a tree
topology most discordant with all other results: clades 3, 6, and 7
were not monophyletic, while no other species tree analysis had
more than one difference between its result and the concatenation
tree. Interestingly, clades 4, 5, and 8 are the only major clades
(besides the constrained Piaractus clade) found in every tree.
3.2.2. Divergence time estimates
Fig. 6 depicts the time-calibrated phylogeny inferred by BEAST.
Major clades in this tree are congruent with the maximum likeli-
hood tree (Fig. 3). According to this result, serrasalmids began to
diversify in the Late Cretaceous (65–75 Ma), but carnivorous pira-
nhas did not appear until the Miocene (15–20 Ma). The most
recent cladogenetic events involve the genus Serrasalmus, which
began to radiate in the mid to late Miocene (9–13 Ma).
Most tree topologies obtained in this study are highly congruent
with previously published molecular and morphological phyloge-
nies (Cione et al., 2009; Ortí et al., 1996, 2008) shown in Fig. 1.
Here, we will focus on the concatenated RAxML, BEAST and the
BEAST tree shown in Figs. 3, 4 and 6 when discussing serrasalmid
interrelationships because they are time calibrated, contain high
posterior probability values, and are based on different inference
methods (concatenation and species tree approaches). Machado-
Allison’s hypothesis (Machado-Allison, 1982, 1983, 1985;
Machado-Allison et al., 1989)(Fig. 1A) also shows similar relation-
ships, but with different rooting. Extensive evidence from other
morphological and molecular studies, in contrast to this hypothe-
sis, place the root at the base of the Piaractus clade (Calcagnotto
et al., 2005; Cione et al., 2009; Oliveira et al., 2011; Ortí et al.,
An interesting result is the placement of Acnodon. This taxon
has been assigned to different clades in previous studies (Fig. 1),
but we show strong support (100% posterior probability in the
BEAST trees) suggesting this genus is sister to the her-
bivorous Myleus clade. The genus Acnodon typically contains her-
bivorous ﬁshes (Correa et al., 2007) but lepidophagy has been
reported for Acnodon normani (Leite and Jégu, 1990). Lepidophagy
among serrasalmids also is found in Catoprion (in the piranha
clade), implying that this trait may have evolved independently
in clades 5 and 7 (Fig. 6).
The new data also corroborate the non-monophyly of several
genera and species. For instance, Pristobrycon calmoni is found
nested in the Serrasalmus clade in every single concatenation-
based tree and species tree. Pristobrycon striolatus however, is
always grouped within clade 8 (Fig. 5). Previous studies (Ortí
et al., 1996) also obtained the non-monophly of this genus with
P. striolatus being part of a clade containing Catoprion. The non-
monophyly of Pristobrycon also was suggested by Machado-
Fig. 3. Maximum Likelihood tree inferred by RAxML using the concatenated data matrix for 11 genes. Major lineages are colored as follows: red = Pygocentrus;
blue = Serrasalmus; cyan = Metynnis; green = Catoprion,Prystobrycon, and Pygopristis, teal = Myloplus,Myleus,Mylesinus,Tometes,Ossubtus; orange = Acnodon; purple = Mylos-
soma,Colossoma, and Piaractus. (For interpretation of the references to color in this ﬁgure legend, the reader is referred to the web version of this article.)
A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257 251
Allison et al. (1989) based on the presence of the pre-anal spine in
Serrasalmus,Pygocentrus and the other Pristobrycon species but
absent in P. striolatus. We also do not obtain the monophyly of Ser-
rasalmus rhombeus, a nominal species with wide-ranging distribu-
tion (e.g. Amazon, Orinoco, and the Guianas). Because this species
was polyphyletic in the BEAST tree we did not constrain the mono-
Fig. 4. Comparison of topologies obtained with two species tree methods,
BEAST and STEM. Analyses were based on DNA sequence data from 11 genes. Major lineages are
colored as follows: red = Pygocentrus; blue = Serrasalmus; cyan = Metynnis; green = Catoprion,Prystobrycon, and Pygopristis, teal = Myloplus,Myleus,Mylesinus,Tometes,
Ossubtus; orange = Acnodon; purple = Mylossoma,Colossoma, and Piaractus. (For interpretation of the references to color in this ﬁgure legend, the reader is referred to the web
version of this article.)
252 A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257
phyly in the species tree analyses to test whether ILS could explain
the pattern and render the species monophyletic. However, the
lack of monophyly was consistently found in all gene trees, species
trees, and concatenation analyses, suggesting that ILS may not
explain this pattern. We found the same situation for the genera
Tometes,Myloplus, and Mylesinus. Given that no clear pattern of
incomplete lineage sorting emerges for these clades, it is likely that
lack of species-level monophyly reﬂects cryptic species and genus-
level taxonomic diversity within these widely distributed groups.
These results are not necessarily surprising considering that spe-
cies such as Serrasalmus rhombeus and Mylesinus schomburgkii are
widely distributed throughout the range of the Serrasalmidae.
Much taxonomic work in conjunction with phylogeographic stud-
ies at ﬁner geographical scales are needed before the evolutionary
relationships within these taxa can be resolved.
Phylogenies based on mtDNA (Freeman et al., 2007; Hubert
et al., 2007; Ortí et al., 1996, 2008) generally resolved a clade of
Pygocentrus nested within Serrasalmus or as its sister group, but
with relatively weak support. In contrast, the nuclear gene evi-
dence compiled here supports reciprocal monophyly of the two
clades of ﬂesh-eating piranhas with high support in all species
and concatenation trees. Lack of resolution (due to low support
or single loci trees) or perhaps introgression in the mitochondrial
genome may explain the discrepancy among the different datasets.
4.1. Comparison of concatenation and species tree methods
Until recently, it has been common practice to use concatena-
tion of loci to obtain phylogenies without explicitly attempting
to account for biological processes that may cause gene tree discor-
dances. The implicit assumption of concatenation is that additive
phylogenetic signal will prevail over noise caused by random or
systematic error that arises as a result of mutational (homoplasy)
or coalescent (or hemiplasy; Avise and Robinson, 2008) variance
(Huang et al., 2010). Fig. 4 illustrates that gene trees obtained with
11 different genes are quite discordant from each other, perhaps
due to ILS, and justiﬁes the use of species tree methods. Another
factor that may explain gene tree discordance that is frequently
overlooked is sampling error given weak genealogical signal often
contained in individual gene fragments (Bayzid and Warnow,
2013; Betancur-R. et al., 2014; Rasmussen and Kellis, 2007). None-
theless, the question is whether concatenation of all genes will
result in the same topology obtained with species tree methods.
BEAST infer a species tree using sequence data
directly in a Bayesian framework. BEST did not converge and
BEAST needed at least 2.0 "10
generations to converge across
all parameters. While this is computationally intensive, the tree
is estimated directly from the sequence data and priors can be used
to inform the analysis. In our case, the tree obtained with
(Fig. 4) matched the topologies obtained with concatenated data
using BEAST (Fig. 6) and RAxML (Fig. 3). This implies that, in the
case of serrasalmids, despite the incongruence among gene trees,
the effects of possible ILS are not pervasive enough to impede
accounting for this source of error by concatenating the signal from
STEM and STELLS both estimate the most likely species tree by
searching for that in which gene tree topologies are most likely.
When running STELLS, we may have constrained the analysis to a
very greedy heuristic search (see parameters used in methods).
When these parameter values were increased, the software
became non-responsive. Consequently, our most likely tree may
be a result of the analysis terminating after reaching a local opti-
Fig. 5. Phylogenies obtained by other species tree methods based on 11 genes. (A) MDC, (B) STAR, and (C) STELLS. Major lineages are colored as follows: red = Pygocentrus;
blue = Serrasalmus; cyan = Metynnis; green = Catoprion,Prystobrycon, and Pygopristis, teal = Myloplus,Myleus,Mylesinus,Tometes,Ossubtus; orange = Acnodon; purple = Mylos-
soma,Colossoma, and Piaractus. (For interpretation of the references to color in this ﬁgure legend, the reader is referred to the web version of this article.)
A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257 253
mum, thus providing a different topology than those obtained
using sequence-based methods. Nonetheless, the tree obtained
with STELLS (Fig. 5) is congruent with those generated by
and with concatenation except that the Myleus clade is not mono-
phyletic. STEM however, gave a result (Fig. 4) highly discordant
from both the
BEAST and BEAST trees and all previous hypotheses.
In this tree, Metynnis is sister to the Myleus clade and Acnodon is a
paraphyletic genus nested within the Myleus clade. In contrast to
our ﬁndings, a recent study comparing methods (Wu, 2011) using
simulated datasets with both high and low numbers of genes
found STEM to be more accurate than STELLS, MDC, and STAR.
However, another recent study (Leache and Rannala, 2011) also
found that STEM performs poorly when compared to BEST, BUCKy
and two different concatenation approaches. These authors
hypothesize that poor performance may be due to errors in esti-
mating gene trees that STEM uses as input to infer the species tree,
Fig. 6. Time-calibrated phylogeny obtained with the concatenated dataset for 11 genes analyzed using BEAST. Fossil calibrations constrain the ages of nodes indicated by F1
(oldest pacu-like teeth) and F2 (oldest piranha-like teeth). Black circles indicate nodes with posterior probability PP > 99% and gray circles nodes with 90% < PP < 99%.
Horizontal bars are 95% highest probability density (HPD) intervals for the inferred age of the node. Major lineages are colored as in previous ﬁgures and numbered clades are
discussed in the text and in Table 5. Major lineages are colored as follows: red = Pygocentrus; blue = Serrasalmus; cyan = Metynnis; green = Catoprion,Prystobrycon, and
Pygopristis, teal = Myloplus,Myleus,Mylesinus,Tometes,Ossubtus; orange = Acnodon; purple = Mylossoma,Colossoma, and Piaractus. Illustrations by Machado-Allison (Machado-
Allison and Fink, 1995). (For interpretation of the references to color in this ﬁgure legend, the reader is referred to the web version of this article.)
Topological congruence among gene trees and species trees. Support for the monophyly of selected clades (numbered 1–11 on the top row as shown in Fig. 4), and color-coded in
all tree ﬁgures. Support is indicated by the gene support frequency (GSF: how many gene trees out of the 11 genes analyzed contain the clade) and by its presence in each of the
trees obtained from multilocus analyses (concatenation or species tree methods). 1 = clade present; 0 = clade absent.
Clade number 1 2 3 4 5 6 7 8 9 10 11
GSF 11/11 11/11 7/11 9/11 10/11 5/11 10/11 11/11 11/11 4/11 9/11
RAxML (concatenation) 1 1 1 1 1 1 1 1 1 1 1
BEAST 1 1 1 1 1 1 1 1 1 1 1
STAR 1 1 1 1 1 1 1 1 1 1 1
STEM 1 1 0 1 1 0 0 1 1 1 1
STELLS 1 1 1 1 1 0 1 1 1 1 1
MDC 1 1 1 1 1 0 1 1 1 1 1
254 A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257
an observation also supported in another empirical study
(Betancur-R. et al., 2014).
STAR generated a cladogram (Fig. 5) that agrees with the highly
supported BEAST and
BEAST trees. Furthermore, this program is
very fast to run, perhaps providing the best trade-off between
accuracy and running time among the software that we tested.
The MDC approach also ran extremely fast, but resulted in a tree
where the Myleus clade was polyphyletic (Fig. 5). This same result
is observed with the STELLS tree. While some methods such as
STAR or MDC may be preferred for their seemingly accurate infer-
ence of topology and fast calculation, these methods do not inte-
grate over gene tree uncertainty. The gene trees themselves are
assumed to depict the true genealogical history of genes, an
assumption that may be unrealistic due to issues of systematic or
sampling error inherent to gene tree estimation (Betancur-R.
et al., 2014; Huang et al., 2010). The species trees obtained by these
methods also have no support values for clades. Brumﬁeld et al.
(2008) showed that MDC and BEST perform similarly on their data-
set, but BEST may be preferred because it provides posterior prob-
ability estimates of node support.
Linnen and Farrell (2008) inferred a genus level phylogeny of
sawﬂies and found that some clades are robust to choice of species
tree method and taxon sampling. Additionally, they found that
BEST priors and species tree methods both impacted results. Here
we ﬁnd similar trends as species tree topology varies with the
method employed (Table 4). Incomplete lineage sorting did not
disrupt resolution of major clades of serrasalmid ﬁshes highlighted
in the ﬁgures. Given the short internodes obtained among species
of Serrasalmus (Figs. 3 and 6), it is possible, however, that ILS may
have played a signiﬁcant role at shallower scales within this genus.
Broader taxonomic sampling is required to test whether ILS may
affect resolution of shallow nodes within Serrasalmus and other
clades of serrasalmids with broad distributions. Thorough analyses
within Serrasalmus and other serrasalmid genera may also help
reveal cryptic species diversity.
4.2. Divergence time estimates
Previous attempts to calibrate a molecular phylogeny for serra-
salmids Hubert et al. (2007) were based on an inferred rate of sub-
stitution for the mtDNA control region of 0.58% per million years.
This rate was calculated using mtDNA divergence values observed
among species pairs of Serrasalmus and Pygocentrus in the Orinoco
and the Amazon basins and dividing this value by the estimated
date of basin separation (8 Ma). According to this method, the split
between Serrasalmus and Pygocentrus occurred 8.7 Ma (Hubert
et al., 2007), about half the estimated age of this clade (!17 Ma)
obtained with our BEAST analysis (Fig. 6). The discrepancy may
be explained because the previous result was based on a single,
ﬁxed calibration point of uncertain value, since geological separa-
tion of these basins is not complete (they are currently still con-
nected via the Casiquiare river (Willis et al., 2010). The problem
can be compounded by the use of a single molecular marker unable
to resolve major nodes in the phylogeny with conﬁdence. Such
strong biases may be prevented by using several prior distributions
informed by fossil ages, speciﬁed under soft bounds and probabi-
listic models with multiple phylogenetic markers. The approach
used in our study (albeit using only two fossil calibrations) circum-
vents these problems and provides a more robust hypothesis to
assess the temporal axis of diversiﬁcation of Serrasalmids. Our
results coincide with other studies that place most of the higher-
level diversiﬁcation of Neotropical freshwater ﬁshes in the late
Cretaceous and Paleogene (Hrbek et al., 2007; Lopez-Fernandez
et al., 2013; Sullivan et al., 2013) and lend further credence to
the idea that lineages of ﬁshes that dominate modern Neotropical
freshwater environments have a prolonged evolutionary history
that dates back to the Cretaceous (Albert and Reis, 2011; Lopez-
Fernandez et al., 2013; Lundberg et al., 1998). Our estimates, com-
bined with an early fossil record showing unambiguous serrasal-
mid attributes (Gayet and Meunier, 1998), also suggest that the
main ecological and morphological attributes of modern serrasal-
mids evolved early in the history of the family and may have con-
tributed to structuring Neotropical ﬁsh assemblages for an
extended period of time.
The new evidence collected for this study collectively produced
a robust backbone phylogeny for the family Serrasalmidae that
agrees with most previous hypotheses based on morphology and
mtDNA. The data also conﬁrms that a great deal of taxonomic work
is needed in this group; i.e., the genera Myloplus,Mylesinus,
Tometes,Pristobrycon, and Serrasalmus rhombeus are not monophy-
letic. More phylogeographic studies with higher sampling and col-
lection of morphological vouchers from a variety of ontogenetic
stages will help reveal species boundaries in Serrasalmidae and
test biogeographic hypotheses. Our time-calibrated hypothesis
indicates that serrasalmids began to diversify in the Late Creta-
ceous (65–75 Ma), while the infamous carnivorous piranha clade
originated much later, during the Miocene (15–20 Ma).
Different species tree methods produced different hypotheses,
but some are preferred due to their speed (STAR), accuracy (STAR,
BEAST), and inference of support values (
BEAST). Methods that
use gene tree topologies as inputs to estimate the species tree
assume their underlying gene trees are accurate, and violations
of this assumption that can lead to the inference of an incorrect
species tree topology. Concatenation methods can still be a good
and efﬁcient alternative for species tree inference, although the
effects of ILS biasing these results should be explored on a case-
by-case basis. Here we obtained a dataset that seems informative
and robust to ILS, but we argue that testing multiple methods is
a necessary requirement to reach a trustworthy phylogenetic
We thank, Mark Sabaj, Biff Bermingham, Jonathan Armbruster,
and Nathan Lujan for tissues. We also thank Stuart Willis for help
with DNA cloning and Bryan Carstens, Liang Liu, and Yufeng Wu for
help with the use of their species tree programs. Antonio Machado-
Allison authorized use of his scientiﬁc drawings in Fig. 6 and the
graphical abstract. Fieldwork partially associated with this project
was partially funded by Grants from National Geographic, Royal
Ontario Museum Governors, and an NSERC Discovery Grant to
HLF. Other funding came from GWU (startup funds to GO). We
are grateful to the Guyana Environmental Protection Agency and
Brazilian IBAMA for granting collection and export specimens.
Fieldwork partially associated with this project was greatly facili-
tated by help from Calvin Bernard and Elford Liverpool (University
of Guyana, Georgetown, Guyana), Izeni Farias (Universidade Fed-
eral do Amazonas, Manaus, Brazil), Lucia Rapp Py-Daniel and Jan-
sen Zuanon (Instituto Nacional de Pesquisas da Amazonia,
Manaus, Brazil). Lastly, we thank William Fink and one anonymous
reviewer for their comments that helped improve this manuscript.
Appendix A. Supplementary material
Supplementary data associated with this article can be found, in
the online version, at http://dx.doi.org/10.1016/j.ympev.2014.
A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257 255
Albert, J.S., Reis, R.E., 2011. Historical Biogeography of Neotropical Freshwater
Fishes. Univ. of California Press.
Anderson, J.T., Saldaña Rojas, J., Flecker, A.S., 2009. High-quality seed dispersal by
fruit-eating ﬁshes in Amazonian ﬂoodplain habitat. Oecologia 161, 279–290.
Anderson, J.T., Nuttle, T., Saldaña Rojas, J., Pendergast, T.H., Flecker, A.S., 2011.
Extremely long distance seed dispersal by an overﬁshed Amazonian frugivore.
Proc. R. Soc. B: Biol. Sci. 278, 3329–3335.
Araujo-Lima, C., Goulding, M., 1997. So Fruitful a FIsh: Ecology, Conservation, and
Aquaculture of the Amazon’s Tambaqui. Columba University Press, New York.
Avise, J.C., Robinson, T.J., 2008. Hemiplasy: a new term in the lexicon of
phylogenetics. Syst. Biol. 57, 503–507.
Bayzid, M.S., Warnow, T., 2013. Naive binning improves phylogenomic analyses.
Bioinformatics 29, 2277–2284.
Betancur-R., R., Broughton, R.E., Wiley, E.O., Carpenter, K., López, J.A., Li, C., Holcroft,
N.I., Arcila, D., Sanciangco, M., Cureton II, J.C., Zhang, F., Buser, T., Campbell, M.A.,
Ballesteros, J.A., Roa-Varon, A., Willis, S., Borden, W.C., Rowley, T., Reneau, P.C.,
Hough, D.J., Lu, G., Grande, T., Arratia, G., Ortí, G., 2013. The tree of life and a new
classiﬁcation of bony ﬁshes. PLoS Curr., 1–41.
Betancur-R., R., Naylor, G., Ortí, G., 2014. Conserved genes, sampling error, and
phylogenomic inference. Syst. Biol. 63, 257–262.
Brumﬁeld, R.T., Liu, L., Lum, D.E., Edwards, S.V., 2008. Comparison of species tree
methods for reconstructing the phylogeny of bearded manakins (Aves: Pipridae,
Manacus) from multilocus sequence data. Syst. Biol. 57, 719–731.
Calcagnotto, D., Schaefer, S.A., DeSalle, R., 2005. Relationships among characiform
ﬁshes inferred from analysis of nuclear and mitochondrial gene sequences. Mol.
Phylogenet. Evol. 36, 135–153.
Camargo, A., Avila, L.J., Morando, M., Sites Jr., J.W., 2012. Accuracy and precision of
species trees: effects of locus, individual, and base pair sampling on inference of
species trees in lizards of the Liolaemus darwinii group (Squamata,
Liolaemidae). Syst. Biol. 61, 272–288.
Carstens, B.C., Knowles, L.L., 2007. Estimating species phylogeny from gene-tree
probabilities despite incomplete lineage sorting: an example from Melanoplus
grasshoppers. Syst. Biol. 56, 400–411.
Chung, Y., Ane, C., 2011. Comparing two Bayesian methods for gene tree/species
tree reconstruction: simulations with incomplete lineage sorting and horizontal
gene transfer. Syst. Biol. 60, 261–275.
Cione, A.L., Dahdul, W.M., Lundberg, J.G., Machado-Allison, A., 2009. Megapiranha
paranensis, a new genus and species of Serrasalmidae (Characiformes, Teleostei)
from the upper Miocene of Argentina. J. Vert. Paleon. 29, 350–358.
Correa, S.B., Winemiller, K.O., Lopez-Fernandez, H., Galetti, M., 2007. Evolutionary
perspectives on seed consumption and dispersal by ﬁshes. Bioscience 57, 748–
Dahdul, W.M., 2007. Phylogenetics and Diversiﬁcation of the Neotropical
Serrasalminae (Ostariophysi: Characiformes). Doctoral Dissertation, University
of Pennsylvania, University of Pennsylvania, pp. 1–115.
Degnan, J.H., Rosenberg, N.A., 2006. Discordance of species trees with their most
likely gene trees. PLoS Genet. 2, e68.
Degnan, J.H., Rosenberg, N.A., 2009. Gene tree discordance, phylogenetic inference
and the multispecies coalescent. Trends Ecol. Evol. 24, 332–340.
Drummond, A.J., Rambaut, A., 2007. BEAST: Bayesian evolutionary analysis by
sampling trees. BMC Evol. Biol. 7.
Drummond, A.J., Ho, S.Y.W., Phillips, M.J., Rambaut, A., 2006. Relaxed phylogenetics
and dating with conﬁdence. PLoS Biol. 4, 1–12.
Edgar, R.C., 2004. MUSCLE: multiple sequence alignment with high accuracy and
high throughput. Nucleic Acids Res. 32, 1792–1797.
Edwards, S.V., 2008. Is a new and general theory of molecular systematics
emerging? Evolution 63, 1–19.
Edwards, S.V., Liu, L., Pearl, D.K., 2007. High-resolution species trees without
concatenation. Proc. Natl. Acad. Sci. USA 104, 5936–5941.
Eigenmann, C.H., 1915. The Serrasalminae and Mylinae. Ann. Carnegie Mus. 9, 226–
Eschmeyer, W.N., Fong, J.D., 2013. Species by Family/Subfamily. <http://
Faircloth, B.C., McCormack, J.E., Crawford, N.G., Harvey, M.G., Brumﬁeld, R.T., Glenn,
T.C., 2012. Ultraconserved elements anchor thousands of genetic markers
spanning multiple evolutionary timescales. Syst. Biol. 61, 717–726.
Faircloth, B.C., Sorenson, L., Santini, F., Alfaro, M.E., 2013. A phylogenomic
perspective on the radiation of ray-ﬁnned ﬁshes based upon targeted
sequencing of ultraconserved elements (UCEs). PLoS ONE 8, e65923.
Freeman, B., Nico, L.G., Osentoski, M., Jelks, H.L., Collins, T.M., 2007. Molecular
systematics of Serrasalmidae: deciphering the identities of piranha species and
unraveling their evolutionary histories. Zootaxa 1484, 1–38.
Gayet, M., Meunier, F.J., 1998. Maastrichtian to early late Paleocene freshwater
Osteichthyes of Bolivia: additions and comments. In: Malabarba, L.R., Reis, R.E.,
Vari, R.P., Lucena, Z.M., Lucena, C.A. (Eds.), Phylogeny and Classiﬁcation of
Neotropical Fishes. Edipucrs, Porto Alegre.
Geneious. Version 5.6 Created by Biomatters. <http://www.geneious.com>.
Géry, J., 1977. Characoids of the World. T.F.H. Publications Inc., Neptune City, New
Géry, J., 1984. The ﬁshes of Amazonia. In: Sioli, H. (Ed.), The Amazon, Limnology and
Landscape Ecology of a Mighty Tropical River and its Basin. Junk Publishers,
Dordrecht, pp. 343–370.
Gosline, W., 1951. Notes on the characoid ﬁshes of the subfamily Serrasalminae.
Proc. Cal. Acad. Sci. 4, 17–64.
Goulding, M., 1980. The Fishes and the Forest: Explorations in Amazonian Natural
History. University of California Press, Berkeley, California.
Grubich, J.R., Huskey, S., Crofts, S., Ortí, G., Porto, J., 2012. Mega-bites: extreme jaw
forces of living and extinct piranhas (Serrasalmidae). Sci. Rep. 2, 1009.
Hassan, M., Lemaire, C., Fauvelot, C., Bonhomme, F., 2002. Seventeen new exon-
primed intron-crossing polymerase chain reaction ampliﬁable introns in ﬁsh.
Mol. Ecol. Notes 2, 334–340.
Heled, J., Drummond, A.J., 2010. Bayesian inference of species trees from multilocus
data. Mol. Biol. Evol. 27, 570–580.
Hobolth, A., Dutheil, J.Y., Hawks, J., Schierup, M.H., Mailund, T., 2011. Incomplete
lineage sorting patterns among human, chimpanzee, and orangutan suggest
recent orangutan speciation and widespread selection. Genome Res. 21, 349–
Hollingsworth Jr., P.R., Hulsey, D.C., 2011. Reconciling gene trees of eastern North
American minnows. Mol. Phylogenet. Evol. 61, 149–156.
Hrbek, T., Seckinger, J., Meyer, A., 2007. A phylogenetic and biogeographic
perspective on the evolution of poeciliid ﬁshes. Mol. Phylogenet. Evol. 43,
Huang, H., Knowles, L.L., 2009. What is the danger of the anomaly zone for empirical
phylogenetics? Syst. Biol. 58, 527–536.
Huang, H., He, Q., Kubatko, L.S., Knowles, L.L., 2010. Sources of error inherent in
species-tree estimation: impact of mutational and coalescent effects on
accuracy and implications for choosing among different methods. Syst. Biol.
Hubert, N., Duponchelle, F., Nunez, J., Garcia-Davila, C., Paugy, D., Renno, J.F., 2007.
Phylogeography of the piranha genera Serrasalmus and Pygocentrus:
implications for the diversiﬁcation of the Neotropical ichthyofauna. Mol. Ecol.
Javonillo, R., Malabarba, L.R., Weitzman, S.H., Burns, J.R., 2010. Relationships among
major lineages of characid ﬁshes (Teleostei: Ostariophysi: Characiformes),
based on molecular sequence data. Mol. Phylogenet. Evol. 54, 498–511.
Jégu, M., 2003. Subfamily Serrasalminae (pacus and piranhas). In: Reis, R.E.,
Kullander, S.O., Ferraris, C.J., Jr. (Eds.), Checklist of the Freshwater Fishes of
South and Central America. Edipucrs.
Jégu, M., Hubert, N., Belmont-Jegu, E., 2004. Réhabilitation de Myloplus asterias
(Müller & Troschel, 1844), espèce-type de Myloplus Gill, 1896 et validation du
genre Myloplus Gill (Characidae: Serrasalminae). Cybium 28, 119–157.
Jobb, G., Von Haeseler, A., Strimmer, K., 2004. TREEFINDER: a powerful graphical
analysis environment for molecular phylogenetics. BMC Evol. Biol. 4.
Katoh, K., Kuma, K.-I., Toh, H., Miyata, T., 2005. MAFFT version 5: improvement in
accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518.
Knowles, L.L., 2009. Estimating species trees: methods of phylogenetic analysis
when there is incongruence across genes. Syst. Biol. 58, 463–467.
Knowles, L.L., Carstens, B.C., 2007. Delimiting species without monophyletic gene
trees. Syst. Biol. 56, 887–895.
Kubatko, L.S., Carstens, B.C., Knowles, L.L., 2009. STEM: species tree estimation using
maximum likelihood for gene trees under coalescence. Bioinformatics 25, 971–
Larget, B.R., Kotha, S.K., Dewey, C.N., Ane, C., 2010. BUCKy: gene tree/species tree
reconciliation with Bayesian concordance analysis. Bioinformatics 26, 2910–
Leache, A.D., Rannala, B., 2011. The accuracy of species tree estimation under
simulation: a comparison of methods. Syst. Biol. 60, 126–137.
Leite, R.G., Jégu, M., 1990. Food habits of two species of Acnodon (Characiformes,
Serrasalmidae) and scale-eating habits of Acnodon normani. Cybium 14, 353–
Li, C., Ortí, G., Zhang, G., Lu, G., 2007. A practical approach to phylogenomics: the
phylogeny of ray-ﬁnned ﬁsh (Actinopterygii) as a case study. BMC Evol. Biol. 7,
Li, C., Lu, G., Ortí, G., 2008. Optimal data partitioning and a test case for ray-ﬁnned
ﬁshes (Actinopterygii) based on ten nuclear loci. Syst. Biol. 57, 519–539.
Li, C., Riethoven, J.J., Ma, L., 2010. Exon-primed intron-crossing (EPIC) markers for
non-model teleost ﬁshes. BMC Evol. Biol. 10, 90.
Liang, L., Pearl, D.K., 2007. Species trees from gene trees: reconstructing bayesian
posterior distributions of a species phylogeny using estimated gene tree
distributions. Syst. Biol. 56, 504–514.
Linnen, C.R., Farrell, B.D., 2008. Comparison of methods for species-tree inference in
the sawﬂy genus Neodiprion (Hymenoptera: Diprionidae). Syst. Biol. 57, 876–
Liu, L., 2008. BEST: Bayesian estimation of species trees under the coalescent model.
Bioinformatics 24, 2542–2543.
Liu, K., Raghavan, S., Nelesen, S., Linder, C.R., Warnow, T., 2009a. Rapid and accurate
large scale coestimation of sequence alignments and phylogenetic trees. Science
Liu, L., Yu, L., Pearl, D.K., Edwards, S.V., 2009b. Estimating species phylogenies using
coalescence times among sequences. Syst. Biol. 58, 468–477.
Lopez-Fernandez, H., Arbour, J.H., Winemiller, K.O., Honeycutt, R.L., 2013. Testing
for ancient adaptive radiations in neotropical cichlid ﬁshes. Evolution 67, 1321–
Lowe-McConnel, R.H., 1975. Fish Communities in Tropical Freshwaters: Their
Distribution, Ecology and Evolution. Longman, London.
Lundberg, J.G., 1997. Fishes of the Miocene La Venta Fauna: additional taxa and
their biotic and paleoenvironmental implications. In: Kay, R.F., Madden, R.H.,
Cifelli, R.L., Flynn, J.J. (Eds.), Vertebrate Paleontology in the Neotropics: The
256 A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257
Miocene Fauna of La Venta, Colombia. Smithsonian Institution Press,
Washington, DC, pp. 67–91.
Lundberg, J.G., Marshall, L.G., Guerrero, J., Horton, B., Malabarba, L., Wesselingh, F.T.,
1998. The stage for neotropical ﬁsh diversiﬁcation: a history of tropical South
American rivers. Phylog. Classif. Neotrop. Fish., 13–38.
Machado-Allison, A., 1982. Studies on the Systematics of the Subfamily
Serrasalminae (Pisces-Characidae). Biological Sciences. The George
Washington University, Washington, DC, p. 267.
Machado-Allison, A., 1983. Estudios sobre la sistemática de la subfamilia
Serrasalminae (Teleostei, Characidae). Parte II. Discusión sobre la condición
monoﬁlética de la subfamilia. Acta Biol. Venez 11, 145–195.
Machado-Allison, A., 1985. Estudios sobre la Subfamilia Serrasalminae. Parte III:
Sobre el estatus generico y relaciones ﬁlogeneticas los generos Pygopristis,
Pygocentrus, Pristobrycon, y Serrasalmus (Teleostei-Characidae-Serrasalmidae).
Acta Biol. Venez 12, 19–42.
Machado-Allison, A., Fink, W.L., 1995. Sinopsis de las Especies de la Subfamilia
Serrasalminae Presentes en la Cuenca del Orinoco. Universidad Central de
Venezuela, Caracas, Venezuela.
Machado-Allison, A., Fink, W.L., Antonio, M.E., 1989. Revisión del género
Serrasalmus Lacepede, 1803 y géneros relacionados en Venezuela: I. Notas
sobre la morfología y sistemática de Pristobrycon striolatus (Steindachner,
1908). Acta Biol. Venez 12, 140–171.
Maddison, W.P., 1997. Gene trees in species trees. Syst. Biol. 46, 523–536.
Maddison, W.P., Knowles, L.L., 2006. Inferring phylogeny despite incomplete lineage
sorting. Syst. Biol. 55, 21–30.
McCormack, J.E., Faircloth, B.C., Crawford, N.G., Gowaty, R.T., Brumﬁeld, R.T., Glenn,
S.M., 2012. Ultraconserved elements are novel phylogenomic markers that
resolve placental mammal phylogeny when combined with species tree
analysis. Genome Res. 22, 746–754.
Nico, L., Taphorn, D.C., 1988. Food habits of piranhas in the low llanos of Venezuela.
Biotropica 20, 311–321.
Norman, J.R., 1929. The South American characid ﬁshes of the subfamily
Serrasalmoninae with a revision of the genus Serrasalmus Lacepede. Proc. Zool.
Soc. Lond. 52, 661–1044.
Oliveira, C., Avelino, G.S., Abe, K.T., Mariguela, T.C., Benine, R.C., Ortí, G., Vari, R.P.,
Correa e Castro, R.M., 2011. Phylogenetic relationships within the speciose
family Characidae (Teleostei: Ostariophysi: Characiformes) based on multilocus
analysis and extensive ingroup sampling. BMC Evol. Biol. 11, 275.
Oliver, J.C., 2013. Microevolutionary processes generate phylogenomic discordance
at ancient divergences. Evolution 67, 1823–1830.
Ortí, G., Petry, P., Porto, J.I.R., Jegu, M., Meyer, A., 1996. Patterns of nucleotide change
in mitochondrial ribosomal RNA genes and the phylogeny of piranhas. J. Mol.
Evol. 42, 169–182.
Ortí, G., Sivasundar, A., Dietz, K., Jegu, M., 2008. Phylogeny of the Serrasalmidae
(Characiformes) based on mitochondrial DNA sequences. Genet. Mol. Biol. 31,
Posada, D., 2008. JModelTest: phylogenetic model averaging. Mol. Biol. Evol. 25,
Rambaut, A., Drummond, A.J., 2007. Tracer v1.4. <http://beast.bio.ed.ac.uk/Tracer>.
Rasmussen, M.D., Kellis, M., 2007. Accurate gene-tree reconstruction by learning
gene- and species-speciﬁc substitution rates across multiple complete
genomes. Genome Res. 17, 1932–1942.
Sanders, K.L., Lee, M.S., 2007. Evaluating molecular clock calibrations using Bayesian
analyses with soft and hard bounds. Biol. Lett. 3, 275–279.
Sanderson, M.J., 2003. r8s: inferring absolute rates of molecular evolution and
divergence times in the absence of a molecular clock. Bioinformatics 19, 301–302.
Sen, S., Liu, L., Edwards, S.V., Wu, S., 2012. Resolving conﬂict in eutherian mammal
phylogeny using phylogenomics and the multispecies coalescent model. PNAS
Stamatakis, A., 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic
analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–
Sullivan, J.P., Muriel-Cunha, J., Lundberg, J.G., 2013. Phylogenetic relationships and
molecular dating of the major groups of catﬁshes of the neotropical superfamily
Pimelodoidea (Teleostei, Siluriformes). BioOne 162, 89–110.
Than, C., Nakhleh, L., 2009. Species tree inference by minimizing deep coalescences.
PLoS Comput. Biol. 5. http://dx.doi.org/10.1371/journal.pcbi.1000501.
Than, C., Ruths, D., Nakhleh, L., 2008. PhyloNet: a software package for analyzing
and reconstructing reticulate evolutionary relationships. BMC Bioinform. 9.
Willis, S.C., Nunes, M.S., Montaña, C.G., Farias, I.P., Lovejoy, N.R., 2007. Systematics,
biogeography, and evolution of the Neotropical peacock basses Cichla
(Perciformes: Cichlidae). Mol. Phylogenet. Evol. 44, 291–307.
Willis, S.C., Nunes, M., Montana, C.G., Farias, I.P., Orti, G., Lovejoy, N.R., 2010. The
Casiquiare river acts as a corridor between the Amazonas and Orinoco river
basins: biogeographic analysis of the genus Cichla. Mol. Ecol. 19, 1014–1030.
Winemiller, K.O., 1989. Ontogenetic diet shifts and resource partitioning among
piscivorous ﬁshes in the Venezuelan Llanos. Environ. Biol. Fish. 26, 177–199.
Wu, Y., 2011. Coalescent-based species tree inference from gene tree topologies
under incomplete lineage sorting by maximum likelihood. Evolution 66, 763–
A.W. Thompson et al. / Molecular Phylogenetics and Evolution 81 (2014) 242–257 257