Stockdale and Benton 2021 explain why to avoid super trees and super matrices

…then Nature publishes their supertree/supermatrix
analysis of crocodilians and their ancestors using 175 source trees published since 2010.

Stockdale and Benton presented their charts and graphs without a valid phylogenetic context due to massive outgroup taxon exclusion. Their in-group appears to be okay.

Stockdale and Benton report, 
“There is no published phylogenetic hypothesis that encompasses all Pseudosuchia, as well as molecular data from living taxa. Therefore, we estimated a new phylogenetic hypothesis for this study.”

Molecular studies rarely, if ever, replicate trait-based studies. It’s no exaggeration to report that everyone know this. So, why even waste time with molecular studies when dealing with deep time taxa?

“A matrix based approach was also ruled out, because collecting character data for such a large matrix from the literature and vetting characters for redundancy would have been impractical.”

Here’s a suggestion: Add all the taxa from the various source cladograms to one study and use only the characters from that one study until resolution starts to falter.

Or start with fewer taxa and fewer characters (about 200 each) to establish the tree topology. Then add more taxa. I can tell you from experience, this works.

“In addition, such a large matrix would have introduced a significant fraction of missing data, which could undermine the quality of a finished tree.”

Here’s a suggestion: employ relatively complete taxa to figure out the tree topology. Then add less complete taxa until resolution starts to falter.

“The phylogenetic hypothesis used in this study is based on a formal supertree analysis. Formal supertrees use a systematic approach to assimilating multiple smaller topologies into a single tree. Liberal formal supertree methods enable a well-resolved consensus topology to be estimated from source trees that are incongruent.”

Both supertree methods enable workers to trust prior studies, rather than examining specimens, photos and engravings. That’s antithetical to standards established by paleontologists for the rest of us.

“The supertree was estimated from a sample of 175 source trees published since 2010, each re-analysed from their original source matrices using Bayesian inference and the MK model. The supertree was dated using the equal method; the dated supertree contained a total of 579 archosauromorph taxa, including 24 extant species.”

And yet, despite their large taxon list, Stockdale and Benton managed to exclude a long list of outgroup taxa found to be pertinent by the LRT (subset Fig. 1), Missing taxa include many basal crocodylomorphs and outgroup poposaurs. Stockdale and Benton also omitted members of the only other clade in the Archosauria (by definition, as recovered in the LRT): Dinosauria. In other words, Herrerasaurus and Junggarsuchus should have been included. And where was Benton’s Scleromochlus?

“This tree was then trimmed to match the 280 pseudosuchian taxa included in the body size data. This phylogenetic approach was implemented to eliminate as many sources of error as possible.”

Trimmed? Sounds subjective. Don’t gloss over this point.

Source errors? The whole idea of using a super tree analysis is to avoid looking at taxa, photos of taxa or the literature. Why not at least peek at the taxa to double-check for possible source errors?

Figure x. Subset of the LRT focusing on Euarchosauriformes and Crocodylomorpha.

Figure 1. Subset of the LRT focusing on Euarchosauriformes and Crocodylomorpha. Fewer derived crocs here, but a wider gamut of outgroup taxa validate the LRT.

Stockdale and Benton (SuppData) report on
the “limitations of informal super trees”

“All supertrees, informal or otherwise, share a common drawback that they are dependent on the accuracy of the source trees from which they are estimated. This is especially true of informal trees where topology is copied from older publications, where the data or methodology may be outdated. Informal supertrees are also entirely subjective, and by definition bias analyses in favour of the author’s own views.”

True. So why didn’t Stockdale and Benton listen to themselves? Why did referees and editors permit this to be published?

Add taxa and the traditional clade ‘Pseudosuchia’ becomes invalid, polyphyletic.

“If there is controversy about the evolutionary relationships within a clade, the number of possible informal supertree topologies may become excessive.”

No. This is professional baloney. There is only one tree and it models actual evolutionary events. It’s our job to recover that one tree (subset Fig. 1).

“For example, the positions of several member clades within the Pseudosuchia differ between analyses. The Thalattosuchia have been resolved as a derived clade within the Neosuchia, a basal sister clade to the Crocodyliformes, or an intermediate clade within the Mesoeucrocodylia but outside the Neosuchia.” 

That means someone or several someones made a mistake. Fix the mistakes. See the Stockdale and Benton 2016 text for other examples they cite.

“It would be difficult to draw meaningful conclusions from so many trees if they are considered equally likely; it is therefore necessary to develop a consensus of these different viewpoints based on the strongest evidence.”

More subjective professional baloney. Consensus of viewpoints? How about just taking a quick peek at some specimens, photos of specimens or even drawings of specimens.

“Supermatrix approaches avoid many of the subjectivity issues associated with informal supertrees. Supermatrices lack a specific technical definition, however the term is broadly used to describe phylogenetic analysis of a single, comprehensive matrix. A supermatrix of the Pseudosuchia would require in excess of 500 taxa. Estimating such a large phylogenetic tree from a single matrix represents a formidable challenge, either in the sheer number of fossils to be examined and their characters scored, or by the integration of existing matrices.”

Quit whining! Do the work. The LRT passed 500 total taxa eight years ago and now includes 3-4x that number (including pterosaurs and therapsid skulls).

Here’s a suggestion: Start with 150 to 200 taxa. That will get you will started with a rough estimate of the final tree topology. Later adding taxa one or two at a time will slowly fill in the gaps and solidify the tree topology.

If two taxa are nearly identical in every detail, they are probably related. Drop one. Pick it up later if you really need to.

First attend to the basic problems. The Stockdale and Benton study has basic issues based on taxon exclusion among outgroup taxa. The ingroup taxon list appears to be just fine.

“Very large morphological character matrices present a significant problem in the accumulation of inapplicable characters.For example, a matrix of crocodile-line archosaurs would likely contain characters relating to the morphology of osteoderms, despite osteoderms being absent in some members.”

That’s no problem for the LRT. For example, the score ‘absent’ is available where appropriate. Consider this paper an example and cautionary tale showing how to get published while whining about what not to do.

“Therefore, it is not reasonable to assume that the time invested in building a very large supermatrix will be rewarded with a high quality phylogenetic analysis.”

This statement was falsified by the LRT with over 2000 taxa. Just do the work. Show your work. Repair bad scores. Report results. If you don’t get one tree go back in there and figure out what went wrong. If a skull-only taxon nests with a skull-less taxon, eliminate one of them.

“Conservative approaches handle incongruences between source trees by presenting them as unresolved nodes in the final topology.”

If something is wrong, fix it. Do the work. Don’t let bad data infiltrate your matrix.

“The MRP method is an example of a liberal supertree approach, where incongruences between source trees are resolved democratically, with the better-supported topology being retained in the final supertree. The MRP method is a pragmatic choice, since it can be implemented using readily available software without consuming excessive computer processing power.”

If something is wrong, fix it. Do the work. Don’t let bad data infiltrate your matrix.

“Studies sceptical of supertrees, such as Gatesy et al., have concluded that these issues are insurmountable and that supertree methods should be avoided altogether.”

Just do the work. Don’t rely on, or trust the work of others.

“A rebuttal by Bininda-Emonds et al. concluded that these problems could be mitigated through careful source tree selection protocols and stated that supertrees are a necessity due to the inherent impracticality of super matrices.”

Stockdale and Benton don’t want to do the necessary work.

From the Stockdale and Benton Discussion Section:
“The supertree identifies the Phytosauria as a monophyletic group within Pseudosuchia, closer to extant crocodilians than to Avemetatarsalia.”

Adding missing taxa (as in the LRT) separates Phytosauria from all other included taxa. “Pseudosuchia” becomes an invalid polyphyletic clade when missing taxa are included. “Avemetatarsalia” is a junior synonym for the older clade Reptilia when missing taxa are included. Professor Benton is infamous for cherry-picking taxa. Better to let a wide gamut analysis tell you which taxa to include and exclude.

As described early in 2012,
adding pertinent taxa separate Pararchosauriformes (Proterosuchus is the last common ancestor) from the Euarchosauriformes (Euparkeria is the last common ancestor). Neither of these taxa are in the Stockdale and Benton taxon list. Their last common ancestor is Younginoides. The clade Archosauriformes begins there. The rest follow (Fig. 1).

Stockdale and Benton attempted to describe
environmental drivers of body size in crocs. Unfortunately, without a valid phylogenetic context, and omitting so many pertinent taxa, the rest of the information they so carefully prepared is hobbled by their own self-confessed lack of effort.

Don’t whine about doing the necessary work.
Get the broad basics right. The you’ll have that powerful cladogram for the rest of your career. Only then do the more focused work.

Bininda-Edmons ORP et al. (7 co-authors) 2003. Supertrees are a necessary not-so- evil: a comment on Gatesy et al. Syt. Bio. 52, 724-729. [not sure how this 2003 comment precedes the Gatesy et al. 2004 paper].
Gatesy J, Baker RH and Hayashi C 2004. Inconsistencies in arguments for the supertree approach: supermatrices versus supertrees of Crocodylia. Syt. Bio. 53:342-355.
Stockdale MT and Benton MJ 2021. Environmental drivers of body size evolution in crocodile-line archosaurs. Nature Communications Biolody 4:38

Some news sources took the bait.
Since Scleromochlus and other basal bipedal crocs were not included, the headline in The Conversation is bogus.






