Four more basal tetrapods added to the LRT

Posted on February 1, 2017 by davidpeters1954

Spoiler alert:
No basic changes to the large reptile tree topology (LRT, Fig. 1, 938 taxa). The biggest difference from traditional trees continues to be the separation of dissorophoids, including Cacops, from temnospondyls. Cacops and kin are still nesting with the lepospondyls, including all microsaurs and extant amphibians

Figure 1. Subset of the LRT showing basal tetrapods. Four more are added here with no change in tree topology.

Tomorrow or shortly thereafter
I’ll start reporting some numbers and describing some interesting taxa. For those interested, whenever I add taxa, I revisit and update old taxa including their scores. So if you want an updated .nex file, now is a better time than ever, with errors minimized.

23 thoughts on “Four more basal tetrapods added to the LRT”

David Marjanović on February 3, 2017 at 6:15 am said:

Thank you for the .nex file. Scrolling through the character list and glancing at the matrix, I notice the following points:

1) As I thought, many of your 229 characters have several states. This explains how you get a resolved tree with so many taxa and so few characters.

2) The way the characters are constructed artificially increases the robustness of the tree (and with it the resolution). I’m not saying you did this deliberately! I’m saying you did this because you have no idea how to make characters for phylogenetic analysis. In fact, the characters read a lot like those in Carroll’s two phylogenetic analyses (1995, 2007), which found clearly spurious topologies, too, and have been quietly ignored in the literature.

Here goes:

9 prf_pof / separated contact prf_pof_fused no_prf_no_pof _ pof_absent _ ju_po_pof_fused

(In case anyone not familiar with NEXUS files read this, this is character 9 from the matrix, called “prf_pof”; “separated” is state 0, “contact” is state 1, “prf_pof_fused” is state 2, and so on. And yes, states 4 and 6 are both called “_”, which is odd, but not a recurring problem, and has no effect on the tree.)

From looking at this character alone (don’t worry, I look at a lot more below), we learn several things.

First, you try to pack everything related to the prefrontal and the postfrontal into a single character. Clearly you’re trying to avoid having to score characters as inapplicable. (Like Carroll. I blasted him for it in the 2013 paper which you’ve blogged about, but clearly only skimmed.) But there’s hardly any reason for this avoidance. Missing data is nothing to be afraid of; there’s plenty of literature on this, and there’s plenty of literature on how to deal with inapplicability. I cite some in my preprint; there’s a whole section on inapplicability there.

By trying to pack everything into one character, you routinely end up with cases where you code two or more independent characters as a single character. More on this below.

Further, we learn that you have a magic fusion detector: state 2 is when the postfrontal is fused to the prefrontal (a condition never described in the literature, BTW, but never mind, that’s beside the point), state 3 is when the postfrontal and the prefrontal are both absent, state 5 is when the postfrontal is absent (but the prefrontal is still there, apparently), and state 7 is when the postfrontal is fused to the postorbital and (!) to the jugal. How can you distinguish these conditions without a huge sample of different ossification stages of the skull, which we have for Apateon and practically nothing else in the fossil record?

Here, too, you’re in good company. Many people, especially in earlier decades (Carroll again comes to mind), thought they had magic fusion detectors and could distinguish such states by the positions of the sutures. It makes sense to believe that a lateral rectangular process on a parietal that occupies the position where you’d expect an intertemporal is in fact an intertemporal fused to the parietal – until you find a skull where such a process and a narrow intertemporal lie next to each other. Oopsie.

Be very, very, very careful with assuming fusion.

Scrolling on…

29 antorbital_fenestra / absent with_mx_fossa without_mx_fossa between_prefrontal_and_lacrimal vestige_tiny w_web_of_bone

We’re looking at three independent characters, not one: the size of the fenestra (which should be ordered because it’s a potentially continuous character – again I refer to several papers cited in my preprint); the presence or absence of a fossa around it; and whether the prefrontal participates in the rim of the fenestra. I don’t understand what you mean by the last state.

(I’ve counted absence of the fenestra as a size of zero. If we don’t do that, we’re looking at four characters instead of three.)

Interestingly, you haven’t coded whether the nasal and the jugal participate in the fenestra. Are you sure these states don’t carry phylogenetic signal? They very much seem to in archosaurs.

Now compare these three characters:

10 ‘preorb/postorb’ / subequal preorb_longer postorb_longer
30 orbit_vs._rostrum / orbit_³_rostrum_length not
31 orbit_vs_postorb_skull / ‘orbit_<_po_skull_at_temples' not

You divide the skull into three parts, and then you take all possible ratios and code them all. That’s overdetermined. It’s a circle. If the postorbital part of the skull is longer than the eye (character 31), and the preorbital part is longer than the postorbital part (character 10), then the preorbital part is automatically, logically, inevitably longer than the eye (character 30).

If a state of a character is predictable from the states of other characters, it is redundant. And if you have redundant states in your matrix, that means your matrix counts the same thing twice (or more often as the case may be). I don’t think I need to explain why that’s an unequivocally bad thing.

It’s easy to fall into such traps, and there’s no one-size-fits-all method to deal with them. In this case I strongly recommend simply dropping one of these characters.

BTW, character 10 is potentially continuous and has 3 states; it really should be ordered. (Obviously with “subequal” as state 1.)

33 naris_vs._aof / naris_larger not_larger

A perfectly innocent character, right? What could go wrong?

The scoring could go wrong, and a look at the matrix immediately shows that it has gone wrong. I just quoted character 29; it turns out that all taxa which have state 29(0) are scored as having 33(0). Absence of the antorbital fenestra is thus counted twice. Again, when you can predict a state of one character (33(0)) from a state of another character (29(0), you have redundant characters. And that’s terrible.

The following solutions are available, off the top of my head:
– score all taxa with state 29(0) as unknown (specifically as inapplicable) for character 33;
– take presence and size out of character 29, which must be split up anyway as I said, and put them into character 33.

Which one is better depends on things I haven’t looked at in your matrix.

20 major_axis_of_naris_vs_jaw / horiz_to_30¼ ’30-90¼’ no_major_axis_indicated ‘>_90¼’ secondary_naris ’30-90¼_but_dorsal’
23 naris_displacement / snout_tip displaced_or_post_elong snout_tip_but_dorsal secondary_naris lateral

“Secondary naris” is a state in both of these characters. Every time you believe you’ve found a secondary naris (whatever that is), you’re counting it twice, and PAUP* counts it twice.

56 postorbital / ovate_to_square triangular fused_to_pof absent fused_with_sq strip_pof_overlap truncated_triangle small_crescent fused_to_ju arch

Hm. So state 56(2) is fusion of postorbital and postfrontal. State 9(7) is also fusion of postorbital and postfrontal. More redundancy.

And that’s nothing against this work of art:

80 opisthotic_connected / tab st sq tab_sq_fused st_tab_fused qu no_lateral_connection prootic_or_petrosal st_sq_fused postparietal
81 supratemporals / present absent_or_fused
82 Sq_vs_St / St_tiny Sq_tiny St_absent_or_fused Sq_absent both_absent both_large both_long St_long_and_Sq_tiny Sq_long_and_St_tiny_to_absent Sq_large_St_long
83 ST_vs_PO / contact no_contact ‘st-tab_fused’ st_sq_fused st_po_fused st_pa_fused

Well, if you count everything three times, you’re bound to get an unshakeable tree, no matter how spurious the topology is. Have you noticed that fusion of supratemporal and squamosal is a state in each of these four characters? You’re counting it four times!

Also, magic fusion detector and failure to recognize independent characters in the desperate attempt to avoid having to score anything as inapplicable. And something I haven’t brought up yet, the failure to quantify: how long is “long”, how large is “large”? Sure, it’s possible that you scored these characters in consistent ways, but there’s no way for anyone else to figure out what exactly these terms mean, and that means other people can’t add taxa to your matrix, other people can’t tell if your matrix contains typos, other people can’t use your matrix. If it’s not reproducible, it’s not science.

49 Tabulars / present absent fused_to_sq fused_to_st fused_to_pp

Sounds familiar, doesn’t it. Also, four different kinds of absence… wow, your fusion detector must be really finely tuned!

59 posterior_parietal_angle / transverse ’20-40¼_posteriorly’ ‘>_40¼_posteriorly’ transverse_w_tab_horns ‘_inverted_B-shape’ transverse_w_st_horns ant_oriented Postparietal_surrounded fused_to_postparietal

Lots of different things crammed into one character, plus magic fusion detector.

62 par_skull_table / broad weakly_constricted strongly_constricted sag_crest constricted_posteriorly nuchal_and_sag_crest nuchal_crests on_occiput elevated_bulla

What do these crests have to do with the 2D shape of the parietals in dorsal view? Nothing, that’s what. Indeed, what do nuchal and sagittal crests have to do with each other? You can have one without the other – happens all the time.

Independent characters.

Also, what’s that bulla on the parietals?

4 parietal_table / convex flat concave concave_w_pineal_volcano lateral_and_medial_crests depressed_terrace_medial_crest flat_w_pineal_volcano transverse_crest

Similar problems as in 62.

14 pmx_mx_notch / less_than_25¼ ‘>_45¼’ 25_to_45¼ notch_for_fang no_pmx_palatal_surface pmx_ventral_to_mx

Similar again.

228 overall_size / ‘>_30_cm_tall_x_60_cm_long’ not

Body size carries a pretty strong phylogenetic signal. However, it’s correlated to so many other things – some obvious, some not – that I would immediately throw it out because it’s almost certainly in the matrix five times already.

211 prox_metatarsals / subequal_width 4_is_the_widest 2_is_the_widest I_and_or_5_reduced 1_and_5_widest 5_is_widest 3_and_4_widest 1_widest 3_is_the_widest only_5_reduced

Several independent characters. I mean, what single quirk of development genetics could possibly cause reduction of metatarsal I or V but not both? How could reduction of I and reduction of V be homologous when they occur independently?

203 tarsus_type / crocodile_normal croc_reversed adv_mesotarsal fenestrated_meso non_fenestrated_meso aquatic_round_elements poorly_ossified fused_ast_calc doublebend large_intermedium

So many independent characters… even the “types” are a wrong-headed approach, code the joints separately.

3) The third point I notice is that, with so many states counted so many times in so few characters, there’s not much left. Many characters that are well known to carry phylogenetic signal are absent from your matrix.

…as I distinctly remember telling you on a hot summer day in 2005. There’s a reason size is never included in anyone else’s matrices, you know; did it never occur to you to wonder?

156 cleithrum / present absent

That’s the only character that mentions the cleithrum. Postbranchial lamina? Fusion to the scapula (which can be detected because the cleithrum is a plate-like dermal bone while the scapula is an endochondral lump… also, it happens in the ontogeny of Eryops)? Cranial extension of the dorsal end? Caudal extension of the dorsal end? Shape of the attachment surface for the clavicle? Not considered in your matrix. And that’s all just off the top of my head.

========================

That’s it, I give up. Do ask if you have any further questions!

The examples I quote above are just that; if I don’t quote something or don’t address something, that doesn’t imply I agree with it. Also, I have barely glanced at the taxon sample, where it seems that you haven’t seen much recent literature and haven’t noticed when newer papers contradict older ones. Importantly, I haven’t checked if your scores are correct, or if the things you score even exist (“secondary naris”…?); that’s entirely beside the point – Carroll understood the anatomy of the animals in his matrices very well, what with all those decades of firsthand study, but he didn’t know how to construct characters for phylogenetic analysis, and so his trees are worthless no matter how good his morphological descriptions are.

Reply ↓
davidpeters1954 on February 3, 2017 at 7:23 am said:

Dave, you’re overthinking this. And I appreciate your efforts. We could have had more of a conversation if you had just done, say, three topics at a time. So, I’ll choose three random issues to keep this manageable.

1 overall size. Yes, I know the potential dangers, but since juveniles and hatchlings are rare in the fossil record, I threw it in to see if there were size patterns. There are. Ultimately, this is statistical study, with each character providing one ‘vote’, so one character, in or or out, should not make much difference over 930 taxa. If you don’t like it, take it out and rerun the analysis to see if the topology changes. I think you’ll see my point.

2. Revisiting the same character more than once. Yes, that happens. It gives weight, whether appropriate or not. If you don’t like that, take one character out and rerun the analysis to see if the topology changes.

3. Multiple scoring opportunities for single characters, like nasal shape, dorsal view. Hey, it works. There are lots of shapes and at the fringes I have to make a choice. I do and it seems to work out… usually.

And I’ll throw this tidbit in to pay homage to one of your firsthand observations:
The only instance I scored prefrontal/postfrontal fused is in Brachydectes, one of the original 200 or so taxa. That’s why that trait is in there. That trait does not show up again… as you noted, anywhere else! …but if it does show up again among the lepospondyls, we’ll have a good candidate for a Brachydectes sister taxon, if valid.
See: http://reptileevolution.com/eocaecilia.htm for the cyan bone I thought looked like a fused prefrontal/postfrontal as it appears to extends on both sides of the orbit. If it’s not valid, I’ll get rid of it. No problem. Like the umpire sez, “I calls ’em as I sees ’em.” But unlike an umpire, I’m more than willing to make …some… corrections, as warranted.

Final thought, everyone’s character list has to be flawed, so like family members, we love them the way they are IF they recover trees that makes sense and provide insight. Your issues are all valid. My character list has pimples, boils, scabies and ticks, but still, it works.

Reply ↓
- David Marjanović on February 3, 2017 at 5:19 pm said:
  
  Dave, you’re overthinking this.
  
  :-D :-D :-D :-D :-D
  
  You’re underthinking this. To an enormous extent!
  
  2. Revisiting the same character more than once. Yes, that happens. It gives weight, whether appropriate or not. If you don’t like that, take one character out and rerun the analysis to see if the topology changes.
  
  That’s your job. Don’t have redundant characters in the first place.
  
  I don’t even have time to untangle all the redundancy! I’m working on my own matrix. Your matrix, your publication, your job.
  
  Hey, it works.
  
  No, it does not; and the fact that you haven’t even noticed this speaks volumes.
  
  Again, there’s lots of literature on this out there. It’s your responsibility to read it.
  
  See: http://reptileevolution.com/eocaecilia.htm for the cyan bone I thought looked like a fused prefrontal/postfrontal as it appears to extends on both sides of the orbit.
  
  Ah, you’ve overlooked the redescription of the skull of Brachydectes by Pardo & Anderson that came out in PLOS ONE last year.
  
  Final thought, everyone’s character list has to be flawed
  
  False equivalence. All are flawed – but some are a lot more flawed than others, and that in ways that can be fixed!
  
  And frankly, if you love your character list, you’re doing it wrong.
  
  My character list has pimples, boils, scabies and ticks, but still, it works.
  
  No, it does not; and the fact that you haven’t even noticed this speaks volumes.
  
  “It gives me a tree” is not the same as “it works”. I’ve tried to explain why spurious topologies can be strongly supported by a flawed matrix. Your eyes glazed over, and you scrolled to the end.
  
  I don’t need a reply within the hour! Take the time to read my long comment in full and follow the leads in it.
  
  Reply ↓
davidpeters1954 on February 3, 2017 at 7:46 pm said:

You can’t just say the analysis doesn’t work. You have to be specific. Given the present taxon list, which taxa should not nest together? And where should they go instead?

Reply ↓
- Squiddhartha on February 3, 2017 at 11:11 pm said:
  
  This is your standard reply. It’s akin to saying, “Look at my beautiful treehouse,” being told (and shown!) the tree you’ve built it in is rotten and about to fall down, and replying, “Oh, which of my boards is in the wrong place?”
  
  Reply ↓
  - David Marjanović on February 5, 2017 at 6:14 am said:
    
    Pretty much, yeah.
    
    On what topologies I think are more likely, I’ll just say “read my preprint”. (Not “pick a tree from there”, but “read the whole thing”; some parts of the problem are hard enough that I can’t form a hard-and-fast opinion.)
Jason Pardo on February 3, 2017 at 10:52 pm said:

I’m just going to point out that this is what it means for David Marjanovic to take your character matrix seriously and treat it with scientific rigor. I’ve received more scrutiny, not less, when he’s reviewed my papers. Being treated like a scientist means taking scientific criticism seriously.

Reply ↓
David Marjanović on February 5, 2017 at 7:08 am said:

Oh, I kept forgetting to mention: your tree flatly contradicts the molecular data by putting Mammalia inside the diapsid crown-group (on the archosauromorph side). This part, at the very least, has to be wrong.

Reply ↓
- davidpeters1954 on February 13, 2017 at 6:50 am said:
  
  As I’ve said before, the molecular data fails when it attempts to go beyond the smaller clades.
  
  Reply ↓
  - Squiddhartha on February 14, 2017 at 9:32 pm said:
    
    “Does not agree with my results” is not the same as “fails.”
  - David Marjanović on February 16, 2017 at 4:35 am said:
    
    Wow. That’s an amazingly arrogant and ignorant thing to say.
    
    Everything has its sources of error. But when huge analyses of nuclear and mitochondrial genomes all give you the same result, you can’t just handwave that away. Molecular phylogenetics is understood in a richness of detail that we morphologists can only dream about.
Mickey Mortimer on February 8, 2017 at 4:39 pm said:

Yeah, Marjanovic exactly echoes my points about your analysis’ flaws. Why you continue to think that you’re right but everyone else who has examined your matrix is wrong eludes me.

Reply ↓
- David Marjanović on February 8, 2017 at 5:28 pm said:
  
  I think we’re looking at the classical mistake of drastically underestimating the unknown unknowns: the expectation that the knowledge one happens to have is all the knowledge there is, except details. It looks like it simply hasn’t occurred to our host that phylogenetics is a science with a rich literature on methods, with simulation studies, with studies of the behavior of different methods under a range of conditions…
  
  Reply ↓
davidpeters1954 on February 13, 2017 at 6:52 am said:

I’m very simply running the data and reporting it. If there were problems, they would show up with mismatched sisters and relatives. I encourage you to run your own analyses and see for yourself.

Reply ↓
- David Marjanović on February 14, 2017 at 6:23 pm said:
  
  If there were problems, they would show up with mismatched sisters and relatives.
  
  Because you have a magic mismatch detector. Because you already know in advance what the topology ought to look like, or what?
  
  No. Your characters are coded in an incompetent way and scored in an incompetent way. You put garbage in – it is simply inevitable that you get garbage out.
  
  You know, once in a blue moon the LRT could be right on something, just as a statistical fluke. But this wouldn’t mean there’s anything good or trustworthy about the LRT. And that’s why it’s important in science to be right for the right reasons. It’s better, in fact, to be wrong for the right reasons than to be right for the wrong reasons.
  
  Now finally go and look for literature on how to construct characters and how to deal with inapplicable characters. I cited a bunch in the preprint and wrote a few little chapters about these issues; I think you should read them.
  
  Reply ↓
davidpeters1954 on February 17, 2017 at 12:03 am said:

Then we come from different cultures and mind sets. Do you thing, and I’ll do mine.

Reply ↓
- David Marjanović on February 17, 2017 at 5:52 am said:
  
  Please try to understand my mindset. It’s not some kind of hoary tradition, it comes from actually wrestling with issues that you’ve hardly given any thought.
  
  Reply ↓
davidpeters1954 on February 17, 2017 at 12:09 am said:

re: molecular phylogenetics – paleontologists everywhere are wondering what they heck do the golden mole and elephant have in common in that molecular clade Afrotheria? And where on the archosaur line do turtles fit? Ultimately, someone is going to have to show a gradual accumulation of morphological traits to confirm the DNA results. When that happens I will buy you a beer. At present, I’ll present morphological evidence, and doubt DNA evidence when it extends over large morphological distances. That’s where the evidence now leads. Glad to change when new data shows otherwise.

Reply ↓
- David Marjanović on February 17, 2017 at 6:14 am said:
  
  what they heck do the golden mole and elephant have in common in that molecular clade Afrotheria?
  
  What indeed! People have started looking for morphological features they have in common, and the one thing I know of – delayed tooth replacement – actually has a more complex distribution than originally thought (which could have been caricatured as “afrotherians do it later than everyone else”).
  
  Still, there are reasons not to consider the Afrotheria hypothesis completely unexpected from a paleontological point of view.
  
  First, the fact that Afrotheria is named after: golden moles and elephants have their origins in Africa during a time when Africa was isolated.
  
  Second, there is morphological evidence that elephants + sea cows + hyraxes on one side and elephant shrews on the other side are pretty closely related. Several morphological phylogenetic analyses have found such a relationship in the last 10 or so years; and the farther back you go in time, the more similar elephants and elephant shrews (or at least their teeth) become.
  
  Third, the evidence that golden moles or elephants are more closely related to something else has turned out to be very weak indeed. Golden moles used to be lumped with the other “insectivores”, and elephants with the “ungulates”, but both of these assignments are based on a very small number of characters once you account for all the redundancy.
  
  Fourth, golden moles are zalambdodont. Instead of identifiable cusps, their molars have V-shaped crests, and that’s it. There goes most of any morphological data matrix of mammal phylogeny. The golden moles have eliminated most of the morphological evidence about their relationships!
  
  (Of course I suspect that improvements are possible on the fourth point: there is phylogenetic signal in the rest of the mammal skeleton, and I’m quite sure it’s underexplored.)
  
  And then of course the golden moles have next to no fossil record, as usual for terrestrial burrowing animals, so there was lots of time for evolution to overprint all the morphological evidence.
  
  The molecular signal, on the other hand, is really robust. Use more genes or different genes, and Afrotheria just isn’t going away, not in one paper since 1998.
  
  And where on the archosaur line do turtles fit?
  
  Ah, where does anything fit in sauropsid phylogeny? Where do the millerettids fit? Where do all the other “parareptiles” fit? Where does Coelostegus fit? There are already morphological analyses where Eunotosaurus and the turtles together have come out as diapsids. We’re living in interesting times.
  
  And squamates are in some ways even weirder than turtles. Generations of herpetologists have started their studies by learning everything about squamates and have taken squamates as default sauropsids… in some ways they’re very far from that, as it’s slowly turning out. They have very fast molecular evolution, too, and quite a long branch… the tuatara genome is being sequenced, so stay tuned!
  
  When that happens I will buy you a beer.
  
  I don’t drink beer. Beer is disgusting – all beer, any beer. :-)
  
  Glad to change when new data shows otherwise.
  
  Well, most of the changes that would have to be made to your matrix and character list don’t require any new data. They require an understanding of the data we already have. And as long as you don’t have that, you won’t understand the new data either.
  
  Do you have any more comments on my critique of your character list?
  
  Reply ↓
davidpeters1954 on February 25, 2017 at 8:20 pm said:

Character lists, no matter who creates them, can always be criticized.
Scoring can also be criticized. You have done so yourself with other workers. Taxon lists can also be criticized, but solutions are readily available (either adding pertinent taxa or deleting unrelated taxa) and THAT we discover by expanding the gamut. That’s my job, expanding the gamut.

Reply ↓
- David Marjanović on April 12, 2017 at 2:57 am said:
  
  Adding taxa without adding characters eventually exhausts the ability of the character sample to resolve the relationships of these taxa correctly. You are way past that point.
  
  Check out Caudata and Stereospondylomorpha in my preprint. In some analyses their resolution is quite good – and also obviously wrong, which was made possible by not expanding the character sample to deal with the added taxa.
  
  You still seem to believe phylogenetic analysis is magic.
  
  Reply ↓
davidpeters1954 on May 22, 2017 at 5:30 am said:

We’ve talked about this before. So far, the ability of the character sample to resolve the relationships of these taxa correctly has NOT been exhausted. I’ll let you know when that happens. Your hypothesis and current paradigm have been falsified in the LRT.

Reply ↓
- David Marjanović on May 25, 2017 at 2:48 am said:
  
  Correctly? How do you know it’s correct?
  
  How can the LRT falsify anything when it’s so full of misscores and miscodings as I demonstrated in my first comment here?
  
  In fact, have you ever read that whole comment?
  
  Reply ↓