12 September 2010

Applying bioinformatics to the bible

As I think everyone knows (or should know), the Gospels of Matthew and Luke used the gospel of Mark as one of their sources, plus some other material. Over at Irreducible Complexity (a great blog), Ian has an analysis of shared material across the synoptic gospels (as they are called), and I've commented on this before. At the time it struck me that a powerful way of analysing this material might be to approach it from the bioinformatic angle, and use the dot-plot technique to compare the source material.

Well, it's been done! John Lee (working at MIT at the time) has compared the gospels of Mark and Luke using this technique, and it makes for mighty interesting reading. Here's part of the skinny (excuse the clumsy screen grab!).
So what you're seeing is Mark on the X axis and Luke on the Y, and the dots indicate regions of high similarity shared between the gospels. See those diagonal lines? They indicate regions of very high similarity, and in fact show where Luke has derived his material virtually entirely from Mark (or ur-Mark or a slightly younger neo-Mark).

Now, I reckon you could do this for all the gospels, or even the whole bible. It would be interesting to see what would be thrown up, although I doubt it'll tell us much that we don't already know. For more information on what we do already know (pre dot-plot of course), I would strongly recommend Robin Lane Fox's excellent resource "The Unauthorised Version: Truth and Fiction in the Bible". If you can read that and remain an inerrantist, there is something VERY wrong with your brain. Or you don't really care about the truth.


  1. This is because they are two books telling the story of Jesus's life from different perspectives, Mark talks about the three wise men and others well Luke is about the three farmers who were in the fields and then the "angels" came and saw him. Before you reply to this I'm atheist and in no way am I trying to defend the bible.

  2. Hi Anonymous; thanks for the comment. Actually, you'll find if you go back and read Mark and Luke that Mark does not have anything in it about Jesus's birth *at all*. You may be confused with Matthew, who has the wise men etc. Neither of them mentions three, and the two nativity stories are completely contradictory.

    But you raise a more serious point - the synoptic gospels are not "different perspectives" at all - indeed "synoptic" means "seen with one eye"! Matthew and Luke derive the bulk of their text from Mark, and the analysis of Lee (and many others) shows very clearly where substantial chunks of the Mark *document* have been plagiarised (OK, "re-used") in the later gospels.

    None of the gospel writers was an eyewitness, and the gospels were each written with a specific audience and a specific theological shopping list in mind. And they were all altered after their initial composition, so it is not at all easy to decide what is original or not. It is even harder to work out whether there is any historical basis to *any* of it (although I think there is).

    One problem is that there is no point in defending "the bible", since in reality it is a fairly ad hoc collection of documents. There is some good stuff in there, and some complete cobblers.

  3. Well, if you can give me a definition of "inerrancy" that everyone can agree on I'll get back to you.

    On the similarities...folk did notice these before bio-informatics. So who copied who? Have all the options been exhausted?

    I'd like to make my name in NT studies by coming up with a Veale hypothesis. We've had Markan Priority and Matthean Priority. Annoyingly some Israeli scholars have touted Lukan priority. There's got to be another way of ordering things - so a la Morton Smith, I propose we invent a secret Gospel.

    This approach fooled the academy once.

  4. Yea, yea; where it's the same it's plagiarism, where it's different it's a contradiction.


  5. Hi boys,
    Some interesting misconceptions there. The best contradictions are of course the ones that occur within the bits of text that one gospel redactor (I hesitate to use the word "author") has pinched from another. As you know, I especially like Matthew's donkey gaffe; I also enjoy the withering fig-tree trick and Jairus's daughter.

    Quite why a god would wish to reveal himself to mankind through such a faulty series of documents is a poser to be sure.

    As for "inerrancy", the whole concept is ludicrous; the bible is jammed with errors, and the attempts by some misguided people to "explain them away" do no justice to the bible itself.

    Anyway, good to have you hear, lads :-)

  6. Thanks for the link.

    I'm not sure I understand his figure 1 diagram. The text seems to suggest this is a cosine similarity plot, but I couldn't fathom his method (the text talks about using a constant if they 'aren't derived' - but surely that is what the cosine similarity is supposed to tell?). So I guess his plot puts a dot anywhere where the cosine metric returns greater than some threshold. It should also be possible to do the same thing by word, though, create a heat-map, and obviate the need for the cosine calculation at all.

  7. Ian, you're right; there is no need for fancy stats in this sort of analysis - a simple heat map would suffice, based on some simple similarity metric, and would obviate the need for a threshold (which would be required under a dot model).