Can History Be a Science?
(Bayes' Theorem and Mythicism)

simple form of Bayes' theorem

1 - Quantifying History?

The most significant mythicist works of the last decade are arguably Richard Carrier's two volumes on the subject: Proving History and On the Historicity of Jesus. Carrier's most significant contribution (so far) to the field of history in general and to mythicism in particular has been his positing of Bayes' theorem as the preferred methodological basis for analysis and argumentation in determining the feasibility/tenability of historical claims. The problem with this is that just one look at even the short simplified form of the theorem (see figure at right) almost invariably only brings pained grimaces to the faces of New Testament scholars, and so the novelty of Carrier's approach is lost on those raised on the idea that history and science are apples and oranges, that never the twain shall meet. This may be especially true in New Testament studies. The idea of employing mathematical technique on biblical pericopes to calculate and weigh the comparative probabilities of two or more alternative historical conclusions tends to make most New Testament scholars cross-eyed and silent. At best they are humbled into this silence by an honest appraisal of their mathematical acumen, which is usually not as developed as they would wish it to be. At worst, despite this deficit (perhaps because of it), they are prone to be suspicious of what to them must seem like the sneaking-in of an idea they've always found non sequitur, i.e., historical empiricism. This is only natural to those for whom the fence between history and science is a given. They are wary of that which they don't (yet) understand by default. Carrier may as well have suggested that we divide the square root of  –1 by 0, judging by the bluster and/or incoherence of some of the responses that his methodological suggestions inspired in the blogosphere following his last book's release.  Math is simply all Greek to them. 

This is not to say that New Testament scholars are inherently handicapped, of course. Their deficiencies in mathematics are pretty much par for course, culturally and academically speaking. It's really just a natural consequence of holding to such a strict separation between history and science. But it's an easily-enough remedied handicap. It's just a bit of atrophy, not an insurmountable disability. People simply get a bit rusty in math when they seldom have to use it. It's understandable. Luckily, it's not rocket surgery that is needed (or even basic differential calculus). Bayesian calculations require fairly simple arithmetic. The math is the least of it, really. Scholars just need to see and appreciate the relevance and appropriateness of the approach in the first place, and once the false dichotomy is dissolved, they will hopefully allow themselves to see the usefulness of these statistical and empirical analyses, and then adjust their methodologies accordingly. Math and statistics are just more tools for their toolbelt. Not scary at all. But before we can learn any given tool (e.g., Bayes' theorem), we must first recognize and conquer this underlying irrational fear of math. 

Can history be a science?

Generally speaking, most people intuitively think of history and science as non-overlapping magisteria (to paraphrase Gould). Why? Well, because history just seems so random and chaotic prima facie. Doesn't it? It's reflexive. Our immediate reaction to the notion of studying history scientifically is to insist that human societies and their cultural artifacts are far too complex, too arbitrary, with too many variables changing over time to be studied with the kind of rigor and precision that is ordinarily required of the sciences. In fact, I too used to think likewise. I was trained as a chemical engineer, and so although I came to this line in the sand from the other direction, it was with similar reservations that I did so. But a little probing into the history of science shows that the idea is actually not that new at all. We discover that there have been historians and scientists who have approached several of the humanities in a more quantitative way than intuition would suggest. The pioneering work of people like Vito Volterra and Alfred Lotka, who at the end of the nineteenth and into the twentieth century advanced some very useful mathematical models, (still) serves as a basis for the analysis of population dynamics. More recent scholars like Jared Diamond, Peter Turchin and Victor Lieberman, have sought to find out how these kinds of mathematical models might relate to empirical data, not just in theory, but also more tangibly. Theory is good but actually corresponding to measurable reality is even better. Empirical testing of theories is crucial if a discipline is to rise to the level of science. When these newer scholars began accumulating and testing data, to their surprise, they discovered, first, that there is a lot of data, and secondly (and more importantly), that there are very strong empirical patterns discernible in the data. The presence of strong empirical patterns suggests that there may be some general principles—laws of history, if you will—underlying these patterns. Diamond's Guns, Germs, and Steel, is a great work of scientific writing that utilizes this approach well, but Turchin's Historical Dynamics and his War and Peace and War take it one step further. Whereas Diamond forwards a single hypothesis and then surveys a massive amount of data to bear on it in order to support the explanations he proposes, he doesn't bring quite the whole power of the scientific method to it. His approach is certainly quantitative, but a more thorough scope, such as Turchin's, involves the development of alternative, competing hypotheses of how societies and/or nations operate (not just a single hypothesis), and then examining the data (evidence) that allows us to distinguish between the predictions made by these alternative hypotheses. Though individual hypotheses matter, the true essence of scientific experimentation lies in the thoroughness of this kind of comparative analysis. Turchin has coined the term "cliodynamics" to describe this more-thorough comparative approach.

In many (historical) fields such as history, geology, evolutionary biology, paleontology and astro-physics (all legitimate branches of science), though they don't allow for manipulative experiments, they allow for mensurative experiments, in which the researcher measures the data in its natural state, without experimental manipulation. I should pause momentarily to highlight a common misconception about the meaning of "prediction" in the context of scientific research. Prediction is not necessarily about "the future." In science, it can be about events which have already occurred. Predictions can be about the discovery of particular kinds of evidence that are supportive (or not) of a particular hypothesis. Prediction is more of an intermediate step that we use to test alternative hypotheses. We extract predictions from hypotheses that clash from some aspect under examination, and we try to determine which of them conforms best to reality. Prediction thus need not happen "in the future," as the common parlance of the word implies. Instead of "prediction" what is meant here is "retrodiction," really, I guess. Jargon habits are hard to break, though, I suppose, even for geeky scientists who pride themselves on their objectivity and the provisionality of their field, and so I doubt the use of the single word "prediction" to denote both things will be fading any time soon.

Prediction in cliodynamics begins by discerning patterns in the data, such as patterns of political instability in a given society which tend to happen with some regularity (say, every couple of centuries or so), for example. Societies go through these long term oscillations, which can be divided into two phases, an integrative phase and a disintegrative phase. The integrative phase, which lasts about a century (sometimes longer), is characterized by a period of peace and stability, population growth and a generally optimistic outlook. The disintegrative phase is the other side of that coin: political unrest, stagnation, war, a pessimistic ideology, etc. Predictions can be made at several different levels. On one level we may ask whether this same pattern can be found to exist in societies other than the initial one. Because societies are not identical, neither are the patterns. They respectively vary in period and in amplitude. There is of course a difference between quantifying and identifying similarities in patterns, on the one hand, and testing causal hypotheses. The latter requires that we compare and contrast alternative explanations for why these repeating oscillations that may be common to societies occur.

An example of empirical patterns in history

Let's consider two alternative hypotheses for the causes underlying these cycles: First we have the Malthusian model, which suggests that as a population increases to a point where it starts pressing the limits of its resources, it begins to experience instability and warfare, which in turn decreases population, at which point the cycle just starts again. In this model, the cycle is driven purely by demographic mechanisms. Another hypothesis, called "demographic structural theory," also has demographic growth as a component, but it assigns more importance to what happens to the other two components of the social system, namely, the elites, i.e. the people with power (think French medieval aristocracy), and the state. In this model, waves of instability happen not necessarily because the population is miserable, but because peasants in those agrarian societies didn't have much military power, and so peasant rebellions were easily crushed by the state as long as the elites are unified and the state is strong. What happens is called elite reproduction. Eventually there are too many elites, all fighting for a diminishing slice of the same pie, and the states becomes weaker as a result. When it is weakened to the point where it has difficulty putting down insurrection and keeping order, the elites by this point have split off into rival factions and spark the civil wars that soon follow.

So, here we have two alternative theories. How can we test them? Well, we can test them by looking at the actual sequence of events. When we take a look at what happened in the late-middle ages in England or France, what do we find? There was a very strong population growth during most of the 13th century. By the beginning of the 14th century, we were beginning to have a Malthusian situation. Starvation. Popular immiseration. But notice that there was no real breakdown of the state at the time, and there were no civil wars yet. It really took the additional push of the Black Death, to get the population to significantly decrease. As a result, the social pyramid was disrupted because the elites were less affected by the plague than the masses were, so the social pyramid became top-heavy. There were too many elites for the supportive base underneath them. This is when society tipped and the civil wars finally got started. According to the Malthusian model, once the Black Death appeared, we should have seen a reduction in the demographic growth and therefore the beginning of a new cycle again. But that's not what happened.  The population did not begin to grow; it remained stagnant after the plague. It wasn't until the pressures on the elite, during the next hundred years, that English and French society experienced a streak of civil wars. The elites and the state had an influence that the Malthusian model does not predict, but which the structuralist model does. This essentially constitutes a quantifiable falsification of the Malthusian hypothesis. The empirical data more closely follows the predictions of the demographic-structuralist hypothesis.

If we apply the same theoretical predictions to another society, say, the transitional Roman period between the republic and the empire, we find a similar pattern. Again, about a hundred years of civil war. The demographic pressure on resources wasn't very strong but the evidence for elital reproduction is strong. If we continue to further test the theory on a variety of other case studies, it turns out that these patterns are common to many of them. In his two volume Strange Parallels, for example, Victor Lieberman finds many of these patterns in many South-east Asian societies throughout history.

The stochastic aspect of history requires statistical methods to determine its quantifiability. Statistical significance plays a crucial role. In the above example, because there have been thousands of these cycles of integrative/disintegrative cycles in the recorded history of human societies to cull from and examine, it is possible to use what is called the "frequentist" approach. There must be enough data (and there is) to allow for the quantifying of event dynamics in this method, so that we can compare one instance of such a cycle with another and draw conclusions based on the frequency of their occurrence (hence the name of the method). It can be applied whenever there are enough instances of a given phenomenon to warrant it. 

But what if we are trying to ascertain the parameters of something that has never occurred, though it may well fall within the realm of possibility, such as the accidental detonation of a nuclear bomb, for instance, or of some event that is so rare that we simply have no actual data to examine or to compare it to, such as the birth and development of Christianity (our focus), for instance?

2 - Enter Bayes

In such cases, a methodology regarding conditional probability is available to us that is based on something called Bayes' theorem.

What is Bayes' theorem?

If we want to quantifiably figure out the probability of whether two things are true (i.e. they "happened")—we'll call them "A" and "B"— that's going to equal the probability of A happening multiplied by the probability of B happening. That is:
P(A) X P(B)

This simple formula is the case only if both of the events are independent of each other. For example, the probability of rolling a "1" and a "4" on a single throw of a pair of dice, respectively, is 1/6 X 1/6, which is 1/36. Therefore, there is a one in thirty-six chance that a single throw of the dice will result in this particular combination of numbers.

However, we are often going to be interested in events that are relevant to each other and related in some significant way. In such cases, this simple formula is inadequate because it does not reflect the relation between the two events. This can be achieved by inserting an additional term:
P(A) X P(B|A)

This means that while A can be true, we are trying to ascertain whether B is true given that A happened.

An example of this principle:
Say that you don't know much about someone, whether she is married or whether she has children, for example.  You could gain partial information. You could learn that she is married. Let's say that that's A.  Now, we could ask what the probability is that she is married and that she has kids. That's where the new term (B|A) comes in, because there is a correlation involved. What are the chances of her having kids given that she is married? That is going to affect the probability.   Her chance of having kids is higher given that she's married. It works the other way around as well. That is, given that someone has children, the chances are higher that they are married
{symmetrically expressed as P(B) X P(A|B)}.

We can thus view these as equivalent probabilities:
P(A) X P(B|A) = P(B) X P(A|B)
Such is the symmetric equivalency between those two equations that we can solve for one of the variables {in this case for P(A|B)} using just simple algebra, which results in:

Voila!  — Behold the simple form of Bayes' theorem.1

If variables such as "A" and "B" should invoke some mental block regarding math in your brain, there is another way to think about Bayes' theorem, which is called the diachronic interpretation. The expression remains the same, but we simply replace A with "H" (for hypothesis) and replace B with "D" (for data or evidence).  This better reflects (for some people, at least) the essence of the theorem, which is that with every bit of new evidence we can update your belief in that hypothesis using the term on the right side of the formula as follows:
The most important thing to keep in mind regarding this equation is that it is iterative. That is, that each time a new piece of evidence arrives, we take our previous level of belief (our prior) and update it by multiplying by the ratio (of the two "likelihoods"). The result (our posterior) then becomes our prior for the next iterative calculation should newer evidence appear.  The process continues until all the available evidence has been taken into account. 

An example of how this idea is used to calculate the probability involving two alternatives follows:

The cookie problem

Suppose there are two bowls of cookies. Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies. Bowl 2 contains 20 of each.
Now suppose you choose one of the bowls at random and, without looking, select a cookie at random. The cookie is vanilla. What is the probability that it came from Bowl 1?

First, let's rewrite the equation to reflect the aim of the example:

  • p(B1): This is the probability that we chose Bowl 1, unconditioned by what kind of cookie we got. Since the problem says we chose a bowl at random, we can assume p(B1) = 1/2. 

  • p(V|B1): This is the probability of getting a vanilla cookie from Bowl 1, which is 3/4.
    (i.e. 30 out of 40 cookies are vanilla)

  • p(V): This is the probability of drawing a vanilla cookie from either bowl. Since we had an equal chance of choosing either bowl and the bowls contain the same number of cookies, we had the same chance of choosing any cookie. Between the two bowls there are 50 vanilla and 30 chocolate cookies, so p(V) = 5/8.
    (i.e. 50 out of a total of 80 cookies in both bowls are vanilla)

Let's plug the numbers in:

Doing the arithmetic, the answer to the problem reduces therefore to 3/5, which is equal to 60%. There is a 60% chance that the vanilla cookie came from bowl 1. Pretty straightforward, right?

In fact, even when we are not aware that this is the kind of thing we are calculating, this is the kind of thing we are doing whenever we are reasoning correctly in deciding which alternative is the more probable out of two (or more) different ones.

How does this apply to mythicism?

Having thus demonstrated the basic use of Bayes' rule, we return to Richard Carrier's work again. Carrier noticed that whenever people dispute the strength of a given historicist or mythicist argument, the dispute boils down to a difference of opinion regarding either the value of the prior probability or else of the probability given some evidence. He proposes that Bayes' theorem removes any arbitrariness involved in the methodology used by anyone hoping to defend either historicity or mythicism by, first, providing a way to quantify the variables, and second, by providing a way to calculate the posterior from these values once they can be agreed upon. This is the basic thrust of the thesis of the first volume of his work on this subject, Proving History.

In the second volume, he first narrows both the mythicist case and the historicist case to what he terms "minimal" cases, meaning the easiest versions of both positions that can be argued. In essence this introduces the concept of parsimony, a kind of Occam's razor, so that no extraneous or unnecessarily superfluous argument will bog down the issue. What's more, in order to ensure that he cannot be accused of stacking the deck in his favor (as a mythicist, that is) he even goes as far as granting to minimal historicity the greatest latitude possible in his calculations. That is, at every step, he intentionally argues against mythicism as far as his method will allow, at times granting things that are not even realistically warranted by any stretch of imagination, so that it can be fairly said that the probability of historicism must be lower than the intentionally conservative range estimate given by this kind of 'devil's advocate' a fortiori calculation. With each respective piece of evidence, he first calculates this a fortiori probability, and then, after that is out of the way, he also calculates where he really thinks the probability actually lies. This more-realistic estimate is considerably closer to a mythicist conclusion than the a fortiori one, obviously. Doubling down on his calculations in this this way precludes any potential accusation that he might be weighing the evidence in his favor in any way, and it also shows a willingness on his part to be fair with historicists while also being honest to his own intellect and method. In addition it also serves to provide a defensible upper and lower limit to the range of probability for any given datum. It seems to me a very clever and useful technique.

I will close this essay by quoting Carrier himself on the challenge that this quantitative approach poses for the problem of the historicity of Jesus. The most important novelty, the most important function of his work, in my opinion, comes after he has laid out his thesis in On the historicity of Jesus. At the very end of the last chapter of this work, Carrier directs a sober, clear, and very direct challenge at the peanut gallery of complacent "experts" who might find the whole idea an unsophisticated laughing matter:

[...] if readers object even to employing Bayes's Theorem in this case (or in any), then I ask them to propose alternative models for structuring the debate. If, instead, readers accept my Bayesian approach, but object to my method of assigning prior probabilities, then I ask them to argue for an alternative method of assigning prior probabilities (e.g. if my choice of reference class is faulty, then I ask you to argue why it is, and to argue for an alternative). On the other hand, if readers accept my method of assigning prior probabilities, but object to my estimates of consequent probability, then l ask them to argue for alternative consequent probabilities-not just assert some, but actually argue for them. Because the mythicist case hinges on the claim that these things cannot reasonably be done. It is time that claim was properly put to the test. And finally, of course, if readers object to my categories and sub­ categories of evidence or believe there are others that should be included or distinguished, then I ask them to argue the case.

I know many devout Christian scholars will balk and claim to find all manner of bogus or irrelevant or insignificant holes or flaws in my arguments, but they would do that anyway. Witness what many Christian scholars come up with just to reject evolution, or to defend the literal miraculous resurrection of Jesus (which they claim they can do even with the terrible and paltry evidence we have). Consequently, I don't care anymore what Christian apologists think. They are not rational people. I only want to know what rational scholars think. I want to see a helpful critique of this book by objective, qualified experts who could live with the conclusion that Jesus didn't exist, but just don't think the case can be made, or made well enough to credit. And what I want from my critics is not useless hole punching but an alternative proposal: if my method is invalid, then what method is the correct one for resolving questions of historicity? And if you know of none, how can you justify any claim to historicity for any person, if you don't even know how such a claim can be justified or falsified at all? Also correct any facts I get wrong, point out what I missed, and if my method then produces a different conclusion when those emendations are included, we will have progress. Even if the conclusion is the same, it will nevertheless have been improved. But it is the method I want my fellow historians to correct, replace or perfect above all else. We can't simply rely on intuition or gut instinct when deciding what really did happen or who really did exist, since that simply leans on unexamined assumptions and relies on impressions and instincts that are often not reliable guides to the truth. We need to make explicit why we believe what we do rather than something else, and we need this as much in history as in any other field. And by the method I have deployed here, I have confirmed our intuitions in the study of Jesus are wrong. He did not exist. I have made my case. To all objective and qualified scholars, I appeal to you all as a community: the ball is now in your court.


Further reading:



1 - I won't go into detail on the longer form of the theorem here, a form which more accurately reflects the fact that these two expressions are not in fact congruent. However, the simple form suffices for the introductory aspect of this essay, and is more than adequate for my purpose here. It should also be noted that although I derived this formula using algebra, the actual Bayesian calculations don't require any algebra at all. It's just a matter of plugging in numbers in the appropriate places and either multiplying or dividing. It's just simple arithmetic.  .



  1. McCullagh's denouncement of BT from the 80's, I have to admit, is rather surprising for how weak it is. In the first place he says that "nobody's doing it," which is totally irrelevant. In the second, his contention that it's "too difficult to get ahold of the inputs" is an invented problem. Trying to make the most out of inadequate documentation is the fundamental problem of history is it not? More suprising is that I still find bloggers quoting McCullagh's arguments as though we should expect the practice of history to stagnate, and McCullagh ought to be the last word in perpetuity.

    Contrary to McCullagh, it isn't that hard to apply BT, especially once you've gotten over the initial hurdle of grasping the meaning of the expression and getting yourself set up in Excel. After that, it's relatively easy to cut and paste your setup to compare and contrast any number of input scenarios you want. Carrier's suggestion of computing a min-max envelope to trap the true number is the same technique used by financial analysts in forecasting cashflows.

    One thing that BT expresses in it's division operation is the relationship and reasoning for why extraordinary claims must necessarily require extraordinary evidence to overcome their prior improbability. Apologists routinely disregard this fact, arguing instead that the flimsiest of evidence they would reject in any other sphere of life, is sufficient to overcome the most extravagantly unlikely claims that can be proposed. If you're not allowed to violate the rigors of the formula, then it these kinds of mistakes cannot be gotten away with.

    One other thing that using BT does is keep you honest in in terms of your inputs, because it makes it obvious when you're making a simple, but catastrophic error such as having all your priors sum to something other than 1, which is something I've caught William Lane Craig doing because I was using BT! Using ABE or criteria, it's easy to, in fact, wind up using estimates of P(h|b)=~0%, (Now with respect to the background knowledge alone, the supernaturalist may agree with the naturalist that the resurrection hypothesis has virtually zero plausibility...) AND P(¬h|b)=~0% (...but by the same token, the hypotheses that the disciples stole the body or that Jesus was taken down from the cross alive, and so forth, also have zero plausibility...)! I single out Craig, but this is exactly the sort of errors apologists of all stripes make routinely.

    If, as Carrier suggests, history were regularly done using BT, and the inputs and the calculations were all made explicit, then it would be possible to haggle over the inputs, not just outputs, which, using other methodologies, simply emerge out of a black box that could contain almost any machinery. I see that as a huge flaw in the current practice of history. Combine dogmatism and ego with methodologies that simply aren't very methodical, and the academic will never allow himself to be shown the error of his ways.

    That being said, secular historians might have a range of opinions about using BT initially, but once prestige and credibility begin to be associated with research done using it, I am sure it would catch on like wildfire. Apologists, OTOH, I can imagine would be lockstep against it, because they will be completely unable to function without the wiggle room afforded them by the old "methodologies." I can only say I would welcome BT as de rigueur in the field of history, as it would allow consensus to be reached on multiple issues, not just those touching on religion, much more easily than is presently the case.


  2. I don't have any objection to using Bayes' theorem per se but I can't work out how you could get valid probabilities to use to determine historicity.

    Regarding "can history be a science?" History done properly is a scientific process. The problem is too many people don't understand what science is. I did a Philosophy of Science course many years ago. The textbook was titled, "What is this thing called Science?"
    so it seems there is a lot of disagreement.

    But for me it is all about process.

    You postulate your hypothesis.

    You then make a prediction about the real world using your hypothesis.

    You then go and look at the real world and see if your prediction is correct.

    If the prediction is correct, then your hypothesis might be correct.

    If it is wrong, then the hypothesis is falsified.

    So what we need is to state clearly our hypothesis and a testable prediction.

    I am only a beginner when it comes to mythicism but it seems to me to be a difficult problem. Large parts of the new testament could be complete fiction or misrepresentations but there could be large parts that refer to events actually happened.

    Was there a real Jesus? If there wasn't a real Jesus, what might we expect to see in the Bible and out in the non biblical evidence?

    Not sure yet.



    1. I recommend the first volume of Carrier's .... Proving History ...
      it deals with the questions you ask in great detail.

  3. Will have a look.




anonymous comments may or may not be published ...