# Projects do not always go exactly as planned.

**Cost and Schedule Risk Analysis**

**Some readers might have noticed that projects do not always go exactly as planned. Actually they never go exactly as planned. What’s more, they have a tendency to go worse than planned more often than they go better than planned. Why is this?**

Project plans are just forecasts, so the reason projects don’t go exactly to plan is the same as the reason weather forecasts are not always correct. It is because our knowledge of the future is uncertain. Yet mainstream project management systems expect project managers to give single-point estimates of how much each task will cost and how long it will take. From this they produce single-point estimates of what the project will cost and when it will be finished. The only thing certain about these estimates is that they will be proved wrong.

The first attempt to address this issue came with the introduction of PERT in the late 1950’s. Unfortunately, PERT works only when there is only one path through the project network with any chance of being critical. Even more unfortunately, it is not easy to determine whether this is the case without using more sophisticated methods such as Monte Carlo simulation.

So, what is Monte Carlo simulation? Like PERT, Monte Carlo starts with the user specifying a range of values for the duration and cost of each task, typically in the form of a probability distribution based on a 3-point estimate. (I will come back later to the perceived difficulty of making these estimates.) Monte Carlo simulation samples from these distributions – that is to say it generates a value for each duration or cost in a way that is random but which reflects the relative probabilities implied by the specified distribution – and then does a CPM calculation (i.e. a forward and backward pass) using these durations. It then stores the CPM results – dates, floats, and costs – in a summary form such as a histogram. It repeats the whole process thousands of times, each time with a different set of random samples, and so builds up a picture of the whole range of possible outcomes and their relative probabilities.

This enables one to answer questions such as “what is the chance that the project will be finished by December 15^{th}?” or “what is the chance that the project will not cost more than $15 million?” These are intrinsically more sensible and more useful questions than “when will the project finish?” or “how much will the project cost?” because we know that the answers to these will not be precise, and yet they do not tell us how imprecise they might be. At best we might expect to have a 50% chance of meeting the projected date and cost, and a 50% chance is hardly good enough odds for the completion of an important project. What’s more, due to a phenomenon called merge bias which I might go into in a future post, the chance of meeting a single-point estimate of project completion is often much less than 50%.

Each set of duration samples and the resulting CPM calculation is called a trial. It is important to do a large number of trials to get reliable results. I recommend at least 10,000. This may sound like it would be time consuming, but it does not need to be. Software products vary widely – by a factor of at least 100 -- in the speed with which they do these calculations, so a simulation that would take 10 minutes on one product may require an overnight run on another.

A common objection to doing risk analysis is that project managers (or their subject matter experts) say they cannot provide the three-point estimates required. But if they cannot provide a three-point estimate how can they possibly provide a single-point estimate? Surely it is easier to estimate a range of values than to provide a single figure? (If one were asked to estimate someone’s weight, would one not be more confident in saying “170 to 200 pounds” rather than “185 pounds?”)

The more stubborn may further argue that they do not know enough about the task to give optimistic and pessimistic estimates of its duration, but that demonstrates a fundamental misunderstanding of what uncertainty is. Uncertainty is not a property of the task -- which will actually take a definite length of time to complete – but a property of our own current knowledge. So, the fact we do not know enough about the task is *exactly what we are trying to model* with the probability distribution. The less we know about a task the less valid would be our single-poit estimate and the more important it is to give a range. (The meaning of uncertainty is quite a philosophical subject, justifying a post of its own some time.)

There remains the question of how one chooses a distribution shape, to which I would answer that it does not matter much. Recognizing that there *is* a distribution is the main thing. (It does matter to some degree because some distributions have much thinner tails than others, but this can be largely overcome by giving the user the opportunity to frame his optimistic and pessimistic values in terms of percentiles rather than absolute limits.) As Sam Savage says (in his excellent book, “the Flaw of Averages”), in the land of the average the man with the wrong distribution is King.

There is a lot more to Monte Carlo than explained above, but this is enough to get you started. And if you do just this you will be on the path to more realistic project plans. There is just one more point I would like to make, however. I have mentioned cost and schedule several times, and ideally one would do these together in a single simulation because uncertainty about cost and schedule are related. Some systems separate them, bringing them together only at the end of the process. This loses valuable information about how they are correlated. I hope to talk about this further in future posts, as well as about sensitivity analysis, correlations, and risk drivers.