# Predicting the Future

Only fools and the bankers who created the GFC think the future is absolutely predictable. The rest of know there is always a degree of uncertainty in any prediction about what may happen at some point in the future. The key question is either what degree of uncertainty, or in project management space what is the probability of achieving a predetermined time or cost commitment.

There are essentially three ways to deal with this uncertainty:

**Option one**is to hope the project will turn out OK. Unfortunately hope is not an effective strategy.

**Option two**is to plan effectively, measure actual progress and predict future outcomes using techniques such as Earned Value and Earned Schedule then proactively manage future performance to correct any deficiencies. Simply updating a CPM schedule is not enough, based on trend analysis, defined changes in performance need to be determined and instigated to bring the project back onto track.

**Option three**is to use probabilistic calculations to determine the degree of certainty around any forecast completion date, calculate appropriate contingencies and develop the baseline schedule to ensure the contingencies are preserved. From this baseline, applying the predictive techniques discussed in ‘option two’ plus effective risk management creates the best chance of success. The balance of this post is looking at the options for calculating a probabilistic outcome.

The original system developed to assess probability in a schedule was PERT. PERT was developed in 1957 and was based on a number of simplifications that were known to be inaccurate (but were seen as ‘good enough’ for the objectives of the Polaris program). The major problem with PERT is it only calculates the probability distribution associated with the PERT Critical Path which inevitably underestimates the uncertainty in the overall schedule. For more on the problems with PERT see ** Understanding PERT** [http://www.mosaicprojects.com.au/WhitePapers/WP1087_PERT.pdf]. Fortunately both computing power and the understanding of uncertainty calculations have advanced since the 1950s.

Modern computing allows more effective calculations of uncertainty in schedules; the two primary options are **Monte Carlo** and **Latin hypercube sampling**. When you run a Monte Carlo simulation or a Latin Hypercube simulation, what you’re trying to achieve is convergence. Convergence is achieved when you reach the point where you could run another ten thousand, or another hundred thousand simulations, and your answer isn’t really going to change. Because of the way the algorithms are implemented, Latin Hypercube reaches convergence more quickly than the Monte Carlo. It’s a more advanced, more efficient algorithm for distribution calculations.

Both options going to come to the same answer eventually, so the choice comes down to familiarity. Older school risk assessment people are going to have more experience with the Monte Carlo, so they might default to that, whereas people new to the discipline are likely to favour a more efficient algorithm. It’s really just a question of which method you are more comfortable with. However, before making a decision, it helps to know a bit about both of these options:

**Monte Carlo**

Stanislaw Ulam first started playing around with the underpinning concepts, pre-World War II. He had broken his leg and was in rehab for a long time convalescing and played solitaire to pass the time. He wanted some way of figuring out what the probability was that he would finish his solitaire game successfully and tried many different math techniques, but he couldn’t do it. Then he came up with this idea of using ** probability distribution** [http://www.mosaicprojects.com.au/WhitePapers/WP1037_Probability.pdf] as a method of figuring out the answer.

Years later, Ulam and the other scientists working on the Manhattan Project were trying to figure out what the likelihood was for the distribution of neutrons within a nuclear reaction. He remembered this method and used it to calculate something that they couldn’t figure out any other way. Then they needed a name for it! One of the guys on the team had an uncle that used to gamble a lot in Monte Carlo, so they decided to call it the Monte Carlo method in honour of the odds and probabilities found in casinos.

The Monte Carlo method (or Monte Carlo experiments) are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results; ie, by running simulations many times over in order to calculate those same probabilities heuristically just like actually playing and recording your results in a real casino situation.

In summary, Monte Carlo sampling uses random or pseudo-random numbers to sample from the probability distribution associated with each activity in a schedule (or cost item in the cost plan). The sampling is entirely random - that is, any given sample value may fall anywhere within the range of the input distribution and with enough iterations recreates the input distributions for the whole model. However, a problem of clustering may arise when a small number of iterations are performed.

**Latin hypercube [i]sampling (LHS) **

LHS is a statistical method for generating a sample of plausible collections of parameter values from a multidimensional distribution. It was described by McKay in 1979. An independently equivalent technique was proposed by Eglājs in 1977. And it was further elaborated by Ronald L. Iman, and others in 1981.

Latin Hypercube sampling stratifies the input probability distributions and takes a random value from each interval of the input distribution. The effect is that each sample (the data used for each simulation) is constrained to match the input distribution very closely. Therefore, for even a modest sample sizes, the Latin Hypercube method makes all, or nearly all, of the sample means fall within a small fraction of the standard error. This is usually desirable.

**Different type of sampling**

The difference between random sampling, Latin Hypercube sampling and orthogonal sampling can be explained as follows:

- The
**Monte Carlo**approach uses random sampling; new sample points are generated without taking into account the previously generated sample points. One does thus not necessarily need to know beforehand how many sample points are needed. - In
**Latin Hypercube**sampling one must first decide how many sample points to use and for each sample point remember that it has been used. Fewer iterations are needed to achieve convergence. - In
**Orthogonal sampling**, the sample space is divided into equally probable subspaces. All sample points are then chosen simultaneously making sure that the total ensemble of sample points is a Latin Hypercube sample and that each subspace is sampled with the same density.

In summary orthogonal sampling ensures that the ensemble of random numbers are a very good representation of the real variability (but is rarely used in project management), LHS ensures that the ensemble of random numbers is representative of the real variability whereas traditional random sampling is just an ensemble of random numbers without any guarantees.

**The Results**

Once you have a reliable probability distribution and a management prepared to recognise, and deal with, uncertainty you are in the best position to effectively manage a project through to a successful conclusion. Conversely, pretending uncertainty does not exists is an almost certainly a recipe for failure!

In conclusion, it would also be really nice to see clients start recognise the simple fact there are no absolute guarantees about future outcomes. I am really looking forward to seeing the first intelligently prepared tender that asks the organisations submitting a tender to define the probability of them achieving the contract date and the contingency included in their project program to achieve this level of certainty. Any tenderer that says they are 100% certain of achieving the contract date would of course be rejected based on the fact they are either dishonest or incompetent……

[i]In the context of statistical sampling, a square grid containing sample positions is a **Latin Square** if (and only if) there is only one sample in each row and each column. A **Latin Hypercube** is the generalisation of this concept to an arbitrary number of dimensions, whereby each sample is the only one in each axis-aligned hyperplane containing it.