Tips on using this forum..

(1) Explain your problem, don't simply post "This isn't working". What were you doing when you faced the problem? What have you tried to resolve - did you look for a solution using "Search" ? Has it happened just once or several times?

(2) It's also good to get feedback when a solution is found, return to the original post to explain how it was resolved so that more people can also use the results.

Latin Hypercube Sampling (LHS)

17 replies [Last post]
Emily Foster
User offline. Last seen 14 weeks 6 days ago. Offline
Joined: 19 Aug 2011
Posts: 625
Groups: None

Here we look at what is Latin Hypercube Sampling (LHS) and are there any benefits over Monte Carlo ow.ly/ERdLw

Replies

Mike Testro
User offline. Last seen 4 days 2 hours ago. Offline
Joined: 14 Dec 2005
Posts: 4398

Hi Emily

Usually my Index finger but sometimes the middle one - right hand.

Never in an Arabic Country all ten three times.

Best regards

Mike Testro

Emily Foster
User offline. Last seen 14 weeks 6 days ago. Offline
Joined: 19 Aug 2011
Posts: 625
Groups: None

We just posted the second part to this article here http://ow.ly/F8Ita

Good discussion, although I'm wondering which finger Mike sticks in the air when he's guesstimating :-)

Dennis Hanks
User offline. Last seen 3 years 34 weeks ago. Offline
Joined: 17 Apr 2007
Posts: 310

Mike:

I assume you are referring to commercial construction. In Oil and Gas, the problem is one of too many non-trackable activities.

That aside, I agree with most of what you present except:

1.       The ten-day limitation may be ignored if an effective time tracking system is used. Ten days would be impractical in many large projects.

2.       Agreed, though not always possible.

3.       No exceptions here. No constraints – period.

4.       Would substitute non-resource activities for lags in these examples.

5.       No argument.

6.       The essence of a resource loaded schedule.

Disagree, risk should not be built into the schedule. The durations should reflect the non-risked hours in the Estimate. Monitor to the estimated values.

 

A poorly constructed schedule has no value.

 

Plan it properly at the begining (sic) and get the correct completion date first time.

Maybe, but what is your contingency? You don’t need one? You are good.

 

Mike Testro
User offline. Last seen 4 days 2 hours ago. Offline
Joined: 14 Dec 2005
Posts: 4398

Hi Everyone

I am fascinated by this discussion.

In my simple mind a construction programme is best put together Bottom Up at Level 4 with:

1. Each task representing 1 trade in 1 location and never more than 10 working dayas.

2. Only FS Links

3. No Constraints

4. No Lead Lag links except for curing and /or drying out and then set for calendar days.

5. Correct interfaces between following trades

6. Each task resource modelled from resource hours extracted from the Cost Plan / BoQ - thus completing the "triangle"

When this is prepared by a Planner who knows how to build it then there is no need for any risk analysis because it has already been built in.

In my 25 years as a delay analyst I have yet to see this quality in any contractor's programme

It is usually at level 3 (sometimes even 2) with SS Links and Lead Lags becuase the scheduler is either too lazy or does not have the skill and experience to do it properly.

With such an imprecise programme then systems such as Monte Carlo will deliver a range of imprecise results.

Plan it properly at the begining and get the correct completion date first time.

Best regards

Mike Testro

Dennis Hanks
User offline. Last seen 3 years 34 weeks ago. Offline
Joined: 17 Apr 2007
Posts: 310

Tony:

LHS has lost much of its value given the current processing power of most systems.

Dennis Hanks
User offline. Last seen 3 years 34 weeks ago. Offline
Joined: 17 Apr 2007
Posts: 310

Steve:

Hardly a nasty crack. You brought up the non-sequitur. We’ve had the ‘critical drag’ discussion before. I regard it as irrelevant to this discussion and the execution of effective project control. We can take this off-line if you wish.

Your experience is different from mine. Risk register development seems linked to SRA. I have yet to do one without the other.

The distribution functions model reality. Most ranges have their values normally distributed. The PDFs are modeling devices reflecting ‘natural’ data – unit rates. The use of other than the triangle PDF is an attempt to account for estimator risk bias (risk averse/accepting).

Using the PRA (Pertmaster) Templated Quick Risk is an easy way to apply distributions to resource-loaded schedules. Simple to develop and employ.

“Again, I agree that there may be value in doing thorough probabilistic scheduling via a Monte Carlo system.” So close.

 I'm just not sure

(1) its benefits are worth the effort or  I think they are.

(2) if anyone is doing it thoroughly! No data, but definitely the exception, but then so is effective project controls. Just because it’s not being done, does not mean it shouldn’t.

Tony Welsh
User offline. Last seen 3 years 18 weeks ago. Offline
Joined: 10 Oct 2011
Posts: 19
Groups: None

Stephen, you say "Would that be using the triangular default or the beta distribution default? Because those two will usually give estimates that are about 10% different."

When I say you would likely be right, I do not mean that there is a single number or date which is "right," but rather that the range of projected outcomes is likely to encompass the actual outcome.  You cannot know exactly when your project will be finished, but if it is important to finish by a particular date, you probably want a 90% chance of doing so rather than just a 50% chance.  Using an average, or using a deterministic result based on unbiased duration estimates, implies that 50% is good enough.  (And in fact due to merge bias the chance of meeting the deterministic date -- or the date produced by PERT -- is often much lower.  IMHO this is the biggest single reason why projects are "late.")

Tony Welsh
User offline. Last seen 3 years 18 weeks ago. Offline
Joined: 10 Oct 2011
Posts: 19
Groups: None

Mike, you may be enjoying it but frankly I do not understand your last post.  Monte Carlo does not create uncertainty; it just recgonises that uncertainty exists.

If I were to ask you how long is the Suez canal (and if you did not cheat by googling) you could say "between 50 and 200 miles" and be pretty sure you were right, but if you made a single-point estimate you would almost certainly be wrong.  It is in fact _easier_ to  make a range estimate than a single-point one, as pointed out by Douglas Hubbard in his excelent book "The Failure of Risk Management."

And as for the data being "guesses," this applies a fortiori to single-point esstmates.  To quote another excellent book (The Flaw of Averages by Sam Savage) "In the land of averages, the man with the wrong distribution is King."

Does anyone have any comments on LHS?

Stephen Devaux
User offline. Last seen 5 days 18 hours ago. Offline
Joined: 23 Mar 2005
Posts: 624

Hi, Dennis.

"Putting aside the self-promotion,.."

That's a bit of a nasty crack, isn't it? I promote the value of computing critical path drag, which is an attribute of (almost) every critical path activity/constraint. And 97.8316% (and that's not an estimate, BTW -- it's EXACT because I ran it through a Monte carlo system!) of activities with drag have drag cost, which can be in dollars, euros or human lives. So perhaps it's of zero importance to some people to know how many dollars or lives are dependent on the extra time that could be saved on a schedule -- but I happen to think it's very important. And so I will continue to emphasize it, and post exercises and conduct free webinars on the Internet in how to compute it, and point people to more info about it, and even publicize those software products such as Spider Project that compute it, even though certainly none of these things (except my book, the revenues for which are likely to amount to under $10/hour!) redounds to my credit balance! I assure you, it ain't about the money! If it were, I'd spend my time teaching PMP prep classes!

I share with most others on this site the character flaw that I consider project management efficiency important. And I didn't invent drag -- critical path activities have had drag since the pharaohs. I consider computing it to be simple due diligence, and suggest that it is a disgrace that schedulers and most scheduling software can't be bothered to compute it. Would you buy a software package that doesn't compute total float, Dennis? Well, drag is much more critical because it is critical! So if you want to engage in snide ad hominem attacks, do it to someone who doesn't bother about doing an efficient job of scheduling.

"...and your argument is that SRA is not worth the effort. Improperly done, I agree."

Actually, it's 'properly done" that I'm not sure is worth the effort or the cost. (Improperly done, which is the case more than 90% of the time, I know it's not worth the effort!) Again, it requires an accessible database of historical metrics that effort has been made to make accurate and useful. And then it requires a software package that costs money and that computes lags properly (that's important, BTW!). And then it requires the thoughful assembly of estimates for each activity. That all takes time and costs money.

"But then, so would be the application of critical drag."

No, the computation of critical path drag takes MUCH less time and effort, either manually or for software, in that it requires NO additional input other than what you have to do anyway for your CPM schedule -- the drag calculations are arithmetic outputs, just like total float and free float, only drag calculations are critical!

"Most of the time consuming effort in SRA is devoted to risk events – what may/may not happen, and mitigation strategy - which is seldom undertaken outside of SRA."

I'm sorry, but I totally disagree with this. Risk management, including identification techniques, impact estimates and mitigation startegies are a standard part of project planning whether Monte Carlo systems are used or not.

"I’m confining my comments to duration determination via unit rates. Here, analysis and application are relatively straightforward and immediate. The distributions I use (triangle, trigen, and betaPERT) assume a ‘normal’ distribution of values with an optimistic skew for betaPERT. A normal distribution for unit rates is not an unreasonable assumption for data collected over several projects, over time (the Estimate)."

I'm sorry, but if that "normal distribution" is the output of triangular distribution estimates for each activity, there is NO reason to believe it is any more accurate than the (yes, more optimistic) beta distribution estimates, or any other default. If someone wants to try to get accurate input, it really requires selecting among the 30 or so options that most packages offer for each of the several thousand activities. And then you likely WILL get an estimate of about 8% - 12% less than with the triangular default at the 50% confidence level for the project.

(Of course, contractors LIKE to have that extra 8% - 12% extra reserve -- and they use it all, too!) 

BTW: How does one determine “the riskiness of the schedule”?

I get Mike Testro to wet his finger and hold it up in the air. How do you do it?

"Also, while there may be ‘zero evidence’ that estimates (we are now talking oranges and apples) based on Monte Carlo are more actuate than estimates with a guesstimate reserve/contingency; there is zero evidence that they are not. Neither position can be proven by evidence, there is none available to the public."

True. There is, however, a great deal of evidence of the effort needed, and the extra time and cost thereby caused, in order to do Monte Carlo properly. And it is much more than to not do it at all. It's also much more than to do a half-assed effort at it, which is what the vast majority of people doing it are doing. But if I'm going to spend a lot more time and money, I'd like there to be some evidence it will add value. As well as some evidence that it won't subtract value -- with this deadline-driven world that projects exist in, once there's schedule reserve in there, it can drive behaviors that turn prediction into reality. "Self-fulfilling prophecy", I believe is the term.

Again, I agree that there may be value in doing thorough probabilistic scheduling via a Monte Carlo system. I'm just not sure (1) its benefits are worth the effort or (2) if anyone is doing it thoroughly! 

Fraternally in project management,

Steve the Bajan

Dennis Hanks
User offline. Last seen 3 years 34 weeks ago. Offline
Joined: 17 Apr 2007
Posts: 310

Steve:

The trouble with Monte Carlo systems is: Garbage in, gospel out. And it takes a lot of time and effort to input anything more valuable than garbage. That time and effort would be a lot better spent optimizing the schedule by using critical path drag and drag cost metrics.

Putting aside the self-promotion, your argument is that SRA is not worth the effort. Improperly done, I agree. But then, so would be the application of critical drag. Most of the time consuming effort in SRA is devoted to risk events – what may/may not happen, and mitigation strategy - which is seldom undertaken outside of SRA.

I’m confining my comments to duration determination via unit rates. Here, analysis and application are relatively straightforward and immediate.

The distributions I use (triangle, trigen, and betaPERT) assume a ‘normal’ distribution of values with an optimistic skew for betaPERT. A normal distribution for unit rates is not an unreasonable assumption for data collected over several projects, over time (the Estimate).

BTW: How does one determine “the riskiness of the schedule”?

Also, while there may be ‘zero evidence’ that estimates (we are now talking oranges and apples) based on Monte Carlo are more actuate than estimates with a guessitmate reserve/contingency; there is zero evidence that they are not. Neither position can be proven by evidence, there is none available to the public. 

Dennis Hanks
User offline. Last seen 3 years 34 weeks ago. Offline
Joined: 17 Apr 2007
Posts: 310

Mike;

To torture your roulette analogy, let’s say analysis of the outcomes revealed less than a random distribution. One number (or group of numbers) was more likely than other numbers. Would you still guess, or play the more likely number?

SRA helps determine the more likely number. It’s a rational exercise, not a scientific one. The variables are not absolute (read are uncertain) – unit rate is uncertain, the distribution is subjective, and the schedule may not be well crafted. Given that, some analysis is better than no analysis.

If the goal is to determine contingency, then my guess will be better than yours. Nothing says they can’t be the same. Yours might be based on years of experience (you watched the wheel), or you got lucky. Given the sums sequestered for contingency, business minds prefer a rational determination.

Stephen Devaux
User offline. Last seen 5 days 18 hours ago. Offline
Joined: 23 Mar 2005
Posts: 624

In general, I agree with Mike on this one. Here is a long discussion in another thread that I had with Rafael Davila et. al. on the subject of the shortcomings of Monte Carlo systems back in May.

I do feel that there is potentially value to be had from such systems -- but to get that value, you have to invest a great deal of time and effort. For example, maintaining and using a historical database, carefully selecting the appropriate distribution shape for each of the (perhaps 10,000) activities, replacing any and all lags with activities -- and the vast majority of planners ain't doing any of that! The attitude tends to be that of the high profile consultant who speaks at lots of PMI's Scheduling Community of Practice Symposiums and with whom I had this discussion after his presentation at the 2009 meeting (and which I mention in my book Managing Projects as Investments):

"He talked about the great results he got by using a Monte Carlo system for his estimates. I asked him if he didn't find the effort of choosing a distribution shape for each of the 10,000 activities enormously laborious. He gave me the reply I expected: "Oh, I run it on one of the default distributions." Which one, I asked. "Oh, I think the triangular, but it doesn't make much difference." Not much difference? 8%-12% change in duration estimate is "not much difference"?? He had no idea because he'd never bothered to check out the mathematics. And remember, he is regarded as a scheduling authority! (What is the drag cost of an extra 8% - 12% on one of his project durations, I wonder?)"

Tony also made a couple of comments:

"If you use a deterministic CPM the chance of your being right are close to zero."

That is absolutely true. And people using deterministic estimates tend to understand this and incorporate schedule and cost reserve based on the riskiness of the schedule. There is zero evidence that those estimates that include the reserve are any less accurate than those that use Monte Carlos, but they usually involve a LOT less costly work.

"If you make range estimates of your inputs and get range estimates for your outputs you are very likely to be right."

Would that be using the triangular default or the beta distribution default? Because those two will usually give estimates that are about 10% different. Or do you actually go and check out the historical data and then select the appropriate distribution shape for each and every activity in your schedule? Because if you do the latter, AND you either have a system that deals with volume lags (like Spider Project) OR you replace every lag with an activity, then I agree your duration estimate will probably be a little bit more accurate. But what percentage of planners are doing all that (or even aware that they should be!)? I bet it's less than 1%.

The trouble with Monte Carlo systems is: Garbage in, gospel out. And it takes a lot of time and effort to input anything more valuable than garbage. That time and effort would be a lot better spent optimizing the schedule by using critical path drag and drag cost metrics.

Fraternally in project management,

Steve the Bajan

Mike Testro
User offline. Last seen 4 days 2 hours ago. Offline
Joined: 14 Dec 2005
Posts: 4398

Hi Tony

You use the words "very likely" and "Right".

The last is an absolutes - the first is an aproximation.

So the result may be very close to Right or Wrong or neither.

Depending on your "range estimates of your inputs"

Which is where you place your bets on the roullette wheel.

Which is why the system is called MONTE CARLO.

The truth is in the name.

Again I hold my finger up in favour of the wind of guesswork - balanced by years of experience.

Best regards

Mike Testro

PS - I am starting to enjoy this so keep them coming

Tony Welsh
User offline. Last seen 3 years 18 weeks ago. Offline
Joined: 10 Oct 2011
Posts: 19
Groups: None

Mike has things completely backwards.  If you use a deterministic CPM the chance of your being right are close to zero.  If you make range estimates of your inputs and get range estimates for your outputs you are very likely to be right.

Mike Testro
User offline. Last seen 4 days 2 hours ago. Offline
Joined: 14 Dec 2005
Posts: 4398

Hi Dennis

I had a similar discussion on linked in arround the word "guessing".

When you model the risk analysis at either High Medium or Low it is the same as placing your stake at roullette.

The software spins the wheel and drops the ball.

The result is one of an infinite number of permutations - you can't model a computer to play roullette.

The odds of getting the right answer is the same as my uplifted finger.

Best regards

Mike Testro 

Dennis Hanks
User offline. Last seen 3 years 34 weeks ago. Offline
Joined: 17 Apr 2007
Posts: 310

Mike;

I assume you see no value in Schedule Risk Assessment (SRA). Putting aside the issue at hand – Latin Hypercube v. Monte Carlo, which is moot (irrelevant) given the processing power of most systems.

This is just computer simulated gambling software with a human setting the odds.”

This is not the case. SRA is recognition that most values in project controls are estimates. As estimates they are imprecise – they have a range of values. SRA applies those values according to the range/risk profile used – triangle, betaPERT, Trigen, or any of a number of different possible distributions.

If we’ve determined our duration via a unit rate, we know the unit rate represents an average of experience, with high, low, and most likely (average) values. Some of us think it is useful to capture this range and reflect it in our project expected finish date. SRA allows us to do this. We don’t set the odds, we model them.

The distribution is largely a guess, but not without some insight. Regardless, SRA is better than ‘sucking a finger’.

Note: Omitted is any discussion of modeling event risk.

Mike Testro
User offline. Last seen 4 days 2 hours ago. Offline
Joined: 14 Dec 2005
Posts: 4398

Hi Emily

This is just computer simulated gambling software with a human setting the odds.

When it is possible for a computer to set the odds for its own bet then the system will go into a loop and disapear up its own output.

A consummation devoutly to be wished.

I still advocate the method of sucking a finger and judging the way of the wind.

Best regards

Mike Testro