This episode is part of a wider mini-series looking at Estimation in Software Development. In the last couple of episodes, I've looked at a number of methods that fall under the Qualitative approach to software estimation. Qualitative estimation is predominantly based on expert judgment, some think based on subjective thought process. In this week's episode, I want to move on to discuss some Quantitative estimation approaches. While Qualitative estimation is predominantly based on expert judgement, Quantitative is based on something we can count or calculate, a use of statistical analysis based on historical data. In this episode, I specifically want to discuss two quantitative techniques - Monte Carlo simulations and Statistical PERT (or SPERT for short).
Or listen at:
Published: Wed, 12 Feb 2025 01:00:00 GMT
Hello and welcome back to the Better ROI from Software Development podcast.
This episode is part of a wider mini-series looking at estimation in software development. I started the mini-series in episode 189 by providing the following guidelines:
Subsequent episodes take a deeper dive into specific aspects for estimation in software development, and while long-term listeners may find the amount of repetition across the series, I want each episode to be understandable in its own right, as much as practical to be self-contained advice.
In the last couple of episodes, I've looked at a number of methods that fall under the Qualitative approach to software estimation.
Qualitative estimation is predominantly based on expert judgment, some think based on subjective thought process.
In this week's episode, I want to move on to discuss some Quantitative estimation approaches.
While Qualitative estimation is predominantly based on expert judgement, Quantitative is based on something we can count or calculate, a use of statistical analysis based on historical data.
In this episode, I specifically want to discuss two quantitative techniques - Monte Carlo simulations and Statistical PERT (or SPERT for short).
But, before I talk about each, let's start with some commonalities.
At a high level, both take data and produce an estimate, or a range of estimates from it.
Take for example, we have historical data for the past two years of our development team. These approaches can be used with that data to produce estimate for future work.
Both techniques anticipate that by taking the Qualitative subjective element out of the estimation process and being data-based, it will remove the bias that can creep in from subjective elements and potentially reduce the burden on the delivery team to produce the estimates.
Personally, I don't believe that any of the Quantitative approaches deliver on this fully hands-off dream. There are still elements of Qualitative activities involved, thus room for bias to creep in. However, more on this once I've gone through the two techniques in a bit more detail.
Let's summarise the Monte Carlo simulation.
Monte Carlo simulations help estimate software development timelines and costs by leveraging historical delivery data. This method involves running thousands of simulations using difficult possible outcomes based on past project data to predict a range of future scenarios.
Imagine you're planning a road trip and have data on travel times from past trips. Instead of guessing the travel time, you consider various factors like traffic and weather, running numerous scenarios to see a range of possible travel times.
In a similar way, in theory, Monte Carlo simulations provide a range of possible projected completion dates and costs, helping with more accuracy and reliable planning by contouring possible variations and uncertainties.
At a high level, if you want to run a Monte Carlo simulation, you'll download one of the countless Excel-based examples and run through the following steps:
The key here is the simulations. By using past data, probability distribution and a random number generator, each simulation will generate a possible outcome. By running many simulations, i.e. 10,000, we expect the average to give us a reasonable estimate, including a best-worst-case scenario.
Some points to consider when you're looking at Monte Carlo.
It requires some understanding of statistics and probability. I'd argue in most cases, a level of training or learning would be needed to gain valuable estimations.
The accuracy of the simulation is highly dependent on the quality of the input data and the appropriateness of the chosen probability distribution. And there are a number of places where the quantitative, the expert judgment, creeps in, such as the probability distributions, along with making a judgment that, of the task to be estimated, is actually related to any of the historical data.
At a conceptual level, Statistical PERT (or SPERT) is similar.
It's a tool, generally a spreadsheet or a specialized application, that can be used to estimate timelines and costs by utilizing three-point estimates derived from the historical data. The optimistic best case, pessimistic worst case, and most likely scenario for each task. This technique leverages the historical data to calculate those three points.
Imagine planning a project with best, worst and most expected outcome for each task. SPERT then combines these estimates to calculate a weighted average, producing a prediction of the project's overall timeline and cost.
To use SPERT, you'd go through the following steps.
By utilising a three point estimate, you are able to generate a graph of potential outcomes, normally following the bell-shaped curve, where the probability of estimate being correct is low for the optimistic, average for the most likely, and low again for the pessimistic. This allows us to say, points on the graph is at least 80% likelihood.
From a communication perspective, the graph is a really good way of representing the uncertainty of the estimate to the wider community. With something that's a high certainty, you'd expect a narrower bell curve to something with less certainty, ultimately leading to something that looked like a pancake if there was so little certainty that the most pessimistic is so far removed from the optimistic.
Again, similar to Monte Carlo, this requires some understanding of statistics and probability. And I'm certainly not going to try and explain standard deviation on a podcast. It's something that a Google search is much more likely to make understandable.
So again, I'd argue that in most cases, a level of training, learning would be needed to gain valuable estimates.
And again, the accuracy of the simulation is highly dependent on the quality of the input data and the appropriateness of the chosen probability distribution.
And again, there are a number of places where quantitative, the expert judgment, creep in.
Both Monte Carlo and SPERT are interesting techniques, and I freely admit to not having any real experience using them. But neither seem likely to give you a silver bullet for software estimation. Both will be calculating estimates, but we have to remember that means they are not reality. Thus, as with any estimation technique, we should not be putting undue expectation on them, regardless of how impressive the maths or statistical modelling that goes into them.
There's probably more danger here with a Quantitative estimate that people will expect an estimate to be precise. Any form of statistical processing is built to build a trend. It isn't expected to be right 100% of the time, and it certainly won't be. You will have outliers.
Any individual estimate can be wildly out and should be expected to be.
However, over time, we would expect statistical techniques to produce a solid trend.
I do, however, really like the visualisation that can be achieved through SPERT. I'll provide a link in the show notes to the Statistical PERT website, which provides a free Excel download if you're interested in taking a look at how it works.
It should be noted that both approaches need time and effort to understand, set up and refine. Even more if there's a limited understanding of statistics and probabilities within the team. But I'd certainly be interested in hearing if you've had any success in using either of these techniques within your own teams.
In this episode I wanted to introduce two Quantitative techniques Monte Carlo Simulations and Statistical PERT
While there are interesting techniques based on applying statistics and probabilities to historical data, they are not without an effort to implement.
Yes, they can seem like the golden path, removing the effort from the team to produce, but they, like anything else, are not a silver bullet to the approach.
With both Qualitative and Quantitative approaches, it's easy to see that there are pros and cons to individual practices. And realistically, it feels like you need a blend of various approaches to create truly valuable estimates.
But, this all seems like a lot of work and investment - a lot of training, tools and time to achieve - and to iteratively improve to nudge towards that valuable estimate over time.
As I've said many times in this series, producing estimates cost. Producing valuable estimates costs a lot.
But it doesn't take much of a leap of logic to ask, can artificial intelligence help us with this?
And this will be a subject of exploration in the next episode.
In an ideal world, there would be an AI-powered tool that would just do the work for us. Thus, I explore how such a tool could come into being, and probably more importantly, why I doubt it will happen any time soon.
Thank you for taking the time to listen to this podcast. I look forward to speaking to you again next week.