Six questions to ask when buying forecasting solutions
So, you have decided that you need to improve sales forecasting in your organization. Good idea! A quick foray into the world of forecasting shows you that there is no shortage of ways to go. Consulting services, free-lance data scientists, software platforms, and analytics shops abound.
How to choose a good solution? Here are six questions you should ask any vendor to avoid the most common pitfalls. Also, we share a quick guide to the right and wrong answers. The questions to ask of your potential forecasting provider:
1. How does it work?
“Can’t tell you, it’s a secret sauce.
“Here is all the detail you want.”
If they are not sharing the “secret sauce,” their sauce is bad.
Often when I share Concentric Market with someone they say something to the extent of: “I know you can’t tell me the “secret sauce” of how it works.” I am always amazed. How have business users gotten used to providers, who do not share the detail of their craft? What power must data scientists hold to sell something that no one is allowed to question? And how would their models get better, insulated from like-minded practitioners?
I am convinced that sharing does not weaken competitive advantage. A peer review and user critique only make the model better. I know I am not alone. Take Nate Silver and his team’s analysis on fivethirtyeight.com. For years, Nate has explained in detail how his forecasting algorithms work. His competitors like The Upshot have similar models, with their own differences. In the competition among forecasting methodologies, all users benefit from better analysis. This is good for the practice of forecasting.
When someone tells me that they will not divulge the “secret sauce,” I think that one of the following is likely true (none of it good):
a) they do not know to explain it
b) they know how to explain it and know it will not stand up to scrutiny
c) there are afraid you can do it on your own.
The first and second are signs against choosing a provider like this, but the third one is even worse. Imagine a plumber coming to your home and you ask them how they would fix the leaking pipe. If they say they won’t tell you because they are afraid you may do it on your own, would you hire them? Whatever they explain, I will be glad they are there to do it. It is unlikely I would want to roll-up my sleeves and start taking the tiles of the bathroom off. Even if the problem is so simple that I can do it on my own, then shouldn’t they be a good provider and tell me? I will only trust them more and call them next time the garbage disposal breaks.
So, ask for the secret sauce and pay attention to the answer. It may tell you a lot more than how good their forecasting skills are. It may tell you how good of a business partner you are getting.
2. What does a forecast look like?
“It’s a number.”
“It’s a range.”
You most likely know that getting a sample of the end result or a demo of the reporting capability is a must. When you are looking at the reports, keep an eye out for reports shown as ranges, or distributions. Also, the report should use the terms “there is an x chance that” or “the probability of that happening is y.” Those are all good signs of a good forecasting system.
Keep in mind the distinction between predictions and forecasts. “It will rain tomorrow” is a prediction. “There is a 70% chance of rain tomorrow” is a forecast. No one can predict the future with certainty, whether about the weather, or about your organization’s business questions. Statisticians, scientists, analysts, and modelers are all in the business of forecasting, although sometimes they forget. Predictions are best left for fortune-tellers.
What does this mean for your evaluation of forecasting solutions? If a report shows that your sales will be 142,233, be wary. The sign of a good forecasting system is one that formulates results as ranges, or even better: distributions of outcomes.
Distributions inform what is likely to happen and what is the risk associated with it. See below for an example. Each blue dot on the chart on the left is one possible outcome. Let’s say that we are forecasting sales. Of the 44 forecasts, 11 show 85,000 sales. This tells us that there is a 25% chance that sales will be 85,000. We do not forecast more than 100,000 sales or less than 60,000, so those seem to be the maximum and minimum possible sales we forecast. The distribution on the left is often summarized as a “box-and-whiskers” chart (shown on the right). These charts show the minimum, maximum, and median of the forecasted distribution. The blue rectangle shows the range of outcomes that are occurring 50% of the time in our forecast.
Without ranges and probabilities, you benefit only partly from forecasting. The range of outcomes is useful information on its own because it shows how much risk there is. Narrower ranges mean greater certainty in forecasts; wider ranges mean greater risk.
3. How do you validate the forecast?
“We fit to past data.”
Here is an easy rule of thumb: If the analysis does not have a cross-validation test, it is not predictive.
Every analytical provider has a measure of model fit. It is most likely a comparison between historical time-series data (what we call the “training set”) and the model time-series forecasts. So far so good. But what all modelers know and many business users may not is that it is easy to minimize the error of a model. In a regression-based model, for example, adding more random variables decreases the error. In machine learning approaches, letting the algorithm run longer minimizes the error.
The problem with this is that a low error in the training set tells us nothing about the predictive power of the model. It does not show how good the model is at forecasting future, unknown periods. In fact, most modelers are aware that when the error is too low in the training set, the predictive power of the model is most likely low.
The wrong answer to the question “how do you validate a model?” is: “we calibrate/train/fit to past data.”
The right answer is that the process entails “cross-validation.” This is a test of the predictive power when compared to data that the model cannot “see.” A hold-out test is the simplest and most common cross-validation test.
Hold out tests are named that because some amount of the historical data has been… wait for it!… held out. For example, if you have 104 weeks of data, you could split it into 52 weeks for the training (or calibration) of the model and 52 weeks for the hold out. You would then use the calibration period to find the parameters of the model that match it to the real-world data. Then you would let the model forecast what would happen in the following 52 weeks. From the model’s perspective, this is the future.
A properly executed hold-out test gets only one chance. A good modeling process would not allow re-calibrating to match the hold-out forecast better. This would defeat the purpose. Hold-out forecasts are useful in constraining what the modeler can do. Knowing that a test is coming, the modeler cannot take shortcuts like adding dummy variables or “torture the data until it confesses.”
If the forecasting provider does not test their analytics with cross-validation of some sort, beware. You may be getting a model that looks great with a low error when compared to past data but is, in fact, useless in forecasting.
4. How do you ensure quality?
“We don’t make mistakes, we are experts.”
“Here is our documented process.”
Forecasting methodologies grow and improve through repeated testing. The most useful and most finite resources for a model are new observations of the world. You may think that in the age of Big Data, data is abundant. That may be true for the unstructured streams of Twitter, Facebook, and Snapchat data. For a business user, good, testable, well-structured, and forecasting-ready data is rare to come by often. Every month that passes without new information used to test an existing model is a wasted month. Every new quarterly report for a sales forecast; every new week-worth of website visits for website analytics; every new product launch for a volumetric forecast is an invaluable learning moment.
Many modelers do not go back and compare their forecast with what actually occurred. This omission deprives the model and its user of the potential to learn. The problem is most acute for consultancies and data-scientists for hire – often, not by their own fault. They rarely get paid to refine their models. In this case, it is the business users who miss that the long-term insurance of a good model is more valuable than an immediate answer.
Let me share an example of refining a model over time. For our purposes let’s think of our forecasting model as composed of assumptions and other variables. Assumptions are things about the future we do not know, but will find out in time. Our forecast may make the assumption that interest rates are going to be x, market demand y, and competitive actions z. Once time has moved beyond the forecast period, we can go back to our original model and replace our assumptions with the actual values for x, y, and z. We now have two forecasts: the original and the one with corrected assumptions. Comparing the two against what actually happened will tell us how much of the model error is due to the assumptions we chose. Over time we can track the error coming from assumptions. We will now have a rule-of-thumb about what is the theoretical floor of our model error. This is the minimum, unavoidable error we are likely to have, due to the uncertainty of the future.
So, pay attention when evaluating new forecasting hires or vendors. If they are not talking about learning, there most likely won’t be any.
5. How will the forecasting process improve over time?
“It is as good as it can be already.”
“Learning as at the core of the process.”
Here is another rule of thumb: If there is no documented quality assurance process, don’t expect any quality. (This rule applies to most domains of human endeavors).
This is where most forecasting solutions and providers fail. To uncover the truth, do the following. When they are not expecting – not over email, nor in an RFP – ask them: “What is your quality assurance process?”
The terrible secret of business analytics is that often there is either no quality assurance or it is inadequate. Consultancies put too few analysts on the project. Data-scientists like to work on their own and there may be no one around to review what they are doing. Data analysts can make a mistake handling the information. Software solutions are almost never devoid of bugs.
Unless there is a codified quality assurance process that your solution provider can pull up at a moment’s notice, the results you are getting most likely have errors in them. The best way to achieve high quality is to have a dedicated team testing day-in-day-out. Testing a week before the report or the software release won’t do. And the quality assurance team has to be different and independent from the team building the models or the software. Analysts often cannot see their own mistakes, and coders often cannot find their own bugs.
The right answer you are looking for is that your provider has a quality assurance process that they know by heart and have it easily accessible.
6. How do we answer new questions?
“You pay more.”
“It is very easy.”
At first business users who are looking for forecasting capability may think that the benefit is in the ability to answer questions. But after using a good forecasting system, they often learn that the greater benefit is different. Of greatest use is to be able to have a way to answer new questions, fast. The key to forecasting is not to know more, but to act better.
Here is an example. Alice is an analyst whose boss, Bob, has asked her to create a forecasting capability so that they can anticipate their sales better. Bob’s main question is “how many units will we sell next quarter?” Alice goes out and builds a forecasting system. She documents how it works. She cross-validates it, quality assures it and tests it over a few quarters (she has read this blog!). Finally, Alice comes to Bob and says “next quarter there is a 90% chance we will sell between 990 and 1,010 units.” Bob is satisfied for about 10 seconds… and then he asks: “But what if we drop the price by 5%?” If at this point Alice needs to spend six months to answer the question, she has failed. It does not matter that her forecasting system is accurate, valid, and of high quality. It is not usable. What Alice’s actually needs to do is pull up her system, change a few parameters, and voila(!) give Bob the answer right there and then.
It may be unreasonable to expect that a forecasting system will be able to answer anything at any point. Yet, understanding what business users can act upon is important. Business users want to know how the levers they have in real life will affect the forecast when they pull them. It is the only way the forecasting system turns into a decision-support system for continuous use.
The question your forecasting provider should ask you
And this leads us to the final thing to watch out for. You know the six questions you need to ask. But your forecasting partner needs to ask you one question to signal they will do a good job. The question is:
“What is your business objective?”
Too often analysts, data scientists, developers, and vendors fall in love with their methodologies. In doing so, they fail to deliver what the business user is expecting.
An accurate forecast has little value if it is forecasting the wrong things. Business users may have a whole range of objectives that should determine the forecasting approach. They may want to explore new science. In this case accuracy, validation, and documentation should be the focus. They may want to be able to answer questions faster. In this case automation, process, and integration should be the focus. They may want to increase the ROI of their organization, in which case the reports should focus on that.
So, ask your six questions and pay attention. If your forecasting provider is not talking about what your business objectives are, you have a low chance of meeting them.