Holdout test challenge: If your MMM can’t predict data it hasn’t seen, move on – Mutinex chief
An Mi3 editorial series brought to you by
Mutinex
An Mi3 editorial series brought to you by
Mutinex
Mutinex chief Henry Innis has laid down the gauntlet to marketers and rival econometrics firms to prove their market mix models (MMMs) are delivering trustworthy insight by testing them against data they haven’t seen, AKA holdout testing. He's also called for the MMM industry to be far more transparent – and to align on model governance standards. Meanwhile Meta and Google state they have no plans to "cannibalise" MMM pure-players. But Google's MMM tool will launch before the year-end and both firms reckon marketers should try more than one model. Mutinex's Will Marks thinks marketers are better off first working out if the model they do use is actually any good.
What you need to know:
- Mutinex global boss Henry Innis urges marketers to test MMMs via holdouts – holding back data to see if they can predict accurately. If they can’t, that’s a problem.
- He thinks holdout tests should be the model governance standard and vendors should share their model governance metrics with clients – and the broader market.
- Meta and Google are in market with MMM tools. Both state they are complementary to MMM pure-plays versus cannibalistic – but they urged marketers to try more than one model.
- Mutinex’s marketing science lead suggests marketers are better off ensuring the model they are using is accurate.
- Innis also aims to cut consulting firm outlays with a new automated reporting and insight tool to free up marketing budget for working media and creative – but said it won’t be plugging directly into media buying system.
There is no substitute for a model not being able to predict data it hasn't seen.
Holdout call out
“You should test your models for holdouts,” Innis told a room full of 150-plus marketers. “You should hold data back from your vendors and ask them how well can they predict data that their model hasn’t seen. If your vendor is unable to predict data that their model hasn't seen accurately, I question whether or not that model is any good in your business.”
Innis claimed Mutinex is “all-in on holdout tests” and “all-in on testing models transparently”. He said the marketing industry would benefit if firms providing MMMs aligned on model governance standards. “I believe that holdout tests are the model governance standard – and I think it’s appalling that vendors do not share model governance metrics back to their clients. That is wrong.”
No model is perfect, said Innis, with Mutinex aiming for a margin of error “well below 10 per cent”. But they should be largely on the money.
“Does the model generally fit the sales trend within data that it has seen? Do you have metrics of stability so that if you run the model again, does it get a radically different answer? These are really important things that every vendor is testing for. We believe they should be transparent. There are people that are qualified within every enterprise who know what these metrics are. They're not unique to market mix models … but as an industry, we've shied away from publicising them, and I think that's what we need to let lean into a lot more.”
Either way, said Innis, “there is no substitute for a model not being able to predict data that hasn't seen.”
We don’t have any intention of becoming an MMM partner or vendor ourselves, or cannibalising the business models [of companies] like Mutinex at all.
Platform MMMs
Meta and Google have their own MMM tools, but both told marketers at the event that they are not trying to compete with the likes of Mutinex, Analytic Partners, agency groups and consulting firms selling their own wares.
Meta launched Robyn, an open-sourced experimental MMM four years ago. Google plans a full-scale rollout of Meridian later this year after opening up limited access five months ago.
“We don’t have any intention of becoming an MMM partner or vendor ourselves, or cannibalising the business models [of companies] like Mutinex at all,” Google Marketing Mix Model lead, Amir Jangodaz, told the conference. His counterpart, Meta Head of Marketing Science, Carl McLean, likewise said Robyn was “complementary to the [existing] ecosystem.”
Both underlined there is no single measurement “silver bullet” and suggested marketers work with more than one model to validate findings – and to experiment.
“We have seen that different models for the same organisation actually doesn't work, or vice versa. It requires some test and learn,” said Google’s Jangodaz, a former data scientist with ING and Kimberley-Clark. “These models can be customised for different marketing categories, different business categories. It is up to you decide which one you choose for your business.”
Meta’s McLean said it is “always useful to have a couple of models to interrogate and see what you are getting”, but that going too far, with too many models, risked greater confusion and complexity.
Mutinex’s Director of Marketing Science, Will Marks, suggested that is precisely why both stability testing (how consistently the model reaches the same conclusion) and hold out testing (how well the model performed in making predictions based on data kept back from it versus actual outcomes) is vital to MMM accuracy.
“It could create a lot of noise in your business if you're writing lots of different models. For me, a better approach is to really understand the quality of the model you have,” said Marks.
“By understanding it is able to predict that hold out period, you can then understand what level of trust you have in that model and how you can use it for forecasting or reviewing results … So being clued up on how to interpret the quality of the models you do have is a great skill to have.”
I think that is a really dangerous thing – when we are redirecting money from marketing and working media budgets towards market research consultants to justify our jobs.
Consulting cutter
Meanwhile, Innis said the firm’s new Business Answers Modelling (BAM) tool could enable greater marketing and broader investment by reducing fees paid to consulting and market research businesses.
“I think a lot of money is spent creating not a lot of value by consulting firms generally and I think that is a really dangerous thing – when we are redirecting money from marketing and working media budgets towards market research consultants to justify our jobs,” said Innis. “I do not think that is a good world for us to live in. I want this to be a force multiplier to reduce [marketing] costs and the cost burden of consultancies [in] presenting building all of those decks … so we can get the money back to working media and creative.”
BAM uses large language models and a chat-based interface that marketers can prompt with questions and receive near instant responses, reports, analysis and recommendations.
Innis said marketers would have “access so they can replicate what the LLM has done” and gain confidence that the results are reproducible. He said while the firm will be working to “refine the insights … we are very, very confident it doesn’t get the data wrong”.
But Innis underlined that the recommendations “will always require a degree of human judgement. We're not going to plug [BAM directly] into a media buying system. You always need someone to understand the creative idea that you're working with, the media idea that you're working with.”