‘Don’t waste millions fine-tuning LLMs for marketing and commerce’: Salesforce global CMO says firms getting bilked on AI, touts faster fix
Brands are spending “literally tens of millions of dollars with cloud providers” to train and fine-tune large language models from scratch. Wasteful, says Salesforce global CMO Ariel Kelman. “There are lots of use cases where you do need to train and fine-tune your models. But absolutely not sales, service, marketing and commerce – the models are smart enough that they can go and grab information.” Salesforce’s argument is that pointing AIs at existing repositories – catalogues and order management systems, for example – begets faster, useful results that "blur the lines" between sales, marketing, customer service and commerce. The alternative so far has been training and tuning LLMs on “thousands of hours of customer service calls”, to get so-so results for massive outlay per Kelman. He suggests cloud providers milking LLM training/tuning business teats are operating on perverse incentives akin to digital marketing being asked whether clients need branded keywords for every product: ‘Let me give you 50 white papers on why you need that…’ Next up Salesforce’s “autonomous agents” and “large action models” are coming for sales lead development.
What you need to know:
- Salesforce CMO Ariel Kelman says firms are wasting “literally tens and tens of millions of dollars” by training and fine tuning AI large language models when it comes to sales, customer service, marketing and commerce.
- He says just point the LLMs to existing business data systems and repositories – e.g. catalogues, order management systems – add some reasoning, rules, testing and be live within weeks.
- For retailers that means conversational bots that anticipate what a customer wants, taps inventory and logistics systems for example, sorts out returns but then can also recommend a larger size, different colour and a hat to go with it – and then can process and order it all.
- Salesforce calls it a ‘large action model’ approach versus ‘large language model’.
- It’s all unpacked in the podcast, along with Salesforce's development of two new attribution models, and what it's going to do with them – and why Salesforce's global CMO keeps brand and demand deliberately decoupled. Get the full download here.
We're not talking about little savings. Literally, people are spending tens and tens of millions of dollars with cloud providers to train these models. There are lots of use cases where you do need to train and fine-tune your models, but what we’ve found is absolutely not [for] sales and service and marketing and commerce.
Million dollar answer
As the saying goes, Salesforce is eating its own dog food when it comes to autonomous agents, spinning one up in a few days for its Dreamforce app. Which meant that 45,000 delegates in San Francisco were already hands on and using it before most of the big demos, reveals and broadsides at tech rivals had kicked off.
Global CMO Ariel Kelman suggests it underlines how Salesforce’s approach to useful AI differs from competitors, which he says are raking in profits from convincing firms they need to train and refine LLMs from scratch. Salesforce is adamant that’s not always true – and usually a big waste of money.
The firm is aiming to crimp cloud providers' latest cash cow via autonomous agents powered by what Salesforce CEO Marc Benioff describes as “large action models” which are designed to reason, plan, and take concrete actions within business systems.
The key difference versus standalone LLM-based AI models and copilots being that these agents are deeply integrated into existing business processes and data systems and so do not require the extensive training, tuning and customisation that can end up costing firms tens of millions of dollars for questionable results.
“We built Agentforce into the Dreamforce app – three weeks ago it didn’t exist, so we decided, let’s go and build this,” says Kelman.
With “a small amount of code” to integrate it with the event management platform, delegates could conversationally ask the agent about sessions and get answers about them, book their sessions and be offered a map to them.
It took a little bit of refining so that the agents understood that “talks and breakout sessions and sessions were all kind of the same thing” and that delegates would refer to them in different ways and using different language. So the programming team just told it, “If it's anything about a talk or a speaking session or someone listening to content that's spoken – consider it a breakout session’. So it's that kind of tuning that we went through. Then we ran through our tests again: Is it ready to turn on? Okay, let’s turn it on,” says Kelman.
“Of course, on the range of use cases, it’s not the most complicated in the world. But there's a lot of these sort of simple and medium complexity use cases that will take a week to get a prototype up and running and a couple weeks to test it. From our initial pilot set of customers like Wiley, Fisher & Paykel, Saks, I'd say about a two-month period [from scratch] to get it rolled out to the initial set of external users for testing.”
Finer margins
The crux is that Salesforce is pointing the AI agents to the right places rather than spend massive amounts of time and money training or tuning LLMs for little discernible improvement, per Kelman. Which is why the company, most forcefully via CEO Marc Benioff, suggests the “DIY AI” era is over.
Kelman is less punchy than his boss in blasting rivals as exploitative, and of their customers as suffering from “hypnosis”, but he’s no less damning about the vast sums being wasted.
“The initial approach when people start getting excited with large language models and how they could apply to business is they said, ‘I want to do customer service, so I need to go and get access to my own large language model, because I'm going to have to fine-tune it’.” In truth, “Not a lot of companies are training their own models from scratch, but they're doing fine-tuning, which still requires spending a lot of money,” says Kelman.
“So the typical approach is you spin up OpenAI on Azure. Then you get thousands of customer service calls and train it. But then what they [customers, businesses] are finding is the quality isn't working that well, and it's ridiculously expensive,” suggests Kelman. “That’s sort of ironic, because the LLMs are so smart. You don't actually need to fine-tune the models to be able to do customer support and sales and marketing, you can use approaches like retrieval augmented generation [RAG] to have the LLM look up information, or to ground the LLM with data,” he adds.
“So for example, instead of having the large language model be trained on all the different types of events and programs and products, we're going to tell it, ‘if they have a question about products, look up our product catalogue, and here it is; if they have a question about their orders, here's an API we've created to our order management system’ – and so it's smart enough that if you put a reasoning engine around it, it can go and execute these tasks,” Kelman explains.
The upshot is no training, very little fine-tuning and AI that is actually useful, he claims, for a lot less outlay.
“We're not talking about little savings. Literally, people are spending tens and tens of millions of dollars with cloud providers to train these models. There are lots of use cases where you do need to train and fine-tune your models, but what we’ve found is absolutely not [for] sales and service and marketing and commerce. The models are smart enough that they can go and grab information, you can use RAG, you can use grounding of data.”
When you're selling people DIY [AI] and you're making money off of DIY… It's kind of like if I ask a digital marketing agency, ‘is it a waste of money to buy branded keywords for all my products?’ They will say, ‘oh no, no, let me give you 50 white papers on why you need that.’ They are getting a percentage of every dollar that you're spending on Google …. It's a little bit of that same situation [here].
Perverse incentives?
If that’s the case, why hasn’t business cottoned on?
“The problem is that the incentives are not aligned to tell the customer not to DIY,” suggests Kelman.
“When you're selling people DIY [AI] and you're making money off of DIY… It's kind of like if I ask a digital marketing agency, ‘is it a waste of money to buy branded keywords for all my products?’ They will say, ‘oh no, no, let me give you 50 white papers on why you need that.’ They are getting a percentage of every dollar that you're spending on Google …. It's a little bit of that same situation – there's so much infrastructure and business around making money off of people training [LLMs].
“We're trying to tell a different story – you don't need to do that. You don't need to suck up all these resources when you’re already too busy to do a bunch of other things with sales, marketing and support. We’ve built a platform around these large language models that leverages all the work our customers have already done in Salesforce, and they can get these results a lot more quickly. And they're more accurate too.” (Benioff claimed in his Dreamforce opener that its autonomous agents are beating Google, Open AI and Microsoft on accuracy and hallucinations by “two times”.)
Point and shoot
The key difference between the rest of the market and its ‘large action models’ is in the name, says Kelman. “Think of copilots as really being passive and not taking action.” Whereas Salesforce’s approach is to let the customer use any LLM they want, but focus it on specific, useful business data.
“So if you've built something that has your orders or all your customer service cases, it can look that up. It's basically jumping ahead of what it needs to know. If you're calling about a return, the system will know what you've bought before, what part of country you're in, and what customer service issues you've had before. It literally doesn't need to be as smart as it did before, because it's not trying to guess. It's just looking up information like a person would, and then acting on that.”
Early test cases suggest the approach can rapidly pay dividends. Gucci saw its customer contact centres move from essentially a cost unit to increasing sales 30 per cent. Luxury fashion retailer Saks Fifth Avenue is likewise using an agent called Sophie to deal with customers in conversation and is capable of dealing with questions like product sizing and shipping logistics because she – it – is plugged into Saks’ data cloud, which holds all the unstructured data required to answer questions from all parts of the retailer’s business.
So if a customer calls up with a garment to return, it can tell the customer it has already sent a box for the return, but also suggest a different size, or colour that is knows is in stock, ask if the customer would like to place the order, and then suggest and order a hat to go with it.
As such, Kelman thinks “these AI agents really blur the lines between sales functionality, service functionality, marketing functionality”.
Next up in B2B is an autonomous agent for sales lead development. Dubbed SDR Agent, Accenture is already on board. It will essentially use agents to “nurture” high scoring but not top scoring leads, “first over email and later in chat and voice … to start having that conversation with them until they’re ready to talk to a human”, says Kelman.
Ultimately, he says the whole 'third AI wave' program is about “just scaling the work our customers have already done”.
There's more in the podcast, including Salesforce's development of two new attribution models, and what it's going to do with them – and why Salesforce's global CMO keeps brand and demand deliberately decoupled. Get the full download here.