Data chatbots have a 0-1 problem
For them to be effective, they need lots of context. And that context can only come through heavy and consistent usage, during which period customers might lose interest
We are in an interesting place when it comes to conversational interfaces for data (also called ‘analytics copilots’ and ‘chat with your data’ - yes I know all of them don’t follow the same grammatical structure). Tens of companies in this have folded up or pivoted, but a few well funded ones continue to go on and build and sell.
Maybe we can say that the survivors, thanks primarily to their funding (nobody yet has hit true product market fit - not even “big tech”), have “survived the trough of disillusionment”.
When we started Babbage in early 2024, we made a slack channel to track competitors. Quickly there were 100+ companies in there. A year later I checked all the URLs to see who these companies’ customers were, and by then most had either pivoted or closed down. The ones that survive continue to go at it.
How data copilots have evolved
During this period, however, the industry has matured. Here are some changes that I notice if I were to compare the state today to 2023, when such companies started springing up:
There is acceptance that we need context. And lots of it. People have accepted that models getting better will not let them write better SQL - they need to understand more about the business and metric definitions and how the company’s data is structured (and there is no uniformity in this)
On a related note, you see a slew of posts on “semantic layers” strewn around LinkedIn now (much more than you did during the good ol’ “modern data stack” days). These were initially designed and built for human users and analysts to agree on metric definitions. They didn’t really take off then, but are now being repurposed to aid agents.
On the same lines, structured and unstructured data now go together. During the Babbage days I’d say “right now we’re only handling structured data. Unstructured will come later”. With benefit of hindsight, that was a mistake. To make real sense of structured data, you need access to unstructured data. A few examples:
The existing dashboards / github will have existing queries painstakingly built and validated by BI teams. These are a goldmine that the conversational interfaces can leverage.
Meeting documents will give valuable insight on dominant modes of communication in the company - documents or presentations or emails; and how data is generally presented
Similar documents or emails might have context on “investigation trees” that agents should use when a metric feels off.
Different teams in the company might have different definitions for metrics with the same name (when I was at Delhivery I’d been asked to resolve this issue of “different teams’ measurements of the same metric don’t match” issue; my team dug around and found fifteen different definitions of the same metric; each one as valid as the other). This will be in some documentation which needs to be leveraged to pull the context-sensitive definition of the metric and thus the right data
Some, but not all, providers have recognised that conversational interfaces are at best a wedge - since “every potential customer wants a chatbot” - but the real meat lies elsewhere. These are layered with things like scheduled report runs, personalised dashboards, multiplayer conversations with AI, enterprise search, etc.
Copilots still have a traction / adoption problem
All that said, none of the providers have really “cracked it”. Nobody has made it big yet building and running conversational interfaces (and related products). And in my outsider opinion, there is one major reason why they have not taken off like the vendors would like - the time to value is way too long.
Earlier today i was talking to someone at a silicon valley tech company that is trying out one of these conversational interfaces (“it is in UAT”, he said). He said that the company pretty much has one employee working full time to ensure success of this new tool - giving it the necessary context, repeatedly running it and testing it, making sure the outputs are of the necessary quality for business usage, and so on.
Last week I spoke to another company that has integrated with another conversational analytics chatbot, and they mentioned that “it took at least 2-3 months to get their model working for us”. Again that is a very long time. I know reasonably well the platform that they used, and it is a fairly well built product. The reason it took 2-3 months is primarily that until the model got sufficient context, it didn’t give satisfactory results.
These are just 2 data points I mention here but I expect the problem to be more widespread across the industry. I also wouldn’t be surprised if a lot of pilots fall through the cracks because the models couldn’t get good enough quickly enough (at Babbage we lost a deal due to this, though context wasn’t the issue there).
In other words, the AI-for-analytics industry has a serious 0-1 problem.
Models cannot work well enough without context, and context is time consuming. Moreover, the thing with conversational interfaces is that they suffer from “too much diversity” - the range of questions asked of the model can be so large that the ability for the model to gain context on a particular aspect is limited (compared to this, if you are thinking of using AI for a particular workflow, that workflow gets repeated well enough in a short period of time for the model to become good. anyways.. ) .
How to drive better traction / adoption
There are a few things that the builders of the chatbots can do to alleviate this risk (of models taking too long to get good, thanks to insufficient context):
Be more innovative and proactive on how to use the available unstructured data to provide ballast to the models. Rather than only reacting to questions asked of the model, can the model pretrain itself based on questions that can be gleaned (no pun intended) from the unstructured data?
Deploy analytics consultants (I prefer that term for this role rather than the more trendy “forward deployed engineer”) to drive the right kind of context into the models. These consultants should have both a deep understanding of the models, and an ability to quickly get up to speed with the customer’s business and data.
For example, in one of the examples above, rather than simply one of the customer’s employees working on getting the model in shape, the process should be led by a consultant from the conversational interface provider’s side.Be more proactive in onboarding and being able to use the right kind of context data from the customer’s organisation. This will vary significantly from customer to customer, and the consultant’s job will be to ask around and identify where the real context in the company sits. And then this consultant needs to be backed up by sufficient “backward deployed” (i.e. at HQ) engineering horsepower to be able to quickly onboard and index these documents.
I might have shut Babbage, but I continue to maintain that we are only at the beginning of a massive AI-for-analytics wave (on which note, did I tell you that I tried learning to surf last week? I should write about that on my main blog), and by going after context, the remaining conversational analytics companies are going in the right direction.
However, unless they can really crack the speed of adoption within a customer, there is the risk that “revolutionising analytics using AI” might remain a pipe dream for a while longer.

