Learning Library

← Back to Library

Bridging the AI-Ready Data Gap

Key Points

  • A recent Salesforce survey revealed a stark perception gap: 84% of enterprise leaders say their data strategies need a complete overhaul for AI, yet 63% believe they are already data‑driven, which is a key reason many AI projects fail.
  • The first principle for an AI‑ready data architecture is to “diagnose before you deploy” by testing whether simple factual queries and a full cross‑system customer view can be answered in under five seconds, exposing performance bottlenecks early.
  • Traditional data warehouses rely on overnight batch copies, but the second principle—adopting a zero‑copy architecture—emphasizes real‑time, on‑the‑fly data access so AI agents aren’t delayed by stale or duplicated data.
  • Successful AI scaling (the top 16% of organizations) depends on redesigning data pipelines for fast, agentic retrieval rather than slow, report‑oriented queries, requiring honest self‑assessment and infrastructure upgrades.
  • The briefing outlines seven guiding principles, with the initial two focusing on rapid performance validation and zero‑copy, real‑time data handling as the foundation for building trustworthy, AI‑ready data ecosystems.

Full Transcript

# Bridging the AI-Ready Data Gap **Source:** [https://www.youtube.com/watch?v=9IETDveRCQs](https://www.youtube.com/watch?v=9IETDveRCQs) **Duration:** 00:12:53 ## Summary - A recent Salesforce survey revealed a stark perception gap: 84% of enterprise leaders say their data strategies need a complete overhaul for AI, yet 63% believe they are already data‑driven, which is a key reason many AI projects fail. - The first principle for an AI‑ready data architecture is to “diagnose before you deploy” by testing whether simple factual queries and a full cross‑system customer view can be answered in under five seconds, exposing performance bottlenecks early. - Traditional data warehouses rely on overnight batch copies, but the second principle—adopting a zero‑copy architecture—emphasizes real‑time, on‑the‑fly data access so AI agents aren’t delayed by stale or duplicated data. - Successful AI scaling (the top 16% of organizations) depends on redesigning data pipelines for fast, agentic retrieval rather than slow, report‑oriented queries, requiring honest self‑assessment and infrastructure upgrades. - The briefing outlines seven guiding principles, with the initial two focusing on rapid performance validation and zero‑copy, real‑time data handling as the foundation for building trustworthy, AI‑ready data ecosystems. ## Sections - [00:00:00](https://www.youtube.com/watch?v=9IETDveRCQs&t=0s) **Closing the AI‑Ready Data Gap** - The briefing exposes the perception gap where many executives think they are data‑driven while their data isn’t AI‑ready, and introduces seven guiding principles—beginning with rapid diagnostic testing—to overhaul architectures and successfully scale AI. - [00:04:23](https://www.youtube.com/watch?v=9IETDveRCQs&t=263s) **Investing in Internal AI Architecture** - Executives stress building internal system‑design capacity to enable real‑time, context‑rich AI insights and prevent misinterpretations caused by lacking business context. - [00:08:13](https://www.youtube.com/watch?v=9IETDveRCQs&t=493s) **Governance‑Driven AI Scaling Timeline** - The speaker emphasizes that only organizations with strong data governance and accountable ownership can successfully scale AI, requiring an honest, phased 18‑ to 36‑month roadmap rather than the rapid timelines vendors often promise. - [00:12:26](https://www.youtube.com/watch?v=9IETDveRCQs&t=746s) **Bridging the Data‑AI Execution Gap** - The speaker argues that acknowledging the honest timeline reveals a perception‑execution gap—where data doesn’t yet support AI—and warns organizations to close this gap now or regret the failure. ## Full Transcript
0:00This week's executive briefing is all 0:02about the glamorous subject of AI ready 0:05data architectures. Why? You might be 0:07thinking, are we going to do that? The 0:08reason is simple. Salesforce published 0:10data this week from 6,000 plus 0:13enterprise data leaders. 84% said their 0:16data strategies needed a complete 0:18overhaul before AI works. And at the 0:20same time, 63% of executives in the 0:22survey believe their companies are 0:24already data driven. In other words, the 0:26perception gap is why most AI 0:28initiatives are failing. Leaders walk in 0:31to these AI conversations assuming they 0:34are datadriven. And what they find in 0:36reality is that they're not. Is that the 0:38data is not ready for the demands of AI. 0:40And so I ask myself, why is that? I look 0:41at my own projects I've worked on, 0:43projects I've worked on with clients, 0:44and I ask, "What can we learn here that 0:47will help leaders get over the 0:50perception gap and be more likely to 0:53establish a successful AI ready data 0:56architecture?" So, we're going to go 0:57through seven principles that separate 0:59the call it 16% who are successfully 1:02scaling AI over everybody else. Number 1:05one, diagnose before you deploy. So 1:09before you run your AI initiatives, you 1:11need to be running tests. And there's 1:13two big ones that come to mind. Number 1:15one, make sure that your system can 1:19answer factual questions about your data 1:22in less than 5 seconds without human 1:26intervention. Am I making it up as like 1:28it has to be 5 seconds? Kind of. Yeah. 1:31But it's a reasonable proxy. 1:32Essentially, if you cannot put in place 1:35a system that is able to get a 1:37performant query that is very simple 1:39through the system in less than 5 1:41seconds, you're probably not ready for 1:43anything more sophisticated. And I'm 1:45talking really simple like what's our 1:47inventory for product X in our warehouse 1:50houses right now. Something at that 1:52scale. Second, the second test is can 1:54you assemble a complete customer view 1:58across sales, support, billing, and 2:00shipping with no missing data in a 2:02similar time envelope. And so, you 2:04notice how I'm asking you about 2:06performance at the beginning. There's a 2:08reason for that. If the tests fail, what 2:11you're finding is that you need 2:13infrastructure work because your data 2:16sets are not designed to be performant 2:20in the era of AI. They're designed for 2:22slow retrieval, retrieval that might 2:25take 30 minutes and be one row in a 2:27large report your data analyst prepares, 2:30not agentic retrieval that happens on 2:32the fly very quickly. And so the 2:34challenge for you is to figure out how 2:36do you start to change your data 2:38architecture so that you can actually 2:40deploy AI agents reliably. So the first 2:44step just be honest with yourself. 2:46Diagnose before you deploy. Principle 2:48number two, zero copy architecture is a 2:52philosophy. It's not tooling. I'll get 2:55into what that means. So, traditional 2:57data warehouses copy data to central 2:59locations. They clean it up overnight. 3:02Then they make it available for 3:03reporting. But depending on your needs, 3:06Agentic AI cannot wait for overnight 3:09batch jobs. If you want real time data 3:12today, that won't work. So agents need 3:16real time data access to authoritative 3:18source systems if you want real-time 3:20conversations with your data. Now if you 3:23are okay with having day delayed or 3:25longer data and you just you're fine 3:27with that and most of your use cases are 3:29with last week's data or last month's 3:31data, this gets easier. But even in that 3:34situation, you still have a lot of work 3:37to do to make sure that the data sets 3:39that you prepare of your historical data 3:42are performant in keeping with principle 3:44one. The key shift to think about is 3:47that the behavior of the business user 3:51is also evolving. So part of why in the 3:54Salesforce survey, companies were 34% 3:57likely more likely to succeed if they 4:01used a zero copy approach, which is 4:02where you don't copy it to a central 4:04location. You just tell the AI to query 4:06the data where it lives in all of your 4:08different systems. Part of why they were 4:09more successful is because 4:12they were able to architect the entire 4:15system exactly the way they wanted 4:17without trying to buy from different 4:19vendors and cobble a system together. 4:21And so this is in line with other 4:23surveys we've seen of executive 4:25leadership that have emphasized that it 4:27is really important for executives to 4:31invest in internal capacity for 4:34architecting systems if you want the 4:36ability to build and sustain AI long 4:38term. And so as much as principle number 4:40two is about, hey, we're not going to 4:42have copies of our data. We're going to 4:44be able to query real time and that's 4:45going to make us more likely to be 4:47successful because business user 4:49behaviors are shifting toward real-time 4:50querying. All of that makes sense. But 4:52the underlying story that I think that 4:54is interesting is that that only works 4:56if you are investing in your ability to 4:59build those systems internally because 5:02your own fingerprint configuration of 5:04data is going to be somewhat unique and 5:06will require that internal capacity. 5:09Principle number three, context matters 5:11more than volume. So 49% 5:15of organizations draw incorrect 5:17conclusions because their data lacks 5:18business context. again drawing from 5:20this Salesforce survey. So as an 5:23example, let's say that the data says 5:24revenue is $2 million and the customer 5:27is Acme Corp and it's the third quarter 5:30revenue number, but maybe that 5:31particular database row or that 5:33particular fragment retrieved by the AI 5:35does not specify that the $2 million was 5:38a one-time contract, not recurring 5:40revenue, and your agents are now 5:41inappropriately adding $2 million in ARR 5:44to your reporting. What I'm saying is 5:46that is not uncommon and that is a great 5:49illustration of how the context really 5:51matters when you are trying to construct 5:54efficient agentic systems. You have to 5:56get context that works. So you need 6:00semantic layers that encode business 6:02definitions, relationships and logic. 6:05Asians should be able to understand the 6:07difference between booking and revenue, 6:10between gross and net active churn 6:12customers from universal single source 6:14definition sources of truth. And if you 6:17don't have that, if you don't have a 6:19semantic layer that defines the meaning 6:21across the top of your data at a high 6:22level, then the more data you add 6:24without context just means more 6:26confidently wrong answers. If you're 6:28wondering, can you give me an example of 6:30a semantic layer? I would actually say 6:31there's vendors that do this and this 6:34might be a case if you don't have the 6:35capacity internally where a vendor makes 6:38sense because in this sense you're not 6:41asking the vendor to patchwork across 6:42all of your data. You're just trying to 6:45put a single lens across the data set 6:48and interact with it cleanly. And so up 6:51to you cube is a vendor out there. There 6:52are other vendors out there as well. The 6:55principle is not whether you go with a 6:57vendor or not here. The principle is 6:58that you need to think about your data 7:00queries as needing the context that 7:03comes from executive insight to 7:05interpret well. If you ask these kinds 7:07of sophisticated questions and most of 7:09the time the data that you have encoded 7:12does not carry that context internally 7:14in the rows of the database. You have to 7:16find a way to add it. So if you want to 7:18build that yourself and find a way to 7:19add that in that's great. Or this might 7:21be a spot because there's only one sort 7:23of one point in the architecture to 7:25inject this. Maybe it's not irrational 7:27to add a vendor here. Principle number 7:29four, governance enables speed. It does 7:32not slow you down. And so this is one of 7:34those things where you need to have 7:36governance framed as accountability 7:39rather than process. And so governance 7:42councils at enterprise scale need to be 7:45accountable to quality metadata 7:47policies, remediation approaches. And 7:50your goal is not to push things through 7:52human processes, but to actually have 7:54your governance structure encoded as 7:56automated quality monitoring so that you 7:59can route data issues through the exact 8:02same severity model you would use for 8:04production software outages. This is not 8:06bureaucracy. This is what lets you move 8:08quickly if you automate that process. 8:10The 16% successfully scaling in the 8:13Salesforce survey are doing so with 8:16strong governance practices. And there's 8:18not really a substitute for it. And I 8:20and I want to emphasize again, if nobody 8:22owns data quality, the data will not be 8:25quality. You need someone to care about 8:27data quality in order to ensure that you 8:30can actually make the most of the AI 8:32investment you're making. But the person 8:34who owns it needs to have an automation 8:37mindset because otherwise they're going 8:39to slow things down. And so that's why 8:41I'm trying to give you the nuance and 8:42the tension there. So principle number 8:44five, the honest timeline is not as fast 8:46as anybody wants. The plan in most 8:48enterprises is probably 18 to 36 months 8:52and you're showing progress in phases. 8:54And so you might be fixing critical 8:56pipelines. You might be implementing a 8:58zerocopy architecture for your top 9:00domains. You might be piloting agents 9:02where data is trustworthy in year one. 9:04Year two, you may be expanding toward 9:06real-time capabilities. You may be 9:07automating governance year three. Now 9:09you're scaling agents. This feels really 9:12slow when vendors are going to promise 9:14six-month transformations. But those 9:16timelines assume your infrastructure is 9:18already ready, which is exactly what 9:20Salesforce is calling out here. For most 9:22organizations, it's not. And so the 9:24disciplined approach is to be honest 9:26about your shortcomings and fix 9:28infrastructure while running AI pilots 9:30that show proof of concept and then 9:32scale as you see that the foundations 9:34are actually solid. Principle number 9:36six, you want to close the perception 9:38gap between business and technical 9:40leaders. The fact that 63% of executives 9:43think they're data driven underlines 9:45this business technical gap because at 9:47this rate, business leaders are making 9:49purchases for vendors selling AI tooling 9:52on timelines that are completely 9:54divorced from the reality of the data 9:56architectures. Business and technical 9:58leadership need to get on the same team. 10:01Otherwise, they're going to end up 10:02blaming each other, blaming AI, or 10:04blaming their own teams. Misaligned 10:06expectations here are a leadership issue 10:09and it's something where technical 10:11leaders I am convinced need to take the 10:14lead. Step up, educate your executives 10:17on the technical realities that are 10:20constraining AI development in your 10:23organization. Be really honest. Make the 10:25infrastructure work visible, not hidden 10:28beneath the technical teams. This is not 10:29a time to hide the dirty details from 10:32leadership. You actually need that to 10:33avoid writing checks you're going to 10:35regret and making promises that you 10:37can't deliver on. Principle number seven 10:39is all about strategy. The strategic 10:41choice that you're facing is time bound. 10:45I keep emphasizing this and I'm going to 10:47say it again. Data runs on a clock. If 10:50you are going to have to spend 18 to 36 10:53months regardless in the middle of the 10:54AI revolution fixing infrastructure and 10:57scaling AI, it is better to start that 10:59clock sooner than later because you are 11:02going to fall exponentially farther 11:03behind the longer you wait. This is a 11:05point I make often, but it's a point 11:07that we have trouble as humans 11:09processing. So I feel the need to repeat 11:10it. We have trouble with exponentials. 11:13We are living in an exponential from an 11:15AI capability perspective. And when we 11:18are living in that space of a curve 11:20that's accelerating, we have got to be 11:23okay with making bets assuming that the 11:27environment is going to continue to 11:29accelerate. We need to make bets 11:31assuming AI models are going to keep 11:33getting better. That means we need to 11:35frontwate and really prioritize 11:38investments that unlock that AI 11:39capability. And I see that willingness, 11:42but as we've discussed in this video, 11:45it's often business leaders who are 11:47overoptimistic about the boring parts of 11:49the organization, namely the data stack, 11:52who are saying, "We want AI. We're ready 11:54to invest. We're buying vendors." And 11:56that leads to problems. And so the 11:58strategic choice is now, but technical 12:01and business leadership need to get on 12:03the same page to make sure that the 12:05willingness to invest is aligned with 12:07where you actually need to put technical 12:09leverage to drive AI forward in the 12:11business. The organizations that 12:13successfully scale AI aren't smarter 12:16about prompting. They're not smarter 12:18about model selection. They fixed their 12:20data infrastructure. They did the boring 12:22work first. They ran the diagnostics. 12:24They accepted an honest timeline. They 12:26did work. And that's how you close the 12:28perception gap between we're data driven 12:30and actually our data can't support AI 12:32and this vendor implementation is doomed 12:34from the start. That is an execution 12:36gap. That is something an organization 12:38can choose to close if it wants to. And 12:41if you don't close it, you are going to 12:43regret it. So there you go. That is how 12:45I read one of the most interesting 12:47surveys of the year out of Salesforce. 12:49Good luck with your data architecture. 12:50We all need it.