Beyond ChatGPT: The Hidden Infrastructure Powering AI with JB Baker

“We talk in the industry about this as the memory wall, and that is the gap between the processor’s ability to consume data and the data pipeline’s ability to deliver data to the GPU.”

EPISODE:

117

with guest:

JB Baker
VP of Products and Marketing

ScaleFlux

Episode Summary

On this episode of the Digital Banking Podcast, Josh DeTar spoke with JB Baker, VP of Products and Marketing at Scale Flux, about the often unseen infrastructure behind AI. They discussed the limitations of current technology and where improvements are needed to handle AI’s growth. JB highlighted the “memory wall,” which limits data flow to processors, as a major hurdle.

JB explained how this challenge requires innovation in areas like memory, storage, networking, and even cooling systems. He used a simple analogy of an irrigation system needing bigger pipes to support a growing field, representing the need for faster data delivery to power-hungry GPUs.

The conversation also covered the environmental effects of rising energy demands. JB pointed out data centers’ substantial energy consumption and the need for more efficient cooling methods. They explored the ripple effect of innovation, where advancements in one area, like energy production, can unlock progress in seemingly unrelated fields like AI.

Key Insights

⚡ The Memory Wall: A Bottleneck in AI Development

The rapid advancement of AI processing power has outpaced the ability of data storage and memory to keep up. This gap, known as the “memory wall,” creates a bottleneck limiting the speed and efficiency of AI. Just as a powerful engine is useless without sufficient fuel delivery, advanced GPUs are starved without fast enough access to data. This challenge demands innovation in data pipelines, memory technology, and storage solutions to bridge the gap and unleash the full potential of AI. Improvements are needed in areas like high-bandwidth memory and more efficient data access methods to break through this wall and enable continued AI progress.

⚡ The Interconnected Web of AI Innovation

Advancements in AI don’t happen in isolation. Progress in one area can have ripple effects, accelerating development in seemingly unrelated fields. For example, the energy efficiency of processors and cooling systems is crucial for the growth of data centers, which in turn enables more powerful AI applications. This interconnectedness highlights the need for collaboration and cross-disciplinary innovation. A breakthrough in sustainable energy, for example, could drastically reduce the cost and environmental impact of data centers, unlocking faster progress in AI.

⚡ The Growing Energy Demands of AI

As AI becomes more powerful and ubiquitous, its hunger for energy grows exponentially. Data centers are already among the top consumers of electricity globally, and this demand will only increase with greater AI adoption. This presents a significant challenge for sustainable energy and power grid infrastructure. If we don’t innovate in energy production and efficiency, the potential of AI could be limited by our ability to power it. This necessitates a shift towards renewable energy sources, more efficient cooling technologies for data centers, and a focus on minimizing the energy footprint of AI algorithms themselves.

About The Guest

JB Baker
VP of Products and Marketing

ScaleFlux

Find JB On:
LinkedIn

VP at ScaleFlux, focused on data infrastructure and AI’s technical challenges.

JB Baker: [00:00:00] We’re coming at it from the data pipeline perspective of your data gets created. It needs to be stored, it needs to be moved through memory and moved to the processors. Well, AI has, changed how that needs to happen because the capability of the GPUs to consume data has just far outpaced the ability of local storage and local memory to have the data there and feed it into the GPUs.

[00:01:00] [00:02:00] 

Josh DeTar: Welcome to another episode of the Digital Banking Podcast. My guest today is JB Baker, the VP of Products and Marketing at Scale Flux. there seems to be a common trait of the types of people I’m drawn to having as guests on this podcast. Coincidentally, it’s probably a similar trait you have if you’re listening to this podcast, and the trait I’m referring to is that of a lifelong learner.

JB has always been curious, but not just curious. Curiosity may only take you so far or so deep. JB says his natural tendency is to be a lifelong learner, that once to dive all the way in, even back to childhood, for example. He loved playing with model rockets, but not just buying one, launching it, seeing it [00:03:00] come down and then doing it again.

He liked to tinker. Okay. If I change the angle of the fin, what happens? Okay. It goes up and proceeds to immediately turn and smash into the ground. Got it. That’s not gonna work. Okay. What about, how can I help it go higher and then still come back safely? It was the process of learning, taking data points and evolving based on the learnings that he really enjoyed, and that mentality has carried through life and allowed JB a unique opportunity to make massive changes in career paths throughout life.

It would be pretty easy to say. I was in chip design at Intel, and then I went into chip optimization at Intel. Right. How about I was in corporate travel negotiation at Intel and then went into data storage, product management at Intel. That’s a pretty big jump. How does he do it? Well, he says he used to be a big reader.

If JB wanted to learn about a new thing, he immersed himself in books. He grew up pre-internet. So, that tells you a little bit of something. But what I find cool is that he said, now [00:04:00] instead of reading, he just asks ChatGPT. And that should tell you something about where we’re headed in this episode.

I’m really intrigued to have JB on today because his field of expertise is really far outside of mine. So, I’m leaning in with genuine curiosity today. 

JB thanks so much for joining me and being a guest on the podcast today, man.

JB: Thank you. Appreciate the opportunity.

Josh: Okay. So I gotta start with, it, it was pretty cool. You were just telling me, thankfully, this podcast recording wasn’t yesterday. It’s today because you’re coming back from being a little under the weather because you just got back from an event with, quite a few other people. So you were at NVIDIA’s GTC conference.

How was that?

JB: it was insane. I mean, it was just there. I don’t know what other way to describe It it was a madhouse. it just, so many people there, so many different technologies, different ways in which, AI is impacting, not just data centers, but they call it virtual factories. And how do you design physical products?

so [00:05:00] just all over the place and, we had a, we were lucky enough to get a booth there and, the traffic was insane. I mean, there were four. Four days of show floor and usually by day two or three things have slowed down and it’s kind of a trickle of, ’cause everybody’s kind of been around and seen the booths.

But, we were busy, all the way through until the close on, on Friday when they started playing. it’s closing time, it’s, you don’t have to go home, but you can’t stay here. They’re, and they’re like physically pushing people out of the door. ’cause it was, people still wanted to stay.

Josh: I mean, that is a wild, story in and of itself, right? Is just kind of, the state of Nvidia today. And what, I think is kind of interesting is, I think it’s actually a good almost, representation of even the topic we’re gonna talk about today, for the exact reason of, what it is, right.

But, Nvidia has been around for a [00:06:00] very, long time. It’s not like they were born yesterday, right? But I think a lot of times people think they came outta nowhere, especially if you’re not really in the technology world and now all of a sudden you’re seeing, everything from, the crazy stuff that Nvidia is turning out and products that are hitting the market that are getting.

mainstream media attention to hearing about your, friends or people on the internet who made buku bucks by buying Nvidia stock. they’re all over the news, but like I said, they didn’t just like come out of nowhere, right? I mean, the company’s been around for a very long time, and that’s the same with ai, right?

I think a lot of folks are like, oh my gosh, AI kind of came outta nowhere. But in all reality, it’s been here for a very, long time, and we’ve been working on it for a very long time. It’s just the way that a lot of times, technology kind of manifests its innovation happens on a really steep [00:07:00] curve.

And so we don’t see the really long tail of the really slow growth, but then all of a sudden it starts to, it doubles and it doubles and it doubles and doubling nothing to a little bit more than nothing. You don’t really see, but all of a sudden when it goes from a hundred to 200 and 200 to 400, 400 to 800 and 800 to 1600, right?

Then all of a sudden you’re like, oh my gosh, this thing came outta nowhere. so I’m just curious, like when you were at GTC, what was the energy like in terms of just, that like the, hey, this thing has really like blown up and is massive. Massive. what was that like?

JB: I mean, don’t know how to describe it really. It was, it just, it, is a massive amount of energy and excitement around, all the different ways that, that, AI can be utilized. I think over the last couple of years there’s been, every, all of the focus, most of the focus has been on the LLMs and like the ChatGPTs[00:08:00]and Deepseek and, all of these, because it, that kind of became that killer app that I.

Was easy to use and, brought AI to the masses. Right. and, it also consumed tons and tons of compute power, and storage and networking and, drove innovation throughout, throughout the industry. And me. You mentioned that long tail. I recently, had the, opportunity to do a keynote for a conference, and they were, they asked me to come in and talk about efficiency in the industry.

And, one of the things that I, found as I was prepping for it was the how PDI flops had grown, for model training and AI over the, years. And pet flops is just a, measure of how many compute cycles are, consumed. And it went, it took 60 years to get to using one PETA flop to train a model.

Okay. So, so 60 years from, kind of nothing to one ped flop. And then in from that [00:09:00] point, it took just 10 years to go from that one PDI Floop to over 10 billion PDI flops. So I, this, last decade, it’s, certainly that, that hyper hyperbolic growth and, over the last few years, it just continues on that two x every, or I guess the, rate of, growth was every three to four months through 2018. from that, turn. So

Josh: So I want, I wanna come back to that, but before we do, what was maybe your biggest takeaway from GTC? Like what did you go home with, thinking or feeling or learning that really stuck out?

JB: that, that it’s not just about the, the massive, the, LLMs and the hyperscalers. That AI is really gonna be the next wave for AI is moving it to the enterprise, where the, it’s not clusters of, thousands of GPUs, that [00:10:00] will continue to be built out for the hyperscale, but it’s for the enterprise customers, who, maybe are only using a couple of racks or, a, or, a couple dozen racks and, hundreds of GPUs or, tens of GPUs that it will, that’s the next wave and then it will move to the edge.

’cause there has to be edge processing. We’ve gotta be able to do inferencing closer, to where things are happening. for, live data. You know, things like, autonomous driving is, really an inference thing and has a different structure to it and a different workload than the training of these massive models.

In, in the data centers, and then storage, memory, networking, all of these are, critical aspects to innovate, to enable AI to continue to grow.

Josh: side note, just when you’re talking about like what it takes for autonomous driving and everything, I saw, reel on Instagram last night. That was,dash cam, or body [00:11:00] cam footage from a police officer in, Arizona, and he pulled over an autonomous driving car and didn’t realize that was what he was pulling over until he got up to the window.

And you can imagine the shock that he had when there was no one in the car, but the car had driven. the wrong way down a one way. And it was because it was in a construction zone where they had done a traffic pattern change and all of this, and, the car drove the wrong way. And so he pulls it over and it was pretty interesting to watch the interaction.

Basically, there was a,some form of audio AI agent that was like, Hey, I see I’ve been pulled over. It. Talked to the police officer and the police officer was like, Hey, no, I need to talk to a person. Like, y’all just drove the wrong way down one way. This is a problem. and somebody eventually got on and was talking to him and everything, but it was just, it, was fascinating to kind of see that play out.

I’m sure it’s probably not the first time something like that’s happened, [00:12:00] but it was just funny that, I literally saw that last night before this episode. But, 

JB: Well, the Johnny Cat, total recall, they weren’t perfect either.

Josh: Yeah, exactly. so, one of the things that I, why I said I’m really excited to have you on the podcast, is we’ve talked a lot about AI in the last handful of episodes.

Like I make the joke, you kind of almost can’t talk about technology anymore without talking about ai. They’ve, pretty much become synonymous with each other. But we’ve talked a lot about the use cases, right? And so, very specifically, we’re talking to community financial institutions.

About how can you leverage AI within your institution, for example, for operational efficiency, or maybe how can you leverage AI for account holder operations or efficiency? and then, how is the industry as a whole adopting AI and, those kinds of things. But that’s not what we’re gonna talk about today.

I mean, we’ll touch on [00:13:00] some of the different use cases and how they’ll play into it, but what I think is super fascinating about kind of your area of expertise and where you guys are focused on and looking at is like, what are the architecture and infrastructure implications of all of this, AI technology that’s coming to market.

And like what we were just talking about, I mean, this kind of went from yes, it was being worked on. Yes, AI has been here for a long time, but it went from being something that only a group of like super nerds were working on to. I mean, like my 70-year-old dad is using ChatGPT now, right? And just commonly in his day-to-day life.

And so we’ve gotta be able to supply the compute power for that and the energy requirements to power all of the machines that are doing that. And so guess what, as utilization of that technology is skyrocketing, so is everything that needs to come along [00:14:00] with that to support that. And that’s kind of your area of expertise.

So would you maybe start by just giving us a little bit of background on, on this topic, I guess.

JB: sure. yeah, I mean, we’re, where I, as you mentioned in the intro, I’ve been in storage, right? So data storage and the transition to, flash based storage and then, at scale Flex emer looking into emerging memory technologies as well. So, we’re, 

coming at it from the, data pipeline perspective of.

Your, data gets created. It needs to be stored, it needs to be moved through memory and moved to the processors. Well, AI has, changed how that needs to happen because the capability of 

the, 

GPUs to consume data has just far outpaced the ability of [00:15:00] local storage and local memory to have the data there and feed it into the GPUs.

So it’s, we’ve, we talk in the industry about this as, the memory wall, and that is the gap between the processor’s ability to consume data and the, the data pipeline’s ability to deliver data to the GPU. 

and that’s, a, I think in the industry, what is the every 12 to, or sorry, every 18 to 24 months, the processors have.

Increased in, in data consumption capability by like three x. and then the memory side, it’s only increased by 1.4 to 1.8 x. And, you know, you apply that multiplier over time and, the gap gets wider and wider and wider. so we’re in order to, keep those power hungry and, dollar hungry GPUs fed [00:16:00] and so that you can get work done from them, we have to innovate in, in that data pipeline and how we provide data to the, to the GPUs.

Josh: How, just outta curiosity, like how have you seen how much that’s, so what was the term you used, earlier? Sorry, I need to try and commit this one to memory the, it took us 60 years to get to one and then

JB: P

Josh: like a p flop. So how, has you talked about just the, like the data point of how it took us 60 years to get to one.

Right. And then just in the last, you know, from that point, it took 10 years to get to, what was the number

JB: 10 billion.

Josh: 10 billion. Right. So to quantify that for me, like what does that mean? Is that like, going from one horsepower in a horse and buggy to like a Formula One car? Or like, what does this just

JB: I think

Josh: in the, sense of like, what are the [00:17:00] impacts of

JB: it’s going from a horse and buggy to, a, Mars lander. I think in terms of power, it was a Formula One car. They’re what? A, maybe, I don’t know, a thousand, a couple thousand horsepower. and we’re talking going from one to 10 billion. So, it’s a, it’s hard to fathom that growth.

Josh: So what does that mean in terms of like, what, has to happen to the infrastructure and the pipes to support that kind of exponential growth? Is that, and again, I’m just trying to quantify it for a, dummy like me, right? Like, what does that mean? We’ve gotta go from having, one, one little factory to now we’ve gotta have, a hundred different data centers with, going from one little tiny pipe to, 500,000 ginormous.

Like, what does this actually mean? I.

JB: Well, if, what it means is if we did not innovate [00:18:00] outside of, or, if we didn’t innovate in. The efficiencies around the processors, around the data storage density around the, the, network efficiencies or the network bandwidth, then there’s no way we could, sustain this, right? I mean, you couldn’t go from a horse to 10 billion horses, right?

You have to go from a horse to a, rocket, or something like that. So,

Josh: That’s helpful.

JB: from, from a efficiency standpoint, the, the processors have grown tremendously and how efficient they are in terms of how much, how many operations per second, and how many operations per watt that they can deliver.

So that’s, you know, a massive part of, how we achieve that efficiency, storage and memory. The same thing. I, think when, I got really into, when I. When Flash started years ago, it was serial a TA Sada based, [00:19:00] and Sada could go 550 megabytes per second, right? That was the, the throughput that you could get reading off of a an SSD.

And that was 10, 10 to 12 years ago. I don’t know. Time is a blur. and now we’re on PCIE Gen five with gen six coming around the corner and at gen five we’re taking 14 gigabytes per second. So that’s 28 x if I’m doing my math right, 28 x the, the throughput, just reading off of that same little two and a half inch drive.

so, a tremendous amount of improvement in inefficiencies there. ’cause we’re doing that within the same amount, same power envelope too, right? So, and network speeds have gone from. Gigabit per second to 400 gigabit per second. so, just massive increases there and it’s, all of these efficiencies compounding against each other that keep it [00:20:00] from being that we need the, say the power of the sun in order to, do what we’re doing.

Josh: yeah. Okay. That’s where I was trying to kind of get to, ’cause that, was really, helpful. kind of walking me through that, way. JB thank you. The, that was what I needed to like wrap my head around Right. Was, you can’t go from one horse to 10 billion horses, right. And then expect 10 billion horses to somehow carry a payload to Mars.

Like the horses are just not equipped to that. And we don’t have enough real estate to put 10 billion horses on it, so we have to completely shift and go to building rockets. Right. And that’s kind of what we’re talking about here too, right? In the way that we’re using technology, it’s not like just going from one horse to 10 billion horses.

It’s going from a horse to a rocket. and I, want to say when, you and I were talking like, the first time we met before, while we were kind of scheduling the, podcast, [00:21:00] you were talking about how like if we were to just go at the pace at which we’re going with zero innovation, like we would run out of power to power all of these things in pretty short order, right?

JB: Yeah.

Josh: so we have to have some form of innovation. I mean, we look at, you know, we’re already having things like rolling blackouts through California and we’re already talking about having to try and be more thoughtful about, sustainable energy and all of this. And at the same time, we’re creating tools that require obscene amounts of energy.

So those two are kind of button heads a little bit.

JB: Yeah. Yeah. I, missed part of that in the middle, but I, think, it does come into, the, efficiencies overall at, and you have to look at not just the individual components, but, how they interplay with each other and start to look at, system level optimizations, rack level optimizations, data [00:22:00] center level optimizations in order to continue down the curve of, efficiency.

You know, like outside of the components themselves, which are, it’s a struggle because the, they’re definitely, vastly more efficient in terms of work that they can do per wat, but the density of the power is also increasing. So, it used to be that rack, was, a, few hundred watts, right?

or your, processor, your enterprise class, server processor was a hundred to 200 watts. And now we’re talking, over a kilowatt for that processor in the GPUs. And, to deal with that power density, we have to innovate in cooling. ’cause you, historically, we relied upon. Air, right? Those, whiny high pitched fans in your [00:23:00] data center, pushing air as fast as they could across the, fins to move heat away from the, chips and then move them out through the rest of your HVAC system.

and there’s only so much air can do, right? So now there’s, liquid cooling, which is like the radiator in your, it’s really the radiator in your car. and it’s moving instead of just relying upon air, but having, liquid that’s contained in a radiator attached to the chips, and then moving away and then to full immersion cooling, which is even more efficient. Where, and this is, that’s where they’re taking the vats of, I don’t know the, liquid they put in there, but then they just stick the servers right down into the, racks of liquid and move the liquid across to, to get the heat away from the processors. and. All of that is, I don’t have the stats on the, efficiency metrics off the top of my head, but the full immersion cooling [00:24:00] is significantly more, efficient in terms of how much energy you use to cool the chips than air, right?

in data centers, there was the, metric called PUE or power Usage Effectiveness, think that, and that was for every wat of compute or networking or storage power you used, how much extra did you have to use for cooling and other things in, and lighting, et cetera, in your infrastructure. and that used to be like 2.6 was a, an awesome metric.

And the, hyperscalers had gotten that down to about 1.2. but.

Josh: Okay.

JB: Then as you, it was threatening to come back up because you just couldn’t continue to, get, efficiency improvements with air. And so these other, cooling metrics or, cooling technologies are helping continue to, keep that PUE pretty close to one.

I mean, one would be the [00:25:00] ultimate, right?

Josh: Yeah, I, again, this is why this fascinates you. So this is just not an area that I’ve done a lot of research or understanding in, but I remember seeing something about, I don’t know, it’s probably longer ago than I think it is in my head, but, like Google bringing out of, the ocean, some of their like data centers that they put into capsules and sunk under the ocean right to check up on ’em.

And just like, that was a huge moment when that happened. So these are just different techniques that we’re trying to. To use to find ways to maximize our power consumption. So for, every bit of compute power, we’re using less and less energy. Right.

JB: Right, right. That is the goal. 

Josh: Well, and that, so again, that’s why this stuff is so fascinating. Like, same token, I was just trying to read up on, what was, oh my gosh, I shouldn’t, it was Microsoft, [00:26:00] Microsoft’s new quantum computer. And I was just reading on less about the actual computer itself, but more just about the cooling system that they were having to put in place and just for this, quantum computer to run.

Microsoft not only had to develop a quantum computer, but they had to develop the coldest. Environment we’ve ever measured in history. I’m like, that’s crazy. Again, that’s like, I mean, you’re, talking two totally different areas of expertise having to come together, to create this one, tool, and it’s just around, it needs to be so darn cold to even function.

Like that’s wild to me.

JB: Yeah, I mean, we’ve talked about, you mentioned the quantum computing and I, don’t know too much on that, but like the quantum safe encryption needs to come along with that, right? Of how do we, encrypt and [00:27:00] protect data such that a quantum computer can’t crack the encryption? And, because as, a storage guy, then I’m, worried about how do I protect the data?

How do I make sure that it’s safe? And, nobody can, crack into your, secret data. but the quantum safe encryption, it seems like that’s still a ways off because it just, it takes too much energy. and too much compute power to encrypt the data that it’s just not practical to deploy it.

Josh: That’s crazy. But again, like these are where the innovations are happening, right? And if we, like if we told people even five years ago, some of the stuff that we have access to today, they’d think we were crazy. And yet here we are and we have it right? I mean, probably even a large swath of probably tech savvy and tech forward people, if we had told them that, like I could pull up ChatGPT on my cell phone at a bus stop and do some of the things I can do with it.

Even just [00:28:00] five years ago, they would’ve thought I was crazy. Right? And yet, here we are and. You know, you don’t even have to work in technology. You could be, in, burger flipping at McDonald’s and you’re leveraging ChatGPT to do complex, computes. Like that’s pretty cool. So, I mean, it’ll be interesting to see what happens in five years.

And so where do you think that, like the focus on innovation will need to happen in the next five years for that kind of stuff to actually come to fruition?

JB: so there’s, several aspects of the innovation that have to come into play. I mean, the, software has to be there, right? To take advantage of, or, to, for these models to continue to evolve. but you know, I’m, a little, I’m kind of one step away from that, from those applications, you and more on the how do we get the data to there?

and, that from what I see is it’s innovation in, in the file systems and [00:29:00] how we, and the data orchestration layer. How do you get, how do you give your GPUs access to. All of the data that you have, across all of your servers and, for larger companies, across all of the, geographically dispersed sites that you have.

and then once you’ve got that data orchestration and, massively parallel file system resolved, how do I get the, storage and the networking fast enough? How do I get the memory fast enough? ’cause even, there’s orders of magnitude more latency as you move from the, cache that’s, within the processor itself to local dram to distributed dram to storage, right?

These are orders of magnitude, higher latency levels going from nanoseconds to microseconds to seconds depending on how far away things are. and to deal with this growth [00:30:00] in, the compute capability, we’ve gotta be able to grow. The volume and the speed of the data, right? So we’ve gotta be able to have more data faster with greater access to those GPUs in order to keep them from starving.

a an analogy that I used was, if you’ve got, imagine you’re a farmer and you’ve got this, your huge field that you’re planting for, crops, and you got a whole lake, right? You got a whole lake of water to, keep them irrigated. so the, fields are your processor. The, lake is all your data, but all you, if all you have is a tin can, a little, watering can, your, field is gonna go fallow, right?

It’s gonna go, your crops are gonna starve, they’re gonna die. so you need to innovate in your irrigation system or that, data pipeline onaway get data from storage and from memory into those GPUs and distributed there. so they can stay watered.

Josh: I [00:31:00] like that analogy. I want to explore that with you because, my simple brain is trying to think through, there’s so many different variables to what an end consumer of any technology sees, right? just let’s use the simple example of. I’m just sitting at a bus stop and I’m using chat g, PT for whatever.

Right? To me, that seems really simple. I just open up the app on my phone and I type stuff into the bar. That’s it. But like what’s happening underneath of the surface is so many different things that are all working together, but are all on their own separate, like innovation tracks. Right? So using your analogy, right, and I’m sure there’s like umpteen million, variables that we’re not even equating for in this analogy that only make it that much more complex.

But let’s [00:32:00] just say it’s just as simple as the field and the crops, the water to water the crops. And then you know, the pipe to be able to get the water to and from the crops. Well, if you start with a perfectly stable system, right? And you have a field that’s only so big that grows so many crops, and you have exactly the right amount of water that you need to be able to water those crops and the correct irrigation system to bring the right amount of water to those crops at the right time.

You’re good. You have a great stable system, but let’s say that’s to feed your family of five.

JB: Right.

Josh: But then you say you want to be able to feed, you know your neighborhood well, I need more crops. So you could just plant more crops, but then you would have to, in theory, perfectly add more water, more pipe to add more water to those crops.

Well, what if you could find ways to innovate in any one of those areas, [00:33:00] right? So now we say, okay, well I’m gonna create maybe a. Higher density potato with more calories. So I need to grow less potatoes, and to feed a larger number of people. Right? Okay, well that’s one innovation that helped you. but what if all of a sudden that then required significantly more water than you have? Well, now you’ve gotta find an innovation in water because your innovation in the potatoes is useless without the more water. So you create more water. But now your pipe system doesn’t meet that well, now you’ve got the ability to have the more efficient potato, you’ve got more water, but you’ve got no way to transport it between the two.

So your innovation here, in the end, user experience hasn’t actually changed because the whole stream hasn’t actually caught up. And I think that’s what [00:34:00] you’re telling us, right, is like there’s this constant battle of each of these groups is evolving and innovating at different speeds, and we’re trying to have to like meet the lowest common denominator.

So you could have this incredible computer that could do all of this stuff, but if you don’t have the cooling for it, like well, you’re kind of crap outta luck. Like your computer’s useless. Right? so there’s this like constant push and pull of innovation that’s happening in each of those.

JB: Yeah, I mean, it’s a, that’s in the computing world. It’s, a constant game of whack-a-mole as to what, is the current bottleneck, right? as soon as you solve one another, one pops up. So

Josh: What, this is probably an overgeneralization, but maybe not, but what would you say is like the current biggest bottleneck.

JB: I, think it is feeding the, GPUs, the data, it is the, pipeline of, And the efficiency of, loading the data into the GPUs, and getting it back out. So, we [00:35:00] talk, and, Nvidia, this is a, public thing. NVIDIA’s talked about, hey, the, IO size that, or the, sorry, the, chunk, the data chunk size that we need to process on is very different from the IO chunks or the data chunk size that, that storage serves.

Like we need something much, much smaller. and when we ask storage for the smaller chunk, we actually get less total data in, because we don’t get more iOS. So, you know, if bandwidth is the IO size times the number of iOS, right? or, mass times, times velocity, you know, and so if you can’t increase.

then the velocity by, decreasing the mass, then you, just got less. and so it’s how do we innovate in, the access to data that is in storage, to, better match it and increase that payload of, data that’s achi, that’s reaching the [00:36:00] GPU so that they can stay hydrated.

Josh: Hmm. again, using your, I like your farm analogy from earlier. This is really helping me. ’cause there’s, in any scenario, there’s possibly multiple different ways to accomplish it, right? So if we’re saying right now we’ve got enough water and we’ve got big enough crops, we’re just not able to get the water from.

The, lake to the crops. One method would be to get more pipes of water. Another method would be get bigger pipes. Another method may be totally different, and it may be saying, reduce the size of a water droplet, but maintain the hydration benefit that it has at the end for the crop.

So maybe I needed, or at the other end, right, like, reduce the amount of water the crop even needs. So maybe the potato used to need a gallon of [00:37:00] water to grow, but if we can make it so that it only needs a half a gallon, then we only need to transport a half a gallon. Or if conversely, like, one gallon of water creates the right amount of hydration for the potato to grow, what if we can, genetically modify the water?

To be a half a gallon is as good as a gallon to the same potato, even with no innovation in the potato. Like right there, there’s like four different ways we could solve the problem. So I’m assuming that what you’re seeing in your industry is there’s probably people picking each of those different paths and trying to innovate in each of those.

Right.

JB: Yeah. Yeah. So, like we, we look at, I’ll, start with memory, which is the, the fastest, thing and, the closest to the processor, but it’s also the smallest incapacity. And I mentioned earlier the, cash that’s in the cheap, in the processor itself, that’s the fastest, but also the smallest and the most expensive [00:38:00] per gigabyte or megabyte of, capacity.

So recently, you’ve seen what’s called HBM or high bandwidth memory hit the stage and I think, in microns earning this, they just said that this last year was, the first year was over a billion dollars in, revenue, just on this new technology, HBM. And that gives, a higher capacity of faster local memory that sits outside the processor.

And then, so that helps with the solving this, memory wall or data pipeline problem. But it’s not a complete, it’s not a complete solution ’cause it’s just, it’s still very expensive and still not enough capacity. So there’s this other technology called CXL or Compute Express Link is the, full name.

You can look ’em up. Compute express link.org, I think. and that is allowing you to attach memory on in a [00:39:00] different manner. So instead of historically you have to attach dram like on your laptop. You’ve probably familiar with dram, connects directly to the chip, the, processor. ’cause the processor has a certain number of, lanes that it can connect out to the memory.

But the, volume of and speed is gonna be limited by those lanes. But it also has these other lanes that are used to connect out to storage and networking and other components, called PCI. So CXL allows you to connect memory off of the PCIE or PCI express lanes. And that gives you a whole different way in which you can attach memory.

And so now you can. exponentially increase the volume of, of that super fast memory that you can attach. And then even on, on the storage side, we’re looking at how do we improve the payload of data that comes off of the drives? Can you more intelligently select the data [00:40:00] that gets moved?

and, or can we align better with the size of, IO chunk that, the processors need? So these are innovations at all of these different stages and dozens of different players or, probably hundreds of players in the industry working on these different technologies.

Josh: You know what I keep like thinking about in the back of my head JB, is if, oh, I can’t, for me, this is super fascinating ’cause like some of the stuff that you’re talking about Yeah. If you’re. somebody who maybe works in technology for a community financial institution, maybe a lot of these terms make sense to you.

Very few of them make sense to me. These are not my

JB: I’m sorry. I.

Josh: right? No, I love it. I love it. That’s what I mean. Again, like, this is how we learn is, I want to get outside of the comfort zone of the acronyms I use on a daily basis and understand like what is happening outside of our space. But, what I keep thinking back to is if, if I’m sitting there, as [00:41:00] somebody who works for a community financial institution, I’m wondering like, what does this all mean for me?

if I boil it down super crazy simple. And JB I want you to keep me honest here. like what this means is just we’re talking about all of the millions of different factors that go into your ability to leverage the latest and greatest of technology for your specific use cases. This has everything to do with your access and cost to these things, right?

So, perfect case in point, like you could have absolutely deployed an AI strategy for a specific use case at your financial institution 10 plus years ago, right? But it was probably astronomically expensive and required, some really serious in-house, technical expertise and aptitude to be able to pull it off.

But today, this stuff has been simplified down so much and because of the, [00:42:00] innovations that are happening, inefficiency, it’s become cheap enough. That, can use a lot more of this type of stuff. So I, just wanted to kind of draw that parallel for, for you, if you’re sitting here listening to this episode, like what JB is talking about is all the behind the scenes stuff that’s happening so that I can just sit at a bus stop and pull out my phone and use chat g PT, and like, it seems so simple.

JB: Yeah. And on your, bus stop just using the GPT, it’s, now you’re using GPT instead of Google, in a lot of cases, right? Because, and behind the scenes, the infrastructure that, and the processing and, the way that they’re managing the data and working to give you your response to your, quick GPT query versus the, Google query takes 10 times as much power.

So, you know that’s another,

Josh: power, are you, meaning like literal electricity?

JB: yeah. Yeah. Kilowatt hours or [00:43:00] watt hours of, energy. Yes.

Josh: Okay. I am glad you said that. ’cause this is one of the things that keeps coming up in my head that this is one of the things that absolutely terrifies me if I’m being totally honest, and I’m just gonna lay my cards out on the table, right? Like, I think we’re smart enough at this point to know at least I hope, we have a finite number of resources on this planet and we’re kind of going through some of ’em at, some pretty rapid paces.

And what’s interesting is, that, we’re moving so much to electric, everything from, cars to you name it. So across the board, like our energy consumption of electricity is skyrocketing. And if you’re talking about, just the energy needed to run ChatGPT is 10x Google.

Well, that wasn’t a big deal when like five people were using ChatGPT,[00:44:00] but all of a sudden when the entire planet starts realizing, oh, this thing’s way better than Google. This is great. I’m gonna move to ChatGPT. well we just 10 Xd our expectation for energy consumption. How are we meeting that? Like if we’re already talking about having issues with, electricity and the power grid now, I mean we’re, again going down the conversation of like, what becomes the weakest link?

  1. And I’m thinking in my head, dude, this could be as simple as literally just like the freaking power grid just can’t handle this crap. Who cares about like, Google’s ability to manage its own electricity, but like, can the freaking power grid handle it? And all of a sudden, like if everybody is laying in bed at night in town and decides they want to jump on ChatGPT, like all of a sudden we nuke our own little grid and we go off grid and send ourselves back to the stone age by trying to like, [00:45:00] live in 2050 using ChatGPT.

I don’t like, these are the kinds of things that, that get my head going. And I’m like, I just, I don’t understand the complexities behind the requirements for all of this, but even my simple brain starts to wonder like, what is this gonna mean for us long term?

JB: well, it’s, gonna mean we have to innovate. Sorry to go back to that, but, we have, through, the decades, society and, the engineering teams have been ex great at improving efficiencies, right? I mean, you, look at how many light bulbs are used today versus 50 years ago.

but the, power consumption is so much more efficient that even as we electrified the world, and, things we, only saw over the past, Few decades. I forget my, specific stats when that started, there was only [00:46:00] a, two, one to 2% per year growth in total power consumption.

and that’s with all this electrification and, using more light bulbs, using, more electric vehicles, et cetera. Data centers are kind of the, the break in that they’ve been growing at, 12 to 16% per year, total consum growth and total consumption globally. So that’s, eight x the, rate of everything else.

and just, the stat was I think from, 2022 to 2024, we added the equivalent of France to the power, the global power grid, just for data centers. so data centers on their own are now, if you were to separate them as a country. They are the fifth largest country in the world for consumption of power of electricity, behind, China, us, India, and Japan.

I think, so the power data [00:47:00] centers, if they were their own country, would be fifth. And, if, if we didn’t have a lot of the innovation in, say, in like processors being able to, do a hundred times the amount of work, per watt as they did 10 years ago and, and the efficiencies in cooling, et cetera, then you know that would be skyrocketing beyond there.

So, the pressure is on, to continue the innovation in that power efficiency and compute efficiency, to, to deal with being able to grow out, that, or being able to meet that increasing demand for. Consumption of compute capability. but yeah, it, we can’t rely upon one energy source, to, provide the electricity.

We’re, we’ve gotta, fossil fuels are certainly a, significant contributor, but we’ve gotta use renewables, we gotta use geothermal, we gotta use all kinds of, innovation in, how we generate that power.

Josh: I’m, [00:48:00] glad you kinda used that example. That helped me too of, of light bulbs and how we have more light bulbs than ever, but very little additional power consumption just because they’ve gotten more efficient and Yeah. I remember even when I was a kid, like my dad yelling at me for like, turn the light off if you leave the room.

Like, I’m not paying for that. Right. And now I have so many stinking lights in this house and they’re always on and I don’t even pay attention to it. And I bet you if we left for a month versus we’re home for a month, my electric bill would change by like five bucks. I mean, just the efficiency of these modern, super efficient LED bulbs and stuff, it like.

You don’t even notice. So there, there is that’s happening. but so I’m curious, if you’ve heard like is there any conversation around until we get better at energy optimization, are there other talks about how we handle this? I mean like, I guess very [00:49:00] bluntly I’m gonna ask, have you heard if we talked about throttling any of this stuff and say, Hey, look, if we just, if we turn the world loose on using AI for absolutely everything at the pace at which it’s going, like, we cannot innovate power consumption fast enough, we cannot create enough energy fast enough and like, we’ll put ourselves in a dire position if we don’t throttle this.

Like, is that even a thing? Is that being talked about?

JB: I haven’t heard of that as a direct initiative. however, in looking at the power supply and all this, I have seen plenty of instances where they’ve said, Hey, like Northern Virginia. For example, which is a, big, data center alley, that the, guys that wanna build data centers, are, they’ve got the money, they’ve got the land, they can build the infrastructure, but the, the power supply isn’t there.

And so the data [00:50:00] center build outs are getting delayed because the electric, grid isn’t being, can’t be built out fast enough to supply the energy for these new data centers. and then if you look at like the, STAR Project Stargate initiative that the US has, some of those are now, it’s a co-development, right?

You’re building out all of this data center floor space, but you’re also building a power generation plant, co-located or nearby located. You see Microsoft bringing up one of the reactors, I think on three Mile Island to support. Their data center, grow out and or build out. And you see, I’ve seen a, lot of buzz around small nuclear reactors or s smr, small modular reactors, SMRs for, where like one nuclear reactor per data center.

and these aren’t the, nuclear reactors that, that [00:51:00] I grew up with. They’re, vastly more efficient and they can even use waste from other older nuclear plants. I mean, the innovation in the electrical generation is amazing and, just an entirely different industry. so it takes all of these different industries and different technologies coming together to, to help satisfy this, growing demand.

Josh: Yeah, I mean that, that is what’s crazy is like just even in this, hour that you and I have been chatting, like how many different industries we’ve talked about need to have innovation happen for innovation in AI to happen, right? Like, there’s so many different things that go into it, and I think that’s what, as you were talking about, just, like power consumption and everything and talking about the different ways that they’re thinking about it.

I think that’s also what creates some of these, [00:52:00] the public perceives our massive leaps in a technology. so again, keep me honest here, but like, let’s say tomorrow somebody comes outta nowhere. At least to us, right? And says, Hey, we have created the, like, Tony Stark, iron Man reactor. We have unlimited energy, easily created, very little resources that are required.

All of a sudden, we increase our ability to deliver power by, I don’t know, let’s just, something ridiculous, like 500000000000%, right? You would, all of a sudden, yes, you’d see that crazy innovation in energy consumption or, energy production, right? You’d see it in that sector, but you’d also see it in AI and data centers, right?

Because all of those data centers that you were just talking about that could not get built all of a [00:53:00] sudden can get built now ’cause they’ve got the power. So you would see this massive innovation happen in data centers. Because there was a massive innovation that happened in a totally separate vertical.

JB: Yeah, they, feed off each other and, like already. you mentioned the GTC last week. So, Jensen Wong talks about using their chips, the, GPUs and AI to help design the next generation of processors and the next generation of ai. So it’s AI is helping. Innovate in ai, but also I, in a different way, AI is getting, being used to improve the efficiency of the power grid, improve the efficiency of the power generation.

so these are, everything feeds off of each other.

Josh: Yeah. Isn’t that interesting? I, I don’t know. I’m sure you’ve seen there was a, picture growing around the internet recently of some rocket scientists had built an AI model to actually go out and redesign a more [00:54:00] efficient rocket booster.

JB: Okay.

Josh: And this thing is wild looking. Like it’s super trippy.

It looks like, almost a negative space painting as opposed to like a more traditional painting. it’s super weird looking, but they’re showing all the stats of just how like crazy more efficient this thing is than the rocket booster they based the design off of. They basically, built the model for this AI taught it, gave it all the data, all the different, studies they’ve done on their rocket boosters in the past.

And then they said, take this one and make it more efficient. And then they said, actually, just throw everything away that we’ve done before and just build the most efficient one possible. And what it turned out like kind of resembles a rocket booster but kind of doesn’t. And, it’s exponentially more efficient.

It’s cheaper to produce. So, it’s, like feeding that cycle is pretty fascinating to watch.

JB: Yeah. Yeah, because it can [00:55:00] iterate, millions of times faster than we can iterate, in our brains.

Josh: Yeah. What are some of the things that, kind of concern you about the next, set of innovations that happen and where the next set of bottlenecks could come? So let’s say they, you guys solve the bottleneck that’s currently kind of balls in your court, right? Like, that gets solved. Where do you think it shifts next and, what kind of concerns you about that next stage? 

JB: networking, will certainly continue to be one of the, one of the moles that pops up in whack-a-mole in the bottlenecks. I’m kind of thinking out loud here, but you know, the processors, man, they’re, just growing in their capability. I do worry about just that power density being, becoming a bottleneck.

you know, there’s the innovation with the, direct to chip liquid cooling and the immersions immersion cooling, but [00:56:00] still going from, one person in the elevator to 10 people in the elevator. and how much heat it gets generated there over as those processors increase their, power density.

I, that is something that, that is gonna be a, challenge, right? As to how do we, how can you continue to cool that, increasing power density and maybe it’s this, thing that you talked about for the quantum computing cooling, with Microsoft of, just a totally different means, to get there.

Josh: I, I, think it’s interesting that’s what you brought up. ’cause I, again, like, this is not my area of expertise, but that’s where my head was going too, is like the one that kind of concerns me behind just the power consumption is the cooling of all of this stuff. And again, going back to you, were saying, we started with fans.

Well, fans are only gonna get efficient to a certain point. Like, and you’re gonna max that out probably pretty quickly. At the same time, those fans take [00:57:00] electricity, so you gotta generate electricity to move the fans, et cetera. So there’s that thing. And then we’re talking about, liquid cooling and stuff like that.

Well, we’ve gotta get this liquid, so what is it? And sure if it’s water, we’ve got quite a bit of that. But I would venture to guess it’s probably not just straight water. Right? So again, we’re using more we resources. even just thinking about like the, whole Google sinking that thing, I don’t know, like I’m a technologist.

I love the innovation. I like capitalism. I think it brings about some pretty cool stuff. But at the same time, like, I don’t know, man, I don’t know how I feel about sinking a bunch of stuff on our beautiful or ocean floor and like, do we wanna get to the point where it’s like my grandkids like, don’t even really want to go swimming in the ocean anymore.

’cause it’s just a bunch of Google storage containers for data centers right off the coast and the coast isn’t pretty anymore. Like, I don’t know, it’s just, I, [00:58:00] these are the things that you think about that it’s like. Again, maybe. Is this gonna be, is this gonna be, I don’t know if I wanna use the word catastrophic, but like catastrophic in your, in my lifetime?

Probably not. in my kids, maybe in my grandkids. Really? Maybe like, I don’t know. You start to get to the point where it’s like, you see some of these old, sci-fi movies and it’s like everything has been industrialized. There’s no nature. Like, I don’t know, do we really want that for our planet? I don’t think so.

So I, I don’t know. It’s

JB: so I,

Josh: crazy like give and take, but at the same time, I really love using ChatGPT, so.

JB: Yeah, I mean that, there’s a lot of, work on sustainability and how do we make. like us on a more circular environment with the heat generation. And I saw something recently, where, [00:59:00] and I forget which, vendor was doing it, but they were taking some of the heat that’s generated from the processors, and the, compute infrastructure and then leveraging that for heating, other buildings, So rather than it just becoming waste heat that we need to get out into the environment, it’s like, Hey, well we need heat elsewhere, so let’s, reuse this. so, I think we can, it, it’s all a matter of time, right? but we, can certainly start to, to make things more recyclable, and, more of a, circular use of all of the, aspects that go into supporting the, data center infrastructure.

From the components to the

Josh: Yeah, that kind of stuff is fascinating.

JB: Yeah,

Josh: I, should probably do a little bit more research on it before, before talking about it too much, but,’cause too, it was just, I’m pretty sure it was like an ad on Instagram, but I remember [01:00:00] seeing this, I wanna say it’s on like Kickstarter or something right now, and it’s a little portable heater, like an office space heater.

But it’s a Bitcoin mining machine. And they’re like, Hey, we’re gonna, we’re gonna use this thing to mine Bitcoin, but at the same time, the heat that’s generated we’ll use as a space heater. And so they’re like, Hey, we wanna mine more Bitcoin, but also you need to heat your office. Let’s do the same, or let’s do both at the same time.

I’m like, ah, that’s pretty cool. simple little solutions. It’s not like saving, or solving world hunger, but like, hey, step at a time.

Yeah. Rather than just wasting the heat. Yeah. Yeah. I was like, ah, that’s pretty cool. So, yeah, I don’t know, maybe we’ll start to see more stuff like that. And again, it, all it takes is, one huge breakthrough in one area. And like you were saying, it totally shifts which, mole we’ve gotta [01:01:00] whack.

JB: yeah.

Josh: Well, I, JB this has been absolutely fascinating, man. Thank you so much for coming and like, talking to me, about stuff that, yeah, I don’t know. It’s, I when, just to maybe peel, back the layers of transparency, it was j B’s PR team that reached out and was like, Hey, we have a guest for your podcast.

And I got like, the bio and what they were pitching and I was like, I don’t think this makes any sense for me. And I was like, no, wait, actually, this is really fascinating because like I was saying, if you’re sitting there as, somebody who, who’s, taking up the charge, leading the next generation of innovation for a community financial institution, a lot of that’s centered around tech.

And it’s really fascinating to understand like what’s happening below the tip of the iceberg in tech to be able to give us the ability to have a. This type of tool set at our disposal unlike what we’ve ever had before. And it’s all of these different advancements [01:02:00] that are happening across, across our, farm field.

to be able to help us do the things that we’re doing in our industry. And again, I know it sounds silly, but it’s like you think about the place of a community financial institution is hopefully to improve, the financial health, wellbeing and future of their account holders and the communities that they serve.

And today, a lot of that’s done through technology. And so all of this innovation that’s happening in making data centers more efficient. As crazy as it sounds, but that is also having an impact on our ability to provide those really amazing services to our communities and our people. So like all of this stuff goes hand in hand.

And it was really fascinating to kind of talk through with you just what’s happening and it’s impacting our industry, and, how it’s kind of taking place in yours. So thank

you. 

JB: and a lot of these, a lot of things we talked about, [01:03:00] the, environmental impacts, they’re, a little more abstract than, than your, primary listeners might be worried about on a daily basis. But all of these innovations also contribute to making all of their, IT infrastructure more cost efficient.

such that they can get, more done, not just per watt, like I, I think about that efficiency. But they can get more done per dollar that they spend on their IT infrastructure as well through the innovations that are being delivered.

Josh: Yeah, that’s a great point, right? I mean, all of this again, is driving down. I mean, think about how much it used to cost you to potentially run. Maybe you were a, a credit union that ran your, your core and maybe digital banking in-house and like what that cost you, holistically versus now running it in shared compute resources in a cloud environment.

And maybe it’s not even just about [01:04:00] dollar for dollar, it’s cheaper. But not only is it maybe cheaper, but it’s exponentially more efficient. You’re getting so much more processing power. More transactions are flowing through that you’re able to offer more services to your members, right? All of that is because of these innovations that are happening here.

So, yeah. Good point. Well, JB before I, turn you loose on your day, I got two final questions for you, sir. So, first off, where do you go to get information about what’s happening in the industry and how do you stay informed on which mole is popping up that you guys gotta whack?

JB: well, I, recently found one, newsletter and they do a pretty good job of hitting some different topics and kind of putting ’em in historical perspective. it is more AI focused, and that’s, AI supremacy. and then 

Josh: I like 

JB: for, the industry that, I’m, I am very focused on the data storage, and the, memory.

So [01:05:00] storage, newsletter, storage review are two of the, kind of quintessential areas or, sources that we go to, to find out what’s new and interesting, what’s going on with different vendors, what’s going on with different technologies. And then at the data center level, as I’ve been doing some of more, this more, global type of, research uptime ins institute, is a great source for what’s going on with, data centers.

And they, often have, surveys and, results of, so you can see how you’re doing against your peers or, what other people are, worried about with their data center infrastructure.

Josh: Oh, that’s awesome. I’m gonna have to check a few of those out for sure. and then last but not least, if people want to connect with you or if they wanna learn more about scale flux and what you guys are doing, how can they do that?

JB: sure. We’re, so, I’m I, LinkedIn’s pretty easy. I’m JB Baker or JB Baker at, on LinkedIn and you can find me via scale, flu, scale flux. We’ve got, our, [01:06:00] website scale flu.com. LinkedIn. We have a, we’re very active on LinkedIn as well, and we even have a YouTube channel, so you can search scale flux, on YouTube and, see our channel there.

And, our, we, have mostly educational videos. we have one recently that’s, pretty fun. So you might take a look at the fun video too.

Josh: Well, now I’m intrigued. No, that’s awesome. Thank you, sir. well again, I really appreciate this. This has been a, kind of a, fun detour from maybe the normal conversations that I would have on the podcast, but it was really informative about, again, just how innovation both in and out of our direct industry directly impacts the innovation that happens in our industry.

So, thanks for taking time to come and be a guest, JB I appreciate it.

JB: Thanks, I appreciate the, opportunity here, Josh.

2025-04-17T10:52:35-07:00
Go to Top