Monday, December 23, 2024

From Chips to Systems How AI is Revolutionizing Compute and Infrastructure | William Blair

Must read

In this episode of William Blair Thinking Presents, Jason Ader, co-group head of the technology, media, and communications sector, interviews Sebastien Naji, an equity research associate specializing in semiconductors and infrastructure systems, on their team’s latest report, “From Chips to Systems: How AI Is Revolutionizing Compute and Infrastructure,” which explores the shift from serial to parallel computing, the rise of AI accelerators, and the evolving landscape of semiconductor companies.

Podcast Transcript

00:20
Chris
Hey, everybody. Today is October 16th, 2024. We’re back for another episode of William Blair Thinking Presents, but with a bit of a spin this time around. Instead of yours truly moderating, we’re welcoming back Jason Ader. He’s partner and co group head of William Blair Technology, Media and Communications research sector. We’ve had him on I believe a couple times now.

And then he’ll be interviewing Sebastian Naji, an equity research associate in our technology research group who focuses on semiconductors and infrastructure systems.

And before I hand it off to Jason and Sebastien, the audience may remember past episodes in which I spoke with the William Blair technology team about the shifts caused by the rise of Gen AI since the launch of ChatGPT, and how this new technology has reshuffled investments and demand for new infrastructure data services and security.

This new report, which is called From Chips to Systems: How AI is Revolutionizing Computing Infrastructure. Jason and Sebastian dig actually one level deeper, to understand fundamentally how the rise in AI impacts the computing layer and whether the entirety of data center infrastructure technology,

So with that, Jason and Sebastian, welcome to the show.
Jason, I thought we’d kick it off by having you walk through maybe a brief history of computing, which will then set the stage nicely for, the rest of the key takeaways of the report.

01:41
Jason
Yeah. You know, I would say we’re here today, to really drill down, in some ways on just the hardware element of this paradigm shift to generative AI and have Sebastien go through some of the research that he’s done. On that topic as well, talk about, you know, what we can think about going forward.

You know, one of our key takeaways in the report focuses on how AI represents the next generational shift in computing. And just the breakthroughs that we’ve seen, on the computing side that have really enabled the shift.

We are definitely still in the infancy of generative AI. I think, we all agree on that.

The expectation is that it will create a massive multi-trillion-dollar, market opportunity, for, a lot of different players, over the next decade or so. And I think that sort of sets the stage, for this discussion. So, Sebastian, just more specifically, can you talk a little bit about, you know, your research on the history of computing and how we’ve arrived at how we’ve arrived at this point today?

03:00
Sebastien
Sure, yeah, and thanks, Chris, for having me on. Jason, as you said, Gen AI has become a focus over the last two years, really because it underpins the next platform shift in computing. We’ve been through similar platform shifts in the past. I think if we look back at the history of Silicon Valley in the 1960s through the 1970s, the mainframe really emerged as the first accessible computing platform for enterprises.

And then in the 1980s through the 90s, we saw the rise of the PC, which brought with it new ways of working and a new set of computing leaders. And then we saw another shift again in the early 2000s towards mobile computing as smartphones became the primary driver of computing.

And I think now, with the release of ChatGPT in November 2022, serving as sort of the iPhone moment for Jenny I, we are currently in the early innings of this latest platform shifting computing towards, gen AI applications and parallel computing infrastructure.

And I think it’s important to note that every time there has been one of these platform shifts, it results in a realigning of the key vendors within the semi-industry. Each time we have a different set of companies that emerge as the leaders. And that’s why I think in our report, we really focused on what is changing as we shift to gen AI and what companies in the semi space are best positioned to lead this generational shift.

04:13
Jason
Great.

One of the things that I think has been on people’s minds has been the, the fear, you know, that we’re in some type of a bubble here with generative AI. With AI. Can you just talk about your perspectives on, you know, how this paradigm shift might be different than what we’ve seen in the past? And, you know, your, your prognostication on, you know, how long a runway we have here for this supercycle.

04:41
Sebastian
Yeah, I think it’s an important question. And I think everyone’s really focused on, you know, how much spending can there continue to be here? There’s already been a ton. I think if we look at the infrastructure layer, specifically the scale of computing required and the high cost of these specialized AI systems really has demanded and continues to demand massive investments.

We’ve seen the hyperscalers all shell out billions of dollars in CapEx in order to stand up these new AI data centers and train the latest and greatest LLM’s, model performance continues to improve with scale. There’s new types of data, synthetic data, audio data. And so just as training opportunity has become a massive source of demand for AI computing platforms, I think expectations are for that investment momentum and that CapEx to continue for the foreseeable future.

And then if we look beyond training, we’re also starting to see, you know, growing demand for inference infrastructure, which can power this new generation of AI applications and workflows. The rise of agent AI, for example, which is essentially having multiple models all talking to each other, I think is just one example of how even on the inference side, the computing requirements remain extensive.

And then as AI applications move into production, I do expect that the inference opportunity will start to eclipse that of training. And then all in just, you know, both across training and inference, we’ve really seen estimates of this market opportunity that range from hundreds of billions of dollars today, growing to trillions of dollars in spending over the longer term. And this opportunity is driven by a combination of new AI workloads that are being built on these parallel computing systems, as well as the replacement of existing serial I.T infrastructure that helps support some of these newer parallel computing workloads.

06:22
Jason
So I think it was Mark Twain that said that history doesn’t repeat itself. But it rhymes. Can you maybe talk about some of the similarities and differences in this current era of generative AI and the cycle that we’re seeing here, compared to the kind of late nineties.com era.

06:40
Sebastian
Yeah. I mean, I think there’s a few differences. I think everyone’s looking at the 1990s and the dotcom bubble as sort of a potential, you know, forewarning of what this cycle might look like in terms of the heavy investments that we’ve seen so far.

I think what’s a little bit different here is first, we have real demand for this infrastructure. In the early 90s, I think a lot of the build out of the infrastructure for the internet came well ahead of the demand for it, and was powered by a lot of, you know, startups and larger companies that had built balance sheets that were highly levered, that were really, you know, pushing some of these investments, I think, in a very speculative way.

This time around, that’s not what we’re seeing. We’re seeing, you know, the largest companies in the world, who have very, very, strong balance sheets, a lot of cash spending on building infrastructure and data centers that are being used today for training of these models. And then also, you know, in the very early innings of running, the inference for not only their applications and the AI that they’re building into those applications, but potential enterprise customers that are also starting to build their own AI applications.

And so I think it’s a very different supply demand environment than it was in the 90s. It’s a very different, financial health environment in terms of the liquidity risks, very much the lack of leverage that we’ve seen, the fact that, interest rates are not at zero like they were in the early part of that .com bubble and are actually potentially coming down, I think, you know, overall it does it does put us in a position that is a little bit healthier. And for me, gives me confidence that there’s continued momentum here, at least for the next few years, to continue investing in this in this platform.

08:29
Jason
Yeah. I think it’s fair to say, too. There was massive value creation in the period of 1995 to 2002 or whatever. Even though there’s we think about a sort of bubble bursting, there was, you know, some of the biggest companies today were created in that in that period. So, I think that’s a good parallel. That’s maybe a similarity. I think there’ll be a tremendous amount of value creation, during this period. And as you said, I think there’s a healthier financial backdrop for the companies that are making the investments.

Let’s shift over to kind of the, the nuts and bolts of the semiconductor industry right now in terms of how it’s changing. You kind of alluded to that earlier, the shift from serial computing to parallel computing. Maybe talk about, sort of a broader concept of verticalization of the computing stack and why that might be important, in particular, right now.

09:33
Sebastian
That’s a great question. And it really gets to the heart of our whitepaper. I think since the invention of the transistor, there’s been this guiding paradigm in computing or in Silicon Valley called Moore’s Law. It’s really determined that the processing capability of chips doubles roughly every two years. And for many decades, the steady improvement in processing was really driven by improvements in transistor science, which allowed manufacturers to shrink transistors and fit more and more of them onto the same size chip.

So, you got higher transistor density that led to faster, more powerful chips. But more recently, I think this trend has really stopped working, in large part because we’ve started to reach the limits of transistor science as we’ve gotten to five nanometer and three nanometer node sizes. I think the ability to shrink further is running up very real physical limitations.

I mean, at these very small scales, you’re really talking about the characteristics of individual atoms that become important. And so, as a result, it’s taken longer and become increasingly more expensive to build smaller transistors and denser chips. And instead, semi companies have decided to look up the stack at the full computing system in order to help drive continued predictable performance gains.

And so system level engineering is now a new important source of performance improvements as semi vendors build capabilities in networking and software and clustering. All to help run workloads more efficiently on their chips. And this has meant that the real value delivered by semi companies in the world of AI is not so much about faster GPUs or CPUs, but rather much faster data center systems come already integrated with specific amounts of memory, storage, networking capabilities, and software solutions all together resulting in better performance and faster performance for AI workloads.

And you talked a little bit about sort of the shift here from serial to parallel computing. I mean, underpinning all these systems at the end of the day is this shift to parallel computing or GPUs. And that’s being driven by just the massive computing requirements of AI workloads. Training LLMs, inferencing, it all requires thousands of times more computing resources than your traditional application.

And that’s why parallel computing or GPUs, which chops up problems into lots of smaller chunks and is able to run those simultaneously, have become very popular. They just really greatly reduce the amount of time it takes to train a model or to perform a process. And so you’re talking about, you know, things coming from, from years to train to, to weeks to train by shifting away from the serial computing that was present in CPUs towards the parallel computing that is taking over the data center with the rise of GPUs and AI.

12:15
Jason
Okay. And then, when you think about the implications of the verticalization of the computing stack, as you talked about, what does that mean for the, kind of margin structure of the semiconductor industry? Will that result in some type of sustainable improvement in margins, or is this just sort of a temporary shift that over time, you know, prices come down and more competition in the margins will kind of, you know, revert to where they’ve been historically for some of these companies.

12:46
Sebastian
That’s a good question. And my straightforward answer is I think the leading semi companies have an ability to sustain these margins for higher. I think semiconductor vendors for a long time were viewed as cyclical vendors of commodity solutions. And while the cyclicality is still there, increasingly I think the best performing semi companies are those that understand that building systems can reduce the commoditization of individual components and thereby improve their technical mode, their differentiation, and their pricing power, which at the end of the day is driving those better gross margins.

And this is what I think we’re seeing from the best performing semi vendors. Right. It’s allowed those leaders in the AI space that we talk about in the report, to drive higher gross margins above the 40% level that is, more traditional in the semi world to margins that are in the high 60s or even mid 70%.

And in a sense, some of the vendors margins look a lot more like cloud margins, because rather than selling component of a data center, they really are providing a fully bundled offering that embeds all the networking and system engineering and software knowhow into a full stack solution.

13:57
Jason
Where do you see the main bottlenecks right now in the advancement of this generative AI paradigm?

14:07
Sebastian
I mean, I think there are a few bottlenecks. Ironically, GPUs, even though supplies limited is not so much a bottleneck in these systems, they, they end up going underutilized in the vast majority of systems in which they’re deployed. And the reason is because, you know, the networking components, the memory components are all still playing catch up.

So at an infrastructure, from an infrastructure perspective, networking and memory are two of those key bottlenecks. But as we sort of expand the lens more broadly, because we are talking about data center systems and not just servers or boxes, there is also bottlenecks around power and energy. I mean, you need ten times as much power for some of these AI systems as you do a typical system.

There’s bottlenecks around cooling and whether there’s enough liquid cooling within these data center, there’s bottlenecks just in in getting, you know, regulatory approvals and setting the foundations for these data center, for these data centers, wherever they might be. So I think there is a lot of bottlenecks just around building enough supply. And they range from, you know, within the infrastructure and the stack itself, at the network and memory level to more broadly, you know, at the data center level, getting enough power, getting the right cooling, getting all of the sort of ancillary equipment built into the system so that it can operate and function, you know, holistically the way that you want it to.

15:32
Jason
Okay. And I think the last question for me would be, you know, at a very high level, how should we think about the return on this multibillion, tens of billions of dollars of investment? I know there’s been some concern in the investment community, in the industry about, you know, where, like, where are the apps? Are these billions of dollars of investment, you know, going to see a return? And what is the risk that, you know, back to the earlier point that we’ll see this massive spending cycle and then, you know, there won’t be enough ways for, companies to make money and again, return, that investment.

16:16
Sebastian
I think it’s a question everyone is focused on. One of the answers is that we are still very early. I mean, we’re still, getting improvements in models and training. We’re seeing new models come out that can do new things like reasoning and multi-modality. So I think we’re still figuring out exactly what does the end state of these models look like in order to build an application that can inference on top of that.

So I don’t want to put the cart before the horse when, you know, we’re still a little bit testing out the waters and experimental on the training side. But I do think, you know, when you talk about what is the ROI that typically implies what, you know, the inference, what kind of value are we getting out of the inference on these models?

I think we’re starting to see, some of that revenue, come in at least very early on. I mean, some of the earliest companies that have been working with AI have built, you know, recommendation systems, search engine systems on top of these accelerated computing infrastructure already. It’s a core part of their business, and it will continue to be a core part of their business.

We’ve seen the largest vendor, or the, the most advanced vendor of LLMs, put out some numbers showing, you know, incredibly rapid revenue growth, a lot of that being driven not so much by the consumer application but by the enterprise side and enterprise usage of its LLMs. We’ve seen some of the larger software companies talk about embedding these models into the back end of all of their applications, of all of their systems.

And we’ve seen companies talk about how just in terms of the workflow improvement, they have already been able to speed up the time to market for applications, you know, reduce the costs of doing certain tasks, automating certain things that required, you know, a lot of manual labor. And so I think we’re starting to see those initial signs of ROI. I think the expectation should be that that will continue to, to grow very rapidly. And over the next 12 to 18 months, we’re really going to, start to see some of those proof points show that there is a lot of ROI that can be attached to some of these investments.

18:15
Jason
Gotcha. So we’re still kind of, call it, second inning of the arms race, and, there’s more risk of kind of under investing than over investing.

18:26
Sebastian
Yeah, that’s exactly right. And then I think that’s the sentiment that we’ve gotten at least over the last six months from some of the leaders at these largest companies that are making these massive investments is that right now for them, they see the risk as being, under investing in this platform change and all the potential benefits that will come with it rather than, than, you know, over investing.

18:46
Chris
Well Jason, Sebastian, that’s all the time we’ve got for today. It’s been a real pleasure getting to learn more about this from both of you.

For those interested in the report, it’s called From Chips to Systems How AI is Revolutionizing Compute and Infrastructure. You can request a copy by reaching out to us at williamblair.com/contact-us. Thanks for taking the time to be with us today.

Latest article