Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:08,320 --> 00:00:12,400
This is how intelligence is made.
2
00:00:12,400 --> 00:00:15,200
A new kind of factory,
3
00:00:15,200 --> 00:00:18,320
generator of tokens, the building blocks
4
00:00:18,320 --> 00:00:21,480
of AI.
5
00:00:24,480 --> 00:00:27,359
tokens have opened a new frontier,
6
00:00:27,359 --> 00:00:31,359
turning data into knowledge and drawing
7
00:00:31,359 --> 00:00:35,239
on all we have learned.
8
00:00:35,600 --> 00:00:38,800
Tokens are harnessing a new wave of
9
00:00:38,800 --> 00:00:42,200
clean energy
10
00:00:44,719 --> 00:00:50,039
and unlocking the secrets of the stars.
11
00:00:51,600 --> 00:00:54,320
In virtual worlds, they help robots
12
00:00:54,320 --> 00:01:01,239
learn and in the physical world perfect,
13
00:01:02,800 --> 00:01:07,159
forging new paths
14
00:01:11,119 --> 00:01:13,680
and clearing the way for a bountiful
15
00:01:13,680 --> 00:01:16,680
harvest.
16
00:01:16,960 --> 00:01:20,240
In the moments that matter, tokens are
17
00:01:20,240 --> 00:01:23,640
already there.
18
00:01:24,560 --> 00:01:27,200
And in the miles between, they never
19
00:01:27,200 --> 00:01:30,200
stop.
20
00:01:31,840 --> 00:01:37,159
They work where human hands cannot.
21
00:01:38,159 --> 00:01:42,840
So we may all breathe easier.
22
00:01:45,119 --> 00:01:50,360
And the smallest hearts beat stronger.
23
00:02:04,159 --> 00:02:09,560
Tokens are helping us break new ground
24
00:02:10,879 --> 00:02:13,760
on a scale never attempted
25
00:02:13,760 --> 00:02:18,360
to empower the world.
26
00:02:25,599 --> 00:02:28,000
So we can reach star cloud one.
27
00:02:28,000 --> 00:02:33,480
Separation confirmed. well beyond it.
28
00:02:40,160 --> 00:02:43,200
Together we take the next great leap
29
00:02:43,200 --> 00:02:47,480
into a bright new future
30
00:02:49,360 --> 00:02:53,720
built for all mankind.
31
00:02:58,640 --> 00:03:01,280
And here
32
00:03:01,280 --> 00:03:05,080
is where it all begins.
33
00:03:13,760 --> 00:03:16,239
Welcome to the stage, Nvidia founder and
34
00:03:16,239 --> 00:03:20,519
CEO, Jensen Wong.
35
00:03:28,239 --> 00:03:32,200
Welcome to GTC.
36
00:03:36,720 --> 00:03:38,560
I just want to remind you this is a tech
37
00:03:38,560 --> 00:03:41,560
conference.
38
00:03:42,239 --> 00:03:44,400
All these people lining up so early in
39
00:03:44,400 --> 00:03:47,040
the morning. All of you in here, it's
40
00:03:47,040 --> 00:03:49,360
great to see you.
41
00:03:49,360 --> 00:03:51,519
GTC
42
00:03:51,519 --> 00:03:53,840
GTC. We're going to talk about
43
00:03:53,840 --> 00:03:56,000
technology. We're going to talk about
44
00:03:56,000 --> 00:03:59,599
platforms. Nvidia has three platforms.
45
00:03:59,599 --> 00:04:01,840
You think that we mostly talk about one
46
00:04:01,840 --> 00:04:05,439
of them. It's related to CUDA X. Our
47
00:04:05,439 --> 00:04:08,080
systems is another platform and now we
48
00:04:08,080 --> 00:04:10,159
have a new platform called AI factories.
49
00:04:10,159 --> 00:04:11,599
We're going to talk about all of them
50
00:04:11,599 --> 00:04:12,959
and most importantly we're going to talk
51
00:04:12,959 --> 00:04:17,680
about ecosystems. But before I start,
52
00:04:17,680 --> 00:04:21,440
let me thank our pregame show hosts. I
53
00:04:21,440 --> 00:04:24,320
thought they did a great job. Sarah Go
54
00:04:24,320 --> 00:04:27,040
of Conviction,
55
00:04:27,040 --> 00:04:30,479
Alfred Lyn, Sequa Capital, Nvidia's
56
00:04:30,479 --> 00:04:35,120
first venture capitalist, Gavin Baker,
57
00:04:35,120 --> 00:04:37,680
Nvidia's first major institutional
58
00:04:37,680 --> 00:04:41,360
investor. These three people are deep in
59
00:04:41,360 --> 00:04:44,479
technology, deep in what's going on and
60
00:04:44,479 --> 00:04:46,639
of course they have just a really broad
61
00:04:46,639 --> 00:04:48,800
reach of technology ecosystem. And then
62
00:04:48,800 --> 00:04:51,440
of course all of the VIPs that I hand
63
00:04:51,440 --> 00:04:54,639
selected to join us today, allstar team.
64
00:04:54,639 --> 00:04:58,840
I want to thank all of you for that.
65
00:05:02,160 --> 00:05:03,680
I also want to thank all the companies
66
00:05:03,680 --> 00:05:06,320
that are here.
67
00:05:06,320 --> 00:05:09,199
Nvidia as you know is a platform
68
00:05:09,199 --> 00:05:11,600
company. We have technology, we have our
69
00:05:11,600 --> 00:05:15,360
platforms, we have re rich ecosystem and
70
00:05:15,360 --> 00:05:20,320
today there are probably 100% of the
71
00:05:20,320 --> 00:05:22,320
hundred trillion dollars of industry
72
00:05:22,320 --> 00:05:25,680
here. 450 companies sponsored this
73
00:05:25,680 --> 00:05:30,000
event. I want to thank you. A thousand
74
00:05:30,000 --> 00:05:34,000
technical sessions, 2,000 speakers. This
75
00:05:34,000 --> 00:05:36,400
is this conference is going to cover
76
00:05:36,400 --> 00:05:39,120
every single layer of the five layer
77
00:05:39,120 --> 00:05:41,360
cake of artificial intelligence from
78
00:05:41,360 --> 00:05:43,440
land power and shell the infrastructure
79
00:05:43,440 --> 00:05:47,440
to chips to the platforms the models and
80
00:05:47,440 --> 00:05:49,360
of course the most important and
81
00:05:49,360 --> 00:05:51,919
ultimately what's going to take get this
82
00:05:51,919 --> 00:05:54,400
industry taken off is all of the
83
00:05:54,400 --> 00:05:56,880
applications.
84
00:05:56,880 --> 00:06:00,560
What it all began it all began here.
85
00:06:00,560 --> 00:06:04,639
This is the 20th anniversary of CUDA.
86
00:06:04,639 --> 00:06:09,479
We've been working on CUDA for 20 years.
87
00:06:12,240 --> 00:06:14,000
For 20 years, we've been dedicated to
88
00:06:14,000 --> 00:06:16,400
this architecture. This revolutionary
89
00:06:16,400 --> 00:06:20,160
invention, SIMT, single instruction,
90
00:06:20,160 --> 00:06:23,919
multi-threaded, writing scalar code
91
00:06:23,919 --> 00:06:26,240
could spawn off into multi-threaded
92
00:06:26,240 --> 00:06:29,120
application. much much easier to program
93
00:06:29,120 --> 00:06:32,960
than CINDI. We recently added tiles so
94
00:06:32,960 --> 00:06:36,080
that we could help people program tensor
95
00:06:36,080 --> 00:06:38,960
cores and the structures of mathematics
96
00:06:38,960 --> 00:06:41,120
that are so foundational to artificial
97
00:06:41,120 --> 00:06:43,919
intelligence today.
98
00:06:43,919 --> 00:06:46,960
Thousands of tools and compilers and
99
00:06:46,960 --> 00:06:49,680
frameworks and libraries
100
00:06:49,680 --> 00:06:51,520
in open source. There's a couple of
101
00:06:51,520 --> 00:06:55,440
hundred thousand public projects. CUDA
102
00:06:55,440 --> 00:06:57,600
literally is integrated into every
103
00:06:57,600 --> 00:06:59,599
single ecosystem.
104
00:06:59,599 --> 00:07:02,080
This chart
105
00:07:02,080 --> 00:07:05,280
basically describes 100% of Nvidia's
106
00:07:05,280 --> 00:07:08,000
strategies. You've been watching me talk
107
00:07:08,000 --> 00:07:09,919
about this slide from the very
108
00:07:09,919 --> 00:07:12,720
beginning. And ultimately, the single
109
00:07:12,720 --> 00:07:15,440
hardest thing to achieve is the thing on
110
00:07:15,440 --> 00:07:19,039
the bottom, installed base. It has taken
111
00:07:19,039 --> 00:07:22,560
us 20 years to now have built up
112
00:07:22,560 --> 00:07:24,880
hundreds of millions of GPUs and
113
00:07:24,880 --> 00:07:26,400
computing systems around the world that
114
00:07:26,400 --> 00:07:29,360
run CUDA. We are in every cloud. We're
115
00:07:29,360 --> 00:07:32,240
in every computer company.
116
00:07:32,240 --> 00:07:33,759
We serve just about every single
117
00:07:33,759 --> 00:07:38,560
industry. The installed base of CUDA is
118
00:07:38,560 --> 00:07:41,120
the reason why the flywheel is
119
00:07:41,120 --> 00:07:43,280
accelerating. The install base is what
120
00:07:43,280 --> 00:07:45,440
attracts developers who then creates new
121
00:07:45,440 --> 00:07:47,759
algorithms that achieves a breakthrough.
122
00:07:47,759 --> 00:07:50,880
For example, deep learning. There are so
123
00:07:50,880 --> 00:07:54,240
many others. Those breakthroughs leads
124
00:07:54,240 --> 00:07:56,960
to entirely new markets which builds new
125
00:07:56,960 --> 00:07:58,800
ecosystems around them with other
126
00:07:58,800 --> 00:08:01,280
companies that join which creates a
127
00:08:01,280 --> 00:08:04,000
larger installed base. This flywheel,
128
00:08:04,000 --> 00:08:06,720
this flywheel is now accelerating. The
129
00:08:06,720 --> 00:08:09,039
number of downloads of Nvidia libraries
130
00:08:09,039 --> 00:08:11,599
is incredibly accelerating. is at a very
131
00:08:11,599 --> 00:08:13,680
large scale and growing faster than
132
00:08:13,680 --> 00:08:17,440
ever. This flywheel is what makes this
133
00:08:17,440 --> 00:08:20,879
computing platform able to sustain so
134
00:08:20,879 --> 00:08:23,039
much applications, so many new
135
00:08:23,039 --> 00:08:27,120
breakthroughs. But most importantly,
136
00:08:27,120 --> 00:08:29,280
it also enables
137
00:08:29,280 --> 00:08:30,960
these infrastructures to have
138
00:08:30,960 --> 00:08:33,599
extraordinarily useful life. And the
139
00:08:33,599 --> 00:08:35,680
reason for that is very obvious. There's
140
00:08:35,680 --> 00:08:37,680
so many applications that you can run on
141
00:08:37,680 --> 00:08:40,719
Nvidia CUDA. We support the entire every
142
00:08:40,719 --> 00:08:43,360
single phase of the AI life cycle. We
143
00:08:43,360 --> 00:08:45,600
address every single data processing
144
00:08:45,600 --> 00:08:48,640
platform. We accelerate scientific
145
00:08:48,640 --> 00:08:50,320
principled solvers of all different
146
00:08:50,320 --> 00:08:53,680
kinds. And so the application reach is
147
00:08:53,680 --> 00:08:57,279
so great that once you install Nvidia
148
00:08:57,279 --> 00:08:59,680
GPUs, the useful life of it is
149
00:08:59,680 --> 00:09:02,080
incredibly high. It is also one of the
150
00:09:02,080 --> 00:09:05,120
reasons why Ampear that we shipped them
151
00:09:05,120 --> 00:09:07,760
some six years ago the pricing of Ampear
152
00:09:07,760 --> 00:09:10,800
in the cloud is going up. And so all of
153
00:09:10,800 --> 00:09:13,040
that is made possible fundamentally
154
00:09:13,040 --> 00:09:14,880
because the install base is high, the
155
00:09:14,880 --> 00:09:17,279
flywheel is high, the developer reach is
156
00:09:17,279 --> 00:09:20,399
great. And when all of that happens and
157
00:09:20,399 --> 00:09:23,360
we continuously update our software,
158
00:09:23,360 --> 00:09:25,920
the computing cost
159
00:09:25,920 --> 00:09:28,880
declines. The combination of accelerated
160
00:09:28,880 --> 00:09:31,120
computing speeding up applications
161
00:09:31,120 --> 00:09:34,080
tremendously. Meanwhile, as we continue
162
00:09:34,080 --> 00:09:35,760
to nurture and continue to update
163
00:09:35,760 --> 00:09:38,399
software over its life, not only do you
164
00:09:38,399 --> 00:09:40,560
get the first time pop, you get the
165
00:09:40,560 --> 00:09:42,880
continuous cost reduction of accelerated
166
00:09:42,880 --> 00:09:44,959
computing over time. And we're willing
167
00:09:44,959 --> 00:09:48,480
to nurture, willing to support every
168
00:09:48,480 --> 00:09:50,240
single one of these GPUs in the world
169
00:09:50,240 --> 00:09:51,440
because they're all architecturally
170
00:09:51,440 --> 00:09:53,680
compatible. We're willing to do so
171
00:09:53,680 --> 00:09:55,839
because the install base is so large. If
172
00:09:55,839 --> 00:09:57,920
we release a new optimization, it
173
00:09:57,920 --> 00:10:00,080
benefits millions.
174
00:10:00,080 --> 00:10:03,040
This applies to everybody in the world.
175
00:10:03,040 --> 00:10:06,560
This combination of dynamics is what
176
00:10:06,560 --> 00:10:10,240
makes the NVIDIA architecture expand its
177
00:10:10,240 --> 00:10:13,200
reach, accelerating its growth, at the
178
00:10:13,200 --> 00:10:15,600
same time driving down computing cost,
179
00:10:15,600 --> 00:10:17,680
which ultimately
180
00:10:17,680 --> 00:10:20,959
encourages new growth. So, CUDA is at
181
00:10:20,959 --> 00:10:23,200
the center of it. But our journey that
182
00:10:23,200 --> 00:10:26,240
could actually started 25 years ago.
183
00:10:26,240 --> 00:10:29,240
GeForce.
184
00:10:34,560 --> 00:10:36,160
I know how many of you grew up with
185
00:10:36,160 --> 00:10:38,320
GeForce.
186
00:10:38,320 --> 00:10:41,279
GeForce is Nvidia's greatest marketing
187
00:10:41,279 --> 00:10:42,959
campaign.
188
00:10:42,959 --> 00:10:47,120
We attract future customers starting
189
00:10:47,120 --> 00:10:49,279
long before you could afford to pay for
190
00:10:49,279 --> 00:10:52,880
it yourself. Your parents paid
191
00:10:52,880 --> 00:10:56,079
Your parents paid your par your parents
192
00:10:56,079 --> 00:10:58,320
paid for you to be Nvidia customers. And
193
00:10:58,320 --> 00:11:00,959
every single year they paid up year
194
00:11:00,959 --> 00:11:04,079
after year after year until someday you
195
00:11:04,079 --> 00:11:06,640
became an amazing computer scientist and
196
00:11:06,640 --> 00:11:09,360
became a proper customer, a proper
197
00:11:09,360 --> 00:11:13,440
developer. But this is this is the house
198
00:11:13,440 --> 00:11:17,200
that GeForce made 25 years ago. We
199
00:11:17,200 --> 00:11:19,760
started our journey which led to CUDA.
200
00:11:19,760 --> 00:11:21,760
25 years ago, we invented the
201
00:11:21,760 --> 00:11:26,000
programmable shader. A perfectly
202
00:11:26,000 --> 00:11:28,800
unobvious invention to make an
203
00:11:28,800 --> 00:11:31,680
accelerator programmable. The world's
204
00:11:31,680 --> 00:11:33,680
first programmable accelerator, the
205
00:11:33,680 --> 00:11:37,760
pixel shader 25 years ago. That led us
206
00:11:37,760 --> 00:11:40,880
to explore further and further 20 years
207
00:11:40,880 --> 00:11:43,279
later, 5 years later, the invention of
208
00:11:43,279 --> 00:11:45,200
CUDA. One of the biggest investments
209
00:11:45,200 --> 00:11:47,519
that we made and we couldn't afford it
210
00:11:47,519 --> 00:11:50,480
at the time. and it consumed the vast
211
00:11:50,480 --> 00:11:53,360
majority of our company's profits was to
212
00:11:53,360 --> 00:11:56,480
take CUDA on the backs of GeForce to
213
00:11:56,480 --> 00:11:59,279
every single computer. We dedicated
214
00:11:59,279 --> 00:12:01,200
ourselves to creating this platform
215
00:12:01,200 --> 00:12:03,760
because we felt so much we felt so
216
00:12:03,760 --> 00:12:05,839
strongly about its potential. But
217
00:12:05,839 --> 00:12:08,720
ultimately the company's dedication to
218
00:12:08,720 --> 00:12:11,279
it despite the hardships in the
219
00:12:11,279 --> 00:12:14,320
beginning believing it every single day
220
00:12:14,320 --> 00:12:18,079
for third for 13 generations or 20 years
221
00:12:18,079 --> 00:12:20,399
we now have CUDA installed everywhere.
222
00:12:20,399 --> 00:12:22,880
The pixel shader
223
00:12:22,880 --> 00:12:24,959
led to of course the revolution of
224
00:12:24,959 --> 00:12:26,480
GeForce.
225
00:12:26,480 --> 00:12:29,600
And then 10 years ago, we introduced
226
00:12:29,600 --> 00:12:31,360
about 10 years ago, what is it, eight
227
00:12:31,360 --> 00:12:34,480
years ago, we introduced RTX,
228
00:12:34,480 --> 00:12:37,200
a complete redesign of our architecture
229
00:12:37,200 --> 00:12:39,920
for the modern era of computer graphics.
230
00:12:39,920 --> 00:12:42,160
GeForce brought CUDA to the world.
231
00:12:42,160 --> 00:12:43,920
GeForce
232
00:12:43,920 --> 00:12:47,680
therefore enabled Alex Kruefky and Ilas
233
00:12:47,680 --> 00:12:50,399
Susver and Jeff Hinton, Andrew Ang and
234
00:12:50,399 --> 00:12:54,000
so many others to discover that the GPU
235
00:12:54,000 --> 00:12:56,240
could be their friend in accelerating
236
00:12:56,240 --> 00:12:58,720
deep learning. It started the big bang
237
00:12:58,720 --> 00:13:02,480
of AI. 10 years ago, we decided that we
238
00:13:02,480 --> 00:13:04,959
would fuse
239
00:13:04,959 --> 00:13:08,240
programmable shading and introduce two
240
00:13:08,240 --> 00:13:11,360
new ideas. ray tracing, hardware ray
241
00:13:11,360 --> 00:13:13,519
tracing, which is incredibly hard to do.
242
00:13:13,519 --> 00:13:16,240
And a new idea at the time, imagine
243
00:13:16,240 --> 00:13:18,800
about 10 years ago, we thought that AI
244
00:13:18,800 --> 00:13:21,519
would revolutionize computer graphics.
245
00:13:21,519 --> 00:13:24,720
Just as GeForce brought AI to the world,
246
00:13:24,720 --> 00:13:26,880
AI is now going to go back and
247
00:13:26,880 --> 00:13:29,040
revolutionize how computer graphics is
248
00:13:29,040 --> 00:13:31,760
done all together. Well, today I'm going
249
00:13:31,760 --> 00:13:33,440
to show you something of the future.
250
00:13:33,440 --> 00:13:36,240
This is our next generation of graphics
251
00:13:36,240 --> 00:13:39,440
technology. We call it neuro rendering.
252
00:13:39,440 --> 00:13:41,200
The fusion,
253
00:13:41,200 --> 00:13:45,040
the fusion of 3D graphics and artificial
254
00:13:45,040 --> 00:13:47,839
intelligence. This is DLSS 5. Take a
255
00:13:47,839 --> 00:13:51,000
look at it.
256
00:14:05,519 --> 00:14:08,519
Heat. Heat.
257
00:14:46,800 --> 00:14:50,440
Heat. Heat.
258
00:14:57,519 --> 00:15:00,920
Is that incredible?
259
00:15:03,600 --> 00:15:05,839
Computer graphics comes to life. Now
260
00:15:05,839 --> 00:15:08,880
what did we do? We fused
261
00:15:08,880 --> 00:15:11,199
controllable 3D graphics. The ground
262
00:15:11,199 --> 00:15:14,000
truth of virtual worlds, the structured
263
00:15:14,000 --> 00:15:16,480
data, remember this word, the structured
264
00:15:16,480 --> 00:15:19,920
data of virtual worlds, of gener
265
00:15:19,920 --> 00:15:22,399
generated worlds. We combine 3D
266
00:15:22,399 --> 00:15:24,880
graphics, structured data with
267
00:15:24,880 --> 00:15:27,040
generative AI,
268
00:15:27,040 --> 00:15:29,760
probabilistic computing. One of them is
269
00:15:29,760 --> 00:15:31,920
completely predictive, the other one
270
00:15:31,920 --> 00:15:34,959
probabilistic yet highly realistic. We
271
00:15:34,959 --> 00:15:37,920
combine these two ideas. Combine these
272
00:15:37,920 --> 00:15:40,000
two ideas controlled through structured
273
00:15:40,000 --> 00:15:44,480
data controlled perfectly and yet
274
00:15:44,480 --> 00:15:46,639
generating at the same time. And as a
275
00:15:46,639 --> 00:15:48,480
result,
276
00:15:48,480 --> 00:15:51,839
the content is beautiful, amazing, as
277
00:15:51,839 --> 00:15:54,800
well as controllable. This concept of
278
00:15:54,800 --> 00:15:57,920
fusing structured information and
279
00:15:57,920 --> 00:16:01,040
generative AI will repeat itself in one
280
00:16:01,040 --> 00:16:02,880
industry after another industry after
281
00:16:02,880 --> 00:16:05,839
another industry. Structured data is the
282
00:16:05,839 --> 00:16:12,079
foundation of trustworthy AI. Well,
283
00:16:12,079 --> 00:16:13,600
this is going to scare you a little bit.
284
00:16:13,600 --> 00:16:15,839
I'm going to flip the slide. and don't
285
00:16:15,839 --> 00:16:18,720
gasp.
286
00:16:18,720 --> 00:16:19,920
So, we're going to go through the
287
00:16:19,920 --> 00:16:24,360
schematic for the rest of the time.
288
00:16:25,199 --> 00:16:28,079
This is my best slide. Every time I I
289
00:16:28,079 --> 00:16:29,680
asked my I asked the team, "What's my
290
00:16:29,680 --> 00:16:32,160
best slide?" Repeatedly, this was it.
291
00:16:32,160 --> 00:16:34,240
They say, "Don't do it, Jensen. Don't do
292
00:16:34,240 --> 00:16:38,560
it." I said, 'N no, this these seats are
293
00:16:38,560 --> 00:16:40,959
free
294
00:16:40,959 --> 00:16:44,160
for some of you.
295
00:16:44,160 --> 00:16:46,560
So this is your price of admission. So
296
00:16:46,560 --> 00:16:49,360
this is this is structured data. You've
297
00:16:49,360 --> 00:16:54,000
heard of it. SQL, Spark, Pandas, Velox,
298
00:16:54,000 --> 00:16:56,480
some of these really really important
299
00:16:56,480 --> 00:17:00,480
very large platforms. Snow, snow, uh,
300
00:17:00,480 --> 00:17:05,520
snowflake, data bricks, EMR, Amazon EMR,
301
00:17:05,520 --> 00:17:08,720
um, Azure, Fabric,
302
00:17:08,720 --> 00:17:12,480
Google Cloud, BigQuery. All of these
303
00:17:12,480 --> 00:17:15,199
platforms are processing data frames.
304
00:17:15,199 --> 00:17:17,600
These data frames are giant spreadsheets
305
00:17:17,600 --> 00:17:20,799
and they hold all of life's information.
306
00:17:20,799 --> 00:17:24,079
This is the structured data, the ground
307
00:17:24,079 --> 00:17:26,799
truth of business. This is the ground
308
00:17:26,799 --> 00:17:30,240
truth of enterprise computing. Well, now
309
00:17:30,240 --> 00:17:32,559
we're going to have AI use structured
310
00:17:32,559 --> 00:17:35,360
data and we better accelerate the living
311
00:17:35,360 --> 00:17:37,840
daylights out of it. It used to be okay
312
00:17:37,840 --> 00:17:40,320
and we would, you know, of course we
313
00:17:40,320 --> 00:17:42,880
would accelerate uh structured data so
314
00:17:42,880 --> 00:17:45,039
that we could do more. We could do it
315
00:17:45,039 --> 00:17:46,720
more cheaply. We could do it more
316
00:17:46,720 --> 00:17:49,039
frequently per day and keep the company
317
00:17:49,039 --> 00:17:51,440
running at a much more synchronized way.
318
00:17:51,440 --> 00:17:53,520
However, in the future, what's going to
319
00:17:53,520 --> 00:17:55,280
happen is these data structures are
320
00:17:55,280 --> 00:17:57,840
going to be used by AI and AI is going
321
00:17:57,840 --> 00:18:00,240
to be much much faster than us. Future
322
00:18:00,240 --> 00:18:02,400
agents are going to use structured
323
00:18:02,400 --> 00:18:04,880
databases as well. And then of course
324
00:18:04,880 --> 00:18:07,120
the unstructured database, the
325
00:18:07,120 --> 00:18:10,400
generative database. This database is
326
00:18:10,400 --> 00:18:12,000
represents the vast majority of the
327
00:18:12,000 --> 00:18:14,960
world. Vector databases, unstructured
328
00:18:14,960 --> 00:18:19,039
data, PDFs, videos, speeches, all of the
329
00:18:19,039 --> 00:18:21,440
world's information. About 90% of what's
330
00:18:21,440 --> 00:18:23,200
generated every single year is
331
00:18:23,200 --> 00:18:26,400
unstructured data. Until now, this data
332
00:18:26,400 --> 00:18:28,160
has been completely useless to the
333
00:18:28,160 --> 00:18:30,240
world. We read it, we put it into our
334
00:18:30,240 --> 00:18:32,080
file system, and that's it.
335
00:18:32,080 --> 00:18:34,240
Unfortunately, we can't query it. We
336
00:18:34,240 --> 00:18:35,840
can't search for it. It's hard to do
337
00:18:35,840 --> 00:18:38,480
that. And the reason for that is because
338
00:18:38,480 --> 00:18:41,280
there's no easy indexing of unstructured
339
00:18:41,280 --> 00:18:42,880
data. You have to understand its
340
00:18:42,880 --> 00:18:45,440
meaning, its purpose. And so now we have
341
00:18:45,440 --> 00:18:48,480
AI do that just as AI was able to solve
342
00:18:48,480 --> 00:18:51,039
multi-modality
343
00:18:51,039 --> 00:18:53,679
perception you can and understanding you
344
00:18:53,679 --> 00:18:55,760
can use that same technology
345
00:18:55,760 --> 00:18:57,600
multimodality perception and
346
00:18:57,600 --> 00:18:59,919
understanding to go read a PDF to
347
00:18:59,919 --> 00:19:02,160
understand its meaning and from that
348
00:19:02,160 --> 00:19:05,919
meaning embedded into a larger structure
349
00:19:05,919 --> 00:19:08,240
that we can search into we can query
350
00:19:08,240 --> 00:19:12,160
into. NVIDIA created two foundational
351
00:19:12,160 --> 00:19:14,400
libraries. Just like we created RTX for
352
00:19:14,400 --> 00:19:17,520
3D graphics, we created QDF for data
353
00:19:17,520 --> 00:19:20,559
frames, structured data. We created QVS
354
00:19:20,559 --> 00:19:23,919
for vector stores, semantic data,
355
00:19:23,919 --> 00:19:27,679
unstructured data, AI data. These two
356
00:19:27,679 --> 00:19:30,240
platforms are going to be two of the
357
00:19:30,240 --> 00:19:32,480
most important platforms in the future.
358
00:19:32,480 --> 00:19:34,480
super excited to see its adoption
359
00:19:34,480 --> 00:19:36,480
throughout the network, this complicated
360
00:19:36,480 --> 00:19:38,160
network of the world's data processing
361
00:19:38,160 --> 00:19:39,840
systems. And the reason for that is
362
00:19:39,840 --> 00:19:41,440
because data processing has been around
363
00:19:41,440 --> 00:19:45,039
a long time and therefore so many
364
00:19:45,039 --> 00:19:47,440
different companies and platforms and
365
00:19:47,440 --> 00:19:50,080
services. It has taken us a long time to
366
00:19:50,080 --> 00:19:53,600
integrate deeply into this ecosystem.
367
00:19:53,600 --> 00:19:55,120
I'm super proud of the work that we're
368
00:19:55,120 --> 00:19:57,440
doing here. And then today we're
369
00:19:57,440 --> 00:20:01,600
announcing several of them. IBM
370
00:20:01,600 --> 00:20:04,160
the inventor of SQL
371
00:20:04,160 --> 00:20:05,520
one of the most important domain
372
00:20:05,520 --> 00:20:07,679
specific languages of all kind of all
373
00:20:07,679 --> 00:20:11,679
time is accelerating Watson X data with
374
00:20:11,679 --> 00:20:16,600
KUDF let's take a look at it
375
00:20:17,440 --> 00:20:20,960
60 years ago IBM introduced the system
376
00:20:20,960 --> 00:20:22,480
360
377
00:20:22,480 --> 00:20:24,080
the first modern platform for
378
00:20:24,080 --> 00:20:26,400
generalpurpose computing launching the
379
00:20:26,400 --> 00:20:30,000
computing era then SQL a declarative
380
00:20:30,000 --> 00:20:32,320
language to query data without requiring
381
00:20:32,320 --> 00:20:34,480
the computer to be instructed step by
382
00:20:34,480 --> 00:20:36,000
step
383
00:20:36,000 --> 00:20:38,720
and the data warehouse. Each the
384
00:20:38,720 --> 00:20:40,400
foundations of modern enterprise
385
00:20:40,400 --> 00:20:44,080
computing. Today, IBM and NVIDIA are
386
00:20:44,080 --> 00:20:46,559
reinventing data processing for the era
387
00:20:46,559 --> 00:20:50,880
of AI by accelerating IBM Watson X. Data
388
00:20:50,880 --> 00:20:53,520
SQL engines with NVIDIA GPU computing
389
00:20:53,520 --> 00:20:57,280
libraries. Data is the ground truth that
390
00:20:57,280 --> 00:21:00,400
gives AI context and meaning. AI needs
391
00:21:00,400 --> 00:21:03,280
rapid access to massive data sets.
392
00:21:03,280 --> 00:21:05,919
Today's CPU data processing systems
393
00:21:05,919 --> 00:21:09,280
can't keep up. Nestle makes thousands of
394
00:21:09,280 --> 00:21:12,240
supply chain decisions every day. Their
395
00:21:12,240 --> 00:21:14,480
order to cache data mart aggregates
396
00:21:14,480 --> 00:21:17,360
every supply order and delivery event
397
00:21:17,360 --> 00:21:20,400
across global operations in 185
398
00:21:20,400 --> 00:21:22,080
countries.
399
00:21:22,080 --> 00:21:24,880
On CPUs, Nestle refreshed the data mart
400
00:21:24,880 --> 00:21:27,679
a few times a day. With accelerated
401
00:21:27,679 --> 00:21:31,280
Watson X data running on Nvidia GPUs,
402
00:21:31,280 --> 00:21:33,919
Nestle can run the same workload five
403
00:21:33,919 --> 00:21:38,080
times faster at 83% lower cost.
404
00:21:38,080 --> 00:21:40,799
The next computing platform has arrived.
405
00:21:40,799 --> 00:21:46,840
Accelerated computing for the era of AI.
406
00:21:52,320 --> 00:21:54,320
NVIDIA accelerates data processing in
407
00:21:54,320 --> 00:21:56,240
the cloud. We also accelerate data
408
00:21:56,240 --> 00:21:59,919
processing on prem. As you know, Dell is
409
00:21:59,919 --> 00:22:02,400
the worldleading computer systems maker
410
00:22:02,400 --> 00:22:05,280
and they also are one of the world's
411
00:22:05,280 --> 00:22:06,960
leading storage providers and they
412
00:22:06,960 --> 00:22:09,840
worked with us to create the Dell AI
413
00:22:09,840 --> 00:22:12,400
data platform that integrates QDF and
414
00:22:12,400 --> 00:22:15,280
QVS to create an accelerated data
415
00:22:15,280 --> 00:22:19,600
platform. well for the era of AI and uh
416
00:22:19,600 --> 00:22:21,200
this is an example of what they did with
417
00:22:21,200 --> 00:22:25,039
NT data huge speed up this is cloud
418
00:22:25,039 --> 00:22:27,520
Google cloud and Google cloud as you
419
00:22:27,520 --> 00:22:28,880
know we've been working with Google
420
00:22:28,880 --> 00:22:32,400
cloud for a very long time we accelerate
421
00:22:32,400 --> 00:22:35,280
Google's vertex AI we now accelerate
422
00:22:35,280 --> 00:22:38,320
bigquery really important uh framework
423
00:22:38,320 --> 00:22:40,320
and really important platform and this
424
00:22:40,320 --> 00:22:42,480
is an example of our work together with
425
00:22:42,480 --> 00:22:45,360
Snapchat where we reduce their cost of
426
00:22:45,360 --> 00:22:49,280
computing by nearly 80%.
427
00:22:49,280 --> 00:22:51,840
When you accelerate data processing,
428
00:22:51,840 --> 00:22:54,320
when you accelerate computing, you get
429
00:22:54,320 --> 00:22:56,159
the benefit of speed, you get the
430
00:22:56,159 --> 00:22:59,360
benefit of scale, but most importantly,
431
00:22:59,360 --> 00:23:02,559
you also get the benefit of cost. And so
432
00:23:02,559 --> 00:23:05,360
all of those come together as one. It
433
00:23:05,360 --> 00:23:07,360
was originally called Moore's law.
434
00:23:07,360 --> 00:23:08,640
Moore's law was about getting
435
00:23:08,640 --> 00:23:10,799
performance doubling every couple of
436
00:23:10,799 --> 00:23:13,840
years. It's another way of saying so
437
00:23:13,840 --> 00:23:15,600
long as the price remains about the same
438
00:23:15,600 --> 00:23:17,600
and most computers remained about the
439
00:23:17,600 --> 00:23:19,679
same, you're also getting twice the
440
00:23:19,679 --> 00:23:21,919
performance every year or you're
441
00:23:21,919 --> 00:23:23,679
reducing the cost of computing every
442
00:23:23,679 --> 00:23:26,080
single year. Well, Moore's law has run
443
00:23:26,080 --> 00:23:29,039
out of steam. We need a new approach.
444
00:23:29,039 --> 00:23:31,600
Accelerated computing allows us to take
445
00:23:31,600 --> 00:23:33,919
these giant leaps forward and as you
446
00:23:33,919 --> 00:23:36,240
will see later because we continue to
447
00:23:36,240 --> 00:23:38,880
optimize the algorithms
448
00:23:38,880 --> 00:23:41,280
and Nvidia is an algorithm company. As
449
00:23:41,280 --> 00:23:43,120
we continue to optimize the algorithms
450
00:23:43,120 --> 00:23:45,440
and because our our reach is so large
451
00:23:45,440 --> 00:23:48,000
and our install base is so large we can
452
00:23:48,000 --> 00:23:50,480
reduce the computing cost increasing the
453
00:23:50,480 --> 00:23:53,360
scale increasing the speed for everybody
454
00:23:53,360 --> 00:23:56,240
continuously. This is Google cloud. You
455
00:23:56,240 --> 00:23:58,559
could see this pattern I just mentioned.
456
00:23:58,559 --> 00:24:00,240
I just wanted to show you three versions
457
00:24:00,240 --> 00:24:03,679
of it. Nvidia built the accelerated
458
00:24:03,679 --> 00:24:06,480
computing platform has a bunch of
459
00:24:06,480 --> 00:24:08,240
libraries on top. I gave you three
460
00:24:08,240 --> 00:24:10,880
examples. RTX is one of them. QDF is
461
00:24:10,880 --> 00:24:13,120
another. KVS and we'll show you a few
462
00:24:13,120 --> 00:24:16,080
more. These libraries sit on top of our
463
00:24:16,080 --> 00:24:19,200
platform. But ultimately
464
00:24:19,200 --> 00:24:22,640
we integrate into the world's cloud
465
00:24:22,640 --> 00:24:25,760
services into the world's OEMs and
466
00:24:25,760 --> 00:24:28,559
together and other platforms that I'll
467
00:24:28,559 --> 00:24:30,799
show you together were able to reach the
468
00:24:30,799 --> 00:24:34,559
world. This pattern Nvidia, Google
469
00:24:34,559 --> 00:24:37,520
Cloud, Snapchat will repeat over and
470
00:24:37,520 --> 00:24:39,679
over again. And kind of looks like this.
471
00:24:39,679 --> 00:24:41,919
And so this is one example. Nvidia with
472
00:24:41,919 --> 00:24:44,720
Google Cloud. We accelerate Vertex AI.
473
00:24:44,720 --> 00:24:47,200
We accelerate Bitquery. We accelerate.
474
00:24:47,200 --> 00:24:49,120
We're I'm super proud of the work that
475
00:24:49,120 --> 00:24:51,679
we've done with Jackson XLA. We are
476
00:24:51,679 --> 00:24:54,080
incredible on PyTorch. We're the only
477
00:24:54,080 --> 00:24:55,360
accelerator in the world that's
478
00:24:55,360 --> 00:24:58,159
incredible on PyTorch and incredible on
479
00:24:58,159 --> 00:25:00,640
Jackson XLA. And the customers that we
480
00:25:00,640 --> 00:25:02,720
support, the base 10s, the Crowd
481
00:25:02,720 --> 00:25:06,240
Strikes, Puma, Salesforce, they're not
482
00:25:06,240 --> 00:25:09,039
our customers, but they're customers,
483
00:25:09,039 --> 00:25:10,799
developers of ours that we've integrated
484
00:25:10,799 --> 00:25:13,919
the NVIDIA technologies into that we can
485
00:25:13,919 --> 00:25:16,880
then land on the clouds.
486
00:25:16,880 --> 00:25:18,559
Our relationship with cloud service
487
00:25:18,559 --> 00:25:21,840
providers are essentially us bringing
488
00:25:21,840 --> 00:25:24,880
customers to them. We integrate our
489
00:25:24,880 --> 00:25:27,600
libraries, we accelerate workloads, and
490
00:25:27,600 --> 00:25:30,640
we land those customers in the clouds.
491
00:25:30,640 --> 00:25:33,279
And so, as you could see, most of our
492
00:25:33,279 --> 00:25:35,120
cloud service providers love working
493
00:25:35,120 --> 00:25:38,640
with us. And um they're always asking us
494
00:25:38,640 --> 00:25:40,559
to land the next customer on their
495
00:25:40,559 --> 00:25:43,200
cloud. And I just want to let you know
496
00:25:43,200 --> 00:25:45,760
there are a lot of customers.
497
00:25:45,760 --> 00:25:48,320
We're going to accelerate everybody. And
498
00:25:48,320 --> 00:25:49,520
so, there will be lots and lots of
499
00:25:49,520 --> 00:25:50,880
customers will be able to land in your
500
00:25:50,880 --> 00:25:53,120
cloud. Just be patient with us. And so
501
00:25:53,120 --> 00:25:55,919
this is Google Cloud. This is AWS. We've
502
00:25:55,919 --> 00:25:58,000
been working with AWS a long time. And
503
00:25:58,000 --> 00:26:00,000
one of the areas, one of the one of the
504
00:26:00,000 --> 00:26:02,080
things I'm super excited about this year
505
00:26:02,080 --> 00:26:05,520
is we're going to bring open AI to AWS.
506
00:26:05,520 --> 00:26:07,360
And so it's going to drive enormous
507
00:26:07,360 --> 00:26:09,760
consumption of cloud computing at AWS.
508
00:26:09,760 --> 00:26:11,679
It's going to expand the reach, expand
509
00:26:11,679 --> 00:26:14,640
the compute of open AI. And as you know,
510
00:26:14,640 --> 00:26:17,279
they are completely compute constrained.
511
00:26:17,279 --> 00:26:20,240
And so AWS, we accelerate EMR, we
512
00:26:20,240 --> 00:26:22,799
accelerate SageMaker, we accelerate
513
00:26:22,799 --> 00:26:25,279
Bedrock. NVIDIA's integrated really
514
00:26:25,279 --> 00:26:28,320
deeply into AWS. They were our first
515
00:26:28,320 --> 00:26:30,320
cloud partner,
516
00:26:30,320 --> 00:26:32,480
Microsoft Azure.
517
00:26:32,480 --> 00:26:36,159
NVIDIA's A100 supercomputer
518
00:26:36,159 --> 00:26:39,679
um was the the first one we built was
519
00:26:39,679 --> 00:26:42,400
for Nvidia. The first one we installed
520
00:26:42,400 --> 00:26:46,000
was at Azure. And that led to the inter
521
00:26:46,000 --> 00:26:48,000
the uh the big successful partnership
522
00:26:48,000 --> 00:26:50,240
with open AI but we've been working with
523
00:26:50,240 --> 00:26:52,080
Azure for quite a long time. We
524
00:26:52,080 --> 00:26:56,000
accelerate Azure cloud now it's uh their
525
00:26:56,000 --> 00:26:58,720
AI foundry we partner deeply with we
526
00:26:58,720 --> 00:27:02,080
accelerate Bing search we work with them
527
00:27:02,080 --> 00:27:05,440
on Azure regions. This is one of the
528
00:27:05,440 --> 00:27:09,200
areas that is incredibly important as we
529
00:27:09,200 --> 00:27:11,200
continue to expand AI throughout the
530
00:27:11,200 --> 00:27:13,840
world. One of the capabilities that we
531
00:27:13,840 --> 00:27:18,000
offer is confidential computing.
532
00:27:18,000 --> 00:27:20,320
That in confidential computing, you want
533
00:27:20,320 --> 00:27:23,679
to make sure that even the operator
534
00:27:23,679 --> 00:27:27,200
cannot see your data. Even the operator
535
00:27:27,200 --> 00:27:30,320
cannot touch or see your models.
536
00:27:30,320 --> 00:27:32,799
confidential computing. Nvidia's GPUs is
537
00:27:32,799 --> 00:27:35,039
the first ones in the world to do that.
538
00:27:35,039 --> 00:27:37,279
It's now able to support confidential
539
00:27:37,279 --> 00:27:39,760
computing and protected deployment of
540
00:27:39,760 --> 00:27:42,080
these very valuable open AI models and
541
00:27:42,080 --> 00:27:44,799
and anthropic models throughout clouds
542
00:27:44,799 --> 00:27:47,600
and different regions and all because of
543
00:27:47,600 --> 00:27:49,600
our conf confidential computing.
544
00:27:49,600 --> 00:27:50,960
Confidential computing is super
545
00:27:50,960 --> 00:27:53,679
important. And here's an example where
546
00:27:53,679 --> 00:27:55,039
we have different customers that we work
547
00:27:55,039 --> 00:27:57,440
with. Synopsis, a great partner of ours.
548
00:27:57,440 --> 00:27:59,200
were accelerating all of their EDA and
549
00:27:59,200 --> 00:28:01,919
CA workflows. And then we landed at
550
00:28:01,919 --> 00:28:04,399
Microsoft Azure.
551
00:28:04,399 --> 00:28:08,480
We were Oracle's first AI customer.
552
00:28:08,480 --> 00:28:10,080
Most people would have thought we were
553
00:28:10,080 --> 00:28:11,919
their first supplier. We were their
554
00:28:11,919 --> 00:28:14,240
first supplier also, but we were their
555
00:28:14,240 --> 00:28:16,880
first AI customer. I'm quite proud of
556
00:28:16,880 --> 00:28:20,399
the fact that I explained AI clouds to
557
00:28:20,399 --> 00:28:22,480
Oracle for the first time and we were
558
00:28:22,480 --> 00:28:24,640
their first customer. Since then,
559
00:28:24,640 --> 00:28:27,279
they've really taken off. We've landed a
560
00:28:27,279 --> 00:28:29,520
whole bunch of our partners there. Core
561
00:28:29,520 --> 00:28:31,840
Coher and Fireworks and of course very
562
00:28:31,840 --> 00:28:35,720
famously open AAI
563
00:28:35,840 --> 00:28:38,880
a great partnership with Core
564
00:28:38,880 --> 00:28:42,399
Core. They're the world's first AI
565
00:28:42,399 --> 00:28:45,840
native cloud. A company that was built
566
00:28:45,840 --> 00:28:48,320
with only one singular purpose to
567
00:28:48,320 --> 00:28:51,360
provision to host GPUs as the era of
568
00:28:51,360 --> 00:28:53,840
accelerated computing showed up and to
569
00:28:53,840 --> 00:28:56,320
host for AI clouds. They've got some
570
00:28:56,320 --> 00:28:58,000
fantastic customers and they're growing
571
00:28:58,000 --> 00:29:00,159
incredibly. One of the platforms that
572
00:29:00,159 --> 00:29:02,960
I'm quite excited about is Palunteer and
573
00:29:02,960 --> 00:29:05,840
Dell. The three of our companies have
574
00:29:05,840 --> 00:29:08,000
made it possible to stand up a brand new
575
00:29:08,000 --> 00:29:10,399
type of AI platform, the Palunteer
576
00:29:10,399 --> 00:29:13,039
ontology platform and AI platform. And
577
00:29:13,039 --> 00:29:15,520
we could stand up these platforms in any
578
00:29:15,520 --> 00:29:18,720
country in any airgapped region
579
00:29:18,720 --> 00:29:21,600
completely on prem, completely on site,
580
00:29:21,600 --> 00:29:23,840
completely in the field. AI could be
581
00:29:23,840 --> 00:29:25,760
deployed literally everywhere without
582
00:29:25,760 --> 00:29:27,760
our confidential computing capability
583
00:29:27,760 --> 00:29:29,840
without our ability to build the
584
00:29:29,840 --> 00:29:32,640
endtoend system as well as offer the
585
00:29:32,640 --> 00:29:34,240
entire
586
00:29:34,240 --> 00:29:36,880
accelerated computing and AI stack from
587
00:29:36,880 --> 00:29:39,120
data processing whether it's vectors or
588
00:29:39,120 --> 00:29:41,679
structures all the way to AI it wouldn't
589
00:29:41,679 --> 00:29:44,240
have been possible I wanted to show you
590
00:29:44,240 --> 00:29:46,080
these examples
591
00:29:46,080 --> 00:29:50,080
this is our special working relationship
592
00:29:50,080 --> 00:29:52,399
with the world's cloud service providers
593
00:29:52,399 --> 00:29:55,200
and many well all of them are here and I
594
00:29:55,200 --> 00:29:56,720
get the benefit of seeing them during
595
00:29:56,720 --> 00:29:58,640
boot tour and it's just so incredibly
596
00:29:58,640 --> 00:30:00,320
exciting. I just want to thank all of
597
00:30:00,320 --> 00:30:02,240
you for the hard work. What NVIDIA has
598
00:30:02,240 --> 00:30:04,720
done is this and you're going to see
599
00:30:04,720 --> 00:30:07,919
this theme over and over again.
600
00:30:07,919 --> 00:30:10,080
Nvidia is vertically integrated the
601
00:30:10,080 --> 00:30:14,399
world's first vertically integrated
602
00:30:14,399 --> 00:30:17,919
but horizontally open company
603
00:30:17,919 --> 00:30:20,399
and the reason that's necessary is very
604
00:30:20,399 --> 00:30:23,440
simple. Accelerated
605
00:30:23,440 --> 00:30:26,240
computing is not a chip problem.
606
00:30:26,240 --> 00:30:28,320
Accelerated computing is not a systems
607
00:30:28,320 --> 00:30:31,360
problem. Accelerated computing has a
608
00:30:31,360 --> 00:30:34,000
missing word. We just never say it
609
00:30:34,000 --> 00:30:38,159
anymore. Application acceleration.
610
00:30:38,159 --> 00:30:40,559
You if I could make a computer run
611
00:30:40,559 --> 00:30:43,760
everything faster, that's called a CPU.
612
00:30:43,760 --> 00:30:46,399
But that's run out of steam. The only
613
00:30:46,399 --> 00:30:48,720
way for us to accelerate applications
614
00:30:48,720 --> 00:30:51,120
going forward and continue to bring
615
00:30:51,120 --> 00:30:53,520
tremendous speed up, tremendous cost
616
00:30:53,520 --> 00:30:56,320
reduction is through application or
617
00:30:56,320 --> 00:30:59,120
domain specific acceleration. I dropped
618
00:30:59,120 --> 00:31:01,440
that phrase in the in the front and
619
00:31:01,440 --> 00:31:02,960
therefore it just became applica
620
00:31:02,960 --> 00:31:05,600
accelerated computing and that is the
621
00:31:05,600 --> 00:31:07,520
reason why Nvidia has to be library
622
00:31:07,520 --> 00:31:09,520
after library, domain after domain,
623
00:31:09,520 --> 00:31:11,840
vertical after vertical.
624
00:31:11,840 --> 00:31:14,320
We are a vertically integrated computing
625
00:31:14,320 --> 00:31:17,840
company. There is no other way. We have
626
00:31:17,840 --> 00:31:19,440
to understand the applications. We have
627
00:31:19,440 --> 00:31:20,960
to understand the domain. We have to
628
00:31:20,960 --> 00:31:23,360
understand fundamentally the algorithms.
629
00:31:23,360 --> 00:31:26,000
And we have to figure out how to deploy
630
00:31:26,000 --> 00:31:27,760
the algorithm
631
00:31:27,760 --> 00:31:29,600
in whatever scenario it wants to be
632
00:31:29,600 --> 00:31:32,080
deployed. Whether it's a data center,
633
00:31:32,080 --> 00:31:34,720
cloud, onrem, at the edge, or in a
634
00:31:34,720 --> 00:31:36,960
robotic system. All of those computing
635
00:31:36,960 --> 00:31:39,760
systems are different. And finally, the
636
00:31:39,760 --> 00:31:42,720
systems and chips. We are vertically
637
00:31:42,720 --> 00:31:44,960
integrated. What makes it incredibly
638
00:31:44,960 --> 00:31:46,480
powerful and the reason why you saw all
639
00:31:46,480 --> 00:31:48,640
the slides is because Nvidia is
640
00:31:48,640 --> 00:31:51,519
horizontally open. We work and integrate
641
00:31:51,519 --> 00:31:53,600
Nvidia's technology into whatever
642
00:31:53,600 --> 00:31:54,960
platform you would like us to integrate
643
00:31:54,960 --> 00:31:57,200
into. We offer you the software. We
644
00:31:57,200 --> 00:31:59,840
offer you libraries. We integrate with
645
00:31:59,840 --> 00:32:02,000
your technology so that we can bring
646
00:32:02,000 --> 00:32:04,559
accelerated computing to everybody in
647
00:32:04,559 --> 00:32:06,480
the world.
648
00:32:06,480 --> 00:32:08,320
Well,
649
00:32:08,320 --> 00:32:11,279
this GTC is really a great demonstration
650
00:32:11,279 --> 00:32:13,679
of that. You know, most of the time,
651
00:32:13,679 --> 00:32:15,200
most of the time you'll see me talk
652
00:32:15,200 --> 00:32:16,640
about these verticals and I'll use some
653
00:32:16,640 --> 00:32:20,000
examples, but in every single case,
654
00:32:20,000 --> 00:32:22,080
whether it's automotive f by the way,
655
00:32:22,080 --> 00:32:24,320
financial services, the largest
656
00:32:24,320 --> 00:32:27,120
percentage of attendees at this GTC is
657
00:32:27,120 --> 00:32:30,080
from the financial services industry.
658
00:32:30,080 --> 00:32:33,200
I know. I I'm hoping it's developers,
659
00:32:33,200 --> 00:32:35,360
not traders.
660
00:32:35,360 --> 00:32:38,360
Guys,
661
00:32:42,080 --> 00:32:44,240
here's here's
662
00:32:44,240 --> 00:32:49,120
here's one thing I wanted to say. And so
663
00:32:49,120 --> 00:32:51,440
in the audience represents Nvidia's
664
00:32:51,440 --> 00:32:54,880
ecosystem upstream of our supply chain
665
00:32:54,880 --> 00:32:57,360
and downstream of our supply chain. And
666
00:32:57,360 --> 00:32:59,200
we work we think about our supply chain
667
00:32:59,200 --> 00:33:02,000
upstream and downstream. And it's just
668
00:33:02,000 --> 00:33:05,519
so exciting that
669
00:33:05,519 --> 00:33:07,919
our entire upstream supply chain this
670
00:33:07,919 --> 00:33:10,080
last year
671
00:33:10,080 --> 00:33:11,840
irrespective of whether you're a 50 year
672
00:33:11,840 --> 00:33:13,679
old company, we have 70 year old
673
00:33:13,679 --> 00:33:16,480
companies. We have a 150 year old
674
00:33:16,480 --> 00:33:19,440
company who are now part of Nvidia
675
00:33:19,440 --> 00:33:20,960
supply chain and partnering with us
676
00:33:20,960 --> 00:33:23,120
either upstream or downstream. And last
677
00:33:23,120 --> 00:33:24,880
year
678
00:33:24,880 --> 00:33:28,799
you had your record year, did you not?
679
00:33:28,799 --> 00:33:31,799
Congratulations.
680
00:33:35,039 --> 00:33:37,120
We're on to something here. This is the
681
00:33:37,120 --> 00:33:39,120
beginning of something very, very big.
682
00:33:39,120 --> 00:33:41,039
And so if you look at accelerated
683
00:33:41,039 --> 00:33:42,960
computing, we've now set the computing
684
00:33:42,960 --> 00:33:47,039
platform. But in order for us to
685
00:33:47,039 --> 00:33:49,360
activate those computing platforms, we
686
00:33:49,360 --> 00:33:51,679
need to have domain specific libraries
687
00:33:51,679 --> 00:33:54,640
that solve very important problems in
688
00:33:54,640 --> 00:33:56,559
each one of the verticals that we
689
00:33:56,559 --> 00:33:58,480
address. You see us addressing every
690
00:33:58,480 --> 00:34:01,519
single one of this. Autonomous vehicles,
691
00:34:01,519 --> 00:34:04,960
our reach, our breadth, our impact.
692
00:34:04,960 --> 00:34:07,279
Incredible. We have a track on that.
693
00:34:07,279 --> 00:34:08,960
financial services. I just mentioned
694
00:34:08,960 --> 00:34:11,839
algorithmic trading is going from
695
00:34:11,839 --> 00:34:14,240
classical machine learning with human
696
00:34:14,240 --> 00:34:16,879
feature engineering called quant the
697
00:34:16,879 --> 00:34:20,480
quants did that to now supercomputers
698
00:34:20,480 --> 00:34:22,639
studying massive amounts of data
699
00:34:22,639 --> 00:34:24,560
discovering insight and discovering
700
00:34:24,560 --> 00:34:27,679
patterns by itself and so this is going
701
00:34:27,679 --> 00:34:29,520
through its deep learning and its
702
00:34:29,520 --> 00:34:31,839
transformer moment healthcare is going
703
00:34:31,839 --> 00:34:34,159
is going through their chap GPT moment
704
00:34:34,159 --> 00:34:35,760
some really exciting work that we're
705
00:34:35,760 --> 00:34:38,320
there we We have a great keynote track
706
00:34:38,320 --> 00:34:39,760
here. We have a great keynote track.
707
00:34:39,760 --> 00:34:41,280
Kimberly Pal is doing a great keynote
708
00:34:41,280 --> 00:34:43,839
track um for healthcare. We're talking
709
00:34:43,839 --> 00:34:48,639
about AI physics or AI biology for drug
710
00:34:48,639 --> 00:34:51,440
discovery, AI agents for customer
711
00:34:51,440 --> 00:34:56,240
service and support of diagnos diagnosis
712
00:34:56,240 --> 00:34:58,720
and of course physical AI, robotic
713
00:34:58,720 --> 00:35:01,520
systems. All these different vectors of
714
00:35:01,520 --> 00:35:03,839
AI have different platforms that NVIDIA
715
00:35:03,839 --> 00:35:07,119
provides. industrial we are completely
716
00:35:07,119 --> 00:35:09,680
resetting and starting the largest
717
00:35:09,680 --> 00:35:13,599
buildout of human history and most of
718
00:35:13,599 --> 00:35:16,079
the world's industries building AI
719
00:35:16,079 --> 00:35:18,320
factories building chip plants building
720
00:35:18,320 --> 00:35:20,400
computer plants are represented here
721
00:35:20,400 --> 00:35:23,359
today media and entertainment gaming of
722
00:35:23,359 --> 00:35:27,200
course real time AI platform so that we
723
00:35:27,200 --> 00:35:30,880
could translation and broadcast support
724
00:35:30,880 --> 00:35:34,720
and live live games and live video
725
00:35:34,720 --> 00:35:36,800
enormous amount of it will be augmented
726
00:35:36,800 --> 00:35:39,280
with AI. We have a we have a platform
727
00:35:39,280 --> 00:35:41,839
called hollow scan quantum there are 35
728
00:35:41,839 --> 00:35:44,000
different companies here building with
729
00:35:44,000 --> 00:35:47,280
us the next generation of quantum GPU
730
00:35:47,280 --> 00:35:50,960
hybrid systems uh retail and CPG using
731
00:35:50,960 --> 00:35:54,320
Nvidia for supply chain using creating a
732
00:35:54,320 --> 00:35:56,880
gentic shopping systems
733
00:35:56,880 --> 00:36:00,240
AI agents for customer support a lot of
734
00:36:00,240 --> 00:36:03,040
work being done here $35 trillion
735
00:36:03,040 --> 00:36:06,160
industry robotics $50 trillion industry
736
00:36:06,160 --> 00:36:08,240
in manufacturing Nvidia has been working
737
00:36:08,240 --> 00:36:10,880
in this area for a decade now building
738
00:36:10,880 --> 00:36:12,480
three computers, the fundamental
739
00:36:12,480 --> 00:36:14,800
computers necessary to build robotic
740
00:36:14,800 --> 00:36:17,359
systems. We are integrated with working
741
00:36:17,359 --> 00:36:20,079
with literally every single company that
742
00:36:20,079 --> 00:36:22,880
we know of building robots. We have 110
743
00:36:22,880 --> 00:36:25,440
robots here at the show. And then
744
00:36:25,440 --> 00:36:27,280
telecommunications
745
00:36:27,280 --> 00:36:29,440
about as large as the world's IT
746
00:36:29,440 --> 00:36:31,839
industry about$2 trillion dollars. We
747
00:36:31,839 --> 00:36:34,320
see of course base stations everywhere.
748
00:36:34,320 --> 00:36:37,200
It's one of the world's infrastructures.
749
00:36:37,200 --> 00:36:39,119
It was the infrastructure of the last
750
00:36:39,119 --> 00:36:41,119
generation of computing. That
751
00:36:41,119 --> 00:36:42,560
infrastructure is going to get
752
00:36:42,560 --> 00:36:44,640
completely reinvented. And the reason
753
00:36:44,640 --> 00:36:47,200
for that is very simple. That base
754
00:36:47,200 --> 00:36:50,400
station which is
755
00:36:50,400 --> 00:36:53,760
it does one thing which is base station
756
00:36:53,760 --> 00:36:56,079
is going to be an AI infrastructure
757
00:36:56,079 --> 00:36:58,480
platform in the future. AI will run at
758
00:36:58,480 --> 00:37:01,520
the edge. And so lots of lots of great
759
00:37:01,520 --> 00:37:04,079
um uh great uh discussion there. And our
760
00:37:04,079 --> 00:37:06,480
platform there is called Aerial or AI
761
00:37:06,480 --> 00:37:08,960
RAM. Big partnership with Nokia, big
762
00:37:08,960 --> 00:37:10,400
partnership with T-Mobile and many
763
00:37:10,400 --> 00:37:12,240
others.
764
00:37:12,240 --> 00:37:15,359
At the core of our business,
765
00:37:15,359 --> 00:37:17,839
everything that I just mentioned,
766
00:37:17,839 --> 00:37:20,000
computing platforms, but very
767
00:37:20,000 --> 00:37:23,440
importantly, our CUDA X libraries, our
768
00:37:23,440 --> 00:37:26,000
CUDA X libraries is the algorithm, the
769
00:37:26,000 --> 00:37:28,400
algorithms that Nvidia invents. We are
770
00:37:28,400 --> 00:37:30,079
an algorithm company. That's what makes
771
00:37:30,079 --> 00:37:32,160
us special. That what that's what makes
772
00:37:32,160 --> 00:37:34,320
it possible for me to be able to go into
773
00:37:34,320 --> 00:37:36,079
every single one of these industries,
774
00:37:36,079 --> 00:37:38,480
imagine the future and have the world's
775
00:37:38,480 --> 00:37:40,960
best computer scientists describe and
776
00:37:40,960 --> 00:37:43,440
solve problems, refactor it, reexpress
777
00:37:43,440 --> 00:37:44,960
it,
778
00:37:44,960 --> 00:37:47,280
and turn it into a library. We have so
779
00:37:47,280 --> 00:37:50,560
many I think we have at this show, we're
780
00:37:50,560 --> 00:37:54,560
announcing a hundred 100 libraries,
781
00:37:54,560 --> 00:37:57,520
something 70 libraries, maybe 40 models
782
00:37:57,520 --> 00:37:59,440
and that's just at the show. We're
783
00:37:59,440 --> 00:38:01,839
updating these all the time. We're
784
00:38:01,839 --> 00:38:03,359
updating them all the time. The
785
00:38:03,359 --> 00:38:06,640
libraries is the crown jewels of our
786
00:38:06,640 --> 00:38:08,400
company. It is what makes it possible
787
00:38:08,400 --> 00:38:10,560
for that platform, the computing
788
00:38:10,560 --> 00:38:13,760
platform to be activated in service of
789
00:38:13,760 --> 00:38:16,480
solving a problem, making impact. One of
790
00:38:16,480 --> 00:38:17,920
the biggest, one of the most important
791
00:38:17,920 --> 00:38:22,240
libraries that we ever created, coupn
792
00:38:22,240 --> 00:38:25,359
CUDA deep neural networks. It completely
793
00:38:25,359 --> 00:38:27,839
revolutionized artificial intelligence,
794
00:38:27,839 --> 00:38:30,320
caused a big bang of modern AI. Let me
795
00:38:30,320 --> 00:38:34,920
show you a short video about CUDA X.
796
00:38:36,160 --> 00:38:39,280
20 years ago, we built CUDA, a single
797
00:38:39,280 --> 00:38:42,320
architecture for accelerated computing.
798
00:38:42,320 --> 00:38:45,359
Today, we've reinvented computing. A
799
00:38:45,359 --> 00:38:47,599
thousand CUDA X libraries help
800
00:38:47,599 --> 00:38:49,920
developers make breakthroughs in every
801
00:38:49,920 --> 00:38:52,640
field of science and engineering.
802
00:38:52,640 --> 00:38:56,880
CU opt for decision optimization.
803
00:38:56,880 --> 00:39:01,040
CU litho for computational lithography.
804
00:39:01,040 --> 00:39:05,839
CDSS for direct sparse solvers.
805
00:39:05,839 --> 00:39:08,400
Coup equivariance for geometryaware
806
00:39:08,400 --> 00:39:11,119
neural networks.
807
00:39:11,119 --> 00:39:15,280
Aerial for AI ran.
808
00:39:15,280 --> 00:39:19,480
Warp for differentiable physics.
809
00:39:19,680 --> 00:39:22,320
pair of bricks for genomics.
810
00:39:22,320 --> 00:39:25,440
At their foundation are algorithms and
811
00:39:25,440 --> 00:39:29,720
they are beautiful.
812
00:40:08,240 --> 00:40:10,560
Wow.
813
00:40:10,560 --> 00:40:13,560
Heat. Heat.
814
00:40:35,599 --> 00:40:38,599
Heat.
815
00:40:56,000 --> 00:40:59,000
Heat.
816
00:41:13,119 --> 00:41:16,200
Heat. Heat.
817
00:41:34,560 --> 00:41:37,560
Heat.
818
00:41:39,599 --> 00:41:42,599
Heat.
819
00:42:00,160 --> 00:42:02,640
Everything you saw was a simulation.
820
00:42:02,640 --> 00:42:05,440
Some of it was principled solvers,
821
00:42:05,440 --> 00:42:07,680
fundamental physics solvers. Some of it
822
00:42:07,680 --> 00:42:11,359
was AI surrogates, AI physical models
823
00:42:11,359 --> 00:42:14,720
and some of it was physical AI robotics
824
00:42:14,720 --> 00:42:17,839
models. Everything was simulated.
825
00:42:17,839 --> 00:42:20,560
Nothing was animated. Nothing was
826
00:42:20,560 --> 00:42:23,359
articulated. Everything was completely
827
00:42:23,359 --> 00:42:26,720
simulated. That is what fundamentally
828
00:42:26,720 --> 00:42:30,160
Nvidia does. It is through the
829
00:42:30,160 --> 00:42:32,640
connection of understanding of the
830
00:42:32,640 --> 00:42:35,280
algorithms with our computing platforms
831
00:42:35,280 --> 00:42:37,680
that we're able to open up to unlock
832
00:42:37,680 --> 00:42:40,160
these opportunities. Nvidia is a
833
00:42:40,160 --> 00:42:43,599
vertically integrated computing company
834
00:42:43,599 --> 00:42:45,760
with open
835
00:42:45,760 --> 00:42:48,960
horizontal integration with the world.
836
00:42:48,960 --> 00:42:52,720
So that's CUDA X. Well, just now you saw
837
00:42:52,720 --> 00:42:55,359
a whole bunch of companies. You saw
838
00:42:55,359 --> 00:42:58,400
Walmart and you know there's L'Oreal and
839
00:42:58,400 --> 00:43:00,480
incredible companies established
840
00:43:00,480 --> 00:43:03,200
companies JP Morgan and Ro and these are
841
00:43:03,200 --> 00:43:06,240
companies in companies that have defined
842
00:43:06,240 --> 00:43:10,319
society to today. Toyota is here. These
843
00:43:10,319 --> 00:43:12,000
are some of the largest companies in the
844
00:43:12,000 --> 00:43:13,760
world.
845
00:43:13,760 --> 00:43:16,160
It is also true
846
00:43:16,160 --> 00:43:17,599
that there's a whole bunch of companies
847
00:43:17,599 --> 00:43:19,839
you've never heard of. These are
848
00:43:19,839 --> 00:43:22,960
companies we call them AI natives. a
849
00:43:22,960 --> 00:43:25,119
whole bunch of small companies. This the
850
00:43:25,119 --> 00:43:27,280
list is gigantic. I can't I couldn't
851
00:43:27,280 --> 00:43:29,280
this is just a little tiny tiny bit of
852
00:43:29,280 --> 00:43:32,560
it. And um I I couldn't decide whether
853
00:43:32,560 --> 00:43:34,560
to show you more or show you less. And
854
00:43:34,560 --> 00:43:36,400
so I I made it so that you couldn't see
855
00:43:36,400 --> 00:43:39,400
any
856
00:43:39,599 --> 00:43:41,599
and and nobody's feelings are hurt.
857
00:43:41,599 --> 00:43:44,560
However, inside this list are a bunch of
858
00:43:44,560 --> 00:43:46,240
brand new companies. There are companies
859
00:43:46,240 --> 00:43:48,480
like for example you might have heard a
860
00:43:48,480 --> 00:43:51,040
couple of them open AAI anthropic but
861
00:43:51,040 --> 00:43:52,880
there's a whole bunch of others there's
862
00:43:52,880 --> 00:43:54,720
a whole bunch of others and they serve
863
00:43:54,720 --> 00:43:56,880
different verticals
864
00:43:56,880 --> 00:43:59,200
something happened in the last two years
865
00:43:59,200 --> 00:44:01,040
particularly this last year we've been
866
00:44:01,040 --> 00:44:02,560
working with the AI natives for a long
867
00:44:02,560 --> 00:44:05,040
time and this last year it just
868
00:44:05,040 --> 00:44:07,040
skyrocketed and I'll explain to you why
869
00:44:07,040 --> 00:44:09,280
it happened these this industry has
870
00:44:09,280 --> 00:44:12,000
skyrocketed $150 billion dollars of
871
00:44:12,000 --> 00:44:15,280
investment into venture investment into
872
00:44:15,280 --> 00:44:18,960
startups, the largest in human history.
873
00:44:18,960 --> 00:44:22,079
This is also the first time that the
874
00:44:22,079 --> 00:44:24,160
scale of the investments went from
875
00:44:24,160 --> 00:44:26,319
millions of dollars, tens of millions of
876
00:44:26,319 --> 00:44:28,319
dollars to hundreds of millions of
877
00:44:28,319 --> 00:44:30,720
dollars and billions of dollars. And the
878
00:44:30,720 --> 00:44:33,040
reason for that is this is the first
879
00:44:33,040 --> 00:44:36,000
time in history that every single one of
880
00:44:36,000 --> 00:44:38,160
these companies
881
00:44:38,160 --> 00:44:41,040
needs compute and lots and lots of it.
882
00:44:41,040 --> 00:44:43,440
They need tokens, lots and lots of it.
883
00:44:43,440 --> 00:44:45,119
they're either need they're either going
884
00:44:45,119 --> 00:44:48,079
to create and build and create tokens
885
00:44:48,079 --> 00:44:51,440
and generate tokens or they're going to
886
00:44:51,440 --> 00:44:52,960
integrate
887
00:44:52,960 --> 00:44:55,359
add value tokens
888
00:44:55,359 --> 00:44:57,359
that are available created by anthropic
889
00:44:57,359 --> 00:45:00,160
and open AAI and others and so this
890
00:45:00,160 --> 00:45:02,720
industry is different in so many
891
00:45:02,720 --> 00:45:04,400
different ways but the one thing that is
892
00:45:04,400 --> 00:45:06,160
very clear the impact that they're
893
00:45:06,160 --> 00:45:08,720
making this the incredible value that
894
00:45:08,720 --> 00:45:10,960
they're delivering already is quite
895
00:45:10,960 --> 00:45:14,319
tangible AI natives
896
00:45:14,319 --> 00:45:17,760
All because we reinvented computing.
897
00:45:17,760 --> 00:45:19,839
Just like during the PC revolution, a
898
00:45:19,839 --> 00:45:21,040
whole bunch of new companies were
899
00:45:21,040 --> 00:45:24,000
created. Just as just as uh during the
900
00:45:24,000 --> 00:45:25,520
internet revolution, a whole bunch of
901
00:45:25,520 --> 00:45:27,119
companies were created and mobile cloud
902
00:45:27,119 --> 00:45:28,720
a whole bunch of companies were created.
903
00:45:28,720 --> 00:45:30,240
Each one of them had their own standards
904
00:45:30,240 --> 00:45:32,079
and all we're talk about one of the
905
00:45:32,079 --> 00:45:34,720
major standards is that just happened.
906
00:45:34,720 --> 00:45:36,960
Incredibly important. And this
907
00:45:36,960 --> 00:45:40,160
generation, we also have our own large
908
00:45:40,160 --> 00:45:42,720
number of very, very special companies.
909
00:45:42,720 --> 00:45:45,359
We reinvented computing. It stands to
910
00:45:45,359 --> 00:45:47,200
reason there's going to be a whole new
911
00:45:47,200 --> 00:45:50,560
crop of really important companies,
912
00:45:50,560 --> 00:45:52,560
consequential companies for the future
913
00:45:52,560 --> 00:45:55,119
of the world. The the Googles, the
914
00:45:55,119 --> 00:45:57,280
Amazons, the Metas, consequential
915
00:45:57,280 --> 00:45:59,280
companies that have come as a result of
916
00:45:59,280 --> 00:46:02,400
the last computing platform shift. We
917
00:46:02,400 --> 00:46:03,680
are now at the beginning of a new
918
00:46:03,680 --> 00:46:05,440
platform shift. But what happened in the
919
00:46:05,440 --> 00:46:07,920
last couple years? Well, we've been
920
00:46:07,920 --> 00:46:09,200
watching, as you know, we've been
921
00:46:09,200 --> 00:46:10,640
working on deep learning and working on
922
00:46:10,640 --> 00:46:13,520
AI, the big bang of modern AI. We were
923
00:46:13,520 --> 00:46:15,119
right there at the spot and we've been
924
00:46:15,119 --> 00:46:16,880
advancing this field for quite some
925
00:46:16,880 --> 00:46:20,400
time. But why the last two years? What
926
00:46:20,400 --> 00:46:21,760
happened in the last two years? Well,
927
00:46:21,760 --> 00:46:25,920
three things. Chat GPT of course started
928
00:46:25,920 --> 00:46:28,960
the generative AI era. It's able to not
929
00:46:28,960 --> 00:46:32,400
just understand, perceive, and
930
00:46:32,400 --> 00:46:34,880
understand. It's able to also translate
931
00:46:34,880 --> 00:46:37,200
and generate generation of unique
932
00:46:37,200 --> 00:46:39,920
content. I showed you the fusion of
933
00:46:39,920 --> 00:46:42,400
generative AI with computer graphics and
934
00:46:42,400 --> 00:46:44,800
it brought computer graphics to life.
935
00:46:44,800 --> 00:46:46,880
You guys just everybody in the world
936
00:46:46,880 --> 00:46:48,880
should be using chat GPT. I know I use
937
00:46:48,880 --> 00:46:50,319
it every single morning. Used it plenty
938
00:46:50,319 --> 00:46:52,640
this morning. And so chat GPT was the
939
00:46:52,640 --> 00:46:56,400
generative AI the era. The second by the
940
00:46:56,400 --> 00:46:59,839
way generative generative computing
941
00:46:59,839 --> 00:47:01,839
versus the way we used to do computing.
942
00:47:01,839 --> 00:47:04,160
It's not it's generative AI is a
943
00:47:04,160 --> 00:47:06,560
capability of software but it has
944
00:47:06,560 --> 00:47:08,880
profoundly changed how computing is
945
00:47:08,880 --> 00:47:10,400
done.
946
00:47:10,400 --> 00:47:13,040
Computing used to be retrieval based now
947
00:47:13,040 --> 00:47:15,440
it's generative. Keep that thought in
948
00:47:15,440 --> 00:47:17,359
mind when I talk about certain things
949
00:47:17,359 --> 00:47:20,240
and you'll realize why it is that
950
00:47:20,240 --> 00:47:22,400
everything that we do is going to change
951
00:47:22,400 --> 00:47:24,240
how computers are architected, how
952
00:47:24,240 --> 00:47:26,240
computers are provided, how computers
953
00:47:26,240 --> 00:47:28,000
are going to be built out and what is
954
00:47:28,000 --> 00:47:31,520
the meaning of computing altogether
955
00:47:31,520 --> 00:47:36,560
generative AI 2023 end of 22 2023 the
956
00:47:36,560 --> 00:47:40,319
next reasoning AI 01
957
00:47:40,319 --> 00:47:43,280
which and then took off with 03
958
00:47:43,280 --> 00:47:45,920
reasoning allowed it to reflect, allows
959
00:47:45,920 --> 00:47:48,400
it to think to itself, allowed it to
960
00:47:48,400 --> 00:47:51,520
plan, break down, break down problems
961
00:47:51,520 --> 00:47:54,319
and decompose a problem it couldn't
962
00:47:54,319 --> 00:47:56,720
understand into steps or parts that it
963
00:47:56,720 --> 00:47:59,200
could understand. It could ground itself
964
00:47:59,200 --> 00:48:03,760
on research. 01 made generative AI
965
00:48:03,760 --> 00:48:07,760
trustworthy and grounded on truth. That
966
00:48:07,760 --> 00:48:11,920
caused Chad GPT to simply took off. And
967
00:48:11,920 --> 00:48:14,000
that was a very, very big moment. the
968
00:48:14,000 --> 00:48:16,560
amount of input tokens that was
969
00:48:16,560 --> 00:48:18,880
necessary in order to produce and the
970
00:48:18,880 --> 00:48:20,880
amount of output tokens it need it
971
00:48:20,880 --> 00:48:23,599
generated in order to reason the model
972
00:48:23,599 --> 00:48:26,000
was a little bit larger it you know of
973
00:48:26,000 --> 00:48:27,440
course you could have much larger models
974
00:48:27,440 --> 00:48:29,680
the model 01 was a little bit larger not
975
00:48:29,680 --> 00:48:33,040
much larger but its input token usage
976
00:48:33,040 --> 00:48:34,960
for context
977
00:48:34,960 --> 00:48:38,319
and its output token for thinking
978
00:48:38,319 --> 00:48:39,839
increased the amount of computation
979
00:48:39,839 --> 00:48:43,040
tremendously then came quad code the
980
00:48:43,040 --> 00:48:46,240
first agentic model. It was able to read
981
00:48:46,240 --> 00:48:50,400
files, code, compile it, test it,
982
00:48:50,400 --> 00:48:53,280
evaluate it, go back and iterate on it.
983
00:48:53,280 --> 00:48:55,680
Cloud code has revolutionized software
984
00:48:55,680 --> 00:48:58,640
engineering. As all of you know, 100% of
985
00:48:58,640 --> 00:49:02,160
NVIDIA is using a combination of CL or
986
00:49:02,160 --> 00:49:04,240
oftentimes all three of them. Cloud
987
00:49:04,240 --> 00:49:07,280
code, codeex, and cursor all over
988
00:49:07,280 --> 00:49:09,280
Nvidia. There's not one software
989
00:49:09,280 --> 00:49:12,000
engineer today who is not assisted by
990
00:49:12,000 --> 00:49:15,599
one or many AI agents helping them code.
991
00:49:15,599 --> 00:49:18,720
Cloud code completely revolutionizes the
992
00:49:18,720 --> 00:49:21,599
the new inflection and the for the first
993
00:49:21,599 --> 00:49:23,599
time.
994
00:49:23,599 --> 00:49:27,599
You don't ask a AI what,
995
00:49:27,599 --> 00:49:31,520
where, when, how.
996
00:49:31,520 --> 00:49:33,520
You ask it
997
00:49:33,520 --> 00:49:37,520
create, do, build.
998
00:49:37,520 --> 00:49:40,720
You ask it to use tools,
999
00:49:40,720 --> 00:49:44,319
take your context, read files. It's able
1000
00:49:44,319 --> 00:49:46,720
to agentically break down a problem,
1001
00:49:46,720 --> 00:49:49,440
reason about it, reflect on it. It's
1002
00:49:49,440 --> 00:49:51,680
able to solve problems, and actually
1003
00:49:51,680 --> 00:49:55,680
perform tasks. An AI that was able to
1004
00:49:55,680 --> 00:49:58,160
perceive became an AI that could
1005
00:49:58,160 --> 00:50:00,079
generate. An AI that could generate
1006
00:50:00,079 --> 00:50:01,920
became an AI that could reason. An AI
1007
00:50:01,920 --> 00:50:04,160
that could reason now became an AI that
1008
00:50:04,160 --> 00:50:07,119
can actually do work. Very productive
1009
00:50:07,119 --> 00:50:09,839
work. The amount of computation in the
1010
00:50:09,839 --> 00:50:12,559
last two years, we know that everybody
1011
00:50:12,559 --> 00:50:15,760
in this room knows the computing demand
1012
00:50:15,760 --> 00:50:19,359
for NVIDIA GPU is off the charts. Spot
1013
00:50:19,359 --> 00:50:21,920
pricing is skyrocketing. You couldn't
1014
00:50:21,920 --> 00:50:24,160
find a GPU if you tried. And yet, in the
1015
00:50:24,160 --> 00:50:27,200
meantime, we're shipping GPUs out,
1016
00:50:27,200 --> 00:50:29,200
incredible amounts of it, and demand
1017
00:50:29,200 --> 00:50:31,680
just keeps on going up. There's a reason
1018
00:50:31,680 --> 00:50:36,000
for that. This fundamental inflection.
1019
00:50:36,000 --> 00:50:39,520
Finally, AI is able to do productive
1020
00:50:39,520 --> 00:50:43,520
work and therefore the inflection point
1021
00:50:43,520 --> 00:50:47,280
of inference has arrived.
1022
00:50:47,280 --> 00:50:50,400
AI now has to think. In order to think,
1023
00:50:50,400 --> 00:50:53,440
it has to inference. AI now has to do.
1024
00:50:53,440 --> 00:50:55,839
In order to do, it has to inference. AI
1025
00:50:55,839 --> 00:50:57,599
has to read. In order to do so, it has
1026
00:50:57,599 --> 00:50:59,680
to inference. It has to reason. It has
1027
00:50:59,680 --> 00:51:04,000
to inference. every part of AI
1028
00:51:04,000 --> 00:51:05,839
every time it has to think it has to
1029
00:51:05,839 --> 00:51:08,880
reason it has to do it has to generate
1030
00:51:08,880 --> 00:51:12,160
tokens it has to inference it's way past
1031
00:51:12,160 --> 00:51:15,040
training now it's in the in the field of
1032
00:51:15,040 --> 00:51:17,280
inference so the in the inference
1033
00:51:17,280 --> 00:51:20,000
inflection has arrived
1034
00:51:20,000 --> 00:51:22,480
at the time when the amount of tokens
1035
00:51:22,480 --> 00:51:24,960
the amount of compute necessary
1036
00:51:24,960 --> 00:51:28,559
increased by roughly 10,000 times now
1037
00:51:28,559 --> 00:51:31,200
when I combine these to the fact that
1038
00:51:31,200 --> 00:51:32,960
since in the last two years the
1039
00:51:32,960 --> 00:51:35,200
computing demand computing demand of the
1040
00:51:35,200 --> 00:51:38,240
work has gone up by 10,000 times and the
1041
00:51:38,240 --> 00:51:41,119
amount of usage
1042
00:51:41,119 --> 00:51:43,760
the amount of usage has probably gone up
1043
00:51:43,760 --> 00:51:48,079
by a hundred times.
1044
00:51:48,079 --> 00:51:50,960
People have heard me say I believe that
1045
00:51:50,960 --> 00:51:54,079
computing demand has increased by 1
1046
00:51:54,079 --> 00:51:57,680
million times in the last two years. It
1047
00:51:57,680 --> 00:52:00,160
is the feeling that we all have. It is
1048
00:52:00,160 --> 00:52:02,160
the feeling every startup has. It's the
1049
00:52:02,160 --> 00:52:04,160
feeling that OpenAI has. It's the
1050
00:52:04,160 --> 00:52:06,000
feeling that Anthropic has. If they
1051
00:52:06,000 --> 00:52:08,319
could just get more capacity, they could
1052
00:52:08,319 --> 00:52:10,160
generate more tokens. Their revenues
1053
00:52:10,160 --> 00:52:12,319
would go up. More people could use it.
1054
00:52:12,319 --> 00:52:14,720
The more advanced, the smarter the AI
1055
00:52:14,720 --> 00:52:18,319
could become. We are now at that
1056
00:52:18,319 --> 00:52:20,960
positive flywheel system. We have we
1057
00:52:20,960 --> 00:52:22,640
have reached that moment. The
1058
00:52:22,640 --> 00:52:25,760
inflection, the inference inflection has
1059
00:52:25,760 --> 00:52:32,240
arrived. Last year at this time, I said
1060
00:52:32,240 --> 00:52:33,839
that
1061
00:52:33,839 --> 00:52:37,839
where I stood at that moment in time, we
1062
00:52:37,839 --> 00:52:40,720
saw about
1063
00:52:40,720 --> 00:52:42,800
$500
1064
00:52:42,800 --> 00:52:44,800
billion dollars.
1065
00:52:44,800 --> 00:52:48,480
We saw$500 billion dollars
1066
00:52:48,480 --> 00:52:51,200
of very high confidence demand and
1067
00:52:51,200 --> 00:52:53,680
purchase orders
1068
00:52:53,680 --> 00:52:59,839
for Blackwell and Reuben through 2026.
1069
00:52:59,839 --> 00:53:03,119
I said that last year.
1070
00:53:03,119 --> 00:53:05,359
Now, I don't know if you guys feel the
1071
00:53:05,359 --> 00:53:08,559
same way, but $500 billion is an
1072
00:53:08,559 --> 00:53:12,800
enormous amount of revenue.
1073
00:53:12,800 --> 00:53:16,280
Not one impressed.
1074
00:53:22,319 --> 00:53:23,839
I know why you're not impressed. Because
1075
00:53:23,839 --> 00:53:27,800
all of you had record years.
1076
00:53:29,440 --> 00:53:33,280
Well, I'm here to tell you
1077
00:53:33,280 --> 00:53:36,319
that right now where I stand, a few
1078
00:53:36,319 --> 00:53:40,160
short months after GTCDC,
1079
00:53:40,160 --> 00:53:43,280
one year after last GTC, right here
1080
00:53:43,280 --> 00:53:45,760
where I stand,
1081
00:53:45,760 --> 00:53:51,000
I see through 2027
1082
00:53:51,680 --> 00:53:56,520
at least $1 trillion
1083
00:54:03,520 --> 00:54:06,800
Now, does it make any sense?
1084
00:54:06,800 --> 00:54:08,240
And that's what I'm going to spend the
1085
00:54:08,240 --> 00:54:11,359
rest of the time talking about. In fact,
1086
00:54:11,359 --> 00:54:15,599
we are going to be short. I am certain
1087
00:54:15,599 --> 00:54:17,359
computing demand will be much higher
1088
00:54:17,359 --> 00:54:19,599
than that. And there's a reason for
1089
00:54:19,599 --> 00:54:22,800
that. So, the first thing is
1090
00:54:22,800 --> 00:54:24,160
um we did a lot of work in the last
1091
00:54:24,160 --> 00:54:27,520
year. Of course, as you know, 2025 was
1092
00:54:27,520 --> 00:54:30,160
NVIDIA's year of inference. We wanted to
1093
00:54:30,160 --> 00:54:31,760
make sure that not only were we good at
1094
00:54:31,760 --> 00:54:33,599
training and post- training, that we
1095
00:54:33,599 --> 00:54:35,599
were incredibly good at every single
1096
00:54:35,599 --> 00:54:38,319
phase of AI so that the investments that
1097
00:54:38,319 --> 00:54:40,559
were made, investments made in our
1098
00:54:40,559 --> 00:54:43,040
infrastructure could scale out for as
1099
00:54:43,040 --> 00:54:44,880
long as they would like to use it. And
1100
00:54:44,880 --> 00:54:46,800
the useful life of Nvidia's
1101
00:54:46,800 --> 00:54:48,559
infrastructure would be long and
1102
00:54:48,559 --> 00:54:50,240
therefore the cost would be incredibly
1103
00:54:50,240 --> 00:54:52,400
low. The longer you could use it, the
1104
00:54:52,400 --> 00:54:54,400
lower the cost. There's no question in
1105
00:54:54,400 --> 00:54:57,440
my mind Nvidia systems are the lowest
1106
00:54:57,440 --> 00:55:00,079
cost infrastructure you could get for AI
1107
00:55:00,079 --> 00:55:02,000
infrastructure in the world. And so the
1108
00:55:02,000 --> 00:55:04,559
first part was last year was all about
1109
00:55:04,559 --> 00:55:07,280
AI for inference and it drove this
1110
00:55:07,280 --> 00:55:11,280
inflection point. Simultaneously
1111
00:55:11,280 --> 00:55:13,520
we were very pleased last year that
1112
00:55:13,520 --> 00:55:16,480
Anthropic has come to Nvidia that MSL
1113
00:55:16,480 --> 00:55:20,400
Meta SL has chosen Nvidia and meanwhile
1114
00:55:20,400 --> 00:55:24,480
meanwhile and as a collection as a group
1115
00:55:24,480 --> 00:55:26,400
this represents
1116
00:55:26,400 --> 00:55:30,800
onethird of the world's AI compute open-
1117
00:55:30,800 --> 00:55:33,359
source models open-source models have
1118
00:55:33,359 --> 00:55:35,359
reached near the frontier and it is
1119
00:55:35,359 --> 00:55:38,000
literally everywhere and Nvidia as you
1120
00:55:38,000 --> 00:55:40,640
know today we're the only platform in
1121
00:55:40,640 --> 00:55:43,119
the world today that runs every single
1122
00:55:43,119 --> 00:55:45,920
domain of AI
1123
00:55:45,920 --> 00:55:48,079
across every single one of these AI
1124
00:55:48,079 --> 00:55:50,160
models
1125
00:55:50,160 --> 00:55:52,400
in language and biology and computer
1126
00:55:52,400 --> 00:55:56,960
graphics computer vision and speech
1127
00:55:56,960 --> 00:55:59,440
proteins and chemicals robotics and
1128
00:55:59,440 --> 00:56:03,760
otherwise edge or cloud any language
1129
00:56:03,760 --> 00:56:06,799
NVIDIA's architecture is funible for all
1130
00:56:06,799 --> 00:56:08,400
of that and we're incredible for all of
1131
00:56:08,400 --> 00:56:11,119
that. That allows us to be the lowest
1132
00:56:11,119 --> 00:56:14,480
cost, the highest confidence platform
1133
00:56:14,480 --> 00:56:15,680
because when you're building these
1134
00:56:15,680 --> 00:56:18,000
systems, as I mentioned, a trillion
1135
00:56:18,000 --> 00:56:19,920
dollars is an enormous amount of
1136
00:56:19,920 --> 00:56:21,839
infrastructure. You have to have
1137
00:56:21,839 --> 00:56:23,839
complete confidence that the trillion
1138
00:56:23,839 --> 00:56:26,559
dollars you're putting down will be you
1139
00:56:26,559 --> 00:56:29,599
utilized, would be performant, would be
1140
00:56:29,599 --> 00:56:31,599
incredibly cost-effective, and have
1141
00:56:31,599 --> 00:56:34,799
useful life for as long as you could see
1142
00:56:34,799 --> 00:56:37,119
that infrastructure investment you could
1143
00:56:37,119 --> 00:56:39,040
make on Nvidia. You could make with
1144
00:56:39,040 --> 00:56:41,119
complete confidence.
1145
00:56:41,119 --> 00:56:44,079
We have now proven that it is the only
1146
00:56:44,079 --> 00:56:45,839
infrastructure in the world that you
1147
00:56:45,839 --> 00:56:48,400
could go anywhere in the world and build
1148
00:56:48,400 --> 00:56:50,400
with complete confidence. You want to
1149
00:56:50,400 --> 00:56:52,160
put it in any of the clouds, we're
1150
00:56:52,160 --> 00:56:53,920
delighted by that. You want to put it on
1151
00:56:53,920 --> 00:56:56,000
prem, we're happy about that. You want
1152
00:56:56,000 --> 00:56:57,760
to put it in any country, anywhere,
1153
00:56:57,760 --> 00:57:00,559
we're delighted to support you. We are
1154
00:57:00,559 --> 00:57:02,160
now
1155
00:57:02,160 --> 00:57:04,880
a computing platform that runs all of
1156
00:57:04,880 --> 00:57:09,119
AI. Now, our business
1157
00:57:09,119 --> 00:57:12,079
already starting to show that 60% of our
1158
00:57:12,079 --> 00:57:16,000
business is hyperscalers. The top five
1159
00:57:16,000 --> 00:57:17,520
hyperscalers.
1160
00:57:17,520 --> 00:57:19,440
However, even within that top five
1161
00:57:19,440 --> 00:57:22,880
hyperscalers, some of it is internal AI
1162
00:57:22,880 --> 00:57:25,359
consumption. The internal AI consumption
1163
00:57:25,359 --> 00:57:27,280
really important work like Rexus is
1164
00:57:27,280 --> 00:57:29,680
moving from recommener systems of tables
1165
00:57:29,680 --> 00:57:31,839
and collaborative filtering and content
1166
00:57:31,839 --> 00:57:33,920
filtering. It's moving towards deep
1167
00:57:33,920 --> 00:57:35,920
learning and large language models.
1168
00:57:35,920 --> 00:57:38,799
Search moving to deep learning large
1169
00:57:38,799 --> 00:57:41,200
language models. Almost all of these
1170
00:57:41,200 --> 00:57:43,280
different hypers scale workloads are now
1171
00:57:43,280 --> 00:57:45,760
moving shifting towards a workload that
1172
00:57:45,760 --> 00:57:48,000
Nvidia GPUs are incredibly good at. But
1173
00:57:48,000 --> 00:57:50,000
on top of that, because we work with
1174
00:57:50,000 --> 00:57:52,880
every AI lab, because we work with every
1175
00:57:52,880 --> 00:57:55,359
we accelerate a every AI model and
1176
00:57:55,359 --> 00:57:57,839
because we have a large ecosystem of AI
1177
00:57:57,839 --> 00:57:59,760
natives that we work with that we can
1178
00:57:59,760 --> 00:58:03,200
bring to the clouds that investment no
1179
00:58:03,200 --> 00:58:06,160
matter how large, no matter how quick
1180
00:58:06,160 --> 00:58:09,040
that compute will be consumed and that
1181
00:58:09,040 --> 00:58:11,040
represents 60% of our business. The
1182
00:58:11,040 --> 00:58:14,480
other 40% is just everywhere. Regional
1183
00:58:14,480 --> 00:58:17,599
clouds, sovereign clouds, enterprise,
1184
00:58:17,599 --> 00:58:21,599
industrial, robotics, edge, big systems,
1185
00:58:21,599 --> 00:58:24,000
supercomputing systems, small servers,
1186
00:58:24,000 --> 00:58:26,240
enterprise servers.
1187
00:58:26,240 --> 00:58:30,000
The number of systems, incredible.
1188
00:58:30,000 --> 00:58:33,359
The diversity of AI
1189
00:58:33,359 --> 00:58:36,160
is also its resilience.
1190
00:58:36,160 --> 00:58:39,920
The span of reach of AI is its
1191
00:58:39,920 --> 00:58:42,319
resilience. There is no question this is
1192
00:58:42,319 --> 00:58:45,599
not a one app technology. This is now
1193
00:58:45,599 --> 00:58:49,119
fundamental. This is absolutely a new
1194
00:58:49,119 --> 00:58:53,040
computing platform shift. Well, our job
1195
00:58:53,040 --> 00:58:55,520
is to continue to advance the technology
1196
00:58:55,520 --> 00:58:56,960
and one of the most important things
1197
00:58:56,960 --> 00:58:59,119
that I mentioned last year was last year
1198
00:58:59,119 --> 00:59:02,880
was our year of inference. We dedicated
1199
00:59:02,880 --> 00:59:06,559
everything. We took a giant chance and
1200
00:59:06,559 --> 00:59:10,160
reinvented while Hopper was at its prime
1201
00:59:10,160 --> 00:59:13,520
and it was just cooking. We decided that
1202
00:59:13,520 --> 00:59:16,720
the Hopper architecture the MVL link by
1203
00:59:16,720 --> 00:59:19,839
8 had to be taken to the next level. We
1204
00:59:19,839 --> 00:59:21,920
completely rearchitected the system,
1205
00:59:21,920 --> 00:59:24,079
disagregated the computing system alto
1206
00:59:24,079 --> 00:59:27,760
together and created MVLink 72. The way
1207
00:59:27,760 --> 00:59:29,119
that it's built, the way it's
1208
00:59:29,119 --> 00:59:31,440
manufactured, the way it's programmed
1209
00:59:31,440 --> 00:59:33,839
completely changed. Grace Blackwell
1210
00:59:33,839 --> 00:59:37,440
MVLink72 was a giant bet and it wasn't
1211
00:59:37,440 --> 00:59:39,359
easy for anybody and many of my partners
1212
00:59:39,359 --> 00:59:41,040
here in the room. I want to thank all of
1213
00:59:41,040 --> 00:59:42,480
you for the hard work that you guys did.
1214
00:59:42,480 --> 00:59:45,640
Thank you.
1215
00:59:48,400 --> 00:59:50,319
MVLink72
1216
00:59:50,319 --> 00:59:55,359
MV FP4 not just FP4 precision FP4 is a
1217
00:59:55,359 --> 00:59:57,119
whole different type of tensor core and
1218
00:59:57,119 --> 00:59:59,680
computational unit. We've demonstrated
1219
00:59:59,680 --> 01:00:03,359
now that we can inference NVFP4
1220
01:00:03,359 --> 01:00:06,799
without loss of precision but gigantic
1221
01:00:06,799 --> 01:00:08,480
boost in performance and energy
1222
01:00:08,480 --> 01:00:10,559
efficiency. We've also been able to use
1223
01:00:10,559 --> 01:00:15,680
MVFP4 for training. So MVLink72, MVFP4,
1224
01:00:15,680 --> 01:00:19,359
the invention of Dynamo, Tensor RTLM, a
1225
01:00:19,359 --> 01:00:21,280
whole bunch of new algorithms. We even
1226
01:00:21,280 --> 01:00:23,520
built a supercomputer to help us
1227
01:00:23,520 --> 01:00:25,599
optimize kernels and help us optimize
1228
01:00:25,599 --> 01:00:27,839
our complete stack. We call it DGX
1229
01:00:27,839 --> 01:00:30,640
cloud. We invested billions of dollars
1230
01:00:30,640 --> 01:00:33,680
of supercomputing capability help us
1231
01:00:33,680 --> 01:00:36,319
create the kernels, the software that
1232
01:00:36,319 --> 01:00:40,319
made inference possible. Well,
1233
01:00:40,319 --> 01:00:44,559
the results all came together and people
1234
01:00:44,559 --> 01:00:46,640
told people used to tell me but Jensen
1235
01:00:46,640 --> 01:00:49,440
inference is so easy. Inference is the
1236
01:00:49,440 --> 01:00:51,680
ultimate hard. Inference is ultimate
1237
01:00:51,680 --> 01:00:53,599
hard. It is also ultimate important
1238
01:00:53,599 --> 01:00:55,760
because it drives your revenues. And so
1239
01:00:55,760 --> 01:00:58,240
this is the outcome. This is from semi
1240
01:00:58,240 --> 01:01:00,400
analysis. This is the largest most
1241
01:01:00,400 --> 01:01:04,559
comprehensive sweep of AI that has AI
1242
01:01:04,559 --> 01:01:06,720
inference that has ever been done. And
1243
01:01:06,720 --> 01:01:09,440
what you see here on the left on on this
1244
01:01:09,440 --> 01:01:13,839
side on this side is tokens per watt.
1245
01:01:13,839 --> 01:01:16,000
Tokens per watt is important because
1246
01:01:16,000 --> 01:01:18,799
every data center every single factory
1247
01:01:18,799 --> 01:01:21,119
by definition is power constrained. A
1248
01:01:21,119 --> 01:01:22,880
one gawatt factory will never become
1249
01:01:22,880 --> 01:01:25,839
two. It's physically constrained the
1250
01:01:25,839 --> 01:01:28,559
laws of atoms, the laws of physicality.
1251
01:01:28,559 --> 01:01:32,079
And so that one gigawatt of data center
1252
01:01:32,079 --> 01:01:34,880
you want to drive the maximum number of
1253
01:01:34,880 --> 01:01:38,240
tokens which is the production the
1254
01:01:38,240 --> 01:01:40,160
product of that factory. So you want
1255
01:01:40,160 --> 01:01:42,079
that you want to be on top of that curve
1256
01:01:42,079 --> 01:01:46,640
as high as you want. This the x- axis is
1257
01:01:46,640 --> 01:01:49,760
the interactivity the speed of inference
1258
01:01:49,760 --> 01:01:52,480
the speed of each inference. The faster
1259
01:01:52,480 --> 01:01:54,720
you can inference,
1260
01:01:54,720 --> 01:01:57,440
the faster you could of course respond.
1261
01:01:57,440 --> 01:01:59,760
But very importantly, the faster you can
1262
01:01:59,760 --> 01:02:02,400
inference, the larger the models, the
1263
01:02:02,400 --> 01:02:04,400
more context you could process, the more
1264
01:02:04,400 --> 01:02:07,599
tokens you can think through. This axis
1265
01:02:07,599 --> 01:02:11,839
is the same as smartness of the AI. And
1266
01:02:11,839 --> 01:02:13,920
so this is the throughput of the AI.
1267
01:02:13,920 --> 01:02:17,440
This is the smartness of the AI. Notice
1268
01:02:17,440 --> 01:02:20,400
the smarter the AI, the lower your
1269
01:02:20,400 --> 01:02:22,000
throughput. Makes sense? you're thinking
1270
01:02:22,000 --> 01:02:26,240
longer. Okay? And so this axis is the
1271
01:02:26,240 --> 01:02:27,359
speed. And I'm going to come back to
1272
01:02:27,359 --> 01:02:29,119
this. This is important. This is where I
1273
01:02:29,119 --> 01:02:32,000
torture all of you. But it's too
1274
01:02:32,000 --> 01:02:34,799
important. Every CEO in the world you
1275
01:02:34,799 --> 01:02:37,119
watch, every CEO in the world will study
1276
01:02:37,119 --> 01:02:39,200
their business from now on in the way
1277
01:02:39,200 --> 01:02:41,359
I'm about to describe
1278
01:02:41,359 --> 01:02:45,040
because this is your token factory. This
1279
01:02:45,040 --> 01:02:47,920
is your AI factory. This is your
1280
01:02:47,920 --> 01:02:49,839
revenues. There's no question about that
1281
01:02:49,839 --> 01:02:51,359
going forward. And so this is the
1282
01:02:51,359 --> 01:02:53,599
throughput. This is the intelligence.
1283
01:02:53,599 --> 01:02:57,040
Better perf per watt for a given power
1284
01:02:57,040 --> 01:02:59,599
of data center. The more throughput, the
1285
01:02:59,599 --> 01:03:01,520
more tokens you could produce. On this
1286
01:03:01,520 --> 01:03:05,359
side is cost. Notice Nvidia is the
1287
01:03:05,359 --> 01:03:07,440
highest performance in the world. Nobody
1288
01:03:07,440 --> 01:03:10,160
would be surprised by that. They would
1289
01:03:10,160 --> 01:03:13,119
be surprised by the fact that in one
1290
01:03:13,119 --> 01:03:15,599
generation whereas Moore's law would
1291
01:03:15,599 --> 01:03:19,520
have given us through transistors 50%
1292
01:03:19,520 --> 01:03:21,839
two times
1293
01:03:21,839 --> 01:03:24,559
Moore's law would probably give us one
1294
01:03:24,559 --> 01:03:27,440
and a half times more performance. You
1295
01:03:27,440 --> 01:03:30,319
would have expected from Hopper H200 one
1296
01:03:30,319 --> 01:03:32,319
and a half times higher. Nobody would
1297
01:03:32,319 --> 01:03:36,400
have expected 35 times higher. I said
1298
01:03:36,400 --> 01:03:40,319
last year at this time that Nvidia's
1299
01:03:40,319 --> 01:03:44,240
Grace Blackwell NVLink 72 was 35 times
1300
01:03:44,240 --> 01:03:48,400
perf per watt. Nobody believed me. And
1301
01:03:48,400 --> 01:03:51,680
then semi-analysis came out and Dylan
1302
01:03:51,680 --> 01:03:54,000
Patel had a quote.
1303
01:03:54,000 --> 01:03:58,039
He accused me of sandbagging.
1304
01:03:58,640 --> 01:04:00,400
He accused me of sandbagging. He says,
1305
01:04:00,400 --> 01:04:03,359
"Jensen sandbagged. It's actually 50
1306
01:04:03,359 --> 01:04:06,319
times." And he's not wrong. He's not
1307
01:04:06,319 --> 01:04:12,200
wrong. And so our cost per token, yeah,
1308
01:04:15,760 --> 01:04:18,480
our cost per token is the lowest in the
1309
01:04:18,480 --> 01:04:21,680
world. You can't beat it.
1310
01:04:21,680 --> 01:04:23,839
I've said before, if you have the wrong
1311
01:04:23,839 --> 01:04:26,480
architecture, even if it's free, it's
1312
01:04:26,480 --> 01:04:28,799
not cheap enough. And the reason for
1313
01:04:28,799 --> 01:04:30,960
that is because no matter what happens,
1314
01:04:30,960 --> 01:04:33,440
you still have to build a gigawatt data
1315
01:04:33,440 --> 01:04:35,520
center. You still have to build build a
1316
01:04:35,520 --> 01:04:37,440
gigawatt factory. And that gigawatt
1317
01:04:37,440 --> 01:04:40,400
factory for 15 years advertised across
1318
01:04:40,400 --> 01:04:42,480
that gigawatt factory is about $40
1319
01:04:42,480 --> 01:04:45,039
billion. Even when you put nothing on
1320
01:04:45,039 --> 01:04:47,520
it, it's $40 billion in. You better make
1321
01:04:47,520 --> 01:04:49,680
for darn sure you put the best computer
1322
01:04:49,680 --> 01:04:51,599
system on that thing so that you could
1323
01:04:51,599 --> 01:04:55,039
have the best token cost. Nvidia's token
1324
01:04:55,039 --> 01:04:57,599
cost is world class.
1325
01:04:57,599 --> 01:04:59,839
basically untouchable at the moment. And
1326
01:04:59,839 --> 01:05:02,079
the reason that true is because of
1327
01:05:02,079 --> 01:05:04,400
extreme code design. And so I'm very
1328
01:05:04,400 --> 01:05:08,920
happy that he named us.
1329
01:05:25,119 --> 01:05:27,599
There was a monkey king,
1330
01:05:27,599 --> 01:05:31,000
token king.
1331
01:05:33,039 --> 01:05:35,359
Well, we take we take all of our
1332
01:05:35,359 --> 01:05:36,799
software as I as I told you, we
1333
01:05:36,799 --> 01:05:37,920
vertically integrate, but we
1334
01:05:37,920 --> 01:05:40,240
horizontally open. We're vertical
1335
01:05:40,240 --> 01:05:42,000
integration, horizontal open. We
1336
01:05:42,000 --> 01:05:43,599
integrate all of our software and all of
1337
01:05:43,599 --> 01:05:45,359
our technology, however we could package
1338
01:05:45,359 --> 01:05:47,920
it up and integrate it into the world's
1339
01:05:47,920 --> 01:05:51,520
inference service providers. And these
1340
01:05:51,520 --> 01:05:54,640
these companies are growing so fast.
1341
01:05:54,640 --> 01:05:57,440
They're growing so fast. Fireworks. Lynn
1342
01:05:57,440 --> 01:06:00,319
is here together. They're just growing
1343
01:06:00,319 --> 01:06:02,960
so incredibly fast. A hundred times in
1344
01:06:02,960 --> 01:06:07,200
the last year. They are token factories.
1345
01:06:07,200 --> 01:06:09,760
And the effectiveness, the performance
1346
01:06:09,760 --> 01:06:12,400
and the token cost production capability
1347
01:06:12,400 --> 01:06:14,720
for their factories is everything to
1348
01:06:14,720 --> 01:06:17,359
them. And this is what happened.
1349
01:06:17,359 --> 01:06:21,440
This is we updated their software, same
1350
01:06:21,440 --> 01:06:23,200
system.
1351
01:06:23,200 --> 01:06:24,960
And notice
1352
01:06:24,960 --> 01:06:27,200
their token speeds.
1353
01:06:27,200 --> 01:06:30,960
Incredible. The difference before before
1354
01:06:30,960 --> 01:06:33,039
Nvidia updated everything and all of our
1355
01:06:33,039 --> 01:06:34,720
algorithms and software and all the
1356
01:06:34,720 --> 01:06:37,359
technology that we bring to bear
1357
01:06:37,359 --> 01:06:41,760
about 700 tokens per second average went
1358
01:06:41,760 --> 01:06:45,839
to nearly 5,000 7 times higher. And so
1359
01:06:45,839 --> 01:06:49,359
this is the incredible power of extreme
1360
01:06:49,359 --> 01:06:51,599
code design. I mentioned earlier the
1361
01:06:51,599 --> 01:06:54,000
importance of factories. This is the
1362
01:06:54,000 --> 01:06:55,839
importance of factory. Your data center,
1363
01:06:55,839 --> 01:06:57,839
it used to be a data center for files.
1364
01:06:57,839 --> 01:07:01,359
It's now a factory to generate tokens.
1365
01:07:01,359 --> 01:07:03,520
Your factory is limited no matter what.
1366
01:07:03,520 --> 01:07:05,440
Everybody's looking for land, power, and
1367
01:07:05,440 --> 01:07:08,079
shell. Once you build it, you are power
1368
01:07:08,079 --> 01:07:10,559
limited. within that power limited
1369
01:07:10,559 --> 01:07:13,039
infrastructure, you better make for darn
1370
01:07:13,039 --> 01:07:15,119
sure that your inference because you
1371
01:07:15,119 --> 01:07:17,280
know inference is your workload and
1372
01:07:17,280 --> 01:07:19,680
tokens is your new commodity that
1373
01:07:19,680 --> 01:07:22,480
compute is your revenues that you want
1374
01:07:22,480 --> 01:07:25,039
to make sure that the architecture is as
1375
01:07:25,039 --> 01:07:27,839
optimized as you can in the future.
1376
01:07:27,839 --> 01:07:30,880
every single CSP,
1377
01:07:30,880 --> 01:07:32,880
every single computer company, every
1378
01:07:32,880 --> 01:07:34,880
single cloud company, every single AI
1379
01:07:34,880 --> 01:07:37,520
company, every single
1380
01:07:37,520 --> 01:07:40,000
company period are going to be thinking
1381
01:07:40,000 --> 01:07:42,960
about their token factory effectiveness.
1382
01:07:42,960 --> 01:07:45,599
This is your factory in the future. And
1383
01:07:45,599 --> 01:07:47,440
the reason why I know that is because
1384
01:07:47,440 --> 01:07:49,440
everybody in this room is powered by
1385
01:07:49,440 --> 01:07:51,839
intelligence. And in the future, that
1386
01:07:51,839 --> 01:07:53,520
intelligence will be augmented by
1387
01:07:53,520 --> 01:07:56,880
tokens. So, let me show you how we got
1388
01:07:56,880 --> 01:07:59,880
here.
1389
01:07:59,920 --> 01:08:04,000
On April 6th, 2016, a decade ago, we
1390
01:08:04,000 --> 01:08:06,240
introduced DGX1,
1391
01:08:06,240 --> 01:08:08,240
the world's first computer designed for
1392
01:08:08,240 --> 01:08:10,720
deep learning.
1393
01:08:10,720 --> 01:08:13,359
Eight Pascal GPUs connected with the
1394
01:08:13,359 --> 01:08:16,080
first generation NVLink.
1395
01:08:16,080 --> 01:08:18,799
170 teraflops in one computer. The
1396
01:08:18,799 --> 01:08:21,600
world's first computer designed for AI
1397
01:08:21,600 --> 01:08:24,600
researchers.
1398
01:08:24,640 --> 01:08:27,359
With Volulta, we introduced NVLink
1399
01:08:27,359 --> 01:08:30,319
switch. 16 GPUs connected with full
1400
01:08:30,319 --> 01:08:32,880
alltoall bandwidth operating as one
1401
01:08:32,880 --> 01:08:36,239
giant GPU. A giant step forward, but
1402
01:08:36,239 --> 01:08:39,759
model sizes continued to grow. The data
1403
01:08:39,759 --> 01:08:42,480
center needed to become a single unit of
1404
01:08:42,480 --> 01:08:47,799
computing. So, Melanox joined Nvidia.
1405
01:08:49,440 --> 01:08:53,520
In 2020, DGXA100 Super Pod became the
1406
01:08:53,520 --> 01:08:56,480
first GPU supercomput combining scale up
1407
01:08:56,480 --> 01:08:59,279
and scale out architecture.
1408
01:08:59,279 --> 01:09:02,719
NVL link 3 for scale up, connect X6 and
1409
01:09:02,719 --> 01:09:07,319
Quantum Infiniban for scale out.
1410
01:09:08,159 --> 01:09:11,279
Then Hopper, the first GPU with the FP8
1411
01:09:11,279 --> 01:09:13,279
Transformer engine that launched the
1412
01:09:13,279 --> 01:09:18,319
generative AI era. MVLink 4, Connect X7,
1413
01:09:18,319 --> 01:09:21,600
Bluefield 3 DPUs, second generation
1414
01:09:21,600 --> 01:09:24,400
quantum infiniband. It revolutionized
1415
01:09:24,400 --> 01:09:27,400
computing.
1416
01:09:29,359 --> 01:09:31,759
Blackwell redefined AI supercomputing
1417
01:09:31,759 --> 01:09:35,759
system architecture with NVLink 72. 72
1418
01:09:35,759 --> 01:09:39,279
GPUs connected by NVLink spine 130
1419
01:09:39,279 --> 01:09:41,359
terabytes per second of all to all
1420
01:09:41,359 --> 01:09:43,199
bandwidth.
1421
01:09:43,199 --> 01:09:45,600
Compute trays integrate Blackwell GPUs,
1422
01:09:45,600 --> 01:09:51,839
Grace CPUs, Connect X8, and Bluefield 3.
1423
01:09:51,839 --> 01:09:54,880
Scale Out runs over Spectrum 4 Ethernet.
1424
01:09:54,880 --> 01:09:57,280
With three scaling laws in full steam,
1425
01:09:57,280 --> 01:09:59,360
pre-training, post-training, and
1426
01:09:59,360 --> 01:10:02,080
inference, and now Agentic systems,
1427
01:10:02,080 --> 01:10:04,080
compute demand continues to grow
1428
01:10:04,080 --> 01:10:07,080
exponentially.
1429
01:10:07,840 --> 01:10:10,640
And now Vera Rubin
1430
01:10:10,640 --> 01:10:12,960
architected for every phase of Agentic
1431
01:10:12,960 --> 01:10:16,239
AI advancing every pillar of computing
1432
01:10:16,239 --> 01:10:20,000
including CPU storage networking and
1433
01:10:20,000 --> 01:10:21,920
security.
1434
01:10:21,920 --> 01:10:26,560
Vera Rubin Nvlink 72 3.6 exoflops of
1435
01:10:26,560 --> 01:10:30,880
compute 260 tab per second of alltoall
1436
01:10:30,880 --> 01:10:33,440
NVLink bandwidth the engine
1437
01:10:33,440 --> 01:10:36,400
supercharging the era of Agentic AI. The
1438
01:10:36,400 --> 01:10:40,159
Vera CPU rack designed for orchestration
1439
01:10:40,159 --> 01:10:44,159
and agentic workflows. The STX rack AI
1440
01:10:44,159 --> 01:10:47,199
native storage built with Bluefield 4.
1441
01:10:47,199 --> 01:10:49,440
Scale out with Spectrum X co-ackaged
1442
01:10:49,440 --> 01:10:52,320
optics increasing energy efficiency and
1443
01:10:52,320 --> 01:10:54,960
resiliency. And an incredible new
1444
01:10:54,960 --> 01:10:58,800
addition, the Gro 3 LPX rack. Tightly
1445
01:10:58,800 --> 01:11:01,520
connected to Vera Rubin, Gro's LPU's
1446
01:11:01,520 --> 01:11:04,159
massive onchip SRAMM, a token
1447
01:11:04,159 --> 01:11:06,400
accelerator to the already incredibly
1448
01:11:06,400 --> 01:11:10,480
fast Vera Rubin. Together, 35 times more
1449
01:11:10,480 --> 01:11:13,600
throughput per megawatt. The new Vera
1450
01:11:13,600 --> 01:11:16,960
Rubin platform. Seven chips, five rack
1451
01:11:16,960 --> 01:11:19,760
scale computers, one revolutionary AI
1452
01:11:19,760 --> 01:11:23,120
supercomput for agentic AI.
1453
01:11:23,120 --> 01:11:26,480
40 million times more compute in just 10
1454
01:11:26,480 --> 01:11:29,480
years.
1455
01:11:39,679 --> 01:11:42,960
Now, in the in the good old days when I
1456
01:11:42,960 --> 01:11:45,120
would say hopper, I would hold up a
1457
01:11:45,120 --> 01:11:48,000
chip.
1458
01:11:48,000 --> 01:11:51,480
That's just adorable.
1459
01:11:51,600 --> 01:11:57,560
This is Vera Rubin. When we think ver
1460
01:11:59,920 --> 01:12:02,400
when we when we think Vera Rubin, we
1461
01:12:02,400 --> 01:12:05,040
think the entire system vertically
1462
01:12:05,040 --> 01:12:06,640
integrated
1463
01:12:06,640 --> 01:12:08,880
completely with software
1464
01:12:08,880 --> 01:12:12,480
extended end to end optimized as one
1465
01:12:12,480 --> 01:12:14,400
giant system. The reason why it's
1466
01:12:14,400 --> 01:12:16,000
designed for agentic systems is very
1467
01:12:16,000 --> 01:12:18,800
clear because agents of course the most
1468
01:12:18,800 --> 01:12:21,920
important workload is it's thinking the
1469
01:12:21,920 --> 01:12:23,679
large language model. The large language
1470
01:12:23,679 --> 01:12:25,360
models are going to larger and larger
1471
01:12:25,360 --> 01:12:27,040
and larger. It's going to generate more
1472
01:12:27,040 --> 01:12:28,960
and more tokens more quickly so it could
1473
01:12:28,960 --> 01:12:31,840
think more quickly. But it also has to
1474
01:12:31,840 --> 01:12:34,239
access memory. It's going to pound on
1475
01:12:34,239 --> 01:12:38,560
memory really hard. KV cache structured
1476
01:12:38,560 --> 01:12:42,320
data QDF unstructured data QVS. It's
1477
01:12:42,320 --> 01:12:44,719
going to be pounding on the me on the
1478
01:12:44,719 --> 01:12:46,800
storage system really really hard which
1479
01:12:46,800 --> 01:12:49,040
is the reason why we reinvented the
1480
01:12:49,040 --> 01:12:52,400
storage system. It is also going to use
1481
01:12:52,400 --> 01:12:56,320
tools and unlike humans that are more
1482
01:12:56,320 --> 01:12:59,679
tolerant to slower computers.
1483
01:12:59,679 --> 01:13:02,560
AI wants the tools to be as fast as
1484
01:13:02,560 --> 01:13:06,080
possible. These tools web browsers in
1485
01:13:06,080 --> 01:13:08,159
the future they could also be virtual
1486
01:13:08,159 --> 01:13:11,520
PCs in the cloud. Those PCs have to be
1487
01:13:11,520 --> 01:13:13,280
and those computers have to be as fast
1488
01:13:13,280 --> 01:13:17,520
as possible. We created a brand new CPU.
1489
01:13:17,520 --> 01:13:20,640
A brand new CPU that's designed for
1490
01:13:20,640 --> 01:13:22,640
extremely high singlethreaded
1491
01:13:22,640 --> 01:13:24,960
performance,
1492
01:13:24,960 --> 01:13:26,480
incredibly
1493
01:13:26,480 --> 01:13:29,520
high data output, incredibly good at
1494
01:13:29,520 --> 01:13:33,199
data processing, and extreme energy
1495
01:13:33,199 --> 01:13:36,320
efficiency. It is the only data center
1496
01:13:36,320 --> 01:13:40,239
CPU in the world that uses LPDDR5,
1497
01:13:40,239 --> 01:13:44,159
LPDDR5 and incredible single thread
1498
01:13:44,159 --> 01:13:46,400
performance and performance per watt
1499
01:13:46,400 --> 01:13:48,400
that is unrivaled.
1500
01:13:48,400 --> 01:13:51,120
And so that's we built that so that it
1501
01:13:51,120 --> 01:13:53,040
could go along with the rest of these
1502
01:13:53,040 --> 01:13:57,280
racks for agentic processing. And so
1503
01:13:57,280 --> 01:14:01,199
here it is. This is the Grace Blackwell.
1504
01:14:01,199 --> 01:14:03,199
Oh no, Vera Rubin. Where is it? Here it
1505
01:14:03,199 --> 01:14:05,440
is. Okay, so this is the Vera Rubin
1506
01:14:05,440 --> 01:14:08,719
system. Notice since the last time 100%
1507
01:14:08,719 --> 01:14:12,159
liquid cooled. All of the cables gone.
1508
01:14:12,159 --> 01:14:16,239
What used to take what used to take
1509
01:14:16,239 --> 01:14:20,480
2 days to install now takes two hours.
1510
01:14:20,480 --> 01:14:22,640
Incredible. And so the manufacturing
1511
01:14:22,640 --> 01:14:25,040
cycle time going to dramatically reduce.
1512
01:14:25,040 --> 01:14:27,600
This is also a supercomput that is
1513
01:14:27,600 --> 01:14:32,960
cooled by it's cooled by hot water 45°
1514
01:14:32,960 --> 01:14:34,800
which takes the pressure off of the data
1515
01:14:34,800 --> 01:14:37,360
center takes all of that cost and all of
1516
01:14:37,360 --> 01:14:39,440
that energy that's used to cool the data
1517
01:14:39,440 --> 01:14:42,000
center and makes it available for the
1518
01:14:42,000 --> 01:14:46,800
system. This is the secret sauce. It is
1519
01:14:46,800 --> 01:14:48,239
the only we're the only company in the
1520
01:14:48,239 --> 01:14:51,600
world that has today built the sixth
1521
01:14:51,600 --> 01:14:54,640
sixth generation scaleup switching
1522
01:14:54,640 --> 01:14:57,280
system. This is not Ethernet. This is
1523
01:14:57,280 --> 01:15:00,159
not Infiniban. This is MVLink. This is
1524
01:15:00,159 --> 01:15:02,719
the sixth generation MVLink. This is
1525
01:15:02,719 --> 01:15:05,040
insanely hard to do. Well, it is
1526
01:15:05,040 --> 01:15:07,040
insanely hard to do. Period. And I'm
1527
01:15:07,040 --> 01:15:09,199
just super proud of the team. MVLink
1528
01:15:09,199 --> 01:15:12,880
completely cooled. This is the brand new
1529
01:15:12,880 --> 01:15:14,480
Gro system. And I'll show you a little
1530
01:15:14,480 --> 01:15:17,600
bit more about it. this system.
1531
01:15:17,600 --> 01:15:21,120
Eight GU chips. This is the LP30. The
1532
01:15:21,120 --> 01:15:22,880
world's never seen it. Anything that the
1533
01:15:22,880 --> 01:15:25,760
world's ever seen is V1. This is third
1534
01:15:25,760 --> 01:15:27,280
generation.
1535
01:15:27,280 --> 01:15:30,560
And we're in volume production now. And
1536
01:15:30,560 --> 01:15:32,239
I'll show you more about that in just a
1537
01:15:32,239 --> 01:15:35,520
second. The world's first
1538
01:15:35,520 --> 01:15:37,440
CPO
1539
01:15:37,440 --> 01:15:41,280
Spectrum X switch. This is also in full
1540
01:15:41,280 --> 01:15:45,360
production. Co-packaged optics. Optics
1541
01:15:45,360 --> 01:15:47,760
comes directly onto this chip,
1542
01:15:47,760 --> 01:15:50,159
interfaces directly to silicon.
1543
01:15:50,159 --> 01:15:53,679
Electrons gets translated to photons and
1544
01:15:53,679 --> 01:15:56,080
it gets directly directly connected to
1545
01:15:56,080 --> 01:15:58,159
this chip. We invented the process
1546
01:15:58,159 --> 01:16:00,480
technology with TSMC. We're the only one
1547
01:16:00,480 --> 01:16:02,320
in production with it today. It's called
1548
01:16:02,320 --> 01:16:04,960
coupe. It's completely revolutionary.
1549
01:16:04,960 --> 01:16:07,120
Nvidia is in full production with
1550
01:16:07,120 --> 01:16:10,679
Spectrum X.
1551
01:16:12,400 --> 01:16:14,719
This is the Vera system. Twice the
1552
01:16:14,719 --> 01:16:17,920
performance per watt of any any CPUs in
1553
01:16:17,920 --> 01:16:20,159
the world today. It is also in
1554
01:16:20,159 --> 01:16:23,040
production. Well, you know, we never we
1555
01:16:23,040 --> 01:16:26,239
never thought we would be selling CPUs
1556
01:16:26,239 --> 01:16:29,440
standalone. Um, we are selling a lot of
1557
01:16:29,440 --> 01:16:32,000
CPU standalone. This is already for sure
1558
01:16:32,000 --> 01:16:33,280
going to be a multi-billion dollar
1559
01:16:33,280 --> 01:16:34,960
business for us. So, I'm very very
1560
01:16:34,960 --> 01:16:37,040
pleased with our CPU architects. We
1561
01:16:37,040 --> 01:16:40,800
designed a revolutionary CPU and this is
1562
01:16:40,800 --> 01:16:42,640
the CX9
1563
01:16:42,640 --> 01:16:46,960
powered with Vera CPU, the Bluefield 4
1564
01:16:46,960 --> 01:16:49,840
STX, our new storage platform. Okay, so
1565
01:16:49,840 --> 01:16:52,800
these are the four these are the the the
1566
01:16:52,800 --> 01:16:56,159
racks and it's connected
1567
01:16:56,159 --> 01:16:58,800
each one of these racks, the MVL link
1568
01:16:58,800 --> 01:17:00,960
rack.
1569
01:17:00,960 --> 01:17:03,840
This is I've shown you guys this before.
1570
01:17:03,840 --> 01:17:05,280
It's a super heavy and seems to get
1571
01:17:05,280 --> 01:17:08,159
heavier every year.
1572
01:17:08,159 --> 01:17:09,679
because I think there's just more cables
1573
01:17:09,679 --> 01:17:11,760
in there every year. And so, so this is
1574
01:17:11,760 --> 01:17:14,880
the MVLink rack. We've also taken this
1575
01:17:14,880 --> 01:17:19,040
technology because it it is so
1576
01:17:19,040 --> 01:17:22,159
efficient to create a data center with
1577
01:17:22,159 --> 01:17:24,239
these cabling systems, structured
1578
01:17:24,239 --> 01:17:26,480
cables. So, we decided to do that for
1579
01:17:26,480 --> 01:17:31,840
Ethernet. So, this is Ethernet, 256
1580
01:17:31,840 --> 01:17:35,360
liquid cooled nodes in one rack. And it
1581
01:17:35,360 --> 01:17:39,840
is also connected with these incredible
1582
01:17:39,840 --> 01:17:42,840
connectors.
1583
01:17:43,199 --> 01:17:46,159
You guys want to see um
1584
01:17:46,159 --> 01:17:49,480
Reuben Ultra.
1585
01:18:01,920 --> 01:18:03,920
So this is the Reuben Ultra compute
1586
01:18:03,920 --> 01:18:08,640
node. Unlike Reuben that slides in
1587
01:18:08,640 --> 01:18:12,239
horizontally, Ruben Ultra goes into a
1588
01:18:12,239 --> 01:18:15,280
whole new rack. It's called Kyber that
1589
01:18:15,280 --> 01:18:20,800
enables us to connect 144 GPUs in one
1590
01:18:20,800 --> 01:18:24,560
MVLink domain. And so the Kyber rack,
1591
01:18:24,560 --> 01:18:28,719
this I I could lift it, I'm sure, but I
1592
01:18:28,719 --> 01:18:30,640
won't.
1593
01:18:30,640 --> 01:18:33,760
It's quite heavy. This This is one
1594
01:18:33,760 --> 01:18:35,920
compute node, and it slides into the
1595
01:18:35,920 --> 01:18:38,719
Kyber rack vertically.
1596
01:18:38,719 --> 01:18:41,120
This is where it connects into. This is
1597
01:18:41,120 --> 01:18:44,800
the midplane. The Kyber racks, those
1598
01:18:44,800 --> 01:18:48,719
four top MVLink connectors slide in and
1599
01:18:48,719 --> 01:18:52,000
connect into this. And this becomes one
1600
01:18:52,000 --> 01:18:54,000
of the nodes.
1601
01:18:54,000 --> 01:18:55,760
And each one of these racks is a
1602
01:18:55,760 --> 01:18:57,840
different compute node. And this is the
1603
01:18:57,840 --> 01:19:01,760
amazing part. This is the midplane.
1604
01:19:01,760 --> 01:19:04,080
And the back of the midplane, instead of
1605
01:19:04,080 --> 01:19:06,560
the cabling system,
1606
01:19:06,560 --> 01:19:08,960
which has its limits in terms of how far
1607
01:19:08,960 --> 01:19:11,840
we could drive cables, copper cables, we
1608
01:19:11,840 --> 01:19:15,440
now have this system to connect 144
1609
01:19:15,440 --> 01:19:20,719
GPUs. This is the new MVLink. This sits
1610
01:19:20,719 --> 01:19:24,400
also vertically and it con connects into
1611
01:19:24,400 --> 01:19:27,440
the midplanes on the back. Compute in
1612
01:19:27,440 --> 01:19:31,520
the front, MVLink switches in the back.
1613
01:19:31,520 --> 01:19:37,440
One giant computer. Okay. So that is
1614
01:19:37,440 --> 01:19:40,920
Reuben Ultra
1615
01:19:46,080 --> 01:19:50,520
as I mentioned. as I mentioned.
1616
01:19:51,360 --> 01:19:54,560
How about we t take this back down?
1617
01:19:54,560 --> 01:19:56,960
I need the rest of my slides.
1618
01:19:56,960 --> 01:19:59,199
>> Oh, it's coming down. Okay. Thank you,
1619
01:19:59,199 --> 01:20:01,679
Janine.
1620
01:20:01,679 --> 01:20:04,320
This is what happens when you This is
1621
01:20:04,320 --> 01:20:08,280
what happens when you don't practice.
1622
01:20:11,760 --> 01:20:17,040
Okay. All right. So, um you saw you
1623
01:20:17,040 --> 01:20:20,400
Take your time. Just don't get hurt.
1624
01:20:20,400 --> 01:20:23,600
You saw you saw this slide. You know,
1625
01:20:23,600 --> 01:20:25,600
only at Nvidia's keynote will you see
1626
01:20:25,600 --> 01:20:28,640
last year's slide presented again. And
1627
01:20:28,640 --> 01:20:30,159
the reason for that is I just want to
1628
01:20:30,159 --> 01:20:31,840
let you know that last year I told you
1629
01:20:31,840 --> 01:20:33,760
something very, very important. And it's
1630
01:20:33,760 --> 01:20:35,280
so important. It's worthwhile to tell
1631
01:20:35,280 --> 01:20:37,040
you again.
1632
01:20:37,040 --> 01:20:39,040
This is probably the single most
1633
01:20:39,040 --> 01:20:41,679
important chart for the future of AI
1634
01:20:41,679 --> 01:20:44,320
factories. And every CEO, every CEO in
1635
01:20:44,320 --> 01:20:46,080
the world will be tracking it. We'll be
1636
01:20:46,080 --> 01:20:49,040
studying it very deeply. It's much much
1637
01:20:49,040 --> 01:20:50,320
more complicated than this. It's
1638
01:20:50,320 --> 01:20:51,920
multi-dimensional.
1639
01:20:51,920 --> 01:20:55,040
But you will be studying the throughput
1640
01:20:55,040 --> 01:20:58,159
and this token speed of your AI
1641
01:20:58,159 --> 01:21:00,800
factories. The throughput, token speed
1642
01:21:00,800 --> 01:21:03,600
at ISO power because that's all the
1643
01:21:03,600 --> 01:21:06,400
power you have. Throughput and token
1644
01:21:06,400 --> 01:21:09,840
speed for your factories forever. And
1645
01:21:09,840 --> 01:21:12,800
that that analysis is going to lead
1646
01:21:12,800 --> 01:21:15,520
directly to your revenues. What you do
1647
01:21:15,520 --> 01:21:18,640
this year will show up precisely next
1648
01:21:18,640 --> 01:21:21,600
year as your revenues. And this chart is
1649
01:21:21,600 --> 01:21:23,600
what it's all about. And I said on the
1650
01:21:23,600 --> 01:21:25,920
vertical axis, on the vertical axis,
1651
01:21:25,920 --> 01:21:28,719
thank you guys. On the vertical axis is
1652
01:21:28,719 --> 01:21:31,199
throughput. On the horizontal axis is
1653
01:21:31,199 --> 01:21:33,120
token rate. Today I'm going to show you
1654
01:21:33,120 --> 01:21:35,520
this
1655
01:21:35,520 --> 01:21:37,679
because we're able because we're now
1656
01:21:37,679 --> 01:21:40,880
able to increase the token speed and
1657
01:21:40,880 --> 01:21:43,600
because model sizes are increasing
1658
01:21:43,600 --> 01:21:46,080
because the token length the context
1659
01:21:46,080 --> 01:21:48,640
length depending on the different grades
1660
01:21:48,640 --> 01:21:51,360
of a different application use case
1661
01:21:51,360 --> 01:21:54,239
continues to grow from maybe a 100,000
1662
01:21:54,239 --> 01:21:58,719
tokens input length to maybe millions.
1663
01:21:58,719 --> 01:22:01,600
the token input length is growing and
1664
01:22:01,600 --> 01:22:03,920
also the output token length is growing.
1665
01:22:03,920 --> 01:22:09,440
And so all of these play into ultimately
1666
01:22:09,440 --> 01:22:12,320
the marketing and the pricing of future
1667
01:22:12,320 --> 01:22:15,840
tokens. Tokens are the new commodity and
1668
01:22:15,840 --> 01:22:18,080
like all commodities once it reaches an
1669
01:22:18,080 --> 01:22:20,320
inflection once it becomes mature or
1670
01:22:20,320 --> 01:22:22,719
becomes maturing it will segment into
1671
01:22:22,719 --> 01:22:27,199
different parts. The high throughput
1672
01:22:27,199 --> 01:22:29,760
low speed could be used for the free
1673
01:22:29,760 --> 01:22:32,400
tier. The next tier could be the medium
1674
01:22:32,400 --> 01:22:35,360
tier. Larger model, maybe higher speed
1675
01:22:35,360 --> 01:22:39,840
for sure, larger input context length.
1676
01:22:39,840 --> 01:22:42,000
That translates to a different price
1677
01:22:42,000 --> 01:22:44,239
point. You could see from all the
1678
01:22:44,239 --> 01:22:46,080
different services, this one is free.
1679
01:22:46,080 --> 01:22:47,840
It's a free tier. The first tier could
1680
01:22:47,840 --> 01:22:50,560
be $3 per million tokens. The next tier
1681
01:22:50,560 --> 01:22:53,199
could be $6 per million tokens. You
1682
01:22:53,199 --> 01:22:54,800
would like to be able to keep pushing
1683
01:22:54,800 --> 01:22:57,600
this boundary because the larger the
1684
01:22:57,600 --> 01:23:01,040
model smarter, the more input token
1685
01:23:01,040 --> 01:23:04,480
context length, more relevant, the
1686
01:23:04,480 --> 01:23:07,360
higher the speed, the long the more you
1687
01:23:07,360 --> 01:23:10,080
can think and iterate smarter AI models.
1688
01:23:10,080 --> 01:23:12,800
So this is about smarter AI models. And
1689
01:23:12,800 --> 01:23:15,120
when you have smarter AI models, each
1690
01:23:15,120 --> 01:23:17,120
one of these clicks allows you to
1691
01:23:17,120 --> 01:23:20,000
increase the price. So this is $45. And
1692
01:23:20,000 --> 01:23:22,000
maybe one day there'll be a premium
1693
01:23:22,000 --> 01:23:24,880
model that allows you a premium service
1694
01:23:24,880 --> 01:23:28,080
that allows you to generate token speeds
1695
01:23:28,080 --> 01:23:30,239
that are incredibly high because you're
1696
01:23:30,239 --> 01:23:32,239
in a critical path or maybe you're doing
1697
01:23:32,239 --> 01:23:35,520
really long research and $150 per
1698
01:23:35,520 --> 01:23:38,159
million tokens is just not a thing. So
1699
01:23:38,159 --> 01:23:40,960
let's translate that. Suppose you were
1700
01:23:40,960 --> 01:23:43,199
to use 50 million tokens per day as a
1701
01:23:43,199 --> 01:23:46,880
researcher at $150 per million tokens.
1702
01:23:46,880 --> 01:23:48,960
As it turns out, as a research team,
1703
01:23:48,960 --> 01:23:51,440
that's not even a thing. So, we believe
1704
01:23:51,440 --> 01:23:53,679
that this is the future. This is where
1705
01:23:53,679 --> 01:23:55,920
AI wants to go. This is where it is
1706
01:23:55,920 --> 01:23:57,840
today.
1707
01:23:57,840 --> 01:24:00,320
It had to start here to establish the
1708
01:24:00,320 --> 01:24:02,480
value and establish it usefulness and
1709
01:24:02,480 --> 01:24:04,400
get better and better and better. In the
1710
01:24:04,400 --> 01:24:05,679
future, you're going to see most
1711
01:24:05,679 --> 01:24:08,080
services encompass encompass all of
1712
01:24:08,080 --> 01:24:10,719
that. This is Hopper.
1713
01:24:10,719 --> 01:24:12,960
Hopper started and I moved it moved the
1714
01:24:12,960 --> 01:24:16,080
chart. This is 50. This is 100. Hopper
1715
01:24:16,080 --> 01:24:17,600
looks like this. And you would have
1716
01:24:17,600 --> 01:24:20,080
expected Hopper the next generation to
1717
01:24:20,080 --> 01:24:22,000
be higher, but nobody would have
1718
01:24:22,000 --> 01:24:24,239
expected it to be that much higher. This
1719
01:24:24,239 --> 01:24:26,560
is Grace Blackwell. What Grace Blackwell
1720
01:24:26,560 --> 01:24:29,280
did is at your free tier increase your
1721
01:24:29,280 --> 01:24:31,600
throughput tremendously.
1722
01:24:31,600 --> 01:24:33,760
However,
1723
01:24:33,760 --> 01:24:36,960
where you mostly monetize your service,
1724
01:24:36,960 --> 01:24:39,199
it increased your throughput by 35
1725
01:24:39,199 --> 01:24:41,840
times. This is no different than any
1726
01:24:41,840 --> 01:24:44,239
product that every company makes. The
1727
01:24:44,239 --> 01:24:46,639
higher the tier, the higher the quality,
1728
01:24:46,639 --> 01:24:49,040
the higher the performance, the lower
1729
01:24:49,040 --> 01:24:51,600
the volume, the lower the capacity. And
1730
01:24:51,600 --> 01:24:53,600
so it is no different than any other
1731
01:24:53,600 --> 01:24:56,159
business in the world. And so now we're
1732
01:24:56,159 --> 01:25:00,639
able to increase this tier by 35x.
1733
01:25:00,639 --> 01:25:04,880
And we introduced a whole new tier.
1734
01:25:04,880 --> 01:25:06,960
This this is the benefit of Grace
1735
01:25:06,960 --> 01:25:10,639
Blackwell. A huge jump over Hopper.
1736
01:25:10,639 --> 01:25:14,679
Well, this is what we're doing with
1737
01:25:15,920 --> 01:25:19,120
Okay. So, this is Grace Blackwell. Okay.
1738
01:25:19,120 --> 01:25:22,080
Let me just reset reset this.
1739
01:25:22,080 --> 01:25:25,760
And this is Vera Rubin.
1740
01:25:25,760 --> 01:25:28,760
Okay.
1741
01:25:31,840 --> 01:25:33,520
Now, just think just think what just
1742
01:25:33,520 --> 01:25:36,639
happened at every single tier. At every
1743
01:25:36,639 --> 01:25:38,880
single tier, at every single tier, we
1744
01:25:38,880 --> 01:25:41,520
increase the throughput. And at the tier
1745
01:25:41,520 --> 01:25:43,920
that where your highest ASP and your
1746
01:25:43,920 --> 01:25:46,960
most valuable segment, we increased it
1747
01:25:46,960 --> 01:25:48,960
by 10x.
1748
01:25:48,960 --> 01:25:51,920
That is the hard work. This is
1749
01:25:51,920 --> 01:25:54,159
incredibly hard to do out here. This is
1750
01:25:54,159 --> 01:25:56,239
the benefit of EVL 72. This is the
1751
01:25:56,239 --> 01:25:58,480
benefit of extremely low latency. This
1752
01:25:58,480 --> 01:26:00,480
is the benefit of extreme code design
1753
01:26:00,480 --> 01:26:03,280
that we could shift the entire area up.
1754
01:26:03,280 --> 01:26:04,880
Now, what does it mean from a customer
1755
01:26:04,880 --> 01:26:06,960
perspective in the end? Suppose I were
1756
01:26:06,960 --> 01:26:09,280
to take all of that and I just, you
1757
01:26:09,280 --> 01:26:11,760
know, multiply it against suppose I took
1758
01:26:11,760 --> 01:26:14,239
25% of my power, used it in free tier,
1759
01:26:14,239 --> 01:26:16,880
25% of my power in the medium tier, 25%
1760
01:26:16,880 --> 01:26:18,800
of my power in the high tier and 25% of
1761
01:26:18,800 --> 01:26:21,440
my power in the premium tier. My data
1762
01:26:21,440 --> 01:26:23,840
center only has a gigawatt.
1763
01:26:23,840 --> 01:26:26,159
And so I get to decide how I want to
1764
01:26:26,159 --> 01:26:28,080
distribute. The free tier allows me to
1765
01:26:28,080 --> 01:26:30,239
attract more customers.
1766
01:26:30,239 --> 01:26:32,480
This allows me to serve my most valuable
1767
01:26:32,480 --> 01:26:34,080
customers.
1768
01:26:34,080 --> 01:26:36,480
And the combination, the product of all
1769
01:26:36,480 --> 01:26:39,920
that allows you basically your revenues,
1770
01:26:39,920 --> 01:26:42,159
the revenues you can generate, assuming
1771
01:26:42,159 --> 01:26:45,120
this simplistic example, allows
1772
01:26:45,120 --> 01:26:48,960
Blackwell to generate five times more
1773
01:26:48,960 --> 01:26:51,199
revenues.
1774
01:26:51,199 --> 01:26:56,280
Vera Rubin to generate five times. Yeah.
1775
01:27:00,080 --> 01:27:02,159
So if you're a Reuben, you should get
1776
01:27:02,159 --> 01:27:05,199
there as soon as you can. And the reason
1777
01:27:05,199 --> 01:27:07,600
for that is because your your cost of
1778
01:27:07,600 --> 01:27:09,440
tokens goes down and your throughput
1779
01:27:09,440 --> 01:27:12,639
goes up now. But we want even more. We
1780
01:27:12,639 --> 01:27:14,239
want even more. And so let me just show
1781
01:27:14,239 --> 01:27:17,840
you back to this. This is as you as I as
1782
01:27:17,840 --> 01:27:20,159
I told you this throughput requires a
1783
01:27:20,159 --> 01:27:24,080
ton of flops. This latency, this
1784
01:27:24,080 --> 01:27:26,320
interactivity requires enormous amount
1785
01:27:26,320 --> 01:27:29,600
of bandwidth. Computers don't like
1786
01:27:29,600 --> 01:27:31,520
extreme amount of flops, extreme amount
1787
01:27:31,520 --> 01:27:32,880
of bandwidth because there's only so
1788
01:27:32,880 --> 01:27:35,920
much surface area for chips that any
1789
01:27:35,920 --> 01:27:38,880
systems has. And so optimizing for high
1790
01:27:38,880 --> 01:27:40,800
throughput and optimizing for low
1791
01:27:40,800 --> 01:27:43,760
latency are in fact enemies of each
1792
01:27:43,760 --> 01:27:46,560
other. And so this is what happened when
1793
01:27:46,560 --> 01:27:49,440
we combined with rock. Okay. And so we
1794
01:27:49,440 --> 01:27:51,280
we acquired the team that worked on the
1795
01:27:51,280 --> 01:27:53,199
Gro chips and licensed the technology
1796
01:27:53,199 --> 01:27:55,120
and we've been working together now to
1797
01:27:55,120 --> 01:27:57,679
integrate the system. This is what that
1798
01:27:57,679 --> 01:28:01,760
looks like. So at the most valuable tier
1799
01:28:01,760 --> 01:28:03,920
at the most valuable tier we're now
1800
01:28:03,920 --> 01:28:06,719
going to increase performance by 35x.
1801
01:28:06,719 --> 01:28:10,159
Now this very simple chart revealed to
1802
01:28:10,159 --> 01:28:14,639
you exactly the reason why Nvidia is so
1803
01:28:14,639 --> 01:28:17,280
strong in the vast majority of the
1804
01:28:17,280 --> 01:28:19,600
workloads so far. And the reason for
1805
01:28:19,600 --> 01:28:22,639
that is because up in this area
1806
01:28:22,639 --> 01:28:26,080
throughput matters so much. MVLink 72 is
1807
01:28:26,080 --> 01:28:28,639
so gamechanging. It is exactly the right
1808
01:28:28,639 --> 01:28:31,760
architecture and it's even hard to beat
1809
01:28:31,760 --> 01:28:35,840
even as you add Grock to it. However,
1810
01:28:35,840 --> 01:28:38,800
if you extended this chart way out here
1811
01:28:38,800 --> 01:28:40,880
and you said you wanted to have services
1812
01:28:40,880 --> 01:28:43,679
that delivers not 400 tokens per second
1813
01:28:43,679 --> 01:28:46,000
but a thousand tokens per second, all of
1814
01:28:46,000 --> 01:28:48,880
a sudden MVLink72 runs out of steam and
1815
01:28:48,880 --> 01:28:51,360
it simply can't get there. We just don't
1816
01:28:51,360 --> 01:28:53,120
have enough bandwidth. And so this is
1817
01:28:53,120 --> 01:28:55,440
where Grock comes in and this is what
1818
01:28:55,440 --> 01:28:59,280
happens when we push that out. So it
1819
01:28:59,280 --> 01:29:04,440
goes out beyond Thank you.
1820
01:29:05,440 --> 01:29:08,000
goes out beyond even the limits of what
1821
01:29:08,000 --> 01:29:10,239
MVLink72 can do. And if you were to do
1822
01:29:10,239 --> 01:29:14,880
that, translate that into revenues
1823
01:29:14,880 --> 01:29:19,199
relative to Blackwell Vera Rubin is 5x.
1824
01:29:19,199 --> 01:29:21,280
If most of your workload is high
1825
01:29:21,280 --> 01:29:24,080
throughput, I would stick with just 100%
1826
01:29:24,080 --> 01:29:27,199
Vera Rubin. If a lot of your workload
1827
01:29:27,199 --> 01:29:31,600
wants to be coding and very high valued
1828
01:29:31,600 --> 01:29:34,400
engineering token generation, I would
1829
01:29:34,400 --> 01:29:36,719
add Grock to it. I would add Grock to
1830
01:29:36,719 --> 01:29:39,199
maybe 25% of my total data center. The
1831
01:29:39,199 --> 01:29:41,760
rest of my data center is all 100% Vera
1832
01:29:41,760 --> 01:29:45,600
Rubin. And so that gives you a sense of
1833
01:29:45,600 --> 01:29:48,800
how you would add Grock to Vera Rubin
1834
01:29:48,800 --> 01:29:51,040
and extend its performance and extend
1835
01:29:51,040 --> 01:29:53,280
its value even more. This is what
1836
01:29:53,280 --> 01:29:55,600
happens.
1837
01:29:55,600 --> 01:29:58,000
Ver this is a contrast. The reason why
1838
01:29:58,000 --> 01:29:59,600
the reason why Grock was so attractive
1839
01:29:59,600 --> 01:30:02,800
to me is because their computing system
1840
01:30:02,800 --> 01:30:05,920
a deterministic data flow processor it
1841
01:30:05,920 --> 01:30:09,440
is statically compiled. It is compiler
1842
01:30:09,440 --> 01:30:12,239
scheduled meaning the compiler figures
1843
01:30:12,239 --> 01:30:15,360
out when the data when to do the compute
1844
01:30:15,360 --> 01:30:17,199
the the compute and the data arrives at
1845
01:30:17,199 --> 01:30:19,360
the same time. All of that is done
1846
01:30:19,360 --> 01:30:21,840
statically in advance
1847
01:30:21,840 --> 01:30:25,360
and scheduled completely in software.
1848
01:30:25,360 --> 01:30:28,080
There's no dynamic scheduling.
1849
01:30:28,080 --> 01:30:30,320
The architecture is designed with
1850
01:30:30,320 --> 01:30:33,360
massive amounts of SRAMM. It is designed
1851
01:30:33,360 --> 01:30:36,719
just for inference. This one workload.
1852
01:30:36,719 --> 01:30:38,639
Now, this one workload, as it turns out,
1853
01:30:38,639 --> 01:30:41,360
is the workload of AI factories. And as
1854
01:30:41,360 --> 01:30:43,280
the world continues to increase the
1855
01:30:43,280 --> 01:30:45,920
amount of high-speed tokens it wants to
1856
01:30:45,920 --> 01:30:48,239
generate with super smart tokens it
1857
01:30:48,239 --> 01:30:50,400
wants to generate, the value of this
1858
01:30:50,400 --> 01:30:52,560
integration is going to get even higher.
1859
01:30:52,560 --> 01:30:54,400
And so these are two extreme processors.
1860
01:30:54,400 --> 01:30:58,880
You could see one chip 500 megabytes,
1861
01:30:58,880 --> 01:31:02,480
one Vera Ruben chip, one Ruben chip 288
1862
01:31:02,480 --> 01:31:05,199
gigabytes.
1863
01:31:05,199 --> 01:31:08,639
It would take a lot of rock chips to be
1864
01:31:08,639 --> 01:31:11,600
able to hold the parameter size of
1865
01:31:11,600 --> 01:31:14,159
Reuben as well as all of the context
1866
01:31:14,159 --> 01:31:16,320
that has to go the KV cache that has to
1867
01:31:16,320 --> 01:31:18,719
go along with it. So that limited
1868
01:31:18,719 --> 01:31:21,040
Grock's ability to really reach the
1869
01:31:21,040 --> 01:31:23,760
mainstream to really take off until we
1870
01:31:23,760 --> 01:31:25,840
had a great idea. What if we
1871
01:31:25,840 --> 01:31:28,080
disagregated inference altogether with a
1872
01:31:28,080 --> 01:31:30,159
piece of software called Dynamo? What if
1873
01:31:30,159 --> 01:31:32,639
we rearchitected the way that inference
1874
01:31:32,639 --> 01:31:35,360
is done in the pipeline? so that we
1875
01:31:35,360 --> 01:31:37,760
could put the work that makes perfect
1876
01:31:37,760 --> 01:31:41,920
sense on Vera Rubin and then offload the
1877
01:31:41,920 --> 01:31:45,360
decode generation the low latency the
1878
01:31:45,360 --> 01:31:48,080
bandwidth limited challenged part of the
1879
01:31:48,080 --> 01:31:51,360
workload for Grock and so we united
1880
01:31:51,360 --> 01:31:52,880
unified
1881
01:31:52,880 --> 01:31:55,600
two processors of extreme differences
1882
01:31:55,600 --> 01:31:57,760
one for high throughput one for low
1883
01:31:57,760 --> 01:32:00,080
latency it still doesn't change the fact
1884
01:32:00,080 --> 01:32:02,639
that we need a lot of memory and so
1885
01:32:02,639 --> 01:32:04,239
Grock we're just going
1886
01:32:04,239 --> 01:32:06,960
add a whole bunch of Grock chips which
1887
01:32:06,960 --> 01:32:09,600
expands the amount of memory it has and
1888
01:32:09,600 --> 01:32:12,639
so if you could just imagine
1889
01:32:12,639 --> 01:32:15,600
out of a trillion parameter model we
1890
01:32:15,600 --> 01:32:18,480
have to store all of that in gro chips
1891
01:32:18,480 --> 01:32:21,440
however it sits next to Nvidia Vera
1892
01:32:21,440 --> 01:32:23,760
Rubin where we could we could hold the
1893
01:32:23,760 --> 01:32:26,800
massive amounts of KV cache that's
1894
01:32:26,800 --> 01:32:28,800
necessary in processing all of these
1895
01:32:28,800 --> 01:32:31,440
agentic AI systems it's based upon this
1896
01:32:31,440 --> 01:32:35,280
idea of this aggregated inference we do
1897
01:32:35,280 --> 01:32:38,560
the prefill that's the easy part but we
1898
01:32:38,560 --> 01:32:42,320
also tightly integrate the decode so the
1899
01:32:42,320 --> 01:32:44,239
attention part of decode is done on
1900
01:32:44,239 --> 01:32:46,960
Nvidia's Vera Rubin which needs a lot of
1901
01:32:46,960 --> 01:32:50,800
math and the feed forward network part
1902
01:32:50,800 --> 01:32:53,840
of it the decode part is done uh the
1903
01:32:53,840 --> 01:32:55,600
token generation part is done on Vera
1904
01:32:55,600 --> 01:32:58,000
Rubin on the uh on the groip the two of
1905
01:32:58,000 --> 01:32:59,920
them working tightly coupled together
1906
01:32:59,920 --> 01:33:03,600
over today Ethernet with a special mode
1907
01:33:03,600 --> 01:33:06,320
to reduce its latency by about half. And
1908
01:33:06,320 --> 01:33:08,719
so that capability allows us to
1909
01:33:08,719 --> 01:33:10,639
integrate these two systems. We run
1910
01:33:10,639 --> 01:33:12,880
Dynamo, this incredible operating system
1911
01:33:12,880 --> 01:33:15,600
for AI factories on top of it. And you
1912
01:33:15,600 --> 01:33:20,320
get 35 times increase. 35 times
1913
01:33:20,320 --> 01:33:23,040
increase. Not to mention additional new
1914
01:33:23,040 --> 01:33:26,560
tiers of inference performance for token
1915
01:33:26,560 --> 01:33:28,639
generation the world's never seen. So
1916
01:33:28,639 --> 01:33:32,520
this is it. This is Grock.
1917
01:33:38,719 --> 01:33:41,600
the Vera Rubin systems including Grock.
1918
01:33:41,600 --> 01:33:43,840
I want to thank Samsung uh who
1919
01:33:43,840 --> 01:33:47,040
manufactures the Gro LP30 chip for us
1920
01:33:47,040 --> 01:33:48,639
and they're cranking as hard as they
1921
01:33:48,639 --> 01:33:51,120
can. I really appreciate appreciate you
1922
01:33:51,120 --> 01:33:53,360
guys. We're in production with the Gro
1923
01:33:53,360 --> 01:33:56,320
chip and uh you know we'll ship it in
1924
01:33:56,320 --> 01:33:58,239
the second half probably about Q3 time
1925
01:33:58,239 --> 01:34:01,920
frame. Okay.
1926
01:34:01,920 --> 01:34:06,199
Grock LPX
1927
01:34:09,280 --> 01:34:12,000
Vera Rubin you know it's kind of hard
1928
01:34:12,000 --> 01:34:15,520
it's kind of hard to imagine any more
1929
01:34:15,520 --> 01:34:17,520
customers
1930
01:34:17,520 --> 01:34:20,480
you know and and uh the the really great
1931
01:34:20,480 --> 01:34:23,280
thing is is um Grace Blackwell early
1932
01:34:23,280 --> 01:34:25,280
sampling of it was really complicated
1933
01:34:25,280 --> 01:34:27,360
because of coming together of Envy Link
1934
01:34:27,360 --> 01:34:29,760
72 but the sampling of Vera Rubin is
1935
01:34:29,760 --> 01:34:32,080
just going incredibly well and in fact
1936
01:34:32,080 --> 01:34:34,560
Satia I think texted out already that
1937
01:34:34,560 --> 01:34:36,800
the first Vera Rubin rack is already up
1938
01:34:36,800 --> 01:34:39,120
and running at Microsoft Azure and so
1939
01:34:39,120 --> 01:34:40,800
I'm super excited for them. We're just
1940
01:34:40,800 --> 01:34:42,960
going to keep cranking these things out.
1941
01:34:42,960 --> 01:34:45,360
We have now set up a supply chain that
1942
01:34:45,360 --> 01:34:48,960
could manufacture thousands a week of
1943
01:34:48,960 --> 01:34:51,840
these systems essentially multi-
1944
01:34:51,840 --> 01:34:55,440
gigawatts of AI factories per month
1945
01:34:55,440 --> 01:34:58,000
inside our supply chain. And so we're
1946
01:34:58,000 --> 01:34:59,760
going to crank out these these Vera
1947
01:34:59,760 --> 01:35:01,440
Rubin racks while we're cranking out the
1948
01:35:01,440 --> 01:35:04,320
GB300 racks. We are in full production.
1949
01:35:04,320 --> 01:35:06,560
The Vera CPUs
1950
01:35:06,560 --> 01:35:08,639
incredibly successful. And the reason
1951
01:35:08,639 --> 01:35:11,760
for that is because AI needs CPUs for
1952
01:35:11,760 --> 01:35:15,199
tool use and Vera CPU was designed just
1953
01:35:15,199 --> 01:35:17,360
perfectly for that sweet spot.
1954
01:35:17,360 --> 01:35:19,760
Incredible for the next generation of
1955
01:35:19,760 --> 01:35:22,800
data processing. Vera CPU is ideal. the
1956
01:35:22,800 --> 01:35:26,320
Vera CPU plus blue plus CX9 connected
1957
01:35:26,320 --> 01:35:28,800
into the Bluefield fourstack
1958
01:35:28,800 --> 01:35:31,600
100%
1959
01:35:31,600 --> 01:35:35,040
100% of the world's storage industry is
1960
01:35:35,040 --> 01:35:38,960
joining us on this system and the reason
1961
01:35:38,960 --> 01:35:40,960
for that is because they see exactly the
1962
01:35:40,960 --> 01:35:43,600
same thing. The storage system is going
1963
01:35:43,600 --> 01:35:45,679
to get pounded. It's going to get
1964
01:35:45,679 --> 01:35:47,760
pounded because we used to have humans
1965
01:35:47,760 --> 01:35:49,600
using the storage systems. We used to
1966
01:35:49,600 --> 01:35:51,679
have humans using SQL. Now we're going
1967
01:35:51,679 --> 01:35:54,159
to have AIS using these storage systems
1968
01:35:54,159 --> 01:35:57,199
and it's going to store QDF accelerated
1969
01:35:57,199 --> 01:36:00,159
storage, QVS accelerated storage as well
1970
01:36:00,159 --> 01:36:03,360
as very importantly KV caching. Okay, so
1971
01:36:03,360 --> 01:36:06,159
this is the Vera Rubin system. Now
1972
01:36:06,159 --> 01:36:08,960
what's amazing is this. in just two
1973
01:36:08,960 --> 01:36:13,600
years time in a one gigawatt factory in
1974
01:36:13,600 --> 01:36:15,679
just two years time in one gigawatt
1975
01:36:15,679 --> 01:36:18,639
factory using the mathematics that I
1976
01:36:18,639 --> 01:36:21,440
showed you earlier whereas Moore's law
1977
01:36:21,440 --> 01:36:23,199
would have given us a couple of steps we
1978
01:36:23,199 --> 01:36:26,719
would have you know x factored the
1979
01:36:26,719 --> 01:36:28,320
number of transistors we would have x
1980
01:36:28,320 --> 01:36:30,400
factored the number of flops we would
1981
01:36:30,400 --> 01:36:33,280
have x factored the number of amount of
1982
01:36:33,280 --> 01:36:36,239
bandwidth but with this architecture
1983
01:36:36,239 --> 01:36:38,000
we're going to take our token generation
1984
01:36:38,000 --> 01:36:42,480
ation speed token generation rate from 2
1985
01:36:42,480 --> 01:36:48,159
million to 700 million 350 times
1986
01:36:48,159 --> 01:36:51,159
increase.
1987
01:36:53,679 --> 01:36:56,480
This is this is the power of extreme
1988
01:36:56,480 --> 01:36:59,520
code design. This is what I mean when we
1989
01:36:59,520 --> 01:37:02,560
integrate and optimize vertically but
1990
01:37:02,560 --> 01:37:04,880
then we open it horizontally for
1991
01:37:04,880 --> 01:37:06,719
everybody to enjoy. This is our road
1992
01:37:06,719 --> 01:37:09,760
map. Very quickly,
1993
01:37:09,760 --> 01:37:13,520
Blackwell is here, the Oberon system. In
1994
01:37:13,520 --> 01:37:15,760
the case of Reuben, we have the Oberon
1995
01:37:15,760 --> 01:37:17,199
system. We're always backwards
1996
01:37:17,199 --> 01:37:19,360
compatible. So that if you wanted to not
1997
01:37:19,360 --> 01:37:21,280
change anything and just keep on moving
1998
01:37:21,280 --> 01:37:22,800
through with the new architecture, you
1999
01:37:22,800 --> 01:37:24,560
could do so.
2000
01:37:24,560 --> 01:37:27,920
The old the standard um rack system
2001
01:37:27,920 --> 01:37:31,520
Oberon still available. Oberon is copper
2002
01:37:31,520 --> 01:37:34,960
scale up. And with Oberon, we could also
2003
01:37:34,960 --> 01:37:37,440
use optical scale out or excuse me,
2004
01:37:37,440 --> 01:37:42,800
optical scale up to expand to MVLink
2005
01:37:42,800 --> 01:37:44,480
576.
2006
01:37:44,480 --> 01:37:45,840
Okay. And so there's a lot of
2007
01:37:45,840 --> 01:37:47,600
conversation about is Nvidia going to
2008
01:37:47,600 --> 01:37:50,639
copper scale up or optical scale up.
2009
01:37:50,639 --> 01:37:52,880
We're going to do both.
2010
01:37:52,880 --> 01:37:55,920
So, we're going to have MVLink 144 with
2011
01:37:55,920 --> 01:38:00,080
Kyber and then with Operon uh Opteron
2012
01:38:00,080 --> 01:38:03,360
Oberon, we're going to MVLink72
2013
01:38:03,360 --> 01:38:08,639
plus Optical to get to MVLink 576.
2014
01:38:08,639 --> 01:38:11,920
The next generation of Reuben with
2015
01:38:11,920 --> 01:38:14,159
Reuben Ultra. We have the Reuben Ultra
2016
01:38:14,159 --> 01:38:16,400
chip which is coming which is imp taping
2017
01:38:16,400 --> 01:38:20,400
out and we have a brand new chip LP35.
2018
01:38:20,400 --> 01:38:23,920
LP35 will for the first time incorporate
2019
01:38:23,920 --> 01:38:28,159
Nvidia's MVFP4 computing structure give
2020
01:38:28,159 --> 01:38:30,639
you another few X X factor speed up.
2021
01:38:30,639 --> 01:38:34,719
Okay. And so this is Oberon MVLink 72
2022
01:38:34,719 --> 01:38:39,119
optical scale up and it uses Spectrum 6
2023
01:38:39,119 --> 01:38:43,199
the world's first co-ackaged optical and
2024
01:38:43,199 --> 01:38:46,159
um all of this is in production. The
2025
01:38:46,159 --> 01:38:48,880
next generation from here
2026
01:38:48,880 --> 01:38:52,320
is Fman. Fineman has a new GPU of
2027
01:38:52,320 --> 01:38:56,960
course. It also has a new LPU
2028
01:38:56,960 --> 01:38:58,800
LP40.
2029
01:38:58,800 --> 01:39:02,000
Big step up. Incredible. Incredible new
2030
01:39:02,000 --> 01:39:04,719
technology. Now
2031
01:39:04,719 --> 01:39:08,880
uniting the scale of Nvidia and the Gro
2032
01:39:08,880 --> 01:39:11,840
team building together LP40. It's going
2033
01:39:11,840 --> 01:39:14,320
to be incredible. a brand new CPU called
2034
01:39:14,320 --> 01:39:16,560
Rosa,
2035
01:39:16,560 --> 01:39:19,600
short for Roslin. Bluefield 5, which
2036
01:39:19,600 --> 01:39:22,800
connects the next CPU with the next
2037
01:39:22,800 --> 01:39:24,320
Superneck
2038
01:39:24,320 --> 01:39:26,639
CX10.
2039
01:39:26,639 --> 01:39:29,600
We will have Kyber
2040
01:39:29,600 --> 01:39:32,800
which is copper scale up. We will also
2041
01:39:32,800 --> 01:39:36,000
have Kyber
2042
01:39:36,000 --> 01:39:40,400
CPO scale up. So for the first time we
2043
01:39:40,400 --> 01:39:45,840
will scale up with both copper and
2044
01:39:45,840 --> 01:39:49,360
co-ackage optics. Okay. And so a lot of
2045
01:39:49,360 --> 01:39:51,199
people have been asking you know Jensen
2046
01:39:51,199 --> 01:39:53,440
are is copper going to still be
2047
01:39:53,440 --> 01:39:56,560
important? The answer is yes.
2048
01:39:56,560 --> 01:39:58,960
Jensen are you going to scale up
2049
01:39:58,960 --> 01:40:02,719
optical? Yes.
2050
01:40:02,719 --> 01:40:05,440
Are you going to scale out optical?
2051
01:40:05,440 --> 01:40:07,199
Yes.
2052
01:40:07,199 --> 01:40:09,040
And so for everybody who is in our
2053
01:40:09,040 --> 01:40:13,520
ecosystem, we need a lot more capacity
2054
01:40:13,520 --> 01:40:15,600
and that's really the key. We need a lot
2055
01:40:15,600 --> 01:40:17,600
more capacity for cop for copper. We
2056
01:40:17,600 --> 01:40:19,920
need a lot more capacity for optics. We
2057
01:40:19,920 --> 01:40:22,719
need a lot more capacity for CPO and
2058
01:40:22,719 --> 01:40:24,159
that's the reason why we've been working
2059
01:40:24,159 --> 01:40:26,239
with all of you to lay the foundation
2060
01:40:26,239 --> 01:40:29,199
for this level of growth. And so Fman
2061
01:40:29,199 --> 01:40:31,679
will have all of that. Let me see if I
2062
01:40:31,679 --> 01:40:34,000
uh missed everything. That's it. every
2063
01:40:34,000 --> 01:40:36,880
single year. Brand new architecture.
2064
01:40:36,880 --> 01:40:40,119
Very quick.
2065
01:40:43,280 --> 01:40:46,480
Very quickly, Nvidia went from a chip
2066
01:40:46,480 --> 01:40:50,800
company to a AI factory company or AI
2067
01:40:50,800 --> 01:40:52,159
infrastructure company, AI computing
2068
01:40:52,159 --> 01:40:54,719
company. These systems
2069
01:40:54,719 --> 01:40:56,880
and now we're building entire AI
2070
01:40:56,880 --> 01:41:01,040
factories. There's so much power
2071
01:41:01,040 --> 01:41:03,440
that is squandered in these AI
2072
01:41:03,440 --> 01:41:05,760
factories. We want to make sure that
2073
01:41:05,760 --> 01:41:07,920
these AI factories come together
2074
01:41:07,920 --> 01:41:10,719
designed in the best possible way. Most
2075
01:41:10,719 --> 01:41:12,560
of these components never meet each
2076
01:41:12,560 --> 01:41:14,480
other. Most of most of us technology
2077
01:41:14,480 --> 01:41:17,199
vendors now we all know each other but
2078
01:41:17,199 --> 01:41:19,360
in the past we never met each other
2079
01:41:19,360 --> 01:41:21,440
until the data center. That can't
2080
01:41:21,440 --> 01:41:23,679
happen. We're building super complex
2081
01:41:23,679 --> 01:41:25,600
systems and so we have to meet each
2082
01:41:25,600 --> 01:41:28,080
other virtually somewhere else and so we
2083
01:41:28,080 --> 01:41:32,080
created Omniverse and the Omniverse DSX
2084
01:41:32,080 --> 01:41:34,639
world a platform where all of us can
2085
01:41:34,639 --> 01:41:37,360
meet and design these gigafactories the
2086
01:41:37,360 --> 01:41:40,000
giga you know gigawatt AI factories
2087
01:41:40,000 --> 01:41:44,159
virtually in system. We have simulation
2088
01:41:44,159 --> 01:41:46,800
systems for the racks for mechanical,
2089
01:41:46,800 --> 01:41:49,840
thermal, electrical, networking. Those
2090
01:41:49,840 --> 01:41:52,159
simulation systems integrated into all
2091
01:41:52,159 --> 01:41:54,000
of our ecosystem partners of incredible
2092
01:41:54,000 --> 01:41:58,320
tools companies. We also operated,
2093
01:41:58,320 --> 01:42:00,960
connected to the grid so that we could
2094
01:42:00,960 --> 01:42:02,639
interact with each other, send each
2095
01:42:02,639 --> 01:42:04,800
other information so that we could
2096
01:42:04,800 --> 01:42:06,480
adjust
2097
01:42:06,480 --> 01:42:08,800
grid power and data center power
2098
01:42:08,800 --> 01:42:11,760
accordingly, saving energy. And then
2099
01:42:11,760 --> 01:42:14,639
inside the data center using Max Q so
2100
01:42:14,639 --> 01:42:16,320
that we could adjust the system
2101
01:42:16,320 --> 01:42:19,280
dynamically across power and cooling and
2102
01:42:19,280 --> 01:42:20,960
all of the different technologies we all
2103
01:42:20,960 --> 01:42:23,760
work on together so that we leave no
2104
01:42:23,760 --> 01:42:25,920
power squandered
2105
01:42:25,920 --> 01:42:28,639
so that we run at the most optimal rate
2106
01:42:28,639 --> 01:42:30,880
to deliver enormous amount of token
2107
01:42:30,880 --> 01:42:33,040
throughput. There's no question in my
2108
01:42:33,040 --> 01:42:35,840
mind there's a factor of two in here and
2109
01:42:35,840 --> 01:42:37,600
a factor of two at the scale we're
2110
01:42:37,600 --> 01:42:40,639
talking about is gigantic. We call this
2111
01:42:40,639 --> 01:42:43,520
the NVIDIA DXX platform. And just as all
2112
01:42:43,520 --> 01:42:45,440
of our platforms, there's the hardware
2113
01:42:45,440 --> 01:42:48,000
layer, there's the library layer, and
2114
01:42:48,000 --> 01:42:49,760
there's the ecosystem layer. It's
2115
01:42:49,760 --> 01:42:52,000
exactly the same way. Let's show it to
2116
01:42:52,000 --> 01:42:55,000
you.
2117
01:42:56,159 --> 01:42:58,239
The greatest infrastructure buildout in
2118
01:42:58,239 --> 01:43:00,400
history is underway.
2119
01:43:00,400 --> 01:43:03,040
The world is racing to build chip system
2120
01:43:03,040 --> 01:43:05,840
and AI factories. And every month of
2121
01:43:05,840 --> 01:43:09,119
delay costs billions in lost revenues.
2122
01:43:09,119 --> 01:43:11,520
AI factory revenues are equal to tokens
2123
01:43:11,520 --> 01:43:15,119
per watt. So with power constraints,
2124
01:43:15,119 --> 01:43:19,280
every unused watt is revenue lost.
2125
01:43:19,280 --> 01:43:22,560
NVIDIA DSX is an Omniverse digital twin
2126
01:43:22,560 --> 01:43:25,520
blueprint for designing and operating AI
2127
01:43:25,520 --> 01:43:28,080
factories for maximum token throughput,
2128
01:43:28,080 --> 01:43:31,360
resilience, and energy efficiency.
2129
01:43:31,360 --> 01:43:34,159
Developers connect through several APIs.
2130
01:43:34,159 --> 01:43:37,600
DSX SIM for physical, electrical,
2131
01:43:37,600 --> 01:43:40,800
thermal and network simulation, DSX
2132
01:43:40,800 --> 01:43:43,600
exchange for AI factory operational
2133
01:43:43,600 --> 01:43:47,679
data, DSX Flex for secure dynamic power
2134
01:43:47,679 --> 01:43:51,040
management between the grid and DSX Max
2135
01:43:51,040 --> 01:43:53,679
Q to dynamically maximize token
2136
01:43:53,679 --> 01:43:55,920
throughput.
2137
01:43:55,920 --> 01:43:57,840
It starts with SIM ready assets from
2138
01:43:57,840 --> 01:44:01,119
NVIDIA and equipment manufacturers
2139
01:44:01,119 --> 01:44:04,960
managed by PTC windshield PLM.
2140
01:44:04,960 --> 01:44:07,440
Then modelbased systems engineering is
2141
01:44:07,440 --> 01:44:12,440
done in DASO systems 3D experience.
2142
01:44:12,480 --> 01:44:14,320
Jacobs brings the data into their custom
2143
01:44:14,320 --> 01:44:18,719
Omniverse app to finalize design.
2144
01:44:18,719 --> 01:44:20,480
It's tested with leading simulation
2145
01:44:20,480 --> 01:44:24,320
tools using Seaman's Star CCCM Plus for
2146
01:44:24,320 --> 01:44:26,960
external thermals,
2147
01:44:26,960 --> 01:44:29,600
Cadence Reality for internal,
2148
01:44:29,600 --> 01:44:32,800
EAP for electrical, and NVIDIA's network
2149
01:44:32,800 --> 01:44:36,400
simulator DSX Air
2150
01:44:36,400 --> 01:44:39,199
and virtually commission through Procore
2151
01:44:39,199 --> 01:44:42,800
to ensure accelerated construction time.
2152
01:44:42,800 --> 01:44:45,119
When the site goes live, the digital
2153
01:44:45,119 --> 01:44:49,040
twin becomes the operator. AI agents
2154
01:44:49,040 --> 01:44:52,239
work with DSX Max Q to dynamically
2155
01:44:52,239 --> 01:44:54,960
orchestrate infrastructure.
2156
01:44:54,960 --> 01:44:57,040
Fedra's agent overseas cooling and
2157
01:44:57,040 --> 01:44:59,679
electrical systems, sending signals to
2158
01:44:59,679 --> 01:45:02,320
Max Q, which continuously optimizes
2159
01:45:02,320 --> 01:45:03,840
compute throughput and energy
2160
01:45:03,840 --> 01:45:05,679
efficiency.
2161
01:45:05,679 --> 01:45:08,080
Emerald AI agents interpret live grid
2162
01:45:08,080 --> 01:45:11,440
demand and stress signals and adjust
2163
01:45:11,440 --> 01:45:15,119
power dynamically.
2164
01:45:15,119 --> 01:45:18,239
With DSX, Nvidia and our ecosystem of
2165
01:45:18,239 --> 01:45:20,480
partners are racing to build AI
2166
01:45:20,480 --> 01:45:22,639
infrastructure around the world,
2167
01:45:22,639 --> 01:45:26,000
ensuring extreme resiliency, efficiency,
2168
01:45:26,000 --> 01:45:29,239
and throughput.
2169
01:45:31,840 --> 01:45:36,760
It's incredible, right? Well,
2170
01:45:37,280 --> 01:45:40,400
om Omniverse Omniverse was designed to
2171
01:45:40,400 --> 01:45:42,880
hold the world's digital twin starting
2172
01:45:42,880 --> 01:45:45,040
from the earth and it's going to hold
2173
01:45:45,040 --> 01:45:47,199
digital twins of all sizes. And so we
2174
01:45:47,199 --> 01:45:49,119
have just such a great ecosystem of
2175
01:45:49,119 --> 01:45:50,960
partners. I want to thank all of you.
2176
01:45:50,960 --> 01:45:53,360
All of these companies are brand new to
2177
01:45:53,360 --> 01:45:55,600
our world. We didn't know many of you
2178
01:45:55,600 --> 01:45:57,840
just a couple years ago. And now we're
2179
01:45:57,840 --> 01:46:00,400
working so close together to work on and
2180
01:46:00,400 --> 01:46:03,360
build together the largest computer the
2181
01:46:03,360 --> 01:46:05,760
world's ever seen and also to do it at
2182
01:46:05,760 --> 01:46:08,880
planetary scale. So NVIDIA DSX is our
2183
01:46:08,880 --> 01:46:14,360
new AI factory platform.
2184
01:46:17,840 --> 01:46:19,440
I'll spend very little time on this at
2185
01:46:19,440 --> 01:46:21,440
this time. However, we're going to
2186
01:46:21,440 --> 01:46:23,440
space. We've already been out in space.
2187
01:46:23,440 --> 01:46:26,159
uh Thor is radiation approved and uh
2188
01:46:26,159 --> 01:46:28,239
we're in satellites. You do imaging from
2189
01:46:28,239 --> 01:46:30,000
the from satellites. In the future,
2190
01:46:30,000 --> 01:46:32,639
we'll also build data centers in the in
2191
01:46:32,639 --> 01:46:35,360
the in the in space. Uh obviously very
2192
01:46:35,360 --> 01:46:37,840
complicated to do. So we have we're
2193
01:46:37,840 --> 01:46:39,520
working with our partners on a new
2194
01:46:39,520 --> 01:46:42,239
computer called Vera Rubin Space 1 and
2195
01:46:42,239 --> 01:46:44,159
it's going to go out to space and start
2196
01:46:44,159 --> 01:46:46,639
data centers out out in space. Now, of
2197
01:46:46,639 --> 01:46:50,960
course, in space, there's no conduction,
2198
01:46:50,960 --> 01:46:52,800
there's no convection, there's just
2199
01:46:52,800 --> 01:46:55,040
radiation. And so, we have to figure out
2200
01:46:55,040 --> 01:46:57,920
how to um uh cool these systems uh out
2201
01:46:57,920 --> 01:46:59,440
in space. But, we've got lots of great
2202
01:46:59,440 --> 01:47:01,520
engineers working on it. Let me talk to
2203
01:47:01,520 --> 01:47:05,239
you about something new.
2204
01:47:10,239 --> 01:47:13,440
So, so um
2205
01:47:13,440 --> 01:47:16,320
uh Peter Steinberger is here and um uh
2206
01:47:16,320 --> 01:47:19,280
he he wrote a piece of software. It's
2207
01:47:19,280 --> 01:47:23,040
called Open Claw and and um I don't know
2208
01:47:23,040 --> 01:47:26,000
if he realized uh how successful it was
2209
01:47:26,000 --> 01:47:28,880
going to be. Um but the importance is
2210
01:47:28,880 --> 01:47:32,080
profound. Open Claw is the number one.
2211
01:47:32,080 --> 01:47:35,280
It's the most popular opensource project
2212
01:47:35,280 --> 01:47:37,760
in the history of humanity and it did so
2213
01:47:37,760 --> 01:47:41,719
in just a few weeks.
2214
01:47:44,000 --> 01:47:46,800
It exceeded it exceeded what Linux did
2215
01:47:46,800 --> 01:47:50,080
in 30 years. And it's that important. It
2216
01:47:50,080 --> 01:47:55,080
is that important. It will do well.
2217
01:47:55,119 --> 01:47:57,920
Uh this is all you do. Okay? We're
2218
01:47:57,920 --> 01:48:00,159
announcing our support of it. Uh let me
2219
01:48:00,159 --> 01:48:01,360
let me just quickly go through this.
2220
01:48:01,360 --> 01:48:02,320
this. I want to show you a couple
2221
01:48:02,320 --> 01:48:05,679
things. You simply type this, you type
2222
01:48:05,679 --> 01:48:09,440
it this into a into a console and um it
2223
01:48:09,440 --> 01:48:11,679
goes out, it finds open claw, it
2224
01:48:11,679 --> 01:48:15,520
downloads it, it builds you an AI agent,
2225
01:48:15,520 --> 01:48:17,520
and then you could tell it whatever else
2226
01:48:17,520 --> 01:48:19,760
you need to do. Okay, so let's take a
2227
01:48:19,760 --> 01:48:22,760
look.
2228
01:49:53,199 --> 01:49:54,880
An open source project just dropped.
2229
01:49:54,880 --> 01:49:56,639
>> Andre Carpathy has just launched
2230
01:49:56,639 --> 01:49:58,480
something called research is a huge
2231
01:49:58,480 --> 01:49:58,880
deal.
2232
01:49:58,880 --> 01:50:00,960
>> You give an AI agent a task, go to
2233
01:50:00,960 --> 01:50:02,639
sleep, it runs 100 experiments
2234
01:50:02,639 --> 01:50:04,480
overnight, keeping what works and
2235
01:50:04,480 --> 01:50:07,960
killing what doesn't.
2236
01:50:16,000 --> 01:50:19,679
I really love what my stuff enables that
2237
01:50:19,679 --> 01:50:21,679
person to do. And he had like one guy,
2238
01:50:21,679 --> 01:50:23,280
he told me like he installed it as a
2239
01:50:23,280 --> 01:50:25,760
60-year-old dad and like they made beer,
2240
01:50:25,760 --> 01:50:27,520
connected the machine via Bluetooth to
2241
01:50:27,520 --> 01:50:28,880
open claw. And then we automated
2242
01:50:28,880 --> 01:50:30,239
everything including the whole website
2243
01:50:30,239 --> 01:50:35,000
for people to order lobster.
2244
01:50:38,960 --> 01:50:40,880
Hundreds of people are queuing up for
2245
01:50:40,880 --> 01:50:43,760
lobsters in s openclaw.
2246
01:50:43,760 --> 01:50:45,520
>> Open claw.
2247
01:50:45,520 --> 01:50:47,600
>> You want to build open claw with open
2248
01:50:47,600 --> 01:50:47,920
claw.
2249
01:50:47,920 --> 01:50:50,320
>> Everyone is talking about open claw. But
2250
01:50:50,320 --> 01:50:52,080
what is open claw?
2251
01:50:52,080 --> 01:50:53,679
>> Believe it or not, there's already a
2252
01:50:53,679 --> 01:50:57,159
claw con.
2253
01:51:07,280 --> 01:51:10,719
Incredible. Incredible. Now, um I
2254
01:51:10,719 --> 01:51:12,960
illustrated effectively what open claw
2255
01:51:12,960 --> 01:51:15,040
is in this way and so all of you can
2256
01:51:15,040 --> 01:51:16,800
understand it. But let's just think what
2257
01:51:16,800 --> 01:51:19,840
happened. What is open claw? It connects
2258
01:51:19,840 --> 01:51:23,360
it's an a it's a system. It calls and
2259
01:51:23,360 --> 01:51:25,360
connects to large language models. So
2260
01:51:25,360 --> 01:51:27,440
the first thing it has it has resources
2261
01:51:27,440 --> 01:51:29,600
that it manages. It manage it could
2262
01:51:29,600 --> 01:51:31,440
access tools. It could access file
2263
01:51:31,440 --> 01:51:33,199
systems. It could access large language
2264
01:51:33,199 --> 01:51:36,480
models. It It's able to do scheduling.
2265
01:51:36,480 --> 01:51:40,000
It's able to do cron jobs. It's able to
2266
01:51:40,000 --> 01:51:43,440
um uh decompose a problem that a prompt
2267
01:51:43,440 --> 01:51:45,280
that you gave it into step by step by
2268
01:51:45,280 --> 01:51:48,080
step. It could spawn off and call upon
2269
01:51:48,080 --> 01:51:50,560
other sub aents.
2270
01:51:50,560 --> 01:51:53,440
It has IO. You could talk to it in any
2271
01:51:53,440 --> 01:51:55,920
modality you want. You could wave at it
2272
01:51:55,920 --> 01:51:58,239
and it understands you. You could talk
2273
01:51:58,239 --> 01:52:01,199
to any modality you want. It sends you
2274
01:52:01,199 --> 01:52:04,159
messages, it texts you, sends you email.
2275
01:52:04,159 --> 01:52:07,440
So, it's got IO.
2276
01:52:07,440 --> 01:52:11,679
Um, what else does it have? Well, based
2277
01:52:11,679 --> 01:52:15,599
on that, you could you could say in fact
2278
01:52:15,599 --> 01:52:18,800
it's an operating system. I've just used
2279
01:52:18,800 --> 01:52:20,800
the same syntax that I would describe an
2280
01:52:20,800 --> 01:52:23,840
operating system. Art
2281
01:52:23,840 --> 01:52:27,360
openclaw has open sourced essentially
2282
01:52:27,360 --> 01:52:30,880
the operating system of agent computers.
2283
01:52:30,880 --> 01:52:33,440
It is no different than how Windows made
2284
01:52:33,440 --> 01:52:36,159
it possible for us to create personal
2285
01:52:36,159 --> 01:52:38,719
computers. Now open claw has made it
2286
01:52:38,719 --> 01:52:41,199
possible for us to create personal
2287
01:52:41,199 --> 01:52:42,960
agents.
2288
01:52:42,960 --> 01:52:45,760
The implication is incredible. The
2289
01:52:45,760 --> 01:52:47,599
implication is incredible. First of all,
2290
01:52:47,599 --> 01:52:50,400
the adoption says something you know all
2291
01:52:50,400 --> 01:52:53,040
in itself. However, the most important
2292
01:52:53,040 --> 01:52:54,880
thing is this. Every single company now
2293
01:52:54,880 --> 01:52:56,880
realize every single company, every
2294
01:52:56,880 --> 01:52:58,400
single software company, every single
2295
01:52:58,400 --> 01:53:01,199
technology company for the CEOs, the
2296
01:53:01,199 --> 01:53:02,639
question is what's your open claw
2297
01:53:02,639 --> 01:53:04,719
strategy?
2298
01:53:04,719 --> 01:53:06,719
Just as we need to all have a Linux
2299
01:53:06,719 --> 01:53:09,599
strategy, we all needed to have a HTTP
2300
01:53:09,599 --> 01:53:12,400
HTML strategy which started the
2301
01:53:12,400 --> 01:53:13,920
internet. We all needed to have a
2302
01:53:13,920 --> 01:53:15,599
Kubernetes strategy which made it
2303
01:53:15,599 --> 01:53:17,920
possible for mobile cloud to happen.
2304
01:53:17,920 --> 01:53:20,080
Every company in the world today needs
2305
01:53:20,080 --> 01:53:22,639
to have an open claw strategy and a
2306
01:53:22,639 --> 01:53:25,440
gentic system strategy. This is the new
2307
01:53:25,440 --> 01:53:28,800
computer. Now this is just the exciting
2308
01:53:28,800 --> 01:53:31,679
part. This is enterprise IT before
2309
01:53:31,679 --> 01:53:34,159
openclaw you know and and I mentioned
2310
01:53:34,159 --> 01:53:37,360
earlier the way enterprise IT works and
2311
01:53:37,360 --> 01:53:38,960
the the reason these reason why it's
2312
01:53:38,960 --> 01:53:40,560
called data centers is because these
2313
01:53:40,560 --> 01:53:42,960
large rooms these large buildings held
2314
01:53:42,960 --> 01:53:46,000
data held the files of people the
2315
01:53:46,000 --> 01:53:48,480
structured data of business. It would
2316
01:53:48,480 --> 01:53:51,679
pass through software that has tools and
2317
01:53:51,679 --> 01:53:53,760
you know systems of records and all
2318
01:53:53,760 --> 01:53:56,159
kinds of workflow that's codified into
2319
01:53:56,159 --> 01:53:58,560
it and that turns into tools that humans
2320
01:53:58,560 --> 01:54:00,719
would use
2321
01:54:00,719 --> 01:54:02,960
digital workers would use. That is the
2322
01:54:02,960 --> 01:54:05,760
old IT industry software companies
2323
01:54:05,760 --> 01:54:09,280
creating tools saving files and of
2324
01:54:09,280 --> 01:54:11,280
course gsis consultants that help
2325
01:54:11,280 --> 01:54:12,800
companies figure out how to use these
2326
01:54:12,800 --> 01:54:14,560
tools and integrate these tools. These
2327
01:54:14,560 --> 01:54:16,560
in these tools are incredibly valuable
2328
01:54:16,560 --> 01:54:19,119
for governance and security and privacy
2329
01:54:19,119 --> 01:54:21,280
and compliance and all of that's
2330
01:54:21,280 --> 01:54:23,840
continues to be true.
2331
01:54:23,840 --> 01:54:26,880
It's just that post open clock post
2332
01:54:26,880 --> 01:54:28,800
agentic this is what it's going to look
2333
01:54:28,800 --> 01:54:31,440
like. This is the extraordinary part.
2334
01:54:31,440 --> 01:54:34,159
Every single IT company, every single
2335
01:54:34,159 --> 01:54:37,599
company, every SAS company,
2336
01:54:37,599 --> 01:54:42,639
every SAS company will become a
2337
01:54:42,639 --> 01:54:45,280
a gas company.
2338
01:54:45,280 --> 01:54:48,080
No question about it. Every single SAS
2339
01:54:48,080 --> 01:54:49,840
company will become a gas company, an
2340
01:54:49,840 --> 01:54:52,960
agentic as a service company. And what's
2341
01:54:52,960 --> 01:54:55,840
amazing is this. You now open claw gave
2342
01:54:55,840 --> 01:54:58,239
us gave the industry exactly what it
2343
01:54:58,239 --> 01:55:01,920
needed at exactly the time.
2344
01:55:01,920 --> 01:55:05,119
Just as Linux gave the industry exactly
2345
01:55:05,119 --> 01:55:07,040
what it needed at exactly the time just
2346
01:55:07,040 --> 01:55:09,360
as Kubernetes showed up at exactly the
2347
01:55:09,360 --> 01:55:12,560
right time just as HTML showed up it
2348
01:55:12,560 --> 01:55:14,560
made it possible for the entire industry
2349
01:55:14,560 --> 01:55:18,000
to grab onto this open-source stack and
2350
01:55:18,000 --> 01:55:19,840
go do something with it. There's just
2351
01:55:19,840 --> 01:55:21,840
one catch.
2352
01:55:21,840 --> 01:55:24,080
Agentic systems
2353
01:55:24,080 --> 01:55:26,880
in the corporate network can have access
2354
01:55:26,880 --> 01:55:30,080
to sensitive information. It can execute
2355
01:55:30,080 --> 01:55:33,760
code and it can communicate externally.
2356
01:55:33,760 --> 01:55:35,599
Just say that out loud. Okay, think
2357
01:55:35,599 --> 01:55:38,320
about it. Access sensitive information,
2358
01:55:38,320 --> 01:55:41,599
execute code, communicate externally.
2359
01:55:41,599 --> 01:55:43,760
You could of course access employee
2360
01:55:43,760 --> 01:55:45,280
information,
2361
01:55:45,280 --> 01:55:47,119
access supply chain, access finance
2362
01:55:47,119 --> 01:55:48,719
information, sensitive information and
2363
01:55:48,719 --> 01:55:51,119
send it out, communicate externally.
2364
01:55:51,119 --> 01:55:52,880
Obviously,
2365
01:55:52,880 --> 01:55:55,440
this can't possibly be allowed. And so,
2366
01:55:55,440 --> 01:55:57,520
what we did was we worked with Peter. We
2367
01:55:57,520 --> 01:55:59,920
took some of the world's best security
2368
01:55:59,920 --> 01:56:02,560
and computing experts and we worked with
2369
01:56:02,560 --> 01:56:06,239
Peter to make open claw
2370
01:56:06,239 --> 01:56:09,199
open claw enterprise
2371
01:56:09,199 --> 01:56:13,520
secure and enterprise private capable.
2372
01:56:13,520 --> 01:56:16,239
And we call that
2373
01:56:16,239 --> 01:56:19,199
this is our Nvidia open claw reference
2374
01:56:19,199 --> 01:56:21,920
for open nemo claw which is a reference
2375
01:56:21,920 --> 01:56:24,880
for openclaw and it has all these
2376
01:56:24,880 --> 01:56:27,599
agentic AI toolkits and the first part
2377
01:56:27,599 --> 01:56:30,800
of it is technology we call open shell
2378
01:56:30,800 --> 01:56:33,520
that has now been integrated into open
2379
01:56:33,520 --> 01:56:38,080
claw now it's enterprise ready this
2380
01:56:38,080 --> 01:56:41,199
stack this stack with a reference design
2381
01:56:41,199 --> 01:56:43,760
we call Nemo cloud neoclaw
2382
01:56:43,760 --> 01:56:45,760
Okay, with a reference stack we call
2383
01:56:45,760 --> 01:56:47,840
Nemo clock. You could download it, play
2384
01:56:47,840 --> 01:56:52,560
with it, and you could connect to it the
2385
01:56:52,560 --> 01:56:55,119
policy engine of all of the SAS
2386
01:56:55,119 --> 01:56:57,280
companies in the world. And your policy
2387
01:56:57,280 --> 01:56:59,440
engines are super important, super
2388
01:56:59,440 --> 01:57:01,920
valuable. So the policy engines could be
2389
01:57:01,920 --> 01:57:05,440
connected Nemo Claw or Open Claw with
2390
01:57:05,440 --> 01:57:07,440
Open Shell would be able to execute that
2391
01:57:07,440 --> 01:57:11,599
policy engine. It has a polic
2392
01:57:11,599 --> 01:57:14,719
guard rail. It has a privacy router and
2393
01:57:14,719 --> 01:57:18,400
as a result we could protect and keep
2394
01:57:18,400 --> 01:57:22,400
the the clause from executing inside our
2395
01:57:22,400 --> 01:57:25,440
company and do it safely. We also added
2396
01:57:25,440 --> 01:57:28,080
several things to the agent system and
2397
01:57:28,080 --> 01:57:29,440
one of the most important things you
2398
01:57:29,440 --> 01:57:32,480
want to do with your own
2399
01:57:32,480 --> 01:57:35,760
claw custom claws is so that you can
2400
01:57:35,760 --> 01:57:37,920
have your custom models and this is
2401
01:57:37,920 --> 01:57:40,719
Nvidia's open model initiative. We are
2402
01:57:40,719 --> 01:57:44,560
now at the frontier of every single
2403
01:57:44,560 --> 01:57:47,679
domain of AI models. Whether it's
2404
01:57:47,679 --> 01:57:50,400
Neimotron, Cosmos, World Foundation
2405
01:57:50,400 --> 01:57:53,280
model, Groot, artificial general
2406
01:57:53,280 --> 01:57:55,760
robotics, human or robotics models,
2407
01:57:55,760 --> 01:57:59,440
Alpamo for autonomous vehicle, Bioneo
2408
01:57:59,440 --> 01:58:01,920
for digital biology,
2409
01:58:01,920 --> 01:58:04,400
Earth 2 for AI physics. We are at the
2410
01:58:04,400 --> 01:58:06,400
frontier on every single one. Take a
2411
01:58:06,400 --> 01:58:09,400
look.
2412
01:58:09,840 --> 01:58:12,800
The world is diverse. No single model
2413
01:58:12,800 --> 01:58:15,280
can serve every industry.
2414
01:58:15,280 --> 01:58:17,520
Open Models is one of the largest and
2415
01:58:17,520 --> 01:58:20,400
most diverse AI ecosystems in the world.
2416
01:58:20,400 --> 01:58:22,639
Nearly 3 million open models across
2417
01:58:22,639 --> 01:58:25,840
language, vision, biology, physics, and
2418
01:58:25,840 --> 01:58:28,800
autonomous systems enable AI builds for
2419
01:58:28,800 --> 01:58:31,840
specialized domains. NVIDIA is one of
2420
01:58:31,840 --> 01:58:33,840
the largest contributors to open-source
2421
01:58:33,840 --> 01:58:36,880
AI. We build and release six families of
2422
01:58:36,880 --> 01:58:39,119
open frontier models, plus the training
2423
01:58:39,119 --> 01:58:42,159
data, recipes, and frameworks to help
2424
01:58:42,159 --> 01:58:44,320
developers customize and adopt new
2425
01:58:44,320 --> 01:58:46,560
leaderboard topping models are launching
2426
01:58:46,560 --> 01:58:50,400
for every family. At the core, Neotron
2427
01:58:50,400 --> 01:58:53,040
reasoning models for language, visual
2428
01:58:53,040 --> 01:58:57,400
understanding, rag,
2429
01:58:57,760 --> 01:59:00,159
safety,
2430
01:59:00,159 --> 01:59:01,280
and speech.
2431
01:59:01,280 --> 01:59:03,520
>> Can you hear me now? Hello. Yes, I can
2432
01:59:03,520 --> 01:59:05,199
hear you now.
2433
01:59:05,199 --> 01:59:08,880
>> Cosmos Frontier models for physical AI
2434
01:59:08,880 --> 01:59:12,920
world generation and understanding.
2435
01:59:13,520 --> 01:59:16,320
Alpayo, the world's first thinking and
2436
01:59:16,320 --> 01:59:19,920
reasoning autonomous vehicle AI
2437
01:59:19,920 --> 01:59:22,639
group foundation models for general
2438
01:59:22,639 --> 01:59:26,639
purpose robots. Bioneo open models for
2439
01:59:26,639 --> 01:59:28,960
biology, chemistry, and molecular
2440
01:59:28,960 --> 01:59:30,800
design.
2441
01:59:30,800 --> 01:59:33,280
Earth 2 models for weather and climate
2442
01:59:33,280 --> 01:59:37,440
forecasting rooted in AI physics.
2443
01:59:37,440 --> 01:59:39,920
NVIDIA open models give researchers and
2444
01:59:39,920 --> 01:59:42,239
developers the foundation to build and
2445
01:59:42,239 --> 01:59:44,320
deploy AI for their own specialized
2446
01:59:44,320 --> 01:59:46,719
domains.
2447
01:59:46,719 --> 01:59:51,480
Our models our mo thank you
2448
01:59:52,639 --> 01:59:54,719
our models are valuable to all of you
2449
01:59:54,719 --> 01:59:56,880
because number one it's on the top of
2450
01:59:56,880 --> 02:00:01,199
the leaderboard. It's world class. But
2451
02:00:01,199 --> 02:00:03,679
most importantly, it's because we are
2452
02:00:03,679 --> 02:00:05,119
not going to give up working on it.
2453
02:00:05,119 --> 02:00:06,480
We're going to keep on working on it
2454
02:00:06,480 --> 02:00:08,719
every single day. Neotron 3 is going to
2455
02:00:08,719 --> 02:00:11,199
be followed by Neotron 4. Cosmos one was
2456
02:00:11,199 --> 02:00:14,400
followed by Cosmos 2. Groot Groot at
2457
02:00:14,400 --> 02:00:16,320
generation 2. Each and one of these
2458
02:00:16,320 --> 02:00:18,639
we're going to continue to advance these
2459
02:00:18,639 --> 02:00:21,920
models. vertical integration,
2460
02:00:21,920 --> 02:00:24,000
horizontal openness, so that we can
2461
02:00:24,000 --> 02:00:27,199
enable everybody to join the AI
2462
02:00:27,199 --> 02:00:29,360
revolution, number one on leaderboard
2463
02:00:29,360 --> 02:00:31,840
across research and voice and world
2464
02:00:31,840 --> 02:00:34,000
models and artificial general robotics
2465
02:00:34,000 --> 02:00:37,920
and self-driving cars and reasoning and
2466
02:00:37,920 --> 02:00:40,960
of course one of the most important one.
2467
02:00:40,960 --> 02:00:44,880
This is Neotron 3 in
2468
02:00:44,880 --> 02:00:47,679
Open Claw. This is Neimotron 3 and Open
2469
02:00:47,679 --> 02:00:50,560
Claw. And look at the top three. There
2470
02:00:50,560 --> 02:00:53,360
are the three best models in the world.
2471
02:00:53,360 --> 02:00:58,280
Okay. So, we are at the frontier.
2472
02:01:00,880 --> 02:01:03,840
It is also true. It is also true that we
2473
02:01:03,840 --> 02:01:05,599
want to create the foundation model so
2474
02:01:05,599 --> 02:01:07,360
that all of you could fine-tune it and
2475
02:01:07,360 --> 02:01:10,320
post-train it into exactly the
2476
02:01:10,320 --> 02:01:12,960
intelligence you need. This is Neotron 3
2477
02:01:12,960 --> 02:01:16,239
Ultra. It is going to be the best base
2478
02:01:16,239 --> 02:01:18,639
model the world's ever created. This
2479
02:01:18,639 --> 02:01:22,800
allows us to help every country build
2480
02:01:22,800 --> 02:01:24,960
their sovereign AI. And we're working
2481
02:01:24,960 --> 02:01:26,560
with so many different companies out
2482
02:01:26,560 --> 02:01:28,400
there. And one of the most exciting
2483
02:01:28,400 --> 02:01:29,840
things that we're doing today, I'm
2484
02:01:29,840 --> 02:01:35,440
announcing today is a Neotron coalition.
2485
02:01:35,440 --> 02:01:37,520
We are so dedicated to this. We have
2486
02:01:37,520 --> 02:01:39,280
invested billions of dollars of AI
2487
02:01:39,280 --> 02:01:41,040
infrastructure so that we could develop
2488
02:01:41,040 --> 02:01:43,280
the core engines for AI that's necessary
2489
02:01:43,280 --> 02:01:45,360
for all the libraries of inference and
2490
02:01:45,360 --> 02:01:49,760
so on. But also to create the AI models
2491
02:01:49,760 --> 02:01:52,960
to activate every single industry in the
2492
02:01:52,960 --> 02:01:54,880
world. Large language models is really
2493
02:01:54,880 --> 02:01:56,560
important. Of course, it's important.
2494
02:01:56,560 --> 02:01:58,320
How could how could human intelligence
2495
02:01:58,320 --> 02:02:01,199
not be? However, in different industries
2496
02:02:01,199 --> 02:02:03,040
around the world, in different countries
2497
02:02:03,040 --> 02:02:05,040
around the world, you need to have the
2498
02:02:05,040 --> 02:02:08,080
ability to customize your own models and
2499
02:02:08,080 --> 02:02:10,560
the domains of the domain of the domain
2500
02:02:10,560 --> 02:02:12,239
of the models is radically different
2501
02:02:12,239 --> 02:02:14,880
from biology to physics to self-driving
2502
02:02:14,880 --> 02:02:16,960
cars to general robotics to of course
2503
02:02:16,960 --> 02:02:18,800
human language. And we have the ability
2504
02:02:18,800 --> 02:02:21,119
to work with every single region to
2505
02:02:21,119 --> 02:02:23,440
create their domain specific their
2506
02:02:23,440 --> 02:02:26,080
sovereign AI. Today we're announcing a
2507
02:02:26,080 --> 02:02:29,199
coalition to partner with us to make
2508
02:02:29,199 --> 02:02:33,199
Neotron 4 even more amazing. And that
2509
02:02:33,199 --> 02:02:36,000
coalition has some amazing companies in
2510
02:02:36,000 --> 02:02:38,800
it. Black Forest Labs imaging company.
2511
02:02:38,800 --> 02:02:41,360
Cursor the famous coding company we use
2512
02:02:41,360 --> 02:02:45,119
lots of it. Lang chain billion downloads
2513
02:02:45,119 --> 02:02:48,880
for creating custom agents. Mistrol the
2514
02:02:48,880 --> 02:02:50,560
Arthur Arthur mentioned I think he's
2515
02:02:50,560 --> 02:02:52,800
here. Incredible incredible company.
2516
02:02:52,800 --> 02:02:56,080
Perplexity Perplexes computer absolutely
2517
02:02:56,080 --> 02:02:59,119
use it everybody use it. It is so good.
2518
02:02:59,119 --> 02:03:03,040
A multimodal agentic system. Reflection
2519
02:03:03,040 --> 02:03:05,840
Sarv from India thinking machine mirror
2520
02:03:05,840 --> 02:03:08,239
Morardi's lab. Incredible companies
2521
02:03:08,239 --> 02:03:12,199
joining us. Thank you.
2522
02:03:14,719 --> 02:03:18,080
I said I said that every single
2523
02:03:18,080 --> 02:03:19,679
enterprise company, every single
2524
02:03:19,679 --> 02:03:22,159
software company in the world needs an
2525
02:03:22,159 --> 02:03:25,440
agentic systems, need an agent strategy.
2526
02:03:25,440 --> 02:03:27,599
you need to have an open claw strategy.
2527
02:03:27,599 --> 02:03:29,760
And they all agree
2528
02:03:29,760 --> 02:03:31,920
and they're all partnering with us to
2529
02:03:31,920 --> 02:03:36,320
integrate Nemo, the Nemo claw reference
2530
02:03:36,320 --> 02:03:40,560
design, the NVIDIA agentic AI toolkit,
2531
02:03:40,560 --> 02:03:43,199
and of course all of our open models.
2532
02:03:43,199 --> 02:03:45,119
One company after another. There's so
2533
02:03:45,119 --> 02:03:46,800
many. And we're partnering with all of
2534
02:03:46,800 --> 02:03:49,360
you. I'm really grateful for that. And
2535
02:03:49,360 --> 02:03:51,760
um this is our moment. This is a
2536
02:03:51,760 --> 02:03:53,840
reinvention. This is this is a
2537
02:03:53,840 --> 02:03:55,599
renaissance
2538
02:03:55,599 --> 02:03:59,760
a renaissance of the enterprise IT from
2539
02:03:59,760 --> 02:04:03,360
what would be a $2 trillion industry.
2540
02:04:03,360 --> 02:04:05,280
This is going to become a multi-
2541
02:04:05,280 --> 02:04:07,920
trillion dollar industry offering not
2542
02:04:07,920 --> 02:04:11,199
just tools for people to use but agents
2543
02:04:11,199 --> 02:04:13,119
that are specialized in very special
2544
02:04:13,119 --> 02:04:15,440
domains that you're expert in that we
2545
02:04:15,440 --> 02:04:18,239
could rent. I could totally imagine in
2546
02:04:18,239 --> 02:04:21,199
the future every single engineer in our
2547
02:04:21,199 --> 02:04:23,520
company will need an annual token
2548
02:04:23,520 --> 02:04:25,040
budget.
2549
02:04:25,040 --> 02:04:27,360
They're going to make a few hundred,000
2550
02:04:27,360 --> 02:04:29,840
a year their base pay. I'm going to give
2551
02:04:29,840 --> 02:04:33,119
them probably half of that on top of it
2552
02:04:33,119 --> 02:04:35,679
as tokens so that they could be
2553
02:04:35,679 --> 02:04:39,360
amplified 10x. Of course, we would. It
2554
02:04:39,360 --> 02:04:42,480
is now one of the recruiting tools in
2555
02:04:42,480 --> 02:04:45,360
Silicon Valley. how many tokens comes
2556
02:04:45,360 --> 02:04:48,320
along with my job. And the reason for
2557
02:04:48,320 --> 02:04:50,480
that is very clear because every
2558
02:04:50,480 --> 02:04:53,760
engineer that has access to tokens will
2559
02:04:53,760 --> 02:04:56,239
be more productive and those tokens as
2560
02:04:56,239 --> 02:04:58,719
you know will be produced by AI
2561
02:04:58,719 --> 02:05:01,679
factories that all of you and us we
2562
02:05:01,679 --> 02:05:04,800
partner to build. Okay. So every single
2563
02:05:04,800 --> 02:05:07,599
enterprise company in today sit on top
2564
02:05:07,599 --> 02:05:10,159
of file systems and data centers. Every
2565
02:05:10,159 --> 02:05:12,880
single software company of the future
2566
02:05:12,880 --> 02:05:15,360
will be agentic and they will be token
2567
02:05:15,360 --> 02:05:17,679
manufacturers. They'll be token users
2568
02:05:17,679 --> 02:05:19,520
for their engineers and they'll be token
2569
02:05:19,520 --> 02:05:21,280
manufacturers for all of their
2570
02:05:21,280 --> 02:05:25,599
customers. The open clause in event, the
2571
02:05:25,599 --> 02:05:28,480
open claw event cannot be understated.
2572
02:05:28,480 --> 02:05:31,760
This is as big of a deal as HTML. This
2573
02:05:31,760 --> 02:05:34,400
is as big of a deal as Linux. We have
2574
02:05:34,400 --> 02:05:39,280
now a world-class open agentic framework
2575
02:05:39,280 --> 02:05:42,480
that all of us could use to build our
2576
02:05:42,480 --> 02:05:45,760
open claw strategy. And we've created a
2577
02:05:45,760 --> 02:05:47,840
reference design we call Nemo cloud
2578
02:05:47,840 --> 02:05:50,880
neoclaw that all of you could use that
2579
02:05:50,880 --> 02:05:54,080
is optimized. It's performant. It is
2580
02:05:54,080 --> 02:05:57,560
safe and secure.
2581
02:06:04,000 --> 02:06:07,040
Speaking of agents, agents as you know
2582
02:06:07,040 --> 02:06:10,560
perceive, reason and act. Most of the
2583
02:06:10,560 --> 02:06:12,560
agents in the world today that I've
2584
02:06:12,560 --> 02:06:14,800
spoken about are digital agents. They
2585
02:06:14,800 --> 02:06:17,840
act in the digital world. They reason.
2586
02:06:17,840 --> 02:06:20,800
They write software. It's all digital.
2587
02:06:20,800 --> 02:06:23,760
But we also have been working on
2588
02:06:23,760 --> 02:06:25,920
physically embodied agents for a long
2589
02:06:25,920 --> 02:06:28,320
time. We call them robots. And the AIs
2590
02:06:28,320 --> 02:06:31,119
that they need are physical AIs. We have
2591
02:06:31,119 --> 02:06:33,199
some big announcements here. I'm going
2592
02:06:33,199 --> 02:06:36,400
to just walk through a few of them. 110
2593
02:06:36,400 --> 02:06:39,360
robots here. Almost every single company
2594
02:06:39,360 --> 02:06:41,840
in the world, I can't think of one that
2595
02:06:41,840 --> 02:06:43,440
are building robots is working with
2596
02:06:43,440 --> 02:06:45,840
Nvidia. We have three computers. The
2597
02:06:45,840 --> 02:06:47,920
training computer, the synthetic data
2598
02:06:47,920 --> 02:06:50,000
generation and simulation computer, and
2599
02:06:50,000 --> 02:06:51,920
of course the robotics computer that
2600
02:06:51,920 --> 02:06:54,239
sits inside the robot itself. We have
2601
02:06:54,239 --> 02:06:56,079
all the software stacks necessary to do
2602
02:06:56,079 --> 02:07:00,320
so. the AI models to help you.
2603
02:07:00,320 --> 02:07:02,079
And all of this is integrated into
2604
02:07:02,079 --> 02:07:04,480
ecosystems around the world and all of
2605
02:07:04,480 --> 02:07:07,840
our partners from Seammens to Cadence,
2606
02:07:07,840 --> 02:07:10,239
incredible partners everywhere. And
2607
02:07:10,239 --> 02:07:12,079
today, we're announcing a whole bunch of
2608
02:07:12,079 --> 02:07:14,800
new new partners. As you know, we've
2609
02:07:14,800 --> 02:07:16,320
been working on self-driving cars for a
2610
02:07:16,320 --> 02:07:18,000
long time. The Chad GPT moment of
2611
02:07:18,000 --> 02:07:20,400
self-driving cars has arrived. We now
2612
02:07:20,400 --> 02:07:23,040
know we could successfully autonomously
2613
02:07:23,040 --> 02:07:26,159
drive cars. And today we are announcing
2614
02:07:26,159 --> 02:07:30,880
four new partners for Nvidia's robo taxi
2615
02:07:30,880 --> 02:07:33,360
ready platform.
2616
02:07:33,360 --> 02:07:35,360
BYD,
2617
02:07:35,360 --> 02:07:36,880
Hyundai,
2618
02:07:36,880 --> 02:07:38,480
Nissan,
2619
02:07:38,480 --> 02:07:43,360
Ji all together, 18 million cars built
2620
02:07:43,360 --> 02:07:46,159
each year. joining our partners from
2621
02:07:46,159 --> 02:07:49,679
before Mercedes, Toyota,
2622
02:07:49,679 --> 02:07:53,760
GM. The number of robo taxi ready cars
2623
02:07:53,760 --> 02:07:55,520
in the future are going to be
2624
02:07:55,520 --> 02:07:58,159
incredible. And we're announcing also a
2625
02:07:58,159 --> 02:08:00,719
big partnership with Uber.
2626
02:08:00,719 --> 02:08:02,639
Multiple cities were going to be
2627
02:08:02,639 --> 02:08:05,760
deploying and connecting these robo taxi
2628
02:08:05,760 --> 02:08:08,480
ready vehicles into their network. And
2629
02:08:08,480 --> 02:08:11,520
so a whole bunch of new cars. We have uh
2630
02:08:11,520 --> 02:08:15,840
ABB, Universal Robotics, uh CUKA, so
2631
02:08:15,840 --> 02:08:17,920
many robotics companies here and we're
2632
02:08:17,920 --> 02:08:20,880
working with them to implement our
2633
02:08:20,880 --> 02:08:22,880
physical AI models integrated into
2634
02:08:22,880 --> 02:08:24,480
simulation system so that we could
2635
02:08:24,480 --> 02:08:27,119
deploy these robots into manufacturing
2636
02:08:27,119 --> 02:08:29,440
lines all over. We have Caterpillar
2637
02:08:29,440 --> 02:08:32,880
here. We even have T-Mobile here. And
2638
02:08:32,880 --> 02:08:34,960
the reason for that is in the future
2639
02:08:34,960 --> 02:08:37,440
that radio radio tower used to be a
2640
02:08:37,440 --> 02:08:40,000
radio tower is going to be an NVIDIA
2641
02:08:40,000 --> 02:08:43,199
aerial AI ram. And so this is going to
2642
02:08:43,199 --> 02:08:46,159
be a robotics radio tower. Meaning it
2643
02:08:46,159 --> 02:08:48,560
can reason about the traffic, figures
2644
02:08:48,560 --> 02:08:50,880
out how to adjust its beam forming so
2645
02:08:50,880 --> 02:08:52,639
that it could save as much energy as
2646
02:08:52,639 --> 02:08:55,199
possible and increase the amount of
2647
02:08:55,199 --> 02:08:57,920
fidelity as possible. There's so many
2648
02:08:57,920 --> 02:09:00,320
humanoid robots here, but one of my
2649
02:09:00,320 --> 02:09:04,159
favorites, one of my favorites is a
2650
02:09:04,159 --> 02:09:06,880
Disney robot. You know what? Tell you
2651
02:09:06,880 --> 02:09:08,560
what, let me just show you some of the
2652
02:09:08,560 --> 02:09:13,239
videos. Let's look at that first.
2653
02:09:18,800 --> 02:09:21,119
The first global rollout of physical AI
2654
02:09:21,119 --> 02:09:25,679
at scale is here. Autonomous vehicles.
2655
02:09:25,679 --> 02:09:28,560
And with NVIDIA Alpamo, vehicles now
2656
02:09:28,560 --> 02:09:30,960
have reasoning, helping them operate
2657
02:09:30,960 --> 02:09:33,599
safely and intelligently across
2658
02:09:33,599 --> 02:09:35,599
scenarios.
2659
02:09:35,599 --> 02:09:38,400
We ask the car to narrate its actions.
2660
02:09:38,400 --> 02:09:40,000
>> I'm changing lanes to the right to
2661
02:09:40,000 --> 02:09:42,719
follow my route.
2662
02:09:42,719 --> 02:09:44,560
>> Explain its thinking as it makes
2663
02:09:44,560 --> 02:09:46,159
decisions.
2664
02:09:46,159 --> 02:09:47,760
>> There's a double parked vehicle in my
2665
02:09:47,760 --> 02:09:52,079
lane. I'm going around it.
2666
02:09:52,079 --> 02:09:53,760
>> And follow instructions.
2667
02:09:53,760 --> 02:09:57,040
>> Hey, Mercedes. Can you speed up?
2668
02:09:57,040 --> 02:10:00,920
>> Sure, I'll speed up.
2669
02:10:02,159 --> 02:10:04,960
>> This is the age of physical AI and
2670
02:10:04,960 --> 02:10:07,280
robotics.
2671
02:10:07,280 --> 02:10:09,040
Around the world, developers are
2672
02:10:09,040 --> 02:10:11,760
building robots of every kind. But the
2673
02:10:11,760 --> 02:10:14,079
real world is massively diverse,
2674
02:10:14,079 --> 02:10:17,520
unpredictable, full of edge cases. Real
2675
02:10:17,520 --> 02:10:19,440
world data will never be enough to train
2676
02:10:19,440 --> 02:10:21,520
for every scenario.
2677
02:10:21,520 --> 02:10:23,920
We need data generated from AI and
2678
02:10:23,920 --> 02:10:29,199
simulation. For robots, compute is data.
2679
02:10:29,199 --> 02:10:31,280
Developers pre-trained World Foundation
2680
02:10:31,280 --> 02:10:33,760
models on internet scale video and human
2681
02:10:33,760 --> 02:10:36,320
demonstrations and evaluate the model's
2682
02:10:36,320 --> 02:10:38,239
performance to prepare them for
2683
02:10:38,239 --> 02:10:41,239
post-training.
2684
02:10:41,520 --> 02:10:44,639
Using classical and neural simulation,
2685
02:10:44,639 --> 02:10:46,320
they generate massive amounts of
2686
02:10:46,320 --> 02:10:48,960
synthetic data and train policies at
2687
02:10:48,960 --> 02:10:51,119
scale.
2688
02:10:51,119 --> 02:10:53,760
To accelerate developers, Nvidia built
2689
02:10:53,760 --> 02:10:56,159
open-source Isaac lab for robot training
2690
02:10:56,159 --> 02:10:58,960
and evaluation and simulation.
2691
02:10:58,960 --> 02:11:01,280
Newton for extensible and GPU
2692
02:11:01,280 --> 02:11:02,880
accelerated differentiable physics
2693
02:11:02,880 --> 02:11:04,639
simulation.
2694
02:11:04,639 --> 02:11:06,560
Cosmos world models for neural
2695
02:11:06,560 --> 02:11:08,719
simulation
2696
02:11:08,719 --> 02:11:10,639
and Groot open robotics foundation
2697
02:11:10,639 --> 02:11:12,800
models for robot reasoning and action
2698
02:11:12,800 --> 02:11:15,199
generation.
2699
02:11:15,199 --> 02:11:17,840
With enough compute, developers
2700
02:11:17,840 --> 02:11:20,000
everywhere are closing the physical AI
2701
02:11:20,000 --> 02:11:22,239
data gap.
2702
02:11:22,239 --> 02:11:24,800
Paratas AI trains their operating room
2703
02:11:24,800 --> 02:11:27,760
assistant robot in NVIDIA Isaac Lab,
2704
02:11:27,760 --> 02:11:29,440
multiplying their data with NVIDIA
2705
02:11:29,440 --> 02:11:33,119
Cosmos World models. Skilled AI uses
2706
02:11:33,119 --> 02:11:35,679
Isaac Lab and Cosmos to generate
2707
02:11:35,679 --> 02:11:38,000
post-training data for their skilled AI
2708
02:11:38,000 --> 02:11:40,719
brain. They use reinforcement learning
2709
02:11:40,719 --> 02:11:43,280
to harden the model across thousands of
2710
02:11:43,280 --> 02:11:45,760
variations. Humanoid
2711
02:11:45,760 --> 02:11:48,400
uses Isaac Lab to train whole body
2712
02:11:48,400 --> 02:11:52,000
control and manipulation policies.
2713
02:11:52,000 --> 02:11:54,800
Hexagon Robotics uses Isaac Lab for
2714
02:11:54,800 --> 02:11:57,360
training and data generation.
2715
02:11:57,360 --> 02:12:00,079
Foxcon fine-tunes group models in Isaac
2716
02:12:00,079 --> 02:12:01,679
Lab,
2717
02:12:01,679 --> 02:12:04,560
as does Noble Machines.
2718
02:12:04,560 --> 02:12:06,560
Disney research uses their chamino
2719
02:12:06,560 --> 02:12:09,199
physics simulator in Newton and Isaac
2720
02:12:09,199 --> 02:12:11,679
lab to train policies across their
2721
02:12:11,679 --> 02:12:16,840
character robots in every universe.
2722
02:13:09,760 --> 02:13:12,760
Da
2723
02:13:22,880 --> 02:13:24,159
Does
2724
02:13:24,159 --> 02:13:28,360
>> ladies and gentlemen Olaf
2725
02:13:30,079 --> 02:13:34,480
>> does coming through Newton Newton works.
2726
02:13:34,480 --> 02:13:35,599
>> Wow.
2727
02:13:35,599 --> 02:13:38,079
>> Omniverse works.
2728
02:13:38,079 --> 02:13:39,679
Olaf,
2729
02:13:39,679 --> 02:13:41,119
how are you?
2730
02:13:41,119 --> 02:13:44,079
>> I'm so happy now that I'm meeting you.
2731
02:13:44,079 --> 02:13:47,360
>> I know because I gave you your computer,
2732
02:13:47,360 --> 02:13:48,239
Jetson.
2733
02:13:48,239 --> 02:13:50,000
>> What's that?
2734
02:13:50,000 --> 02:13:53,760
Well, it's in your tummy.
2735
02:13:53,760 --> 02:13:55,520
>> That's going to be amazing.
2736
02:13:55,520 --> 02:13:59,040
>> And you learn how to walk inside
2737
02:13:59,040 --> 02:14:00,639
Omniverse.
2738
02:14:00,639 --> 02:14:04,079
>> I love walking. This is so much better
2739
02:14:04,079 --> 02:14:06,400
than riding on a reindeer gazing up at a
2740
02:14:06,400 --> 02:14:09,199
beautiful sky.
2741
02:14:09,199 --> 02:14:10,880
And
2742
02:14:10,880 --> 02:14:13,520
it was because of physics using this
2743
02:14:13,520 --> 02:14:17,040
Newton solver that runs on top of Nvidia
2744
02:14:17,040 --> 02:14:19,679
Warp that we jointly developed with
2745
02:14:19,679 --> 02:14:22,560
Disney and with DeepMind that made it
2746
02:14:22,560 --> 02:14:25,360
possible for you to be able to adapt to
2747
02:14:25,360 --> 02:14:27,440
the physical world. Check that out.
2748
02:14:27,440 --> 02:14:30,159
>> Not to say that
2749
02:14:30,159 --> 02:14:32,400
that's how smart you are.
2750
02:14:32,400 --> 02:14:37,560
>> I'm a snowman, not a snowed.
2751
02:14:38,800 --> 02:14:41,360
Could you imagine this? The future of
2752
02:14:41,360 --> 02:14:44,800
Disneyland. All these all these robots,
2753
02:14:44,800 --> 02:14:47,040
all these characters wandering around.
2754
02:14:47,040 --> 02:14:47,520
>> Oh,
2755
02:14:47,520 --> 02:14:49,040
>> you know, I have to admit though, I
2756
02:14:49,040 --> 02:14:51,920
thought you were going to be taller.
2757
02:14:51,920 --> 02:14:54,480
I've never seen such a short snowman, to
2758
02:14:54,480 --> 02:14:55,760
be honest.
2759
02:14:55,760 --> 02:14:57,679
>> Nope.
2760
02:14:57,679 --> 02:15:00,079
>> Hey, tell you what. You want to help me
2761
02:15:00,079 --> 02:15:00,880
out?
2762
02:15:00,880 --> 02:15:02,320
>> Hooray.
2763
02:15:02,320 --> 02:15:05,760
>> Okay. Usually usually I close the
2764
02:15:05,760 --> 02:15:08,079
keynote by talk telling you what I told
2765
02:15:08,079 --> 02:15:10,159
you. We talked about inference and
2766
02:15:10,159 --> 02:15:11,920
flection. We talked about the AI
2767
02:15:11,920 --> 02:15:14,560
factory. We talked about the open claw
2768
02:15:14,560 --> 02:15:16,719
agent revolution that's happening. And
2769
02:15:16,719 --> 02:15:19,280
of course we talked about physical AI
2770
02:15:19,280 --> 02:15:21,679
and robotics. But tell you what, why
2771
02:15:21,679 --> 02:15:23,599
don't we get some friends to help us
2772
02:15:23,599 --> 02:15:24,719
close it out?
2773
02:15:24,719 --> 02:15:25,920
>> Of course.
2774
02:15:25,920 --> 02:15:28,159
>> All right, play it.
2775
02:15:28,159 --> 02:15:29,599
Come on.
2776
02:15:29,599 --> 02:15:33,239
>> Terminating simulation.
2777
02:15:38,400 --> 02:15:41,400
Hello.
2778
02:15:43,119 --> 02:15:46,599
Anybody here?
2779
02:16:09,360 --> 02:16:12,320
The keynotes over all was said. Jensen
2780
02:16:12,320 --> 02:16:15,360
map the road ahead. AI factories coming
2781
02:16:15,360 --> 02:16:18,320
alive. Agents learning how to drive from
2782
02:16:18,320 --> 02:16:21,040
open models to robots too. Now we'll
2783
02:16:21,040 --> 02:16:24,599
break it all down.
2784
02:16:26,719 --> 02:16:29,920
Comput exploded. What we saw from CNN's
2785
02:16:29,920 --> 02:16:32,880
to open cloth agents working across the
2786
02:16:32,880 --> 02:16:34,639
land but they need the power to meet
2787
02:16:34,639 --> 02:16:37,200
demand. So we saw the problem it was
2788
02:16:37,200 --> 02:16:40,960
brilliant. We multiplied compute by 40
2789
02:16:40,960 --> 02:16:43,960
million.
2790
02:16:51,519 --> 02:16:55,120
But once upon AI time training was the
2791
02:16:55,120 --> 02:16:58,479
paradigm. Sure it talk models how in
2792
02:16:58,479 --> 02:17:01,519
France runs the whole world now shows us
2793
02:17:01,519 --> 02:17:04,800
who's the bars at 35 times less the cost
2794
02:17:04,800 --> 02:17:08,319
blackwell makes the token singing video
2795
02:17:08,319 --> 02:17:12,359
the inference king.
2796
02:17:13,040 --> 02:17:15,679
Yeah, our factories once took years as
2797
02:17:15,679 --> 02:17:18,160
vendors pulling racks and gears. Built
2798
02:17:18,160 --> 02:17:20,800
up slowly, piece by piece. No clear way
2799
02:17:20,800 --> 02:17:24,319
to scale this beast. DSX and Dynamo know
2800
02:17:24,319 --> 02:17:26,639
what to do.
2801
02:17:26,639 --> 02:17:29,920
Turning power
2802
02:17:29,920 --> 02:17:33,479
into revenue.
2803
02:17:35,359 --> 02:17:38,399
Agents used to wait and see, now act
2804
02:17:38,399 --> 02:17:41,120
autonomously. But if they ever try to
2805
02:17:41,120 --> 02:17:43,920
stray, safe claws block and say no way.
2806
02:17:43,920 --> 02:17:47,519
Nemo claws there to guard the course.
2807
02:17:47,519 --> 02:17:51,359
And yes, my friends,
2808
02:17:51,359 --> 02:17:55,719
it's open sorrow.
2809
02:18:01,359 --> 02:18:03,359
Cars that think and droids that run.
2810
02:18:03,359 --> 02:18:05,840
This ain't the movies. It's all begun.
2811
02:18:05,840 --> 02:18:09,200
Alamo calls the shots. It's a GPT moment
2812
02:18:09,200 --> 02:18:12,240
for the bots from sim streets. Now watch
2813
02:18:12,240 --> 02:18:17,679
them drive. Blow your hands up
2814
02:18:17,679 --> 02:18:22,599
for physical AI.
2815
02:18:31,840 --> 02:18:33,920
Industrial age. Build what came before.
2816
02:18:33,920 --> 02:18:36,479
Now we build for AI. Even more vera
2817
02:18:36,479 --> 02:18:38,080
rubin plus grog make the inference
2818
02:18:38,080 --> 02:18:39,840
splash put them together now it's
2819
02:18:39,840 --> 02:18:42,000
raining cash we build new architecture
2820
02:18:42,000 --> 02:18:44,080
every year cuz claws keep yelling more
2821
02:18:44,080 --> 02:18:46,719
tokens here the AI stacks for all to
2822
02:18:46,719 --> 02:18:49,359
make so let us all eat five layer cake
2823
02:18:49,359 --> 02:18:51,439
the moment's bright the path is clear
2824
02:18:51,439 --> 02:18:54,160
cuz open models led us here when data's
2825
02:18:54,160 --> 02:18:56,160
missing there's no dispute we just
2826
02:18:56,160 --> 02:18:58,800
generate more with compute robots
2827
02:18:58,800 --> 02:19:01,439
learning without flaw fueling the four
2828
02:19:01,439 --> 02:19:03,599
scaling laws the future's here won't you
2829
02:19:03,599 --> 02:19:09,319
come and see welcome Welcome all to GTC.
2830
02:19:22,319 --> 02:19:26,439
All right, have a great GTC
2831
02:19:27,599 --> 02:19:29,679
wave.
2832
02:19:29,679 --> 02:19:33,080
Thank you everybody.
2833
02:19:34,399 --> 02:19:37,800
I just met200164
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.