NVIDIA’S HUGE AI Breakthroughs Just Changed Everything (Supercut)
“In five years, we improved computer graphics by 1,000 times. In five years, using artificial intelligence and accelerated computing, Moore’s Law is probably currently running at about two times. A thousand times in five years, a thousand times in five years is one million times in ten. We’re doing the same thing in artificial intelligence. Now the question is, what can you do when your computer is one million times faster?
AI for Ray Tracing
I have a lot to tell you, very little time, so let’s get going. Ray tracing, simulating the characteristics of light and materials, is the ultimate accelerated computing challenge. Six years ago, we demonstrated for the very first time rendering this scene in less than a few hours. Let’s take a look at the difference in just five years. Roll it.
This is running on Cuda GPUs. Six years ago, rendering this beautiful image that would have otherwise taken a couple of hours on a CPU. So, this was a giant breakthrough, already enormous speed up running on accelerated computing. And then we invented the RTX GPU. Run it, please.
The Holy Grail of computer graphics, ray tracing, is now possible in real-time. AI made it possible for us to do that. Everything that you saw would have been utterly impossible without AI. For every single pixel we render, we use AI to predict seven others. For every pixel we compute, AI predicted seven others. The amount of energy we save, the amount of performance we get, is incredible. We used AI to render the scene. We’re going to also use AI to bring it alive.
Generative AI for Avatars
Today, we’re announcing Nvidia Ace Avatar Cloud Engine that is designed for animating, to bring a digital avatar to life. It has several characteristics, several capabilities, speech recognition, text-to-speech, natural language understanding, basically a large language model. Using the sound that you will be generating with your voice, animate the face, and using the sound and the expression that you’re saying, animate your gestures. All of this is completely trained by AI. It is completely rendered with ray tracing. Everything is real-time.
“Hey, Jen, how are you?” “Unfortunately, not so good.” “How come?” “I’m worried about the crime around here; it’s gotten bad lately. My Ramen Shop got caught in the crossfire.” “Can I help?” “If you want to do something about this, I have heard rumors that the powerful crime Lord Kumon Aoki is causing all sorts of chaos in the city. He may be the root of this violence.” “I’ll talk to him. Where can I find them?” “I have heard he hangs out in the underground fight clubs on the city’s East Side. Try there.” “Okay, I’ll go.” “Be careful, guy.”
None of that conversation was scripted. We gave that AI this gin AI character a backstory, his story about his Ramen Shop, and the story of this game. All you have to do is go up and talk to this character, and because this character has been infused with artificial intelligence and large language models, it can interact with you, understand your meaning, and interact with you in a reasonable way. All of the facial animation completely done by the AI. We have made it possible for all kinds of characters to be generated; they have their own domain knowledge. You can customize it, so everybody’s game is different, and look how wonderfully beautiful and natural they are. This is the future of video games.
This compute for generative AI is the new computer industry. Software is no longer programmed just by computer engineers. Software is programmed by computer engineers working with AI supercomputers. These AI supercomputers are a new type of factory. It is very logical that a car industry has factories; they build things so you can see cars. It is very logical that the computer industry has computer factories; you build things that you can see, computers. In the future, every single major company will also have AI factories, and you will build and produce your company’s intelligence. We are intelligence producers already; it’s just that the intelligence producers are people. In the future, we will be intelligence producers, artificial intelligence producers, and every single company will have factories, and the factories will be built this way. This translates to your throughput. Just now, in the beginning, I showed you computer graphics. It turns out that the friends we met at the University of Toronto, Alex Krizhevsky and the Suvcovers, and Geoffrey Hinton, they discovered the continuous scaling of artificial intelligence and deep learning, deep learning networks, and came up with the chat GPT breakthrough.
The breakthrough, of course, is very, very clear, and I’m sure that everybody here has already tried tragedy PT. But the important thing is this: we now have a software capability to learn the structure of almost any information. We can learn the structure of text, sound, images; there’s structure in all of this. Physics, proteins, DNA, chemicals, anything that has structure, we can learn that language. And then the next breakthrough came, generative AI. Once you can learn the language, once you can learn the language of certain information, then with control and guidance from another source of information that we call prompts, we can now guide the AI to generate information of all kinds. We can generate text to text, text to image, but the important thing is this information transformed to other information is now possible. Text to proteins, text to chemicals, images to 3D, video to video. So many different types of information can now be transformed for the very first time in history. We can now apply the instrument of our industry to so many different fields that were impossible before. This is the reason why everybody is so excited.
Newest Generative AI Examples
Let’s take a look at what he can do. This here’s a prompt, and this prompt says, “Hi, Computex, I’m here to tell you how wonderful stinky tofu is. You can enjoy it right here in Taiwan; it’s best from the night market. I was just there the other night.” Okay, play it.
“Hi, Computex, I’m here to tell you about how wonderful stinky tofu is. You can enjoy it right here in Taiwan; it’s best from the Night Market.”
The only input was words. The output was that video. Okay, here’s another problem. “I am here at Computex; I will make you like me best. Sing it with me: I really like Nvidia.” Okay, so these are the words, and I say, “Hey, hey, voice mod, could you write me a song?” These are the words. Okay, play it.
“I am here.com”
[Music]
[Applause]
Okay, so obviously this is a very, very important new capability, and that’s the reason why there are so many generative AI startups. We’re working with some 1,600 generative AI startupsโjust utterly incredible. There’s no question that we’re in a new computing era. You don’t have to just AI this generation. This computing era does not need new applications; it can succeed with old applications, and it’s going to have new applications. The rate of progress, the rate of progress, because it’s so easy to use, is the reason why it’s growing so fast.
Generative AI for Communications
1964, the year after I was born, was a very good year for technology. IBM, of course, launched the System 360, and AT&T demonstrated to the world their first picture phone encoded, compressed, streamed over copper telephone wires and twisted pair. And on the other end, decoded picture phone, little tiny screen, black and white. To this day, this very experience is largely the same, of course, at much, much higher volumes for all the reasons we all know. Well, video calls are now one of the most important things we do. Everybody does it. About 65 percent of the internet’s traffic is now video. And yet, the way it’s done is fundamentally still the same. Compress it on the device, stream it, and decompress it on the other end. Nothing changed in 60 years.
We treat communications like it goes down a dump pipe. The question is, what would happen if we applied generative AI to that? Let’s take a look.
The future of wireless and video communications will be 3D, generated by AI. Let’s take a look at how Nvidia Maxine 3D, running on the Nvidia Grace Hopper Superchip, can enable 3D video conferencing on any device without specialized software or hardware. Starting with a standard 2D camera sensor that’s in most cell phones, laptops, and webcams, and tapping into the processing power of Grace Hopper, Maxine 3D converts these 2D videos to 3D using cloud services. This brings a new dimension to video conferencing, with Maxine 3D visualization creating an enhanced sense of depth and presence. You can dynamically adjust the camera to see every angle in motion, engage with others more directly with enhanced eye contact, and personalize your experience with animated avatars, stylizing them with simple text prompts. With Maxine’s language capabilities, your avatar can speak in other languages, even ones you don’t know.
AI by farming singer Jojo function, a new video, where is Akai one hotel communication? Nvidia Maxine 3D, together with Grace Hopper, bring immersive 3D video conferencing to anyone with a mobile device, revolutionizing the way we connect, communicate, and collaborate.
All of the words coming out of my mouth were generated by AI. Instead of compression, streaming, and decompression, in the future, communications will be perceived, streamed, and reconstructed. Regenerated and can be generated in all kinds of different ways. It can be generated in 3D, of course. It can regenerate your language in another language. So, we now have a universal translator.
Generative AI for Robotics
I just talked about how we are going to extend the frontier of AI, but there are so many different applications in so many different areasโscientific computing, data processing, large language model training, that you’ve been taught. We’ve been talking about generative AI inference that we just talked about. Cloud and video and graphics and so let’s take a look at the benefit for them. This is a simple image processing application. If you were to do it on a CPU versus on a GPU running on Enterprise Nvidia AI Enterprise, you’re getting 31.8 images per minute, or basically 24 times the throughput, or you only pay 5 percent of the cost. This is really quite amazing.
Generative AI for Digital Twins
Now let me talk to you about the next phase of AI, where AI meets a digital twin. Why does AI need a digital twin? Let me give you just a simple example. In the future, you would say to your robot, “I would like you to do something,” and the robot will understand your words, and it would generate animation. Remember, I said earlier, you can go from text to text, you can go from text to image, you can go from text to music. Why can’t you go from text to animation? In the future, robotics will be highly revolutionized by the technology we already have in front of us. However, how does this robot know that the motion that it is generating is grounded in reality? It is grounded in physics. You need a software system that understands the laws of physics. Now you’ve actually seen this already with chat GPT, where AI uses Nvidia Omniverse in a reinforcement learning loop to ground itself. You have seen chat GPT do this, using reinforcement learning, human feedback. Using human feedback, chat GPT was able to be developed by grounding it to humans and align it with our principles. So, reinforcement learning with human feedback is really important. Reinforcement learning for physics feedback is very important. Let me show you. Everything you’re about to see is a simulation. Let’s roll it, please.
[Music]
thank you
[Music]
foreign
[Music]
[Music]
[Music]
thank you
[Music]
foreign
[Music]
Was a simulation. Nothing was art. Everything was a simulation. Now I’m going to show you very quickly Omniverse in the cloud. Let’s take a look at the Omniverse Cloud. So, this is, you know, just a web browser, and we’re looking now into Omniverse Factory Explorer. It’s running 10,000 kilometers away in our Santa Clara headquarters, and we’re leveraging the power of our data center now to visualize this Factory floor. We’ve using real Factory data from Siemens and Autodesk Revit to take a look. It’s a cloud application, so we can have multiple users collaborating. Let’s go ahead and bring up Eloise Green, and we can see that we have these two users in this environment, and Jeff on the left there is going to look at some markup. We have this task to perform; we need to move this object, so we can have Eloise just go ahead and grab that conveyor belt, move it over, and as he does so, you’ll see that it’s reflected accurately and completely in real-time on Jeff’s screen. So, we’re able to collaborate with multiple users, and even in bringing up this demo, we had users from around the globe working on the processโEast and West Coast United States, Germany, even Sydney, and of course, here in Taipei, to put this together. Now, as if we’re modifying our production line, of course, one of the things we want to do is add the necessary safety equipment. So, we’re able to simply drag and drop items into Omniverse and modify this production environment and begin tweaking this and optimizing for performance even before we break ground with construction.
That is so cool. This is in California, 6,264 miles away or something like that, 34 milliseconds by the speed of light one way, and it’s completely interactive. Everything is ray-traced. No art is necessary. You bring everything, the entire CAD into Omniverse, open up a browser, bring your data in, bring your factory in. No art is necessary. The lighting just does what the lighting does, physics does what the physics does. If you want to turn off physics, you can. If you want to turn on physics, you can. And multiple users, as many as you like, can enter the Omniverse at the same time and work together. One unified source of data across your entire company. Just now, it was humans interacting with Omniverse. Humans interacting with Omniverse; in the future, we’ll even have a generative AI and AI interact with him in Omniverse because, of course, imagine in the very beginning there was Jinn. That could be a character; that could be one of the users of Omniverse interacting with you, answering questions, helping you. We can also use generative AI to help us create Virtual Worlds. So, for example, this is a plastic bottle that’s rendered in Omniverse. They could be placed in a whole bunch of different types of environments. It can render beautifully physically. You could place it just by giving it a prompt by saying, “I would like to put this life; these bottles in a lifestyle photograph style backdrop form of modern warm Farmhouse bathroom.” Change the background. Everything’s all integrated and rendered again. Okay, so generative AI will come together with Omniverse to assist the creation of virtual worlds. In the future, whenever you engage a particular ad, it could be generated just for you. Today it was retrieved, and today the Computing model when you engage information, it is retrieved. In the future, when you engage information, much of it will be generated. Notice the Computing model has changed. WPP generates 25 percent of the ads that the world sees. Sixty percent of the world’s largest companies are already clients, and so they made a video of how they would use this technology. The world’s Industries are racing to realize the benefits of AI. Nvidia and WPP are building a groundbreaking generative AI-enabled content engine to enable the next evolution of the 700 billion dollar digital advertising industry built on Nvidia Ai and Omniverse. This engine gives Brands the ability to build and deploy highly personalized and compelling visual content faster and more efficiently than ever before. The process starts by building a physically accurate digital twin of a product using Omniverse Cloud, which connects product design data from industry-standard tools. Then WPP artists create customized and diverse virtual sets using a combination of digitized environments and generative AI tools by organizations such as Getty Images and Adobe trained on fully licensed data using Nvidia Picasso. Unique combination of technologies allows WPP to build accurate photorealistic visual content and e-commerce experiences that bring new levels of realism and scale to the industry. It’s completely digital. Imagine if you have digital information in your hands, what can you do with it? Almost everything.
Generative AI for Robotics
What you just saw is basically every Factory in the future will be digital. Of course, first, every Factory will be a robot inside the factories. There will be other robots that the factory is orchestrating. We are also going to build robots that move themselves. So far, the robots that you saw are stationary. Now we’re going to also have robots that move. Everything that moves in the future will have artificial intelligence and will have robotic capability. We built the entire robotic stack top to bottom from the chip to the algorithms. We have state-of-the-art perception for multimodality sensors, state-of-the-art mapping, state-of-the-art localization and planning, and a cloud mapping system. Everything has been created; however, you would like to use it, you can use pieces of it. It’s open, available for you, including all the cloud mapping systems. So, this is Isaac AMR. It includes a chip called Orin. It goes into a computer, and it goes into the Nvidia or Nova Orin, which is a reference system, a blueprint for AMRs. This is the most advanced AMR in the world today, and that thinks it is in the Real Environment; it cannot tell. And the reason for that is because all the sensors work, physics work, it can navigate, it can localize itself, everything is physically based. Therefore, we can design the robot, simulate the robot, train the robot all in Isaac, and then we take the brainโIsaac Simโthen we take the brain, the software, and we put it into the actual robot, and with some amount of adaptation, it should be able to perform the same job. This is the futureโrobotics, Omniverse, and AI working together. The excitement in the hard Industries, the heavy Industries, has been incredible. We have been connecting Omniverse all over the world with tools companies, robotics companies, sensor companies, all kinds of Industries. There are three Industries right now, as we speak, that’s putting enormous investments into world number one, of course, is the Chip industry. Number two, electric battery industry. Number three, the electric vehicle industry. Trillions of dollars will be invested in the next several years. Trillions of dollars will be invested in the next several years, and they would all like to do it better, and they would like to do it in a modern way. For the very first time, we now give them a system, a platform, tools that allow them to do that. I want to thank all of you for your partnership over the years. Thank you. Thanks.