NVIDIA to showcase 20 groundbreaking research papers at the influential SIGGRAPH computer graphics conference
Today, NVIDIA unveiled a series of pioneering AI research aimed at empowering developers and artists to actualize their visions, irrespective of whether they are stationary or dynamic, 2D or 3D, hyper-realistic or abstract.
A total of 20 research papers from NVIDIA, which propel the progress of generative AI and neural graphics, are set to be presented at SIGGRAPH 2023 - the leading conference for computer graphics. This encompasses partnerships with more than twelve universities from the U.S., Europe, and Israel, with the conference set to occur in Los Angeles from August 6-10.
The research papers include generative AI models that turn text into personalized images, inverse rendering tools that transform still ideas into 3D objects, neural physics models that use AI to simulate complex 3D elements with stunning realism, and neural rendering models that unlock new capabilities for generating real-time, AI-powered visual details.
Regularly shared with developers on GitHub and incorporated into products, NVIDIA's innovations span the NVIDIA Omniverse platform for building and operating metaverse applications, as well as NVIDIA Picasso, a recently unveiled foundry dedicated to customized generative AI models for visual design. Leveraging years of graphics research, NVIDIA has played a pivotal role in introducing film-like rendering to games, exemplified by the highly acclaimed Cyberpunk 2077 Ray Tracing: Overdrive Mode, the world's first AAA title to utilize path tracing.
The research advancements presented at this year's SIGGRAPH will not only assist developers and businesses in rapidly generating synthetic data to populate virtual worlds for robotics and autonomous vehicle training but also empower creators in art, architecture, graphic design, game development, and film to accelerate the creation of top-notch visuals for storyboarding, previsualization, and production.
Personalized AI: Tailored Text-to-Image Generative Models
The power of generative AI models that metamorphose text into images lies in their ability to craft concept art or storyboards for films, video games, and 3D virtual worlds. These text-to-image AI tools can convert a simple prompt like "children's toys" into a myriad of visuals, serving as a fount of inspiration for creators, generating images ranging from stuffed animals to blocks and puzzles.
However, artists often have a specific subject they envision with clarity and precision. For example, a creative director for a toy brand might be strategizing an ad campaign centered on a new teddy bear and would want to visualize this toy in various scenarios, like a teddy bear tea party. To cater to such a degree of precision in the output of a generative AI model, researchers from Tel Aviv University and NVIDIA present two papers at SIGGRAPH that allow users to input image examples from which the model can swiftly learn.
One of these papers delineates a technique requiring a single example image to customize its output. This drastically speeds up the personalization process, reducing it to approximately 11 seconds on a single NVIDIA A100 Tensor Core GPU — a staggering 60x improvement over previous personalization techniques.
The second paper unveils a highly compact model named Perfusion. This model utilizes a few concept images to permit users to merge multiple personalized elements — for example, a specific teddy bear and teapot — into a singular, AI-generated visual.
Sculpting in 3D: The Leap Forward in Inverse Rendering and Character Creation
Once creators have formulated the concept art for a virtual world, the subsequent phase involves rendering the milieu and populating it with 3D objects and characters. NVIDIA Research is pioneering AI methodologies to expedite this otherwise laborious process. They're achieving this by automating the conversion of 2D images and videos into 3D representations that creators can seamlessly import into graphic applications for further refinement.
A collaborative study with the University of California, San Francisco, introduces technology that can create and display a lifelike 3D model of a head and shoulders portrait using a single 2D picture. This technique functions in real-time on an everyday home computer and can generate a natural or artistic 3D presence utilizing standard webcams or mobile phone cameras. This monumental advancement democratizes the creation of 3D avatars and 3D video conferencing through AI.
A fourth endeavor, resulting from a collaboration with Stanford University, infuses 3D characters with lifelike motion. The researchers have crafted an AI system capable of learning an array of tennis skills from 2D video recordings of actual tennis matches and then applying this motion to 3D characters. The digitally created tennis players are proficient in hitting the ball precisely to intended locations on the virtual court, and they can also partake in extended back-and-forths with other characters.
Beyond the specific instance of tennis, this SIGGRAPH paper tackles the challenging task of creating 3D characters capable of executing a wide array of skills with realistic movement. Remarkably, this is accomplished without the need for costly motion-capture data.
Precision Down to the Last Strand: Leveraging Neural Physics for Lifelike Simulations
After creating a 3D character, artists endeavor to incorporate realistic details such as hair, which is intricate and computationally demanding for animators.
On average, humans have around 100,000 hairs on their heads, each interacting dynamically with individual motion and the surrounding environment. Historically, creators have applied physics formulas to approximate hair movement, simplifying its action based on available resources. This is why the hair detail on virtual characters in high-budget films surpasses that of real-time video game avatars.
The fifth paper features a method that leverages neural physics — an AI technique that trains a neural network to predict real-world object movement — to simulate tens of thousands of hairs in high resolution and in real-time. This innovative approach, optimized explicitly for contemporary GPUs, provides a substantial performance boost compared to top-of-the-line, CPU-based solvers.
The method slashes simulation times from several days to hours while enhancing the quality of real-time hair simulations. This breakthrough finally paves the way for both precise and interactive physically-based hair grooming.
Elevating Real-Time Graphics: Neural Rendering Unleashes Film-Quality Detail
Once a virtual environment is populated with animated 3D objects and characters, real-time rendering becomes crucial in simulating the physics of light as it interacts with the virtual scene. The latest research from NVIDIA showcases how AI models pertaining to textures, materials, and volumes could transform real-time graphical representation, providing movie-grade, true-to-life imagery for video games and digital replicas.
More than twenty years ago, NVIDIA blazed the trail for programmable shading, allowing developers to tailor the graphics pipeline to their needs. Building upon this foundation, the latest advancements in neural rendering involve researchers integrating AI models deep within NVIDIA's real-time graphics pipelines.
During the presentation of their sixth paper at SIGGRAPH, NVIDIA is set to reveal neural texture compression. This innovative method enables a 16-fold increase in texture detail without demanding extra GPU memory. Neural texture compression significantly enhances the realism of 3D scenes, as showcased in the accompanying image. In this scenario, neural-compressed textures (shown on the right) preserve more defined details than older formats presenting the text as somewhat fuzzy (displayed in the middle).
Additionally, a related paper announced last year, NeuralVDB, is now available for early access. NeuralVDB employs an AI-powered data compression technique that reduces volumetric data representation memory requirements by 100 times. This technique is particularly beneficial for rendering phenomena like smoke, fire, clouds, and water.
Furthermore, NVIDIA has disclosed further details today about their neural materials research, showcased in a recent NVIDIA GTC keynote. The paper outlines an AI system that learns how light interacts with multi-layered, photoreal materials, simplifying the complexity of these assets into compact neural networks that can operate in real-time. This breakthrough enables shading up to 10 times faster.
The remarkable level of realism achieved through neural rendering is exemplified in the neural-rendered teapot, which faithfully represents various intricate details. It accurately portrays the ceramic material, captures the imperfections of the clear-coat glaze, reveals fingerprints and smudges, and even simulates the presence of dust.