Search form

NVIDIA’s AI Workbench Beta Coming Later This Month

At CES, company announced a host of new generative AI technology updates and releases, including TensorRT GPUs, the TensorRT-LLM open-source library, tools for RTX PCs and workstations, RTX-accelerated GeForce RTX SUPER GPUs, and new developer AI tools to create, test, and customize pre-trained generative AI models and LLMs.

Today at CES, NVIDIA announced GeForce RTX SUPER desktop GPUs for generative AI performance, new AI laptops from every top manufacturer, and new NVIDIA RTX-accelerated AI software and tools for both developers and consumers. These tools enhance PC experiences with generative AI: NVIDIA TensorRT acceleration of the popular Stable Diffusion XL model for text-to-image workflows, NVIDIA RTX Remix with generative AI texture tools, NVIDIA ACE microservices, and more games that use DLSS 3 technology with Frame Generation.

The new AI Workbench, a unified toolkit for AI developers, is coming to beta later this month. In addition, NVIDIA TensorRT-LLM (TRT-LLM), an open-source library that accelerates and optimizes inference performance of the latest large language models (LLMs), now supports more pre-optimized PC models. Accelerated by TRT-LLM, Chat with RTX, an NVIDIA tech demo also releasing this month, allows AI enthusiasts to interact with their notes, documents, and other content.

“Generative AI is the single most significant platform transition in computing history and will transform every industry, including gaming,” said NVIDIA founder and CEO Jensen Huang. “With over 100 million RTX AI PCs and workstations, NVIDIA is a massive installed base for developers and gamers to enjoy the magic of generative AI.”

Running generative AI locally on a PC is important for privacy, latency, and cost-sensitive applications. Still, it requires a large installed base of AI-ready systems and developer tools to optimize AI models for the PC platform. To accomplish this, NVIDIA is innovating across its full technology stack, driving new experiences, and building on the 500+ AI-enabled PC applications and games already accelerated by NVIDIA RTX technology.

RTX AI PCs and Workstations

NVIDIA RTX GPUs can run a range of applications at a high level of performance, unlocking the potential of generative AI on PCs. Tensor Cores in these GPUs dramatically speed up AI performance across demanding applications.

The new GeForce RTX 40 SUPER Series graphics cards announced today at CES include the GeForce RTX 4080 SUPER, 4070 Ti SUPER, and 4070 SUPER for top AI performance. The GeForce RTX 4080 SUPER generates AI video 1.5x faster — and images 1.7x faster — than the GeForce RTX 3080 Ti GPU. The Tensor Cores in SUPER GPUs deliver up to 836 trillion operations per second, bringing transformative AI capabilities to gaming, creating, and everyday productivity.

Leading manufacturers, including Acer, ASUS, Dell, HP, Lenovo, MSI, Razer, and Samsung, are releasing a new wave of RTX AI laptops, bringing generative AI capabilities to users right out of the box. The new systems, which deliver a performance increase ranging from 20x-60x compared with using neural processing units, will start shipping this month.

Mobile workstations with RTX GPUs can run NVIDIA AI Enterprise software, including TensorRT and NVIDIA RAPIDS, for simplified, secure, generative AI and data science development. A three-year license for NVIDIA AI Enterprise is included with every NVIDIA A800 40GB Active GPU.

New PC Developer Tools for Building AI Models

NVIDIA recently announced NVIDIA AI Workbench, which will help developers create, test, and customize pre-trained generative AI models and LLMs using PC-class performance and memory footprint. Available in beta later this month, it will offer streamlined access to popular repositories like Hugging Face, GitHub, and NVIDIA NGC, along with a simplified user interface that allows developers to reproduce, collaborate, and migrate projects easily.

Projects can be scaled out to virtually anywhere, such as a data center, a public cloud, or NVIDIA DGX Cloud, and then brought back to local RTX systems on a PC or workstation for inference and light customization.

In collaboration with HP, NVIDIA is also simplifying AI model development by integrating NVIDIA AI Foundation Models and Endpoints, which include RTX-accelerated AI models and software development kits, into the HP AI Studio, a centralized platform for data science. Users can search, import, and deploy optimized models across PCs and the cloud.

After building AI models for PC use cases, developers can optimize them using NVIDIA TensorRT to take advantage of RTX GPUs’ Tensor Cores.

In addition, NVIDIA recently extended TensorRT to text-based applications with TensorRT-LLM for Windows, an open-source library for accelerating LLMs. The latest update to TensorRT-LLM, available now, adds Phi-2 to the growing list of pre-optimized models for PCs, which run up to 5x faster compared to other inference backends.

RTX-Accelerated Generative AI Powers New PC Experiences

At CES, NVIDIA and its developer partners are releasing new generative AI-powered applications and services for PCs, including:

  • NVIDIA RTX Remix, available in beta later this month, is a platform that creates RTX remasters of classic games. It delivers generative AI tools that transform basic textures from classic games into modern, 4K-resolution, physically based rendering materials.
  • NVIDIA ACE microservices, including generative AI-powered speech and animation models, enable developers to add intelligent, dynamic digital avatars to games.
  • TensorRT acceleration for Stable Diffusion XL (SDXL) Turbo and latent consistency models, TensorRT improves performance for both by up to 60% compared with the previous fastest implementation. An updated Stable Diffusion WebUI TensorRT extension version is now available, including acceleration for SDXL, SDXL Turbo, LCM - Low-Rank Adaptation (LoRA), and improved LoRA support.
  • NVIDIA DLSS 3 with Frame Generation uses AI to increase frame rates up to 4x compared with native rendering and is featured in 12 out of 14 new RTX games announced, including Horizon Forbidden West, Pax Dei, and Dragon’s Dogma 2.
  • Chat with RTX, an NVIDIA tech demo available later this month, allows AI enthusiasts to connect PC LLMs to their data using the popular retrieval-augmented generation (RAG) technique. The demo, accelerated by TensorRT-LLM, enables users to interact with their notes, documents, and other content. It will also be available as an open-source reference project so developers can implement the same capabilities in their applications.

Join NVIDIA at CES to learn more about generative AI.

Source: NVIDIA

Debbie Diamond Sarto's picture

Debbie Diamond Sarto is news editor at Animation World Network.