Custom NVIDIA AI Foundry Models The NeMo Retriever microservice

 

How Businesses Can Create Personalized Generative AI Models with NVIDIA AI Foundry.

NVIDIA AI Foundry

Companies looking to use  AI need specialized Custom models made to fit their particular sector requirements.

With the use of software tools, accelerated computation, and data, businesses can build and implement unique models with NVIDIA  AI Foundry, a service that may significantly boost their generative AI projects.

Similar to how TSMC produces chips made by other firms, NVIDIA AI Foundry offers the infrastructure and resources needed by other businesses to create and modify AI models. These resources include DGX Cloud, foundation models, NVIDIA NeMo software, NVIDIA knowledge, ecosystem tools, and support.

The product is the primary distinction: NVIDIA AI Foundry assists in the creation of Custom models, whereas TSMC manufactures actual semiconductor chips. Both foster creativity and provide access to a huge network of resources and collaborators.

Businesses can use AI Foundry to personalise NVIDIA and open Custom models models, such as NVIDIA Nemotron, CodeGemma by Google DeepMind, CodeLlama, Gemma by Google DeepMind, Mistral, Mixtral, Phi-3, StarCoder2, and others. This includes the recently released Llama 3.1 collection.

AI Innovation is Driven by Industry Pioneers

Among the first companies to use NVIDIA AI Foundry are industry leaders Amdocs, Capital One, Getty Images, KT, Hyundai Motor Company, SAP, ServiceNow, and Snowflake. A new era of AI-driven innovation in corporate software, technology, communications, and media is being ushered in by these trailblazers.

According to Jeremy Barnes, vice president of AI Product at ServiceNow, “organizations deploying AI can gain a competitive edge with Custom models that incorporate industry and business knowledge.” “ServiceNow is refining and deploying models that can easily integrate within customers’ existing workflows by utilising NVIDIA AI Foundry.”

The NVIDIA AI Foundry’s Foundation

The foundation models, corporate software, rapid computing, expert support, and extensive partner ecosystem are the main pillars that underpin NVIDIA AI Foundry.

Its software comprises the whole software platform for expediting model building, as well as AI foundation models from NVIDIA and the  AI community.

NVIDIA DGX Cloud, a network of accelerated compute resources co-engineered with the top public clouds in the world Amazon Web Services, Google  Cloud, and Oracle  Cloud Infrastructure is the computational powerhouse of NVIDIA  AI Foundry. Customers of AI Foundry may use DGX Cloud to grow their AI projects as needed without having to make large upfront hardware investments.

They can also create and optimize unique generative AI applications with previously unheard-of ease and efficiency. This adaptability is essential for companies trying to remain nimble in a market that is changing quickly.

NVIDIA AI Enterprise specialists are available to support customers of NVIDIA AI Foundry if they require assistance. In order to ensure that the models closely match their business requirements, NVIDIA experts may guide customers through every stage of the process of developing, optimizing, and deploying their models using private data.

Customers of NVIDIA AI Foundry have access to a worldwide network of partners who can offer a comprehensive range of support. Among the NVIDIA partners offering AI Foundry consulting services are Accenture, Deloitte, Infosys, and Wipro. These services cover the design, implementation, and management of AI-driven digital transformation initiatives. Accenture is the first to provide the Accenture AI Refinery framework, an AI Foundry-based solution for creating Custom models.

Furthermore, companies can get assistance from service delivery partners like Data Monsters, Quantiphi, Slalom, and SoftServe in navigating the challenges of incorporating AI into their current IT environments and making sure that these applications are secure, scalable, and in line with business goals.

Using AIOps and MLOps platforms from NVIDIA partners, such as Cleanlab, DataDog, Dataiku, Dataloop, DataRobot, Domino Data Lab, Fiddler AI, New Relic, Scale, and Weights & Biases, customers may create production-ready NVIDIA AI Foundry models.

Nemo retriever microservice

Clients can export their AI Foundry models as NVIDIA NIM inference microservices, which can be used on their choice accelerated infrastructure. These microservices comprise the Custom models, optimized engines, and a standard API.

NVIDIA TensorRT-LLM and other inferencing methods increase Llama 3.1 model efficiency by reducing latency and maximizing throughput. This lowers the overall cost of operating the models in production and allows businesses to create tokens more quickly. The NVIDIA  AI Enterprise software bundle offers security and support that is suitable for an enterprise.

Along with cloud instances from Amazon Web Services, Google Cloud, and Oracle  Cloud Infrastructure, the extensive array of deployment options includes NVIDIA-Certified Systems from worldwide server manufacturing partners like Cisco, Dell, HPE, Lenovo, and Supermicro.

Furthermore, Together  AI, a leading cloud provider for AI acceleration, announced today that it will make Llama 3.1 endpoints and other open models available on DGX  Cloud through the usage of its NVIDIA GPU-accelerated inference stack, which is accessible to its ecosystem of over 100,000 developers and businesses.

According to Together AI’s founder and CEO, Vipul Ved Prakash, “every enterprise running generative AI applications wants a faster user experience, with greater efficiency and lower cost.” “With NVIDIA DGX Cloud, developers and businesses can now optimize performance, scalability, and security by utilising the Together Inference Engine.”

NVIDIA NeMo

NVIDIA NeMo Accelerates and Simplifies the Creation of Custom Models

Developers can now easily curate data, modify foundation models, and assess performance using the capabilities provided by NVIDIA NeMo integrated into AI Foundry. NeMo technologies consist of:

  • A GPU-accelerated data-curation package called NeMo Curator enhances the performance of generative AI models by preparing large-scale, high-quality datasets for pretraining and fine-tuning.
  • NeMo Customizer is a scalable, high-performance microservice that makes it easier to align and fine-tune LLMs for use cases specific to a given domain.
  • On any accelerated cloud or data centre, NeMo Evaluator offers autonomous evaluation of generative AI models across bespoke and academic standards.
  • NeMo Guardrails is a dialogue management orchestrator that supports security, appropriateness, and correctness in large-scale language model smart applications, hence offering protection for generative AI applications.
  • Businesses can construct unique AI models that are perfectly matched to their needs by utilising the NeMo platform in NVIDIA AI Foundry.
  • Better alignment with strategic objectives, increased decision-making accuracy, and increased operational efficiency are all made possible by this customization.
  • For example, businesses can create models that comprehend industry-specific vernacular, adhere to legal specifications, and perform in unison with current processes.

According to Philipp Herzig, chief  AI officer at SAP, “as a next step of their partnership, SAP plans to use NVIDIA’s NeMo platform to help businesses to accelerate AI-driven productivity powered by SAP Business  AI.”

NeMo Retriever

NeMo Retriever microservice

Businesses can utilize NVIDIA NeMo Retriever NIM inference microservices to implement their own AI models in a live environment. With retrieval-augmented generation (RAG), these assist developers in retrieving private data to provide intelligent solutions for their AI applications.

According to Baris Gultekin, Head of AI at Snowflake, “safe, trustworthy AI is a non-negotiable for enterprises harnessing generative AI, with retrieval accuracy directly impacting the relevance and quality of generated responses in RAG systems.” “NeMo Retriever, a part of NVIDIA AI Foundry, is leveraged by Snowflake Cortex AI to further provide enterprises with simple, reliable answers using their custom data.”

Custom Models

Custom Models Provide a Competitive Edge

The capacity of NVIDIA AI Foundry to handle the particular difficulties that businesses encounter while implementing AI is one of its main benefits. Specific business demands and data security requirements may not be fully satisfied by generic AI models. On the other hand, Custom models are more flexible, adaptable, and perform better, which makes them perfect for businesses looking to get a competitive edge.

Post a Comment

0 Comments