Gaudi 2 Intel Tests GenAI vs. NV H100



Gaudi 2 Intel For Gen AI Performance, Intel Gaudi 2 is Still the Only Benchmarked Option Besides NV H100.

The results of the industry-standard MLPerf v4.0 benchmark for inference were released by MLCommons. The company’s dedication to bringing “AI Everywhere” with a wide range of competitive solutions is reinforced by Intel’s results for its Intel Gaudi 2 accelerators and 5th Gen Intel Xeon Scalable CPUs with Intel Advanced Matrix Extensions (Intel AMX).

When it comes to generative AI (GenAI) performance, the Gaudi 2 Intel AI accelerator is still the sole benchmarked substitute for the Nvidia H100, offering excellent value for the money. Moreover, Intel continues to be the only manufacturer of server CPUs to provide MLPerf findings.

According to MLPerf Inference v3.1, the average improvement in Intel’s 5th Gen Xeon results over 4th Gen Intel Xeon processors was 1.42x. The Reason It Matters Intel’s MLPerf findings provide clients industry-standard standards to assess AI performance, building on its training and inference performance from prior MLPerf rounds.

Intel Gaudi 2 vs Nvidia

NV H100

  • High-Performance AI Accelerator: This robust GPU is intended for usage in data centres for applications such as real-time deep learning inference, rapid data analytics, high-performance computing (HPC), and data analysis.
  • Leading Performance: The H100 offers up to 7x better performance for HPC workloads than prior versions.
  • LLM, or large language model Amicable: With its specialised Transformer Engine, which is intended to handle large LLMs, conversational AI activities may be performed up to 30 times quicker.
  • Scalability and Security: To handle exascale workloads, the NVIDIA NVLink Switch System connects up to 256 H100 GPUs, and NVIDIA Confidential Computing, a built-in feature, secures data and applications.

Intel Gaudi 2

The popular large language models (LLMs) and multimodal models that are covered by the Intel Gaudi software package are still growing. Intel submitted the Gaudi 2 accelerator results for the state-of-the-art models Llama v2-70B and Stable Diffusion XL for MLPerf Inference v4.0.

Hugging Face Text Generation Inference (TGI) has been quite popular, and Gaudi’s Llama work took use of this by using the TGI toolbox, which allows for tensor parallelism and continuous batching, thereby improving the efficiency of real-world LLM scaling. Gaudi 2 yielded 8035.0 and 6287.5 tokens-per-second for offline and server tokens, respectively, for Llama v2-70B. Gaudi 2 yielded 6.26 and 6.25 offline samples per second and server queries per second on Stable Diffusion XL, respectively.

Intel Gaudi 2 specs

  • 7nm process technology
  • Dual matrix multiplication engines (MME) and 24 programmable tensor processor cores (TPC) make up the heterogeneous compute architecture.
  • Memory: 48 MB of SRAM and 96 GB of HBM2E onboard memory
  • Networking: 24 on-chip integrated 100 Gbps Ethernet ports

The Gaudi 2 Intel delivers outstanding speed and scalability and is optimised for deep learning training and inference applications. It has a heterogeneous computational architecture with twin matrix multiplication engines and 24 programmable tensor processor cores, and it is based on a 7nm manufacturing technology. The Gaudi 2 can effectively perform a range of deep learning tasks because to its design.

Additionally, the Gaudi 2 Intel has 96GB of HBM2E memory built-in, offering enough of capacity for data access. Moreover, the Gaudi 2 has 24 on-chip 100 Gbps Ethernet connectors, allowing several Gaudi 2 Intel accelerators to communicate at high speeds. Because of this, the Gaudi 2 is a good fit for deep learning clusters of any size.

Intel Gaudi 2 price

These findings indicate that the Intel Gaudi 2 is still a competitive price. AI Will Be Everywhere in Intel Vision 2024

They are pleased to announce Intel Vision, which will be held in Phoenix, Arizona, on April 8–9, 2024. their flagship event, Intel Vision, brings together elite leaders in business and technology to discuss the most recent developments in the industry and solutions related to client, edge, data Centre, and cloud innovations.

Sign up now to take part in thought-provoking roundtables, captivating demonstrations, and cutting-edge AI insights with Intel executives and distinguished guests that will help you realise your technological vision.e/performance, a crucial factor to take into account when examining the total cost of ownership (TCO).

Concerning the Intel 5th Generation Xeon Outcomes:

With advancements in both hardware and software, Intel’s 5th generation Xeon processors outperformed 4th generation Intel Xeon processors in MLPerf Inference v3.1, with a geomean increase of 1.42x. For instance, the 5th Gen Xeon entry demonstrated around 1.8x performance increases over the v3.1 submission for GPT-J with software optimisations including continuous batching. In a similar vein, MergedEmbeddingBag and further Intel AMX optimisations allowed DLRMv2 to provide around 1.8x speed increases with 99.9 accuracy.

Intel takes great pride in working together with OEM partners, like Quanta, Supermicro, Cisco, Dell, and WiWynn, to enable them to submit their own MLPerf results. Additionally, beginning in 2020, Intel has provided MLPerf results for four generations of Xeon CPUs; in many of these submissions, Xeon serves as the host CPU.

How to Utilise Intel Developer Cloud AI Solutions:

The Intel Developer Cloud offers evaluations of 5th generation Xeon CPUs and Intel Gaudi 2 accelerators. Users may manage AI computing resources, execute training and inference production workloads at scale (LLM or GenAI), and do much more in this environment.

News source: Gaudi 2 Intel


Post a Comment

0 Comments