Simple and Scale AI/ML Workloads: Ray on Vertex AI GA

Ray on Vertex AI

Scaling AI/ML workloads presents numerous significant issues for developers and engineers. Obtaining the necessary AI infrastructure is one of the challenges. Workloads related to AI and ML demand a large number of computing resources, including CPUs and the GPUs. For developers to manage their workloads, they must have enough resources. Managing the various patterns and programming interfaces needed for efficient scaling of AI/ML workloads presents another difficulty. It could be necessary for developers to modify their code so that it operates well on the particular infrastructure they have available. This can be a difficult and time-consuming task.

Ray offers a complete and user-friendly Python distributed framework to address these issues. Using a set of domain-specific libraries and a scalable cluster of computational resources, Ray helps you to effectively divide common AI/ML operations like serving, adjusting, and training.

Google Cloud are happy to report that Ray, a potent distributed Python framework, and Google Cloud's Vertex AI have seamlessly integrated and are now generally available. By enabling AI developers to easily scale their workloads on Vertex AI's flexible infrastructure, this integration maximises the possibilities of distributed computing, machine learning, and data processing.

Ray on Vertex AI: Why?

Accelerated and Scalable AI Development:

Ray's distributed computing platform, which easily connects with Vertex AI's infrastructure services, offers a single experience for both predictive and generative AI. Scale your Python-based workloads for scientific computing, data processing, deep learning, reinforcement learning, machine learning, and data processing from a single computer to a large cluster to take on even the most difficult AI problems without having to worry about the intricacies of maintaining the supporting infrastructure.

Unified Development Experience:

By combining Vertex AI SDK for Python with Ray's ergonomic API, AI developers can now easily move from interactive prototyping in Vertex AI Colab Enterprise or their local development environment to production deployment on Vertex AI's managed infrastructure with little to no code changes.

Enterprise-Grade Security:

By utilising Ray's distributed processing capacity, Vertex AI's powerful security features, such as VPC Service Controls, Private Service Connect, and Customer-Managed Encryption Keys (CMEK), may help protect your sensitive data and models. The extensive security architecture provided by Vertex AI can assist in making sure that your Ray applications adhere to stringent enterprise security regulations.

Vertex AI Python SDK

Assume for the moment that you wish to fine-tune a small language model (SML), like Gemma or Llama. Using the terminal or the Vertex AI SDK for Python, Ray on Vertex AI enables you to quickly establish a Ray cluster, which is necessary before you can use it to fine-tune Gemma. Either the Ray Dashboard or the interface with Google Cloud Logging can be used to monitor the cluster.

Ray 2.9.3 is currently supported by Ray on Vertex AI. Furthermore, you have additional flexibility when it comes to the dependencies that are part of your Ray cluster because you can build a custom image.

It's simple to use Ray on Vertex AI for AI/ML application development once your Ray cluster is up and running. Depending on your development environment, the procedure may change. Using the Vertex AI SDK for Python, you can use Colab Enterprise or any other preferred IDE to connect to the Ray cluster and run your application interactively. As an alternative, you can use the Ray Jobs API to programmatically submit a Python script to the Ray cluster on Vertex AI.

There are many advantages to using Ray on Vertex AI for creating AI/ML applications. In this case, your tuning jobs can be validated by using Vertex AI TensorBoard. With the managed TensorBoard service offered by Vertex AI TensorBoard, you can monitor, compare, and visualise your tuning operations in addition to working efficiently with your team. Additionally, model checkpoints, metrics, and more may be conveniently stored with Cloud Storage. As you can see from the accompanying code, this enables you to swiftly consume the model for AI/ML downstreaming tasks, such as producing batch predictions utilising Ray Data.

How to scale AI on Ray on Vertex AI using HEB and eDreams

Accurate demand forecasting is crucial to the profitability of any large organisation, but it's especially important for grocery stores. anticipating one item can be challenging enough, but consider the task of anticipating millions of goods for hundreds of retailers. The forecasting model's scaling is a difficult process. One of the biggest supermarket chains in the US, H-E-B, employs Ray on Vertex AI to save money, increase speed, and improve dependability.

Ray has made it possible for us to attain revolutionary efficiencies that are essential to Google Cloud's company's operations. Ray's corporate features and user-friendly API are particularly appreciated by us, as stated by H-E-B Principal Data Scientist Philippe Dagher"Google Cloud chose Ray on Vertex as Google Cloud's production platform because of its greater accessibility to Vertex AI's infrastructure ML platform."

In order to make travel simpler, more affordable, and more valuable for customers worldwide, eDreams ODIGEO, the top travel subscription platform in the world and one of the biggest e-commerce companies in Europe, provides the highest calibre products in regular flights, budget airlines, hotels, dynamic packages, car rentals, and travel insurance. The company combines travel alternatives from about 700 international airlines and 2.1 million hotels, made possible by 1.8 billion daily machine learning predictions, through processing 100 million customer searches per day.

The eDreams ODIGEO Data Science team is presently training their ranking models with Ray on Vertex AI in order to provide you with the finest travel experiences at the lowest cost and with the least amount of work.

"Google Cloud is creating the best ranking models, personalised to the preferences of Google Cloud's 5.4 million Prime customers at scale, with the largest base of accommodation and flight options," stated José Luis González, Director of eDreams ODIGEO Data Science. Google Cloud are concentrating on creating the greatest experience to increase value for Google Cloud's clients, with Ray on Vertex AI handling the infrastructure for distributed hyper-parameter tuning.