PaliGemma, Gemma 2, and Responsible AI Upgrades

PaliGemma

The Google Cloud team is pleased to see that Gemma has accumulated millions of downloads in the little period since its launch, as they believe that collaboration and open research may foster creativity.

A wide range of initiatives have been generated by developers as a result of this good response, which has been extremely inspiring. Gemma's potential to provide practical and comprehensible AI solutions is being demonstrated by developers through projects like Octopus v2, an on-device action model, and Navarasa, a multilingual variation for Indic languages.

This attitude of discovery and innovation has also inspired Google Cloud to develop CodeGemma, with its powerful code completion and creation capabilities, and RecurrentGemma, which offers efficient inference and research opportunities.

Lightweight, state-of-the-art models built with the same technology and research as the Gemini models make up the Gemma family of open models. Google Cloud is excited to offer additional information about their intentions to expand the Gemma family, following the release of PaliGemma, a powerful open vision-language model (VLM), and the introduction of Gemma 2. They're also demonstrating their commitment to responsible AI by providing updates to our Responsible Generative AI Toolkit, which provide developers with new and enhanced tools for determining the safety of models and removing objectionable content.

Introducing PaliGemma, the Open Vision-Language Model

PaliGemma is a powerful open vector loader, inspired on PaLI-3. PaliGemma aims to provide class-leading fine-tune performance on a range of vision-language tasks, building on open components such as the SigLIP vision model and the Gemma language model. This includes captioning short films and images, object detection and segmentation, visual question answering, and word recognition in images.

Google Cloud can provide many pretrained and fine-tuned checkpoint resolutions, as well as checkpoints customised for various workloads for immediate analysis.

PaliGemma can be accessed via several technologies and platforms that promote unrestricted investigation and learning. With options like Kaggle and Colab notebooks that are free of cost, you may begin experimenting immediately. Academics that wish to further the subject of vision-language study can also apply for Google Cloud credits to help fund their work.

Launch PaliGemma at this moment. PaliGemma is accessible on GitHub, Hugging Face models, Kaggle, Vertex AI Model Garden, and ai.nvidia.com (accelerated using TensoRT-LLM). It may be integrated with ease using Hugging Face Transformers and JAX. (The merging of Keras is going to occur.) You can also get in touch with the model via this Hugging Face Space.

Introducing Gemma 2: Future Performance and Efficiency

Google Cloud is happy to announce that the next generation of Gemma models, called Gemma 2, will be available very soon. With its innovative architecture designed for unprecedented performance and efficiency, Gemma 2 will be available in new sizes to support a variety of use cases for AI developers.

Advantages consist of:

Class Leading Performance:

Gemma 2 runs at 27 billion parameters and is less than half the size of Llama 3 70B, with performance comparable to that of the latter. The standard for open models is raised by this revolutionary efficiency.

Reduced Deployment Costs:

Due to its efficient design, Gemma 2 fits on less than half the compute of comparable models. The 27B model is designed to run on NVIDIA GPUs and can function well on a single TPU host in Vertex AI, which increases deployment affordability and accessibility for a wider range of customers.

Versatile Toolchains for Tuning:

Gemma 2 will provide developers with strong tools for tuning on a variety of resources and platforms. With the use of cloud-based services like Google Cloud and community tools like Axolotl, Gemma 2 can be adjusted more readily than ever before. Hugging Face and NVIDIA TensorRT-LLM partner interactions, along with our own JAX and Keras, let you to optimise performance and deploy across a variety of hardware configurations.

Expanding the Armoury for Conscientious Generative AI

In order to help developers perform more in-depth model evaluations, Google Cloud is making the LLM Comparator available as open source as part of their Responsible Generative AI Toolkit. The LLM Comparator is a new interactive and visual tool for performing effective side-by-side examinations to determine the quality and dependability of model solutions. See the LLM Comparator in action, comparing Gemma 1.1 versus Gemma 1.0, by visiting their demo.

Google Cloud hopes that this tool will contribute to the toolkit's objective, which is to help developers create AI apps that are responsible, creative, safe, and original.

As Google Cloud expands the Gemma family of open models, the company is dedicated to fostering a collaborative environment where cutting-edge AI technology and responsible development coexist. Google Cloud is eager to see what you do with these new tools and how Google Cloud and you can collaborate to impact AI going forward.