Better malware analysis with Google Gemini 1.5 flash

Gemini 1.5 flash

Google cloud looked at how Gemini 1.5 Pro may be used to automate malware binaries’ code analysis and reverse engineering in Google’s earlier post. Google cloud is now concentrating on Google Gemini 1.5 flash, Google’s new low-weight and affordable model, to move that analysis out of the lab and into a production-ready system that can analyse malware on a massive scale. Gemini 1.5 Flash can handle heavy workloads and offers remarkable speed, handling up to one million tokens.

Google cloud developed an architecture on Google Compute Engine to accommodate this, including a multi-stage workflow with stages for scalable unpacking and recompilation. Although encouraging, this is only the beginning of a long road to address accuracy issues and realise AI’s full potential in malware analysis.

Every day, 1.2 million unique new files that have never been seen on the platform before are analysed by VirusTotal on average. The binary files that make up about half of these could be useful for reverse engineering and code analysis. This amount of new threats is simply too much for traditional, manual ways to handle. The task of developing a system that can quickly and effectively automatically unpack, decompile, and analyse this amount of code is formidable, but Google Gemini 1.5 flash is intended to assist in overcoming it.

Expanding upon the wide range of features of the Gemini 1.5 Pro, the Gemini 1.5 Flash model was designed to maximise speed and efficiency without sacrificing performance. While both models can handle a context window of more than a million tokens and have strong multimodal capabilities, Google Gemini 1.5 flash is specifically made for quick inference and low deployment costs. This is accomplished by using online distillation techniques in conjunction with the parallel computation of feedforward and attention components.

With the latter, Flash can pick up training knowledge directly from the bigger and more intricate Pro model. With the help of these architectural improvements,Google cloud handle up to 1,000 requests and 4 million tokens per minute on Gemini 1.5 flash.

First, we’ll provide some examples of Google Gemini 1.5 flash analyzing decompiled binaries to demonstrate how this pipeline functions. After that, quickly go over the earlier phases of unpacking and large-scale decompilation.

What is Gemini 1.5 Flash

Google AI built Gemini 1.5 Flash, a big language model. It is the fastest and most cost-effective model for high-volume jobs at scale.

Key Gemini 1.5 Flash features:

Speed: Processing 1,000 queries and 4 million tokens per minute, it is extremely fast. This makes it perfect for real-time applications.

Cost-efficiency: This Gemini 1.5 Flash model is more cost-effective than others, giving it a budget-friendly choice for huge projects.

Long context window: A startling one million tokens can be handled by 1.5 Flash, despite its lightweight nature. This lets it consider a lot of data when answering prompts and requests, providing more detailed answers.

Gemini 1.5 Flash balances speed, price, and performance, making it useful for large-scale language processing jobs.

Gemini 1.5 flash Model

Analysis Speed and Illustrative Cases

Google cloud examined 1,000 Windows executables and DLLs that were chosen at random from VirusTotal’s incoming stream in order to assess the real-world efficacy of Google’s malware analysis pipeline. This selection process guaranteed a wide variety of samples, including both malware and legitimate applications. The speed of the Gemini 1.5 Flash was the first item that caught Google’s attention.

This is consistent with the performance benchmarks reported in the paper by the Google Gemini team, where Google Gemini 1.5 flash surpassed other large language models time and time again in terms of text creation speed across a variety of languages.

Google cloud observations showed that the fastest processing time was 1.51 seconds, and the slowest was 59.60 seconds. Google Gemini 1.5 flash handled each file in an average of 12.72 seconds. It’s crucial to remember that these periods do not include the unpacking and decompilation phases, which Google cloud will discuss in more detail in a subsequent blog article.

The length of the ensuing analysis and the amount and complexity of the input code are two examples of the variables that affect these processing timeframes. It’s significant that these measures cover the whole process from beginning to end: from sending the decompiled code to the Vertex AI Gemini 1.5 Flash API, to having the model analyse it, to getting the whole answer back on Google Compute Engine instance. This end-to-end view demonstrates the fast and low latency that Google Gemini 1.5 flash can achieve in actual production settings.

Example 1: It Takes 1.51 Seconds to Dispel a False Positive

This binary was processed the quickest out of the 1,000 binaries Google cloudexamined, demonstrating the exceptional performance of Gemini 1.5 Flash. One anti-virus detection was triggered by the file goopdate.dll (103.52 KB) on VirusTotal, which is a frequent occurrence that frequently necessitates a laborious human review.

Imagine that your SIEM system sent out an alert due to this file, and you urgently need a response. In just 1.51 seconds, Google Gemini 1.5 Flash analyses the decompiled code and gives a clear explanation, stating that the file is a straightforward executable launcher for the “BraveUpdate.exe” application, which is probably a web browser component. Analysts can safely reject the warning as a false positive thanks to this quick, code-level knowledge, avoiding needless escalation and saving crucial time and resources.

Taking Care of Another False Positive, Example 2

Another file that needs more investigation is BootstrapPackagedGame-Win64-Shipping.exe (302.50 KB), which was reported as suspicious by two anti-virus engines on VirusTotal.
In just 4.01 seconds, analyses the decompiled code and discovers that the file is a game launcher.

Gemini describes the functionality of the sample, which includes locating and running redistributable installations, verifying for dependencies such as DirectX and Microsoft Visual C++ Runtime, and finally starting the main game executable. This degree of comprehension enables analysts to classify the file as legitimate with confidence, saving time and effort by preventing the needless investigation of a possible false positive.

Example 3: Using obfuscated code, the longest processing time

During Google cloud investigation, the file svrwsc.exe (5.91 MB) stood out for needing the longest processing time 59.60 seconds. The lengthier analysis time was probably caused by elements like the quantity of the decompiled code and the use of obfuscation methods like XOR encryption. Still, Google Gemini 1.5 flash took less than a minute to finish its study. Considering that it could take a human analyst several hours to manually reverse engineer such a binary, this is a noteworthy accomplishment.

Gemini identified the sample’s backdoor functionality which is intended to exfiltrate data and establish a connection with command-and-control (C2) servers situated on Russian domains and correctly concluded that it was harmful. Numerous indicator of compromises (IOCs) are revealed by the study, including probable C2 server URLs, mutexes employed for process synchronisation, changed registry entries, and dubious file names. Security teams can quickly analyse and address the threat thanks to this information.

Example 4: A miner of cryptocurrency

In this example, the decompiled code of a cryptominer called colto.exe is analysed by Google Gemini 1.5 flash. It’s crucial to remember that VirusTotal provides no further metadata or context for the model; it just receives the decompiled code as input. Google Gemini 1.5 flash completed a thorough analysis in 12.95 seconds, recognising the malware as a cryptominer, highlighting obfuscation strategies, and extracting important IOCs like the file location, wallet address, download URL, and mining pool.

Example 5: Using an Agnostic Approach to Understand Legitimate Software

In this example, Gemini 1.5 Flash analyses 3DViewer2009.exe, a valid 3D viewer programme, in 16.72 seconds. Knowing a program’s functionality can be useful for security reasons even if it is goodware. It is critical to note that the model does not receive any extra metadata from VirusTotal, such as whether the binary is digitally signed by a trusted institution, and that it just receives the decompiled code for analysis, exactly like in the prior examples. While standard malware detection algorithms frequently consider this information, Google cloud are taking a code-centric approach.

Google Gemini 1.5 flash is able to identify the main function of the application, which is to load and show 3D models, as well as the particular kind of 3D data that it works with, which is DTM. The examination emphasises the use of custom file classes for data management, configuration file loading, and rendering using OpenGL. Security teams may find it easier to distinguish between genuine software and malware that might try to replicate its actions with this degree of comprehension.

This functional-only, agnostic method of code analysis may prove especially helpful when examining digitally signed binaries, which may not necessarily receive the same level of security scrutiny as unsigned files. This creates new opportunities to spot possibly harmful activity even in software that is meant to be trusted.

Taking a Closer Look at a Zero-Hour Keylogger

This illustration demonstrates the actual value of looking for malicious activity in code: it can identify risks that are missed by conventional security solutions. When the executable AdvProdTool.exe (87KB) was first uploaded and analysed, it avoided being detected by any anti-virus engines, sandboxes, or detection systems on VirusTotal. But Google Gemini 1.5 flash reveals its actual nature. The model analyses the decompiled code in 4.7 seconds, recognising it as a keylogger and even disclosing the IP address and port where stolen data is exfiltrated.

The research focuses on how the code creates a secure TLS connection to the IP address on port 443 by using OpenSSL. The use of keyboard input capture functions and their link to data transmission over the secure channel are crucially called out by Gemini.

The ability of code analysis to detect zero-hour risks in the early phases of development, as this keylogger seems to be, is demonstrated by this example. It also draws attention to a crucial benefit of Gemini 1.5 Flash: even in cases when malicious intent is concealed by metadata or detection evasion strategies, examining the basic operation of code might uncover it.

Overview of Workflow

Gemini Flash 1.5

Three essential steps comprise Google’s malware analysis pipeline: unpacking, decompilation, and Google Gemini 1.5 flash code analysis. The first two stages are driven by two key processes: large-scale decompilation and automated unpacking. Google cloud use in-house cloud-based malware analysis service, Mandiant Backscatter, to dynamically unpack incoming binaries.

Next, a cluster of Hex-Rays Decompilers running on Google Compute Engine processes the unpacked binaries. Gemini can analyse decompiled and disassembled code, but we have chosen to use decompilation in Google pipeline.

Given the token window limits of big Large language models, the deciding decision was that decompiled code was 5–10 times more concise than disassembled code, making it a more efficient alternative. Google Gemini 1.5 flash is finally used to analyse this decompiled code.

Google cloud handle a vast amount of binaries, including the full daily flood of over 500,000 new binaries submitted to VirusTotal, by coordinating this workflow on Google Cloud.