Service Extensions Pluginos and Application Load Balancers

A summary

Google Cloud has introduced a new feature to its Application Load Balancers: Service Extensions plugins. These plugins increase the programmability of web application delivery by enabling users to execute custom code directly inside the request and response channels of the load balancer. The plugins provide characteristics like safe sandboxing, multi-language compatibility, and fast execution rates since they are built on WebAssembly (Wasm). This new feature has a wide range of use cases, such as header manipulation, security policy implementation, custom logging, and HTML rewriting. Users may run the plugins in a fully controlled environment that offers scalability and low latency.

Now, run your own code at the edge using the Application Load Balancers.

Application load balancers are required for reliable online application delivery on Google Cloud. Some situations need even greater programmability, even though Google Cloud's load balancers provide a significant level of customisation.

The Application Load Balancer Service Extensions plugins were recently made available in Preview by Google Cloud. You may now execute your own code directly in the request/response pipeline in a completely regulated Google environment with optimum latency to customize load balancers to your business requirements. Simply provide the code, and Google Cloud will do the rest. If you wish to do the calculations yourself for larger workloads, you may look into Service Extensions callouts, which are now GA for Application Load Balancers.

Use cases for Service Extensions plugins

Service Extensions plugins cover the following use cases:

Header addition: Create new headers relevant to your applications or specific customers, or include extra headers for requests and answers.

Header manipulation: Includes rewriting request and response headers, overriding client headers, and changing request and response headers while they are being transmitted to the backend or while responding to a client.

Security: Determine how to implement complex security rules in your plugin, such as custom token authentication, based on response headers or client requests.

Custom logging: Add user-defined headers or custom data to Cloud Logging.

Exception handling: Direct clients to a customized error page for certain response types.

HTML Rewriting: Rewrite HTML from its original source in order to allow Google Analytics tagging or Google reCAPTCHA integration.

Where can I run my code?

Service Extensions function in the request and response channel at the periphery of Google's globally distributed network. Service Extensions plugins are now included in the traffic extension for the global external Application Load Balancer. The traffic extension runs after Cloud CDN and Cloud Armor but before traffic reaches the backend. In a future version, cloud CDNs will enable Service Extensions. The cross-region internal load balancer also supports Service Extensions plugins as part of the route and traffic extensions.

Plugin Architecture for Service Extension

Service Extensions plugins concentrate on lightweight compute activities that are part of the request/response flow of the Application Load Balancer. Plugins are based using WebAssembly (Wasm), which offers the following benefits:

Millisecond startup time and close to native execution speed
It supports a wide range of programming languages, such as C++ and Rust.
You may test the plugin locally or use it across many installations thanks to cross-platform compatibility.

Security safeguards, such as executing plugin logic in a sandbox

Proxy-Wasm, an open source project supported by Google, provides Wasm modules with a standard API for communicating with network proxies, which is used by Service Extensions Plugins.

To run Service Extensions plugins, Google created a computing infrastructure that is both massively multi-tenant and scalable (much like load balancers). To meet traffic needs, fully controlled plugins provide dynamic sharding and auto-scaling. The following are made feasible by this design:

Scalability: We can grow to a large number of Wasm hosts as required when traffic patterns change.

Low latency: Since there are no additional proxies between the load balancer and Wasm hosts, this proxyless serverless design allows for even more latency-optimal paths.