Envoy is a configurable proxy that serves a prominent role in modern cloud-native projects; for example, it's used in many k8s deployments to provide inter-service communication (through Istio). In this post, I'd like to describe Envoy's extension mechanism as a case study of using WebAssembly for plugins.

Some background

Imagine a complex service-based infrastructure, in which service A has to communicate with service B. This is typically done via HTTP/REST or some RPC mechanism, but there are a lot of complex networking details to deal with: service discovery (I just want to send a message to an instance of service B, but which address/port is it on?), load balancing, retries, etc. Instead of having each service deal with this complexity, we can set up Envoy to run as a sidecar, and ask it to handle all of it. Then the actual services can focus on our business logic.

Here's a handy diagram from https://istio.io that demonstrates this (Envoy is the Proxy boxes):

System diagram of envoy proxies communicating for microservices

As expected - for such a sophisticated piece of software - Envoy users frequently need to customize it in various ways for their projects. For example, we may want to define custom filters; this is a kind of middleware.

Envoy's original approach to extensions was to support writing C++ to link custom filters with Envoy itself. This, of course, is awkward for many reasons - such as having to distribute your own Envoy binaries instead of using the standard ones. Also, the filter API was not really designed to be stable so keeping up with changes was an issue; and finally, few people like writing C++ these days.

So the Envoy team came up with an alternative approach: Lua extensions.

Lua programming language logo

The Lua programming language was designed for extensions and plugins; it's a small and simple language, and its implementation is also small and simple - making it easy to embed. You can write some Lua code either directly in your configuration file or a separate file it points to, and there's an API exposed to Lua that the extension can interact with.

The Lua extension method is fully supported in Envoy and is currently in a stable state, but some folks weren't too keen on learning yet another programming language just for the sake of writing filters for their proxy. Lua is not particularly prominent in the Cloud world (which is mostly dominated by Go, Python, Java and some other languages). Therefore, the Envoy maintainers have created yet another way to extend it [1] - with WebAssembly.

WASM extensions

WASM extensions are still experimental in Envoy at the time of writing, but it's an intriguing approach and the main subject of this post. WASM elegantly solves the problems of the other extension methods as follows:

  • The WASM extension is compiled into a .wasm file that the Envoy config can point at, and is loaded dynamically at runtime. It doesn't require recompiling and distributing a custom version of Envoy.
  • The extension can use any programming language that compiles down to WASM, and that covers a lot of languages these days. Your entire service infrastructure is written in Go and you don't want to wrangle C++ or learn Lua just for the proxy filters? No problem - Go compiles to WASM and there's even an SDK to help writing Envoy filters in it.

To this end, Envoy embeds v8 as a WASM VM. All that remains is to define the interface between these WASM extension modules and Envoy itself.

The Proxy-Wasm ABI

WebAssembly itself defines:

  • A bytecode format (with an equivalent text format) and its execution semantics
  • A way for WASM modules to export functions and data to the host environment
  • A way for WASM modules to import functions and data from the host environment

And that's about it. Everything else is left to the specific system implementer to figure out. Moreover, the data types WASM supports are very limited - essentially fixed-width integers and floats; users are expected to build their own higher-level data structures on top of these using addresses into WASM's linear heap memory, if needed.

In a previous post I've talked about WASI - an API and ABI that enables OS-like functionality in WASM code. While WASI is useful for exposing WASM modules to the outside world in a vetted way, it's somewhat limited for complex host-wasm interactions, because at the moment the only way for this to happen is via interfaces like stdin/stdout [2].

Therefore, systems that require sophisticated interactions between the host and WASM extensions are left to define their own interfaces. Which is exactly what the Envoy developers ended up creating: the Proxy-Wasm ABI [3].

The ABI is fairly low level, and it has two parts. One is Functions implemented in the WASM module. These are functions exported from WASM (the custom extension) and imported by the host (Envoy or another proxy). For example, proxy_on_request_headers is exported by the WASM module as a callback to handle headers for HTTP requests sailing through the proxy.

This is the signature of proxy_on_request_headers:

params:
    i32 (uint32_t) context_id
    i32 (size_t) num_headers
    i32 (bool) end_of_stream
returns:
    i32 (proxy_action_t) next_action

The import is done in the proxy-wasm-cpp-host project which is a dependency of Envoy. This project implements the host side of Proxy-wasm for C++ hosts.

What should the extension do within proxy_on_request_headers, though? It can do things like ask Envoy about the actual HTTP headers it sees with proxy_get_header_map_value. This is in the second part of the ABI, Functions implemented in the host environment. Its signature is:

params:
    i32 (proxy_map_type_t) map_type
    i32 (const char*) key_data
    i32 (size_t) key_size
    i32 (const char**) return_value_data
    i32 (size_t*) return_value_size
returns:
    i32 (proxy_result_t) call_result

As you can see this is a very low level ABI; all parameters are either pointers (addresses in WASM's linear memory) or constants of predefined types. Since WASM severely restricts the types of function parameters and return values, and both the WASM module and the host can be implemented in very diverse programming languages, there's not much choice here. Writing the glue code on the WASM-host interface is tedious and low-level.

This is where the high-level SDKs come in.

The Go SDK for Proxy-wasm

Suppose we're writing our Envoy extension module in Go (a reasonable choice given the dominance of Go in the Cloud Native / k8s / Istio ecosystem). It seems like hooking up a simple extension to snoop on all the HTTP traffic going through the proxy and logging the HTTP headers is quite a bit of work.

Luckily, the good folks at Tetrate created the Go SDK for Proxy-Wasm. This SDK handles all the Proxy-Wasm ABI mechanics and presents a clean, pure Go API to extension writers that won't have to worry about low level WASM details.

Here's how the task of "snoop on HTTP traffic and log headers" looks using the Go SDK:

func (ctx *httpHeaders) OnHttpRequestHeaders(numHeaders int, endOfStream bool) types.Action {
  hs, err := proxywasm.GetHttpRequestHeaders()
  if err != nil {
    proxywasm.LogCriticalf("failed to get request headers: %v", err)
  }

  for _, h := range hs {
    proxywasm.LogInfof("request header --> %s: %s", h[0], h[1])
  }
  return types.ActionContinue
}

Let's explore how both sides of the ABI (host-implemented and module-implemented) are handled by the Go SDK. Starting with the WASM-calls-host side, this is proxywasm.GetHttpRequestHeaders:

func GetHttpRequestHeaders() ([][2]string, error) {
  return getMap(internal.MapTypeHttpRequestHeaders)
}

It's just a wrapper around a more general getMap function with a map type that the ABI defines. The return type is a slice of 2-element arrays (key/value).

func getMap(mapType internal.MapType) ([][2]string, error) {
  var rvs int
  var raw *byte

  st := internal.ProxyGetHeaderMapPairs(mapType, &raw, &rvs)
  if st != internal.StatusOK {
    return nil, internal.StatusToError(st)
  } else if raw == nil {
    return nil, types.ErrorStatusNotFound
  }

  bs := internal.RawBytePtrToByteSlice(raw, rvs)
  return internal.DeserializeMap(bs), nil
}

internal.ProxyGetHeaderMapPairs is actually an ABI-defined function that's imported from the host (as proxy_get_header_map_pairs). It writes raw pointers to its output parameters, so the rest of getMap deals with converting those into Go data types.

On the host side, proxy_get_header_map_pairs is mapped to a C++ function in this file.

Now the host-calls-WASM side. The Go SDK has the following function:

//export proxy_on_request_headers
func proxyOnRequestHeaders(contextID uint32, numHeaders int, endOfStream bool) types.Action {
  if recordTiming {
    defer logTiming("proxyOnRequestHeaders", time.Now())
  }
  ctx, ok := currentState.httpContexts[contextID]
  if !ok {
    panic("invalid context on proxy_on_request_headers")
  }

  currentState.setActiveContextID(contextID)
  return ctx.OnHttpRequestHeaders(numHeaders, endOfStream)
}

Note the //export annotation that tells the compiler to export this function from the WASM module. To be clear, the entire SDK - along with our custom code - gets compiled into a .wasm file that the host loads, and the //export tag makes the Go compiler place this function in the WASM function export table that the host has access to.

Once the host invokes it, it calls the OnHttpRequestHeaders method on the context, which is user-defined as shown above. Hopefully this example gives a taste of what the SDK does for us - it provides a higher-level, language-idiomatic API on top of a low-level, language-agnostic ABI.

The Go SDK is just an example; there are other SDKs that exist for developing WASM extensions for Envoy - for example in Rust or in C++.

One small wrinkle in this story is that the Go SDK only supports the TinyGo compiler at this time, not the default Go toolchain. This is because the default toolchain doesn't have sufficient WASM support yet, but this situation is changing; in Go 1.21 it has added WASI support and work is ongoing on additional features that should make it possible to develop Envoy extensions using the standard toolchain.

Fundamental plugin concepts in this case study

Let's see how this case study of Envoy extensions with WASM measures against the Fundamental plugin concepts that were covered several times on this blog.

Discovery

Envoy "discovers" available extensions trivially, because they have to be explicitly specified in its configuration file. The config file lists the extensions and where to find them; for WASM, this could be either a local .wasm file or a URL pointing to a file stored remotely (e.g. some cloud storage bucket).

Registration

The WASM extension registers functionality with Envoy by exporting certain functions from the WASM module. When Envoy loads an extension, it scans the list of exported functions for known names. For example, if the extension exports proxy_on_request_headers, Envoy will call it for HTTP headers. If the extension doesn't export such a function, Envoy will assume it's not interested in this particular callback.

Another interesting example of how this functionality is used is the proxy_abi_version_X_Y_Z function. An extension will export this function with an actual ABI version replacing X, Y an Z. Envoy will look for a function with the proxy_abi_version_* prefix, and from its name will determine which version of the ABI the WASM module was compiled against.

Hooks

This is mostly covered in the previous section. There are multiple callbacks a WASM extension can register by exporting them from the WASM module; proxy_on_request_headers is one example out of many defined in the ABI.

Exposing an application API to plugins

This is the Functions implemented in the host environment part of the Proxy-Wasm ABI; we've seen an example of one of them - proxy_get_header_map_pairs. The ABI defines others, like proxy_log for emitting log messages to Envoy's log. These functions let extensions call into Envoy.

Conclusion

As you can see from the string of posts this year, I'm pretty excited about the non-browser uses of WASM, particularly in the area of plugins. The FAAS post presented one interesting possibility - using the current (limited but functional) WASI for the host/plugin interface.

What this post shows is a case study of a much more advanced extension system; the capabilities and performance requirements of custom network filter plugins are just way beyond what WASI can provide, so the Envoy developers ended up creating their own ABI. It's fascinating to study how such an ABI affects plugin development and what kind of ecosystem it spawns.


[1]Note that I'm not trying to criticize the existing extension mechanisms in Envoy in any way. Both work, and are used to solve real business problems. As a project like Envoy grows in popularity and usage, it's inevitable that it will spawn more options for different people to accomplish their tasks with it. Such is the way of software.
[2]The WASI folks are working on extensions to allow sockets and also more complex data to be shared between WASM and hosts in an RPC-like manner; this may enable greatly improved wasm-host interfaces in the future.
[3]

This all sounds great - the way things should be - until reality kicks in. While doing research for this post I discovered that the Proxy-wasm ABI, while clearly and carefully specified, is in-fact out of date and the "real" definition lives within the Envoy source code. It's yet another case of "the ABI is whatever its main implementation does", even though other proxies implement it already (MOSN for example).

This is especially often the case in my favorite domain - systems programming. Sigh, such is life. The rest of the post talks about the de-facto specification, relying on the Envoy source code more than the written down ABI. Hopefully at some future point the ABI is updated and I can rewrite this footnote.

A shout out to Adrian Cole and Takeshi Yoneda for confirming these findings, and the useful chats about all things WASM, WASI and Go in general.