justahero.de - The Magic behind Extractors & Handlers

The Rust web frameworks axum, actix-web and the game engine bevy use a language pattern to set / register seemingly any function to be passed in as an argument. This pattern is also sometimes known as magic function parameters. Rest assured it's not magic, but rather a neat use of traits and the static type system of Rust to make that happen.

The main goal of this article is to build a good understanding how this pattern works, what its building blocks are & also to show how you can implement it for your own purposes.

Example program

Let's start with the Hello World example from axum to see what the pattern looks like in practice. The following code example starts a HTTP web service with a single API route.

use axum::{response::Html, routing::get, Router};

#[tokio::main]
async fn main() {
    let app = Router::new().route("/", get(handler));
    let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
        .await
        .unwrap();
    axum::serve(listener, app).await.unwrap();
}

async fn handler() -> Html<&'static str> {
    Html("<h1>Hello, World!</h1>")
}

Even though this example is fairly small due to the ergonomic API, there is a lot going on. First, a single endpoint is registered for root path /, a TcpListener is configured to listen on localhost at port 3000 to serve incoming requests, then the server is started. Calling the running HTTP server at address 127.0.0.1:3000 will call the handler function, which returns a simple HTML page with the content Hello World. The interesting part for our purpose is, how the function handler is registered in the Router.

The call of get(handler) is a shortcut for on(MethodFilter::GET, handler) which itself is expanded to:

MethodRouter::new().on(MethodFilter::GET, handler)

The MethodRouter is a type to store seperate handler functions associated with different HTTP methods (e.g. GET, POST, PUT) for the same API endpoint path, e.g. /. The elegant part is that function handlers can have completely different function parameters.

Motivation

The Rust language does not provide support for variadic arguments (ignoring macros), meaning the set of function parameters is fixed. Rust also does not support method overloading like Java where different methods by the same name can have different sets of function parameters. Rust disallows functions in the same scope or impl block to have the same name, to avoid any name collision.

In order to implement the pattern that axum or actix-web employ, let's build this pattern step by step, first by starting with an example what our API could look like, then adding concepts until the result matches the pattern. The code below is a minimal example of such an API:

fn handler(...) {
    // TODO:
}

fn extract_i32(value: i32) {
    println!("Value of i32 is {value}");
}

fn main() {
    handler(extract_i32);
}

In the example above function extract_i32 is provided as an argument to the handler function that someshow should accept it and later use it. The idea is to allow functions without fixed parameters to be passed in. Please note this code does not compile, as we have left a few things out for now.

Traits & functions

One important Rust feature we need to explain first are traits. A trait in Rust is roughly equivalent to an interface in Java. It abstracts some functionality over a type. Traits are used to share behavior between separate types. A trait has one or more methods associated with it, a trait can have associated types & type arguments, etc. For a more thorough introduction of traits check the chapter Traits: Defining Shared Behavior of the Rust Programming Language book or check out Rob Ede's excellent talk on Youtube from Rust Nation UK (2023), explaining different aspects of traits & actix-web's extractors.

What is the least we can do to implement the handler example above to satisfy the compiler? A naive approach is to set the function parameter in the signature of handler directly to the same type of the extract_i32 function, as given in the code below:

fn handler(func: fn(i32)) {
    func(42);
}

fn extract_i32(value: i32) {
    println!("Value of i32 is {value}");
}

fn main() {
    handler(extract_i32);
}

The code compiles and prints the output "Value of i32 is 42". Type fn(i32) in the handler function is a function pointer. This works well when the passed type itself is a function (as defined above in extract_i32), but does not work necessarily for closures. A closure can have the same argument types, but may capture its environment. Let's see the difference between two types of closures.

fn main() {
    // this closure does not capture its environment and can be passed
    handler(|x| { let _ = x * 2; });
    // this closure captures its environment and cannot be passed.
    let y = 2;
    handler(|x| { let _ = x + y; });
}

The program does not compile. The first closure can be passed, because it does not capture its environment and therefore can be coerced into a function pointer to match the expected type fn(i32). The closure in the second handler call cannot be coerced, as it captures the variable y from its environment. The Rust compiler prints the following error:

13 |     handler(|x| { let _ = x + y; });
   |     ------- ^^^^^^^^^^^^^^^^^^^^^^ expected fn pointer, found closure
   |     |
   |     arguments to this function are incorrect
   |
   = note: expected fn pointer `fn(i32)`
                 found closure `{closure@src/main.rs:13:18: 13:21}`
note: closures can only be coerced to `fn` types if they do not capture any variables
  --> src/main.rs:13:36
   |
13 |     handler(|x| { let _ = x + y; });
   |                               ^ `y` captured here
note: function defined here
  --> src/main.rs:1:4
   |
1  | fn handler(func: fn(i32)) {
   |    ^^^^^^^ -------------

Let's refactor the signature of handler to alleviate this issue using the appropriate Fn definition:

fn handler(function: impl Fn(i32)) {
    function(123);
}

fn main() {
    handler(|x| { let _ = x * 2; });
    let y = 2;
    handler(|x| { let _ = x + y; });
}

With this change the same code compiles, accepting both closures. The Fn type is a trait itself. One property of it is that the Fn trait is implemented automatically for closures (with some restrictions) and function pointers that have the same function parameters. In this sense it's more flexible than fn(i32).

What happens when a new function with a different function signature is added that we also want to pass to the handler function as an argument? Let's add a new function named extract_f32 with a parameter type f32:

fn extract_f32(value: f32) {
    println!("Value of f32 is {value}")
}

fn main() {
    handler(extract_i32);
    handler(extract_f32); // Compile error
}

The updated program will not compile, because the function signature of extract_f32 is different from what is expected by function handler, the argument does not match Fn(i32). The function signature needs to change in order to accept both functions. As a first step the handler function will be refactored to illustrate a more appropriate way. For now the call to handler(extract_f32) will be commented out. After the refactoring we'll come back to it.

The Handler

Until now the handler function accepted a function with the parameter i32. Instead of using the Fn trait, we'll introduce a new trait, named Handler. The name is chosen to reflect the name given in axum and actix-web for the similar purpose. First we define the new trait as:

pub trait Handler {
    fn call(&self);
}

The Handler trait has a single method call. Please note, the trait currently does not have any generic type arguments, this will be added later. Let's update the handler function accordingly to accept the new trait instead:

pub trait Handler {
    fn call(&self);
}

fn handler(handler: impl Handler) {
    handler.call();
}

Instead of the former Fn(i32) trait, the Handler trait is defined as a parameter, therefore the function accepts a type that implements that trait. Compiling the code produces the following error:

18 |     handler(extract_i32);
   |     ------- ^^^^^^^^^ the trait `Handler` is not implemented for fn item `fn(i32) {extract_i32}`
   |     |
   |     required by a bound introduced by this call
   |
help: this trait has no implementations, consider adding one

This signals that the fn(i32) does not implement the trait Handler, therefore it cannot accept function extract_i32. Interestingly Rust allows us to define an implementation for function pointer fn(i32) directly. This is basically an extension trait to the fn(i32) type. The implementation of the trait for fn(i32) looks as follows:

impl Handler for fn(i32) {
    fn call(&self) {
        self(123)
    }
}

The program still does not compile and displays the same error as before, but provides a suggestion:

   |
11 | fn handler(handler: impl Handler) {
   |                          ^^^^^^^ required by this bound in `handler`
help: the trait `Handler` is implemented for fn pointer `fn(i32)`, try casting using `as`
   |
24 |     handler(extract_i32 as fn(i32));
   |                         ++++++++++

When we update the call to:

fn main() {
    handler(extract_i32 as fn(i32));
    // handler(extract_f32);
}

the code now compiles and prints Value of i32 is 123. The parameter extract_i32 is a function pointer that has to be coerced via as in order for the handler function to match the associated Handler implementation.

This brings us a step closer to specify different implementations for different function signatures. When we re-enable the line handler(extract_f32) in main, and then compile the program again, it still fails to compile. We add the missing Handler implementation for the function pointer fn(f32) as well. Below is the full listing of our current program:

pub trait Handler {
    fn call(&self);
}

impl Handler for fn(i32) {
    fn call(&self) {
        self(123);
    }
}

impl Handler for fn(f32) {
    fn call(&self) {
        self(1.23);
    }
}

fn handler(handler: impl Handler) {
    handler.call();
}

fn extract_f32(value: f32) {
    println!("Value of f32 is {value}");
}

fn extract_i32(value: i32) {
    println!("Value of i32 is {value}");
}

fn main() {
    handler(extract_i32 as fn(i32));
    handler(extract_f32 as fn(f32));
}

The next useful step is to find a way to elimnate the function pointer coersions & prepare the Handler trait to be more flexible. Let's update the implementations to implement Handler for functions of the form Fn(T):

pub trait Handler {
    fn call(&self);
}

impl<F: Fn(i32)> Handler for F {
    fn call(&self) {
        self(123);
    }
}

impl<F: Fn(f32)> Handler for F {
    fn call(&self) {
        self(1.23);
    }
}

The code above results in the following compile error:

   |
5  | impl<F: Fn(i32)> Handler for F {
   | ------------------------------ first implementation here
...
11 | impl<F: Fn(f32)> Handler for F {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation

This means there are two conflicting implementations of the Handler trait for template argument F. The Rust compiler cannot distinguish between the different implementations for the template argument F, even though it's defined for different function signatures.

To solve this situation we add a generic template argument to the Handler trait itself. By providing distinct types to the generic argument of the Handler trait the compiler is then able to determine the correct implementation, avoiding the conflict.

We add a generic type argument T to the Handler trait, the updated version and its implementations changes to:

pub trait Handler<T> {
    fn call(&self);
}

impl<F: Fn(i32)> Handler<i32> for F {
    fn call(&self) {
        self(123);
    }
}

impl<F: Fn(f32)> Handler<f32> for F {
    fn call(&self) {
        self(1.23);
    }
}

fn handler<T>(handler: impl Handler<T>) {
    handler.call();
}

fn main() {
    handler(extract_i32);
    handler(extract_f32);
}

This eliminates the coersions for Fn(i32) and Fn(f32) in the main function. The updated implementations of the Handler trait are specific for their associated function signatures Fn(i32) and Fn(f32). The template argument T in Handler<T> allows us to differ between separate implementations, by providing types. It's worth re-iterating that the Handler is implemented for explicit functions of the form Fn(..), where Fn is the trait.

Unfortunately the types i32 & f32 are still hard-coded in the Handler implementations, so are the values 123 and 1.23. Another limitation is that we would need to provide an implementation for each possible function signature. Doing this is not really flexible, especially when more arguments are involved or the order of function parameters is not important. It further does not support the ability to pass in unknown client side types. In order to be more flexible, we need a way to capture & constrain passing in arguments to the handler function.

The Extractor

The Handler trait on its own is quite inflexible at the moment. The frameworks axum and actix-web use the concept of extractors that are passed into the Handler. An extractor is a trait that extracts some value or object, often from a larger object. The web frameworks typically have a request type that contains information on URL, path, HTTP headers, body etc, for example HttpRequest in actix-web. Extractors are used to pick & transform information from this Request-like object into other more specifc types. Instead of dealing with the large object directly, extractors use dedicated types to contain specific data. For example some common extractors in axum are Json, Path, Query or State, all extracted or constructed from the same object. For a more detailed introduction into extractors read the Extractors chapter in the actix-web documentation.

An extractor formalizes the type that is passed into the Handler. Instead of using a Request-like object let's introduce our own custom type that represents the same idea for our existing data named Context:

pub struct Context {
    a: i32,
    b: f32,
}

The Context is our version of a complex data type (ok not too complex). It contains fields for i32 and f32. This new struct needs to work somehow with the Handler trait. As a first step we update the Handler trait to accept the Context as a function argument:

pub trait Handler {
    fn call(&self, context: &Context);
}

We need to update the implementations and pass a reference to the new Context type. The updated code looks then as:

pub struct Context {
    a: i32,
    b: f32,
}

pub trait Handler<T> {
    fn call(&self, context: &Context);
}

impl<F: Fn(i32)> Handler<i32> for F {
    fn call(&self, context: &Context) {
        self(context.a);
    }
}

impl<F: Fn(f32)> Handler<f32> for F {
    fn call(&self, context: &Context) {
        self(context.b);
    }
}

fn handler<T>(context: &Context, handler: impl Handler<T>) {
    handler.call(context);
}
    
fn extract_i32(value: i32) {
    println!("Value of i32 is {value}");
}

fn extract_f32(value: f32) {
    println!("Value of f32 is {value}");
}

fn main() {
    let context = Context { a: 42, b: 1.23 };
    handler(&context, extract_i32);
    handler(&context, extract_f32);
}

This is somewhat better, because the values are not hard-coded anymore. Each Handler implementation delegates the call to a field of Context. The next step is to introduce the Extractor trait, to extract some data from Context, named FromContext in our case:

pub trait FromContext {
    fn from_context(context: &Context) -> Self;
}

A type that implements FromContext returns itself, for example i32. It indicates that we extract some value or object from the given Context. Let's add implementations of FromContext for our existing types i32 and f32 as follows:

impl FromContext for i32 {
    fn from_context(context: &Context) -> Self {
        context.a
    }
}

impl FromContext for f32 {
    fn from_context(context: &Context) -> Self {
        context.b
    }
}

Both implementations are separate extractors. The code above simply delegates the calls to Context fields as before. Typically the internal logic of an extractor implementation is a lot more complex, but for purposes it's sufficient to just return the field. For example a Json extractor would read the HTTP body and transform it into a JSON representation.

The generic type argument T in the Handler trait serves an important purpose here. By using this template argument, we are able to leverage different sets of function signatures. Let's refactor the Handler implementation to require the type argument to implement the FromContext trait. Instead of having two separate implementations for i32 and f32, a single implementation then covers both:

impl<F, T> Handler<T> for F
where
    F: Fn(T),
    T: FromContext
{
    fn call(&self, context: &Context) {
        (self)(T::from_context(context))
    }
}

It may not be immediately clear what changed, therefore let's check details of the implementation. The template argument F is the function the Handler is implemented for, in this case Fn(T), while the template argument T requires a type to implement FromContext. Inside the call function the type is extracted from the given &Context. This concept is what makes handling different functions possible.

One advantage of the change is that, each type that implements FromContext can be given as a function argument. It can also be easily implemented for more types, especially important for client-side code. One drawback is, this Handler implementation in the example above only works for functions with a single function parameter. To allow functions that accept two parameters, for example for both i32 and f32, a separate implementation of the Handler trait is required that accepts two arguments, as given in the following example:

impl<F, T1, T2> Handler<(T1, T2)> for F
where
    F: Fn(T1, T2),
    T1: FromContext,
    T2: FromContext,
{
    fn call(&self, context: &Context) {
        (self)(T1::from_context(context), T2::from_context(context))
    }
}

/// Supports a function with two arguments
fn extract_both(first: f32, second: i32) {
    println!("Both values are {first} and {second}");
}

As mentioned before the generic trait argument T in Handler enables the compiler to distinguish between different implementations. The tuple (T1, T2) is treated as a single generic argument. To provide functions with even more arguments, the number of entries in the tuple () increases. The web frameworks axum and actix-web use macros to generate these implementations automatically for functions with 0 to N arguments, while all arguments are required to implement the extractor trait.

Combining the traits Handler and FromContext (the extractor) is what makes passing in functions with different signatures possible. Check the full code on Rust Playground.

Hopefully this sheds some light on how handlers and extractors work together. To illustrate how our Handler trait compares to other versions, let's have a look at the version in actix-web. Shown below is the handler function with two parameters:

impl<Func, Fut, T1, T2> Handler<(T1, T2)> for Func
where
    Func: Fn(T1, T2) -> Fut + Clone + 'static
    Fut: Future
{
    type Output = Fut::Output;
    type Future = Fut;

    fn call(&self, (T1, T2): (T1, T2)) -> Self::Future {
        (self)(T1, T2)
    }
}

There is an additional type argument Fut that allows this Handler to be used in an async context, but otherwise the trait looks very similar. In this implementation the given function argument is a tuple of (T1, T2), that is transformed outside using the FromRequest trait for each entry, and then passed to the call method, but the principle is the same.

How to store handlers

The current code has one disadvantage compared to how handlers are used in axum or actix-web. The handler functions are called immediately in handler. This section explains how to store heterogenous handler functions in the same collection.

Let's outline a new container type that registers and stores the handler functions:

#[derive(Default)]
struct HandlerContainer {
    pub list: Vec<...>,
}

impl HandlerContainer {
    pub fn register<H, T>(&mut handler: H)
    where
        H: Handler<T>
    {
        // TODO
    }
}

A few details have been left out, therefore the code does not compile yet. Rust requires that all elements in a Vec have to be of the same type. Defining the Vec in HandlerContainer as follows:

#[derive(Default)]
struct HandlerContainer {
    pub list: Vec<Handler<T>>,
}

won't compile, because type argument T is not declared. It results in a compiler error:

   |
51 |     pub list: Vec<Handler<T>>,
   |                           ^ not found in this scope
   |
help: you might be missing a type parameter
   |
50 | struct HandlerContainer<T> {
   |                        +++

Would we add the template argument T on HandlerContainer as well, it would allow us to declare different instances of the HandlerContainer for different types, for example HandlerContainer<i32> and HandlerContainer<f32>, but we could not store mixed handlers in the same Vec.

There is a way to eliminate this restriction by storing the handlers without the specific type on Handler. The process of eliminating the type argument is named type erasure, where compile-time information on traits are erased. Check out Type-erasing trait parameters in Rust for an excellent technical introduction.

To achieve storing handler functions, we need to have a look at Rust's dynamic feature contained in the std::any module, in particular the std::any::Any type. The Any type is a trait to emulate dynamic typing in Rust. Most types in Rust implement this trait. Using Any it's also possible to get a TypeId of a type, a unique type identifier. When using Any as borrowed trait object in the form of &dyn Any it provides methods to check if the contained value is of a specific type, and to cast a reference to the inner value to a type. &dyn Any is limited to test whether a value is of a specific concrete type, and cannot be used to test wether a type implements a trait.

Let's see a brief example:

use std::any::Any;

fn main() {
    let x: i32 = 42;
    println!("type id is {:?}", x.type_id());

    let y = &x as &dyn Any;
    match y.downcast_ref::<i32>() {
        Some(_value) => println!("X is a i32"),
        None => println!("X is not an i32"),
    }
}

This program produces the following output:

type id is TypeId(0x56ced5e4a15bd89050bb9674fa2df013)
X is a i32

The TypeId is a long unique identifier, different for each type. The downcast_ref method casts a &dyn Any into a given target type, if the inner type is the same. The call to y.downcast_ref::<i32>() returns Some(i32), while a call to y.downcast_ref::<f32> would return None.

How can the dynamic type logic help us store different function handlers? We will wrap the Handler trait implementation in a new struct type that will erase the specific template argument T. Let's introduce the new struct called ErasedHandler.

use std::any::Any;

struct ErasedHandler<T>
where
    T: Any,
{
    handler: Box<dyn Handler<T>>,
}

The struct takes a template argument T, and requires that it implements Any, which applies for most types in Rust. The handler instance is stored in a Box, because Handler in itself is noted Sized and therefore its size in memory is not known at compile-time. Let's add a constructor to ErasedHandler to pass in an exsting Handler type.

impl<T> ErasedHandler<T>
where
    T: 'static
{
    pub fn new<'a, H>(handler: H) -> Self
    where
        H: Handler<T> + 'static,
    {
        Self {
            handler: Box::new(handler),
        }
    }
}

The template argument T requires the 'static lifetime. An alternative is to declare T as T: Any, to require the template argument to be of Any type. The latter also works, because the Any trait inherits the 'static lifetime. The inner template argument H for the Handler also requires the 'static lifetime to indicate that each handler needs to be known at compile time, otherwise H may not live long enough.

We'll expand the HandlerContainer to store the Handler instances now. The updated type and implementation is given in the code below:

#[derive(Default)]
struct HandlerContainer {
    pub list: Vec<Box<dyn Handler<Box<dyn Any>>>>,
}

impl HandlerContainer {
    /// Store handler in `Vec`.
    pub fn register<H, T: 'static>(&mut self, handler: H)
    where
        H: Handler<T> + 'static,
    {
        self.list.push(Box::new(ErasedHandler::new(handler)));
    }
}

The field list stores the Handler implementations erased of their types now by coercing the specific type T to dyn Any. This makes it possible to cast Any back to the concrete underlying type of the form Fn(..). The call to ErasedHandler::new(handler) creates a new instance of the boxed erased type. This reads a bit unwieldy, therefore let's add a type alias to provide a better name.

type BoxedErasedHandler = Box<dyn Handler<Box<dyn Any>>>;

#[derive(Default)]
struct HandlerContainer {
    pub list: Vec<BoxedErasedHandler>,
}

The code still does not compile and returns with the following error:

   |
59 |         self.list.push(Box::new(ErasedHandler::new(handler)));
   |                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `Fn(Box<(dyn Any + 'static)>)` is not implemented for `ErasedHandler<T>`
   |
note: required for `ErasedHandler<T>` to implement `Handler<Box<(dyn Any + 'static)>>`
  --> src/main.rs:28:12
   |
28 | impl<F, T> Handler<T> for F
   |            ^^^^^^^^^^     ^
29 | where
30 |     F: Fn(T),
   |        ----- unsatisfied trait bound introduced here
   = note: required for the cast from `Box<ErasedHandler<T>>` to `Box<(dyn Handler<Box<(dyn Any + 'static)>> + 'static)>`

It's missing an implementation of Handler for the ErasedHandler<T> type. Remember, the list stores types that match Handler<Box<dyn Any>>, and ErasedHandler is not implementing it yet. Let's add an implementation of the Handler trait for ErasedHandler.

impl<T: 'static> Handler<Box<dyn Any>> for ErasedHandler<T> {
    fn call(&self, context: &Context) {
        self.handler.call(context);
    }
}

This implements Handler<Box<dyn Any>> (the expected type signature in field list) for any type argument T for the ErasedHandler struct. The call method simply delegates the call to the inner handler field.

At this point the HandlerContainer type can register different handlers. The current version of types ErasedHandler, HandlerContainer and the updated main function is shown below:

type BoxedErasedHandler = Box<dyn Handler<Box<dyn Any>>>;

#[derive(Default)]
struct HandlerContainer {
    pub list: Vec<BoxedErasedHandler>,
}

/// Implement Handler for all possible `T` that `ErasedHandler` encapsulates over.
impl<T: 'static> Handler<Box<dyn Any>> for ErasedHandler<T> {
    fn call(&self, context: &Context) {
        self.handler.call(context);
    }
}

impl HandlerContainer {
    pub fn register<H, T: 'static>(&mut self, handler: H)
    where
        H: Handler<T> + 'static,
    {
        self.list.push(Box::new(ErasedHandler::new(handler)));
    }
}

struct ErasedHandler<T>
where
    T: Any,
{
    handler: Box<dyn Handler<T>>,
}

impl<T: 'static> ErasedHandler<T> {
    pub fn new<'a, H>(handler: H) -> Self
    where
        H: 'static + Handler<T>,
    {
        Self {
            handler: Box::new(handler),
        }
    }
}

fn main() {
    let mut container = HandlerContainer::default();
    container.register(extract_i32);
    container.register(extract_f32);
    container.register(extract_both);
    
    // then execute all handlers
    let context = Context { a: 42, b: 1.23 };
    for handler in container.list.iter() {
        handler.call(&context);
    }
}

First, all handler functions are registered in main, then executed afterwards to show how they can be called. Each call extracts the inner field from the given Context. Crates or programs employing the extractor and handler pattern may use their own technique to call a certain handler, for example axum stores one handler function for each HTTP method in their MethodRouter. Calling downcast_ref on a dyn Any intance can be used to cast it back to a concrete type during runtime.

The complete listing of the code above can be found here.

Conclusion

Hopefully the article offers an insight into how this pattern works. It's a pattern that makes sense when the same workflow is employed, for a different set of client-side functions. The web frameworks axum & actix-web make good use of the pattern, because they both use handler functions to be called for API routes. Each API endpoint defines what it requires, for example one handler may read a submitted form request, other handlers require access to the database, etc. By implementing different extractors they allow other use cases as well, ones that may not be present in the framework directly, e.g. for authorization & authentication.

A more complex system using this pattern can also be found in the game engine bevy, the Module documentation on its Entity Component System (ECS) provides a good overview. For a good introduction check out Pascal Hertleif's article The Rust features that make Bevy's systems work.

Th kind of pattern requires more code on the library-level, but in the end provides a more ergonomic & minimal API for client side code. :)