Previously, we’ve made our version of std::move_only_function<> generic so that it can be used to store any function signature, regardless of the number of parameters or return type. The implementation we’ve ended up with is the following:
template<typename ReturnType, typename... Args>
struct MyFunctionInterface
{
    virtual ~MyFunctionInterface() = default;
    virtual ReturnType operator()(Args...) = 0;
};
 
template<typename Fn, typename ReturnType, typename... Args>
class MyFunctionImpl : public MyFunctionInterface<ReturnType, Args...>
{
    Fn fn;
 
public:
    MyFunctionImpl(Fn fn) : fn(std::move(fn)) { }
 
    ReturnType operator()(Args... args) override
    {
        return fn(std::forward<Args>(args)...);
    }
};
 
template<typename ReturnType, typename... Args>
class MyFunction;
 
template<typename ReturnType, typename... Args>
class MyFunction<ReturnType(Args...)>
{
    std::unique_ptr<MyFunctionInterface<ReturnType, Args...>> fn;
 
public:
    template<typename Func>
    MyFunction(Func function)
    {
        fn = std::make_unique<MyFunctionImpl<Func, ReturnType, Args...>>(
                          std::move(function));
    }
 
    ReturnType operator()(Args... args)
    {
        return fn->operator()(std::forward<Args>(args)...);
    }
};
This works for any movable function, for example:
    MyFunction<int(std::unique_ptr<int>)> pfn([] (auto v) {
        return *v + *v;
    });
    return pfn(std::make_unique<int>(9)); // 18
However, it will use dynamic memory allocation (via std::unique_ptr<>) to obtain/release memory to store the function. In this final post of the series, we’ll focus on using static storage. Note that, as we have no way of knowing how large the function is, this means we may not be able to store all possible functions since some will simply be too large.
Making the storage explicit
Our implementation directly uses std::unique_ptr<> within our MyFunction class. Let’s start by hiding this unique_ptr<> by wrapping it inside a MyFunctionStorage helper class:
template<typename T>
struct MyFunctionStorage
{
    std::unique_ptr<T> ptr;
};
We need to change MyFunction to allow this to compile:
template<typename ReturnType, typename... Args>
class MyFunction<ReturnType(Args...)>
{
    MyFunctionStorage<MyFunctionInterface<ReturnType, Args...>> fn;
  
public:
    template<typename Func>
    MyFunction(Func function)
    {
        fn.ptr = std::make_unique<...>(...);
    }
  
    ReturnType operator()(Args... args)
    {
        return fn.ptr->operator()(std::forward<Args>(args)...);
    }
};
Making the storage static
Let’s start by modifying MyFunctionStorage so that it’ll just hold an array of bytes, which cannot be copied. We’ll keep the template type T, which corresponds to the actual type of the interface we are storing:
template<typename T>
struct MyFunctionStorage
{
    std::byte buffer[16];
    auto ptr() {
        return reinterpret_cast<T*>(&buffer);
    }
    MyFunctionStorage() = default;
    MyFunctionStorage(const MyFunctionStorage&) = delete;
    MyFunctionStorage& operator=(const MyFunctionStorage&) = delete;
};
The ptr() function will give us a T*, which is the type we are actually storing. Next, we update our MyFunction as follows:
    template<typename Func>
    MyFunction(Func function)
    {
        std::construct_at(
            reinterpret_cast<MyFunctionImpl<Func, ReturnType, Args...>*>(fn.buffer),
            std::move(function)
        );
    }
    ~MyFunction()
    {
        std::destroy_at(fn.ptr());
    }
  
    ReturnType operator()(Args... args)
    {
        return fn.ptr()->operator()(std::forward<Args>(args)...);
    }
Instead of creating a std::unique_ptr<>, std::construct_at() is used to invoke the constructor of MyFunctionImpl – this is our type-erased implementation wrapper.
Because MyFunctionImpl implements MyFunctionInterface, we can use the ptr() member function to obtain the interface: both the operator() and the destructor are pure virtual, hence we will end up calling the implementations!
What happens if we capture too much data?
Our current implementation will be able to store any function implementation that occupies 16 bytes or less. What happens if we need more storage? Let’s give it a try:
    MyFunction<int(int)> pfn2([x = std::string("0")] (auto v) {
        return v + x[0];
    });
    return pfn2(3); // crashes! (should return 3 + '0' = 51)
Why does this crash? The likely culprit is that the captured std::string doesn’t fit in the 16 bytes we have available in MyFunctionStorage. And indeed, increasing this value to 64 gives us the correct result! But… how do we know whether 64 bytes is enough?
Ensuring the implementation will fit
Code that compiles fine but always crashes during execution is a bad thing. Let’s fix this by ensuring the implementation will fit in the allotted storage space. Let’s start by making the storage capacity explicit:
template<typename T>
struct MyFunctionStorage
{
    static constexpr auto capacity() { return 16; }
    std::byte buffer[capacity()];
    // ...
};
Now, we can update the constructor to ensure the object we’re creating fits in the storage buffer:
    template<typename Func>
    MyFunction(Func function)
    {      
      static_assert(
        sizeof(MyFunctionImpl<Func, ReturnType, Args...>) <= fn.capacity(),
        "insufficient storage for this implementation"
      );
      // ...
    }
We can clean up this code by using a type alias for the implementation:
    template<typename Func>
    MyFunction(Func function)
    {
        using Impl = MyFunctionImpl<Func, ReturnType, Args...>;
        static_assert(
            sizeof(Impl) <= fn.capacity(),
            "insufficient storage for this implementation");
        std::construct_at(
            reinterpret_cast<Impl*>(fn.buffer),
            std::move(function)
        );
    }
Dealing with alignment
A final caveat with our implementation is that, even though it seems to work, the alignment of the buffer and the stored function type may mismatch: the stored function may require stricter alignment than std::byte. This could lead to difficult to diagnose problems, and thus should be high on our list of problems to solve.
A concern is that we have no idea what the actual function will be – we are using type-erased functions, after all. However, we do know that we need to store at least a pointer. Continuing this train of thought, if no captures are used, we’d end up by simply calling a C-style function pointer. Finally, if we want to call a member function of some class, we can express this. This will already provide proper alignment requirements!
Let’s start by introducing an union that would contain any of these types:
struct MyFunctionSomeClass { };
union MyFunctionUnion
{
    void* some_ptr;
    void (*some_func)();
    void (MyFunctionSomeClass::*some_member_func)();
};
This union allows us to force our buffer to have a proper minimum alignment as follows:
template<typename T>
struct MyFunctionStorage
{
    // ...
    union {
        MyFunctionUnion mfu;
        std::byte buffer[capacity()];
    };
    // ...
};
It is worth pointing out that MyFunctionUnion can serve a second purpose: the size can be used to determine the minimum amount of storage needed for any reasonable function.
Closing words
It’s been quite a journey: we started with an move-only function implementation with fixed parameters and a fixed return type in part 1. Part 2 focused on making this generic by removing the fixed parameter/return type limitation, and in this final part we removed the need for dynamic memory allocation.
Even though the implementation provided here can be quite useful, it’s nowhere as complete as std::move_only_function. For example, we haven’t considered an empty MyFunction, swapping, exceptions, moving MyFunction itself, etc. This is why implementing such a function wrapper is so difficult: as the old saying goes, the devil is in the details – and they are easy to miss.
That being said, it’s been quite the journey and I hope we’ll all learned something valuable! If you require a fixed-storage function implementation, it’s worth taking a look at existing libraries (EASTL, Folly, Abseil, and many more) instead of writing your own existing implementations. Because, by the end of the day, your purpose was likely to address a real-world scenario instead of coming up with a perfect re-implementation.
