First steps in Zig (0.15.2)

Introduction

A few weeks ago, I was replaying an old ’90-ies adventure game and… I got stuck. I decided to look up online hints, and remembered the https://www.uhs-hints.com website (back in the early 2000s, this used to be a very relevant resource). Unfortunately, the game I was playing didn’t have the hints available online.

I thought a fun project would be to write a tool to convert the text-based .uhs files to a HTML page, with some embedded Javascript to allow expanding each hint. A friend suggested to do this in the Zig programming language, and since I’ve been wondering about that language I decided to give it a try.

You can find the result at https://codeberg.org/zhmu/uhs2html (zlib license). Note that this is my first Zig code ever and there are likely many possible improvements. I’m sharing it in the hopes that it will be useful. It was written using Zig 0.15.2 and may not work in other versions.

I decided to describe my experiences in a blog post, in the hopes that it will be useful to someone. Note that this describes my personal experiences, and keep in mind I am not a Zig expert or even proper user: I spent a few days writing a small tool that I needed, and wanted to learn something in the process. Zig is an ever evolving language, and experiences will likely differ over time. I am mainly familiar with C++ and Rust and will likely draw comparisons based on these experiences.

First impressions: syntax and structure

One of the first things that stood out is the syntax. It’s pretty nice, even though it does take getting some used to. I’ve tried a bit of Go before and the Zig syntax reminds of me of Go. A small sample:

pub fn decipher_uhs88a(alloc: std.mem.Allocator, s: []const u8) ![]const u8 {
    var out = try alloc.alloc(u8, s.len);
    for (s, 0..) |v, index| {
        var ch: u8 = undefined;
        if (v < 32) {
            ch = v;
        } else if (v < 80) {
            ch = 2 * v - 32;
        } else {
            ch = 2 * v - 127;
        }
        out[index] = ch;
    }
    return out;
}

The exclamation mark ! denotes this function can yield an error (the memory allocation could fail), and try is a shorthand for: use the result if it is available, or return the error otherwise – similar to the ? shorthand in Rust.

What particularly stood out to me was how to declare new structures, which thankfully can have functions:

const StringArray = struct {
    alloc: std.mem.Allocator,
    lines: std.ArrayList([]const u8),

    // ...

    pub fn add(self: *StringArray, s: []const u8) !void {
        const copy = try self.alloc.dupe(u8, s);
        try self.lines.append(self.alloc, copy);
    }
};

This creates a type StringArray with an add function that duplicates (think of strdup) the input string s and adds it to the dynamic array of strings. I like the idea of assigning the type to a variable, it feels very clean.

Variables are used to hold types

If you look more closely to self.alloc.dupe(u8, s), you notice that s is a variable and u8 is a type! This is one of Zig’s more interesting features: you can pass types as parameters to functions, given that their declaration allows it.

This is known as comptime in Zig, and is its whole separate subject. The gist of it is that it will be evaluated at compile time, similar to the constexpr, consteval, constinit keywords in C++.

I like the Zig notation, as it makes for vastly better reading than the <T> constructs I am used to in Rust and C++. comptime introduces another way of meta-programming. I find this fascinating and am curious to see how it will evolve over time.

Built-in features that help you

One of the key features of Zig is that memory allocation is very explicit: you need to have a variable of type std.mem.Allocator which you use to allocate and free memory when needed. This is nice in the sense that you know exactly when dynamic memory allocation is used, but it forces you to pass along those objects to whatever code which needs them.

I liked the type-safe unions, which are quite easy to use:

const ItemTagged = union(enum) {
    subject: *UhsSubject,
    hint: *UhsHints,
    credit: *UhsCredits,

    // ...
};

// ...

    for (subject.items.items) |item| {
        switch (item) {
            .subject => |s| { hint_num2 = try subject_print(s, alloc, w, hint_num2); },
            .hint => |h| { hint_num2 = try hint_print(h, alloc, w, hint_num2); },
            .credit => |c| { try credits_print(c, w); },
        }
    }

Again, the syntax is nice and it’s good to have this built-in into the language. I will admit I prefer the Rust enums over what Zig offers as they feel more powerful and natural.

An evolving language, or Zig is not quite there yet

When I was writing my first Zig code, I wanted to output formatted strings. No problem, I thought, this is part of most tutorials, so I went on copy/pasting the code. Only one problem: it did not compile.

It turns out Zig is ever-moving, and there recently (0.15.1) was a big overhaul of the I/O subsystem, which was dubbed Writergate (https://github.com/ziglang/zig/pull/24329). Most resources focus on explaining how much more efficient it is and how much more useful it is, which is understandable, but I just wanted to print things.

Don’t get me wrong: with some online searching, you’ll find sufficient resources and inspiration on how to get things working. But the official documentation just isn’t there yet, and a lot of examples are out of date.

I also encountered some problems based on not knowing the language well and not understanding how you are supposed to write code in Zig. For example, it took me a while to understand that, given an ArrayList (Zig’s version of Rust’s Vec<T> or C++ std::vector<T>), you’re supposed to directly access the items member. In fairness, this is documented, it just took me a while to find it and it felt pretty unnatural to me.

A better C?

This brings me to me my final thought: Zig feels like C. It is arguably a better C, but it’s still C. Note that I do not mean this in a condescending way: in fact, it follows from of the design principles of Zig: no hidden control flow or memory allocations. With some experience, every line tells you exactly what it does, without any hidden functionality like destructors or exceptions.

Unfortunately, this also means we have to manage memory manually. For example, the StringArray I showed needs to have a function to free the memory of the strings it maintains, which is often called deinit:

const StringArray = struct {
    alloc: std.mem.Allocator,
    lines: std.ArrayList([]const u8),

    // ...

    pub fn deinit(self: *StringArray) void {
        for (self.lines.items) |line| {
            self.alloc.free(line);
        }
        self.lines.deinit(self.alloc);
    }
};

And – this is what irks me – you need to ensure that deinit will be called at the proper moments. Failure to do so can result in memory leaks, which tend to be caught by the DebugAllocator – if you are using it – at run time and hence depend on your code flow. There is a defer keyword to help you with clean up once the scope ends, but to me this feels… cumbersome.

I would like to emphasize that this is a decision of the language design. I’ve written quite a bit of C code and being able to use defer rather than relying on goto’s and labels is a big improvement. However, I’m mostly using C++ and Rust these days and their scoping rules feel more natural to me at this point.

Closing thoughts

It’s nice to be able to choose languages. I like how Zig embraces modern features, such as built-in error propagation, type-safe unions and compile-time operations. That being said, I’m still debating whether this is the language for me, given that it feels like a “better C”, whereas I’ve moved past C towards C++ and Rust.

Furthermore, it’s a very new language, with the core language and standard library both not set in stone and likely to change in the future, requiring you to update your code as new versions are released. As far as I know, there’s a single killer application written in it, https://ghostty.org/ – something that will likely improve with language stability and time.

This entry was posted in Programming and tagged . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *