The good, the bad, and the ugly of managing Sphinx projects with Bazel#

In the spirit of Focus on decisions, not tasks I would like to share my experience of managing Sphinx projects with Bazel. My goal is to make it easier for you to decide whether or not Bazel would actually benefit your Sphinx project.

If you’re already sold on the idea of managing your Sphinx project with Bazel and just need setup guidance, check out A tutorial on managing Sphinx projects with Bazel.

Background#

Sphinx is a tool for authoring documentation. You write your docs in reStructuredText or Markdown and then use Sphinx to transform the docs into HTML and other output formats. It’s also common to hook in auto-generated API reference docs from tools like Doxygen alongside the reStructuredText or Markdown docs.

Bazel is primarily a tool for building software. Software engineering teams use it for a variety of reasons that mostly revolve around ensuring that software is built correctly and improving team productivity.

“Managing a Sphinx project with Bazel” means orchestrating core Sphinx workflows (such as transforming the docs into HTML) through Bazel’s build system.

My experience with Sphinx and Bazel#

I have about 5 years of experience with Sphinx. In my first technical writing job, I migrated my employer’s docs from Microsoft Word to Sphinx. For the last few years I’ve been leading the docs for Pigweed. Our main site, pigweed.dev, is powered by Sphinx. I spent most of Q4 2024 migrating pigweed.dev to Bazel. The site has over 600 pages of content and integrates with 3 different API reference auto-generation pipelines. I.e. I’ve got a pretty good sense of managing a non-trivial Sphinx project with Bazel.

Why use a build system at all?#

Many Sphinx projects don’t use a build system whatsoever. They just have a little custom shell script that invokes sphinx-build directly. Or they use the minimal Makefile that sphinx-quickstart generates.

In my own small Sphinx projects (such as this site and kayce.basqu.es) I’m actually finding a Bazel-based build to be less work to maintain than the usual custom shell scripts that I previously cobbled together. I like not needing to futz around with virtual environments anymore. And I like that I’m continuing to build up experience with Bazel because it has more momentum than I realized.

In medium-to-large Sphinx projects that have a lot of contributors I think the Easier development environment setup, Unified CLI for a project, and the ability to keep docs close to their relevant code are pretty compelling features. They probably improve productivity by making it much easier to contribute to the project.

Why not use some other build system?#

I’m not really trying to push Bazel in particular. It’s not like I’ve done a systematic review of every build system and concluded that Bazel is the global minimum. I just happen to know a fair bit about managing Sphinx projects with Bazel now because my work required me to migrate pigweed.dev from a GN-based build to a Bazel one.

Well, I guess I do have an opinion on GN versus Bazel. If given a choice, I would choose Bazel over GN for the reasons mentioned in What went well. However, the switch from GN to Bazel was not motivated by any particular failing of the old GN-based docs build system. Pigweed adopted Bazel as its primary build system back in Q3 2023 because it can significantly improve embedded developer productivity. Our strategy is to dogfood every aspect of embedded devleopment in Bazel, including our docs.

The good#

Here’s what I like about managing Sphinx projects with Bazel.

Unified CLI for a project#

I am phrasing the topic as “managing Sphinx projects with Bazel” rather than “building Sphinx projects with Bazel” because Bazel is not just about building software. You can run lots of other workflows through it. For example, the Tour of Pigweed demo uses Bazel to run tests, start a simulator, connect to a console, flash an embedded device, and more.

Easier development environment setup#

With Bazel, building the docs can become a literal three-step process like this:

$ git clone https://github.com/technicalwriting/dev.git
$ cd dev
$ ./bazelisk build //:docs

When Bazel attempts to build the //:docs target it detects that it doesn’t have all the tools and dependencies it needs, automatically fetches them, sets them all up, and then proceeds with the build.

(I’m a cheating a little by assuming that the bazelisk executable is checked into the repo, which is an uncommon practice.)

No need for virtual environments#

One of the main problems that Bazel solves for software engineers is the works on my machine problem. E.g. the source code compiles for teammate A, yet the exact same source code doesn’t compile for teammate B. Many hours of debugging ensue to pinpoint the difference in their development environments. Through hermeticity Bazel can guarantee that a given set of inputs always produce the exact same outputs for all teammates. This is also known as reproducible builds.

Reproducible builds aren’t a hot button issue for Sphinx projects. If Sphinx doesn’t build the docs exactly the same for all teammates, it’s usually not a big deal.

However, hermeticity does bring one tangible benefit to Sphinx projects: no more need for virtual environments. Bazel always runs all Sphinx workflows from an isolated sandbox so there’s no need to also spin up a virtual environment.

Sidecar friendly#

In terms of docs-as-code topologies, a sidecar is when your docs live in the same repo as the rest of your source code. This is a powerful setup because it increases the chances that software engineers keep their docs up-to-date. In my experience most software engineers are actually fine with updating docs, so long as its easy to find the relevant docs. If an engineer changes an API in //src/logger/lib.cpp and they see docs.rst right next to lib.cpp, it’s very obvious that docs.rst might also need an update. On the other hand, if the relevant doc lives at //docs/guides/logging/docs.rst, then there’s less of a chance that the engineer will remember to update the doc. Out of sight, out of mind.

See Built-in support for reorganizing sources for more explanation of how Bazel makes it easier to keep your docs in sight. The gist of the idea is to prioritize keeping your docs right next to the code, and then use Bazel’s features to reorganize the docs into a usable information architecture on the docs website.

Surprisingly robust ecosystem#

bzlmod (“Bazel mod”) is the main mechanism for sharing your Bazel rules (i.e. libraries) with others. When I migrated pigweed.dev to Bazel I was surprised to discover that most of the rules I needed were already available through community modules. For example, rules_python has extensive support for building Sphinx projects, including a built-in workflow for spinning up a server so that you can locally preview the HTML output in a browser. This is the main reason the pigweed.dev migration went faster than expected. People like rickeylev and TendTo had already built most everything I needed.

The bad#

Adopting Bazel requires some upfront investment and creates more complexity for docs authors.

Explicit build graphs#

As explained in No need for virtual environments and A key Bazel concept, Bazel builds your Sphinx project in an isolated sandbox. You need to explicitly declare all inputs in the build system. This can take a while to set up correctly and wrap your head around.

It’s not quite right to call this “bad”. I actually really like declaring the entire build graph explicitly. But it does take time and I imagine that some teammates will never “get it” and will find it needlessly complex.

More indirection#

Bazel necessarily introduces more complexity into a Sphinx project because it introduces new layers of indirection.

Suppose that you previously built the HTML docs directly like this:

$ sphinx-build -M html ./src ./_build

The generated HTML is easy to find: ./_build/html/…

When you build the HTML docs through Bazel with a command like this:

$ ./bazelisk build //:docs

You can still inspect the generated HTML. But it’s at a less-obvious path: ./bazel-bin/docs/_build/html/…

This is just one of many ways that Bazel introduces more indirection into the project.

The ugly#

These are the ways I’ve seen Bazel noticeably worsen developer experience.

Lack of incremental builds#

Suppose you have a medium-sized Sphinx project. You build the HTML docs directly with Sphinx’s build command:

$ sphinx-build -M html ./src ./_build

Sphinx builds everything and caches the outputs somewhere. This command takes 10 seconds.

Now suppose that you change one line in your docs and run sphinx-build again. This subsequent build takes only 1 second. It’s fast because Sphinx only rebuilds the changed content and went to its cache for the rest. This is what I mean by incremental builds.

Incremental builds don’t work out-the-box when managing Sphinx projects through Bazel. Continuing with the example, every docs build takes 10 seconds, even if you only change one line of code in the docs source.

Sphinx and Bazel both support caching so I’m hopeful that there’s a solution here. But it definitely doesn’t work out-of-the-box as far as I can tell.

Possibly incomplete docs#

The experience that I describe in the Core utilities were hard to find section of the pigweed.dev migration blog post suggests to me that the Bazel docs might be missing essential how-to guides and references. I haven’t thoroughly reviewed the Bazel docs though, so I don’t know for sure.

The Starlark guessing game#

If you ever need to write a custom rule, you’ll need to do so in Starlark. Starlark is a dialect of Python, meaning that it only supports a subset of Python syntax. Differences with Python explains how Starlark diverges from Python pretty clearly, but in practice I would write some code, scratch my head as I watched it silently fail, and then eventually figure out that I was trying to use a Python-ism that Starlark doesn’t support. See Uncanny valley experiences with Starlark for an example.