Developing Gnome Shell on Fedora Silverblue
Fedora Silverblue is one of the immutable operating systems. Somewhere the OS is being built, resulting in an ostree commit that gets distributed to the machines running Silverblue. On the machines this results in a new deployment which will become active on the next boot, while the previous deployment is still around, ready to be booted into if anything goes wrong. The deployments are read-only.
The resulting stabilility and fault-tolerance is, compared with the traditional single directory tree, mutated by packages, nothing short of incredible.
There are drawbacks, though. The read-only nature of the OS, and /usr
in particular, makes some development work-flows unusable. Want to install a -devel
package to build your software or just need a compiler? Technically you could manage to do so with rpm-ostree’s layering capabilities (man rpm-ostree
) but this is slow, requires a reboot (the apply-live
option exists but there are restrictions) and is generally frowned upon.
The general solution to this problem has been two-fold:
- Embrace containers
- Toolbx
Silverblue ships with podman and other container tools out of the box which allows you to build your software using Dockerfile
s or other container native technologies. It allows you to run containers with podman run
. For a lot of uses cases this is sufficient.
When this is not sufficient or otherwise undesirable Toolbx will come to the rescue! It allows you to create a pet container with toolbox create $name
and to enter it with toolbox enter $name
. The great thing about it: it looks and feels just like any fedora system (other operating systems are supported as well), so you can dnf install
your compiler and -devel
packages. It also shares your home directory and is integrated with the host.
Building and running mutter and gnome-shell inside a toolbox has been possible for a while now thanks to work both on Toolbx and mutter. I’ve continued making sure everything is working as smoothly as possible. It’s possible to run mutter on a TTY and even run all the complicated testing setups (some patches pending).
As great as this is, there are situations where the developed version of gnome-shell needs to to be run in an actual, real session. Maybe even as part of an actual boot, maybe to run it long enough to find some issue that shows up very infrequently. In those cases Toolbx is not enough and we need to somehow get our gnome-shell running on the host.
systemd-sysext and rpm-ostree usroverlay
There are two more mechanisms to modify files in /usr
besides rpm-ostree
overlays:
rpm-ostree usroverlay
systemd-sysext
Conceptually they are very similar. They use some overlay filesystem on top of the otherwise read-only /usr
tree. With rpm-ostree usroverlay
this is really all there is to it. You enable it and now you can modify files in /usr
.
If we build gnome-shell for example in our toolbox with --prefix=/usr
and then call ninja install
we’re installing it into /usr
of the toolbox container. The host OS is mounted in /run/host
in the toolbox so if we just use DESTDIR=/run/host/usr ninja install
then it… still doesn’t work because we can’t get the right permissions to modify files on the host from inside the toolbox (root is not mapped to any user, /run/host
is nobody
). But with some clever use of DESTDIR=/tmp/gnome-shell-install ninja install
and rsync
ing /tmp/gnome-shell-install
to /usr
, it is possible to install our gnome-shell from inside the toolbox onto our host OS. At least temporarily. The rpm-ostree usroverlay
disappears when the machine powers down.
If we need persistence across reboots, then systemd-sysext
seems exactly like what we need. We can just create a new folder in /var/lib/extensions
and drop our files in there (it needs a special file to be picked up as extension), use systemd-sysext list
to show the available extensions, status
to show if they are currently active and merge
and unmerge
to overlay and remove the overlay of the extensions. Enabling systemd-sysext.service
makes systemd overlay the extensions on boot. We can use the same process as before to get our installation from the toolbox to the right place on the host.
The tiny problem
Now that we’ve successfully installed gnome-shell to our host, we can for example just run gnome-shell --nested
and see… missing dependencies?
In this GNOME development cycle mutter grew a dependency on libei
and this is no issue in the toolbox. We just dnf install libei-devel
and we’re done. The problem though: this is not our host and libei
will only be available in the toolbox. We can clone, build and install the dependency just like we did for gnome-shell but this gets ugly fast when the project you’re building has a lot of dependencies that are not installed on the host. In this case, not a big issue though, so we keep pushing ahead and install what we need.
This time gnome-shell starts, just to exit again. This cycle mutter gained the cancel-input-capture
keybinding which is defined in a gschema
file, which gets installed correctly, but needs to be compiled with all the other gschemas
on the host system and that’s not happening. It can’t be happening because the toolbox has its own set of gschemas
. As part of the build process, the compiler will build them into the toolbox, instead of compiling the host gschemas
and the new gschema
to the host. We can be clever and run the compiler on the host and install the results either to /usr
for rpm-ostree usroverlay
or the extension directory for systemd-sysext
.
This doesn’t seem very robust.
Where is the root issue here? We’re building on one operating system and installing the result to another. This can’t work well.
Embracing containers again
We can’t build gnome-shell in our toolbox! Instead, we have to build it on our host. At least, something that looks exactly like our host. The nice thing about the immutable system here is that we know exactly how our host looks like and there is an ostree commit which describes it. Not only that, we can also build a native container (OCI) image from that ostree commit and someone already did that.
We can create our own Dockerfile
which builds gnome-shell and mutter based on the Silverblue image:
ARG IMAGE_NAME="${IMAGE_NAME:-silverblue}"
ARG SOURCE_IMAGE="${SOURCE_IMAGE:-silverblue}"
ARG BASE_IMAGE="quay.io/fedora-ostree-desktops/${SOURCE_IMAGE}"
ARG FEDORA_MAJOR_VERSION="${FEDORA_MAJOR_VERSION:-38}"
FROM ${BASE_IMAGE}:${FEDORA_MAJOR_VERSION} AS builder
ARG IMAGE_NAME="${IMAGE_NAME}"
ARG FEDORA_MAJOR_VERSION="${FEDORA_MAJOR_VERSION}"
# setup dnf
RUN rpm-ostree install -y dnf
RUN dnf install -y 'dnf-command(builddep)'
# install development packages
RUN dnf groupinstall -y 'Development Tools'
RUN ln -s /usr/bin/ld.gold /usr/bin/ld # ???
RUN dnf install -y meson strace gdb valgrind sysprof
# install gnome shell and mutter specific dependencies
RUN dnf builddep -y gnome-shell mutter
RUN dnf install -y libei-devel libeis-devel asciidoc sassc
# dnf won't work on the running system, let's not confuse ourselves
RUN rpm-ostree uninstall -y dnf
# build mutter
ADD ./mutter /tmp/mutter
RUN cd /tmp/mutter && \
meson setup _container_build/ --prefix=/usr && \
ninja -C _container_build/ && \
ninja -C _container_build/ install
# build gnome-shell
ADD ./gnome-shell /tmp/gnome-shell
RUN cd /tmp/gnome-shell && \
meson setup _container_build/ --prefix=/usr && \
ninja -C _container_build/ && \
ninja -C _container_build/ install
RUN rm -rf /tmp/* /var/*
RUN ostree container commit
RUN mkdir -p /var/tmp && chmod -R 1777 /var/tmp
And then build our own Silverblue based OS which includes our own gnome-shell: podman build -f GnomeShell.containerfile -t silverblue-gnome-shell-main
.
Pretty neat, but how do we get the changes between the Silverblue image and our own image deployed onto our host?
OCI images consist of layers. Each layer can add, remove and modify files from the previous layer. When we build our own image based on another image, our changes become new layers in the new image. With a bit of massaging we can extract those layers into a filesystem tree, such as /var/lib/extensions
:
#! /bin/bash
set -ouex pipefail
BASE_IMAGE=${BASE_IMAGE-quay.io/fedora-ostree-desktops/silverblue:38}
EXT_IMAGE=${EXT_IMAGE-localhost/silverblue-gnome-shell-main:latest}
SYSEXT_NAME=${SYSEXT_NAME-test-ext}
SYSEXT_PATH=/var/lib/extensions/$SYSEXT_NAME
function fail {
printf '%s\n' "$1" >&2 ## Send message to stderr.
exit "${2-1}" ## Return a code specified by $2, or 1 by default.
}
tmpdir="$(mktemp -d)"
trap 'rm -rf -- "$tmpdir"' EXIT
skopeo copy containers-storage:$BASE_IMAGE dir:$tmpdir/base-image
skopeo copy containers-storage:$EXT_IMAGE dir:$tmpdir/ext-image
base_info=$(skopeo inspect dir:$tmpdir/base-image --raw)
ext_info=$(skopeo inspect dir:$tmpdir/ext-image --raw)
base_layer_count=$(jq '.layers | length' <<< "$base_info")
ext_layer_count=$(jq '.layers | length' <<< "$ext_info")
if [[ $ext_layer_count -le $base_layer_count ]]; then
fail "ext image needs more layers than base image"
fi
for (( i=0; i<$base_layer_count; ++i)); do
base_digest=$(jq -r ".layers[${i}].digest" <<< "$base_info")
ext_digest=$(jq -r ".layers[${i}].digest" <<< "$ext_info")
if [[ "$base_digest" != "$ext_digest" ]]; then
fail "layer $i digest mismatch: base $base_digest, ext $ext_digest"
fi
done
sudo rm -rf "$SYSEXT_PATH"
sudo mkdir -p $SYSEXT_PATH/usr/lib/extension-release.d/
echo "ID=_any" | sudo tee $SYSEXT_PATH/usr/lib/extension-release.d/extension-release.$SYSEXT_NAME
for (( i=$base_layer_count; i<$ext_layer_count; ++i)); do
ext_digest=$(jq -r ".layers[${i}].digest" <<< "$ext_info")
ext_digest="${ext_digest#sha256:}"
sudo tar xf $tmpdir/ext-image/$ext_digest -C $SYSEXT_PATH
done
This script is horribly inefficient, copies around all files multiple times. It also doesn’t deal with so-called whiteout
files (the real filename, prefixed with .wh.
) which are used to indicate a file from the previous layers was removed.
One could expand on this concept and create a real program which doesn’t have those inefficiencies and handles removal of files properly. However, we can’t handle whiteout
files which remove files from the base image (i.e. our Silverblue system) with systemd-sysext
simply because it doesn’t support removing files from the base system. It only works with the temporary rpm-ostree usroverlay
.
All in all, this isn’t too bad. With a bit more investment this workflow could become usable. Can we do better?
Embracing new Operating Systems
Usually rpm-ostree
pulls updates/commits from a ostree repository but did you know that it can also pull OCI images from container registries?
Did you notice we have an OCI image laying around which contains exactly everything we want?
#! /bin/sh
set -ouex pipefail
IMAGE_NAME=$1
OCI_IMAGE_STORAGE="/var/development-images/${IMAGE_NAME}"
sudo rm -rf "${OCI_IMAGE_STORAGE}"
sudo mkdir -p "${OCI_IMAGE_STORAGE}"
podman save --format=oci-archive "${IMAGE_NAME}" | sudo tar -x -C "${OCI_IMAGE_STORAGE}"
sudo rpm-ostree rebase "ostree-unverified-image:oci:${OCI_IMAGE_STORAGE}"
# revert: rpm-ostree rebase fedora:fedora/38/x86_64/silverblue
We can deploy this image using the script above with rpm-ostree-deploy-image silverblue-gnome-shell-main
and reboot into it.
This is easy and robust! The only thing that’s annoying is that everything here takes a bit of time and requires a reboot. In a lot of cases this is fine. You develop in the toolbox and if a specific case comes up that requires testing in the entire session or under real conditions for a long amount of time, you can build your own OS like this and boot into it. When everything is done we can just rebase onto the upstream silverblue repository using rpm-ostree rebase fedora:fedora/38/x86_64/silverblue
.
Can we do even better and get rid of the slow development loop?
Instead of installing gnome-shell directly into the image, we can stop after installing the dependencies and build tools, boot into this image, then build gnome-shell on our new host, install it into /var/lib/extensions
and then activate the extension with systemd-sysext refresh
. After the initial build of the image the development loop is almost the same as on a traditional system and just as fast.
Let’s start by copying the small helper scripts to ~/.local/bin
.
rpm-ostree-deploy-image
#! /bin/sh
set -ouex pipefail
if [ "$#" -ne 1 ]; then
echo "Illegal number of parameters"
fi
IMAGE_NAME=$1
OCI_IMAGE_STORAGE="/var/development-images/${IMAGE_NAME}"
sudo rm -rf "${OCI_IMAGE_STORAGE}"
sudo mkdir -p "${OCI_IMAGE_STORAGE}"
podman save --format=oci-archive "${IMAGE_NAME}" | sudo tar -x -C "${OCI_IMAGE_STORAGE}"
sudo rpm-ostree rebase "ostree-unverified-image:oci:${OCI_IMAGE_STORAGE}"
# revert: rpm-ostree rebase fedora:fedora/38/x86_64/silverblue
sysext-install
#! /bin/bash
set -eux
if [ "$#" -ne 2 ]; then
echo "Illegal number of parameters"
fi
BUILDDIR="$1"
EXTENSION_NAME="$2"
DESTDIR="/var/lib/extensions/$EXTENSION_NAME"
RELEASE_DIR="$DESTDIR/usr/lib/extension-release.d"
sudo mkdir -p "$DESTDIR"
sudo meson install --destdir="$DESTDIR" -C "$BUILDDIR" --no-rebuild
sudo mkdir -p "$RELEASE_DIR"
echo ID=_any | sudo tee "$RELEASE_DIR/extension-release.$EXTENSION_NAME"
Adjust the Dockerfile
a bit (make sure to put it in a directory containing both gnome-shell and mutter): GnomeShellDevelopment.containerfile
.
ARG IMAGE_NAME="${IMAGE_NAME:-silverblue}"
ARG SOURCE_IMAGE="${SOURCE_IMAGE:-silverblue}"
ARG BASE_IMAGE="quay.io/fedora-ostree-desktops/${SOURCE_IMAGE}"
ARG FEDORA_MAJOR_VERSION="${FEDORA_MAJOR_VERSION:-38}"
FROM ${BASE_IMAGE}:${FEDORA_MAJOR_VERSION} AS base
ARG IMAGE_NAME="${IMAGE_NAME}"
ARG FEDORA_MAJOR_VERSION="${FEDORA_MAJOR_VERSION}"
# setup dnf
RUN rpm-ostree install -y dnf
RUN dnf install -y 'dnf-command(builddep)'
# install development packages
RUN dnf groupinstall -y 'Development Tools'
RUN ln -s /usr/bin/ld.gold /usr/bin/ld # ???
RUN dnf install -y meson strace gdb valgrind sysprof
# install gnome shell and mutter specific dependencies
RUN dnf builddep -y gnome-shell mutter
RUN dnf install -y libei-devel libeis-devel asciidoc sassc
# dnf won't work on the running system, let's not confuse ourselves
RUN rpm-ostree uninstall -y dnf
RUN rm -rf /tmp/* /var/*
RUN ostree container commit
RUN mkdir -p /var/tmp && chmod -R 1777 /var/tmp
FROM base AS build
# build mutter
ADD ./mutter /tmp/mutter
RUN cd /tmp/mutter && \
meson setup _container_build/ --prefix=/usr && \
ninja -C _container_build/ && \
ninja -C _container_build/ install
# build gnome-shell
ADD ./gnome-shell /tmp/gnome-shell
RUN cd /tmp/gnome-shell && \
meson setup _container_build/ --prefix=/usr && \
ninja -C _container_build/ && \
ninja -C _container_build/ install
RUN rm -rf /tmp/*
Then run the entire Dockerfile
to make sure we actually got all the dependencies we need, tag the base stage as silverblue-38-gnome-shell-development
, deploy this image, and then finally boot into it.
podman build -f GnomeShellDevelopment.containerfile
podman build -f GnomeShellDevelopment.containerfile --target=base -t silverblue-38-gnome-shell-development
rpm-ostree-deploy-image silverblue-38-gnome-shell-development
systemctl reboot
After the reboot we can then actually build mutter and gnome-shell on the host, install it into the extension with the sysext-install
script and make it active with systemd-sysext refresh
. Finally we can enable systemd-sysext.service
to make it persist across reboots.
cd mutter
meson setup buildhost --prefix=/usr
ninja -C buildhost/
sysext-install buildhost/ gnome-shell-test
sudo systemd-sysext refresh
cd ../gnome-shell
meson setup buildhost --prefix=/usr
ninja -C buildhost/
sysext-install buildhost/ gnome-shell-test
sudo systemd-sysext refresh
gnome-shell --version
sudo systemctl enable systemd-sysext.service
Back to the basics
Did you know that rpm-ostree usroverlay
is not the only way to get a mutable overlay on top of the read-only /usr
tree? Me neither, until Ivan Molodetskikh pointed this out to me. ostree admin unlock --hotfix
gets us a persistent overlay even!
This way we can get around the biggest drawback of using systemd-sysext
: we can install directly into the filesystem tree and install-time build system integration such as the gschema
compiler will do its job perfectly.
So, instead of installing all the dependencies and build tools into our OS, we just layer rpm on top of it, enable the persistent overlay with ostree admin unlock --hotfix
, and start using it like a traditional system. We can install the dependencies with dnf builddep
, anything else we might need with dnf install
and then build and install everything as usual.
rpm-ostree install dnf
systemctl reboot
sudo ostree admin unlock --hotfix
sudo dnf install 'dnf-command(builddep)'
sudo dnf builddep gnome-shell mutter --allowerasing
sudo dnf install libei-devel libeis-devel asciidoc sassc
cd mutter
meson setup buildhost --prefix=/usr
ninja -C buildhost/
sudo ninja -C buildhost/ install
cd ../gnome-shell
meson setup buildhost --prefix=/usr
ninja -C buildhost/
sudo ninja -C buildhost/ install
The dependencies in the dnf repo and in the silverblue image can be slightly different which might require passing --allowerasing
when installing new stuff with dnf.
ostree admin unlock --hotfix
creates a new rollback target. To abandon the experiment and rollback to our previous state, we can simply use rpm-ostree rollback -r
and clean up our messy deployment with rpm-ostree cleanup -r
after successfully booting into the previous, good version.
Tomáš Popela pointed out that you can get dnf
even without overlaying it using rpm-ostree
by downloading and installing microdnf
and install dnf
with it. This way we can keep our base system clean, don’t have to reboot and don’t accidentally run dnf
when /usr
is read-only.
#!/bin/sh
set -e
tmp=$(mktemp -d)
cleanup () {
rm -rf $tmp
}
trap cleanup EXIT
set -x
curl --silent --show-error --remote-name-all --output-dir $tmp \
https://kojipkgs.fedoraproject.org//packages/microdnf/3.9.0/2.fc38/x86_64/microdnf-3.9.0-2.fc38.x86_64.rpm \
https://kojipkgs.fedoraproject.org//packages/libpeas/1.34.0/3.fc38/x86_64/libpeas-1.34.0-3.fc38.x86_64.rpm \
https://kojipkgs.fedoraproject.org//packages/dnf/4.14.0/2.fc38/noarch/dnf-data-4.14.0-2.fc38.noarch.rpm \
https://kojipkgs.fedoraproject.org//packages/libdnf/0.68.0/2.fc38/x86_64/libdnf-0.68.0-2.fc38.x86_64.rpm
successful=$?
if [ $successful != 0 ] ; then
exit 1
fi
if ! sudo rpm -i "$tmp/*.rpm"; then
echo "Can't install microdnf!"
exit 1
fi
if ! sudo microdnf --assumeyes install dnf python3-dnf-plugins*; then
exit 1
fi
All in all, this seems to be the best option. It transforms the silverblue setup temporarily into a traditional package based, mutable directory tree like system which can be rolled back at any point. It’s easy to use, works well and doesn’t require any further kind of trickery.
Do you have a comment?
Toot at me on mastodon or send me a mail!