Transactional package manager, declarative GNU/Linux distribution, reproducible deployment tool, and more!
Find a file
Reepca Russelstein c659f977bb
daemon: add seccomp filter for slirp4netns.
The container that slirp4netns runs in should already be quite difficult to do
anything malicious in beyond basic denial of service or sending of network
traffic.  There is, however, one hole remaining in the case in which there is
an adversary able to run code locally: abstract unix sockets.  Because these
are governed by network namespaces, not IPC namespaces, and slirp4netns is in
the root network namespace, any process in the root network namespace can
cooperate with the slirp4netns process to take over its user.

To close this, we use seccomp to block the creation of unix-domain sockets by
slirp4netns.  This requires some finesse, since slirp4netns absolutely needs
to be able to create other types of sockets - at minimum AF_INET and AF_INET6

Seccomp has many, many pitfalls.  To name a few:

1. Seccomp provides you with an "arch" field, but this does not uniquely
   determine the ABI being used; the actual meaning of a system call number
   depends on both the number (which is often the result of ORing a related
   system call with a flag for an alternate ABI) and the architecture.

2. Seccomp provides no direct way of knowing what the native value for the
   arch field should be; the user must do configure/compile-time testing for
   every architecture+ABI combination they want to support.  Amusingly enough,
   the linux-internal header files have this exact information
   (SECCOMP_ARCH_NATIVE), but they aren't sharing it.

3. The only system call numbers we naturally have are the native ones in
   asm/unistd.h.  __NR_socket will always refer to the system call number for
   the target system's ABI.

4. Seccomp can only manipulate 32-bit words, but represents every system call
   argument as a uint64.

5. New system call numbers with as-yet-unknown semantics can be added to the
   kernel at any time.

6. Based on this comment in arch/x86/entry/syscalls/syscall_32.tbl:

   # 251 is available for reuse (was briefly sys_set_zone_reclaim)

   previously-invalid system call numbers may later be reused for new system
   calls.

7. Most architecture+ABI combinations have system call tables with many gaps
   in them.  arm-eabi, for example, has 35 such gaps (note: this is just the
   number of distinct gaps, not the number of system call numbers contained in
   those gaps).

8. Seccomp's BPF filters require a fully-acyclic control flow graph.
   Any operation on a data structure must therefore first be fully
   unrolled before it can be run.

9. Seccomp cannot dereference pointers.  Only the raw bits provided to the
   system calls can be inspected.

10. Some architecture+ABI combos have multiplexer system calls.  For example,
    socketcall can perform any socket-related system call.  The arguments to
    the multiplexed system call are passed indirectly, via a pointer to user
    memory.  They therefore cannot be inspected by seccomp.

11. Some valid system calls are not listed in any table in the kernel source.
    For example, __ARM_NR_cacheflush is an "ARM private" system call.  It does
    not appear in any *.tbl file.

12. Conditional branches are limited to relative jumps of at most 256
    instructions forward.

13. Prior to Linux 4.8, any process able to spawn another process and call
    ptrace could bypass seccomp restrictions.

To address (1), (2), and (3), we include preprocessor checks to identify the
native architecture value, and reject all system calls that don't use the
native architecture.

To address (4), we use the AC_C_BIGENDIAN autoconf check to conditionally
define WORDS_BIGENDIAN, and match up the proper portions of any uint64 we test
for with the value in the accumulator being tested against.

To address (5) and (6), we use system call pinning.  That is, we hardcode a
snapshot of all the valid system call numbers at the time of writing, and
reject any system call numbers not in the recorded set.  A set is recorded for
every architecture+ABI combo, and the native one is chosen at compile-time.
This ensures that not only are non-native architectures rejected, but so are
non-native ABIs.  For the sake of conciseness, we represent these sets as sets
of disjoint ranges.  Due to (7), checking each range in turn could add a lot
of overhead to each system call, so we instead binary search through the
ranges.  Due to (8), this binary search has to be fully unrolled, so we do
that too.

It can be tedious and error-prone to manually produce the syscall ranges by
looking at linux's *.tbl files, since the gaps are often small and
uncommented.  To address this, a script, build-aux/extract-syscall-ranges.sh,
is added that will produce them given a *.tbl filename and an ABI regex (some
tables seem to abuse the ABI field with strange values like "memfd_secret").
Note that producing the final values still requires looking at the proper
asm/unistd.h file to find any private numbers and to identify any offsets and
ABI variants used.

(10) used to have no good solution, but in the past decade most architectures
have gained dedicated system call alternatives to at least socketcall, so we
can (hopefully) just block it entirely.

To address (13), we block ptrace also.

* build-aux/extract-syscall-ranges.sh: new script.
* Makefile.am (EXTRA_DIST): register it.
* config-daemon.ac: use AC_C_BIGENDIAN.
* nix/libutil/spawn.cc (setNoNewPrivsAction, addSeccompFilterAction): new
  functions.
* nix/libutil/spawn.hh (setNoNewPrivsAction, addSeccompFilterAction): new
  declarations.
  (SpawnContext)[setNoNewPrivs, addSeccompFilter]: new fields.
* nix/libutil/seccomp.hh: new header file.
* nix/libutil/seccomp.cc: new file.
* nix/local.mk (libutil_a_SOURCES, libutil_headers): register them.
* nix/libstore/build.cc (slirpSeccompFilter, writeSeccompFilterDot):
  new functions.
  (spawnSlirp4netns): use them, set seccomp filter for slirp4netns.

Change-Id: Ic92c7f564ab12596b87ed0801b22f88fbb543b95
Signed-off-by: John Kehayias <john.kehayias@protonmail.com>
2025-06-24 10:07:58 -04:00
.forgejo doc: contributing: Disable authentication when simulating ‘guix pull’. 2025-06-06 21:18:15 +08:00
.mumi
build-aux daemon: add seccomp filter for slirp4netns. 2025-06-24 10:07:58 -04:00
doc daemon: Use slirp4netns to provide networking to fixed-output derivations. 2025-06-24 10:07:57 -04:00
etc etc: guix-install.sh: Remove dependency on "which". 2025-06-21 12:57:30 -04:00
gnu daemon: Use slirp4netns to provide networking to fixed-output derivations. 2025-06-24 10:07:57 -04:00
guix cve: Add cpe-vendor and lint-hidden-cpe-vendors properties. 2025-06-23 12:31:09 +08:00
m4 maint: Change main repository URL to git.guix.gnu.org. 2025-05-23 11:19:07 +02:00
nix daemon: add seccomp filter for slirp4netns. 2025-06-24 10:07:58 -04:00
po nls: Update translations. 2025-06-03 00:05:14 +08:00
scripts
tests daemon: Use slirp4netns to provide networking to fixed-output derivations. 2025-06-24 10:07:57 -04:00
.dir-locals.el .dir-locals.el: Add geiser-insert-actual-lambda. 2025-05-26 16:07:56 +01:00
.editorconfig .editorconfig: Adjust max_line_length to 80 columns. 2025-04-26 20:40:02 +09:00
.gitattributes
.gitignore nix: Install guix-gc systemd timer. 2025-03-02 14:21:59 +02:00
.guix-authorizations .guix-authorizations: Use subkey for Steve/futurile. 2025-02-27 22:03:22 +09:00
.guix-channel maint: Change main repository URL to git.guix.gnu.org. 2025-05-23 11:19:07 +02:00
.mailmap Update mailmap. 2025-01-25 01:44:32 +08:00
.patman
AUTHORS
bootstrap maint: Generate doc/version[-LANG].texi using `mdate-from-git.scm'. 2024-04-19 16:45:41 +02:00
ChangeLog
CODE-OF-CONDUCT
CODEOWNERS teams: Drop caret from file name regexps in ‘CODEOWNERS’. 2025-06-20 17:59:02 +02:00
config-daemon.ac daemon: add seccomp filter for slirp4netns. 2025-06-24 10:07:58 -04:00
configure.ac build: Use the po4a command for the translation generation. 2024-06-03 00:27:11 +02:00
COPYING
gnu.scm
guix.scm
HACKING
Makefile.am daemon: add seccomp filter for slirp4netns. 2025-06-24 10:07:58 -04:00
manifest.scm maint: Add installer dependencies to the manifest. 2024-11-11 07:28:34 +01:00
NEWS doc: Fix some misspellings. 2025-02-19 11:28:40 +02:00
README README: Mention ranges are OK in copyright notices. 2025-04-22 11:50:26 +09:00
README.org maint: Add README.org symlink pointing to README. 2025-05-27 14:48:07 +08:00
ROADMAP
THANKS
TODO doc: Fix some misspellings. 2025-02-19 11:28:40 +02:00

-- mode: org --

GNU Guix (IPA: ɡiːks) is a purely functional package manager, and associated free software distribution, for the GNU system. In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection.

It provides Guile Scheme APIs, including a high-level embedded domain-specific languages (EDSLs) to describe how packages are to be built and composed.

GNU Guix can be used on top of an already-installed GNU/Linux distribution, or it can be used standalone (we call that “Guix System”).

Guix is based on the Nix package manager.

Requirements

If you are building Guix from source, please see the manual for build instructions and requirements, either by running:

info -f doc/guix.info "Requirements"

or by checking the web copy of the manual.

Installation

See the manual for the installation instructions, either by running

info -f doc/guix.info "Installation"

or by checking the web copy of the manual.

Building from Git

For information on building Guix from a Git checkout, please see the relevant section in the manual, either by running

info -f doc/guix.info "Building from Git"

or by checking the web_copy of the manual.

How It Works

Guix does the high-level preparation of a derivation. A derivation is the promise of a build; it is stored as a text file under /gnu/store/xxx.drv. The (guix derivations) module provides the `derivation' primitive, as well as higher-level wrappers such as `build-expression->derivation'.

Guix does remote procedure calls (RPCs) to the build daemon (the guix-daemon command), which in turn performs builds and accesses to the store on its behalf. The RPCs are implemented in the (guix store) module.

Contact

GNU Guix is hosted at https://savannah.gnu.org/projects/guix/.

Please email <help-guix@gnu.org> for questions and <bug-guix@gnu.org> for bug reports; email <gnu-system-discuss@gnu.org> for general issues regarding the GNU system.

Join #guix on irc.libera.chat.

Guix & Nix

GNU Guix is based on the Nix package manager. It implements the same package deployment paradigm, and in fact it reuses some of its code. Yet, different engineering decisions were made for Guix, as described below.

Nix is really two things: a package build tool, implemented by a library and daemon, and a special-purpose programming language. GNU Guix relies on the former, but uses Scheme as a replacement for the latter.

Using Scheme instead of a specific language allows us to get all the features and tooling that come with Guile (compiler, debugger, REPL, Unicode, libraries, etc.) And it means that we have a general-purpose language, on top of which we can have embedded domain-specific languages (EDSLs), such as the one used to define packages. This broadens what can be done in package recipes themselves, and what can be done around them.

Technically, Guix makes remote procedure calls to the nix-worker daemon to perform operations on the store. At the lowest level, Nix “derivations” represent promises of a build, stored in .drv files in the store. Guix produces such derivations, which are then interpreted by the daemon to perform the build. Thus, Guix derivations can use derivations produced by Nix (and vice versa).

With Nix and the Nixpkgs distribution, package composition happens at the Nix language level, but builders are usually written in Bash. Conversely, Guix encourages the use of Scheme for both package composition and builders. Likewise, the core functionality of Nix is written in C++ and Perl; Guix relies on some of the original C++ code, but exposes all the API as Scheme.

Related software

  • Nix, Nixpkgs, and NixOS, functional package manager and associated software distribution, are the inspiration of Guix
  • GNU Stow builds around the idea of one directory per prefix, and a symlink tree to create user environments
  • STORE shares the same idea
  • GNOME's OSTree allows bootable system images to be built from a specified set of packages
  • The GNU Source Release Collection (GSRC) is a user-land software distribution; unlike Guix, it relies on core tools available on the host system

Copyright Notices

GNU Guix is made available under the GNU GPL version 3 or later license, and authors retain their copyright. For copyright notices, we adhere to the guidance documented in (info "(maintain) Copyright Notices"), and explicitly allow ranges instead of individual years. Here's an example of the preferred style used for copyright notices in source file headers:

Copyright © 2019-2023, 2025 Your Name <your@email.com>

Meaning there were copyright-able changes made for the years 2019, 2020, 2021, 2022, 2023 and 2025.