The arch field is the same for x86_64 and x32, thus checking it
is not enough.
Simply using x32 system calls would allow a bypass. Thus,
we must check whether the system call number is in __X32_SYSCALL_BIT.
This is of course a lazy solution, we could also add the
same system call number + _X32_SYSCALL_BIT to our black/whitelists.
For now however, this however will do.
The filter was missing this check for arch, allowing bypasses
by using different calling conventions of other architectures.
A trivial example is execve() of x86 from and x86_64 process.
The purpose of these new functions is to make it simpler for users
to add new syscalls to the whitelist and blacklist.
The current approach uses a user-supplied pointer which however
was difficult to manage with "no_fs", which may add systemcalls
to the blacklist. Then we must resize arrays, and suddenly
it's our job to free them.
As a bonus, implementing them here allows easier data structure
changes and decreases the chances tgat users of this API
do something wrong, like forgetting -1 at then end, etc.
Landlock can handle write access without it implying read access,
in contrast to the existing bind mounts solution. Hence, remove
ALLOW_READ from ALLOW_WRITE bitmask.
Previously, we needed chroot and bind mounts to enforce path_policies. Therefore,
in the presence of path policies, we had to explicitly create a chroot
dir.
With the coming landlock support, this is not required anymore.
However, one might still want to chroot and bind mount flags. But
path policies don't dictate that anymore.