88 Commits

Author SHA1 Message Date
6b513f8339 pledge: Add prctl() default filter 2021-12-27 12:35:54 +01:00
d2357ac676 pledge: Introduce clone() filter and EXILE_SYSCALL_PLEDGE_THREAD 2021-12-27 12:35:54 +01:00
0b0dda0de1 pledge: Begin filter for setsockopt() args 2021-12-27 12:35:54 +01:00
7115ef8b4d Begin an pledge()-like implementation
This begins a pledge() implementation. This also
retires the previous syscall grouping approach,
as pledge() is the superior mechanism.

Squashed:
test: Begin basic pledge test
pledge: Begin EXILE_SYSCALL_PLEDGE_UNIX/EXILE_SYSCALL_PLEDGE_INET
test: Add pledge socket test
Introduce EXILE_SYSCALL_PLEDGE_DENY_ERROR, remove exile_policy->pledge_policy
pledge: Add PROT_EXEC
2021-12-27 12:35:54 +01:00
15a6850023 Begin low-level seccomp arg filter interface
Squashed:
test: Adjust existing to new API with arg filters
test: Add tests for low-level seccomp args filter API
test: Add seccomp_filter_mixed()
test: Switch to syscall() everywhere
append_syscall_to_bpf(): Apply EXILE_SYSCALL_EXIT_BPF_NO_MATCH also for sock_filter.jt
2021-12-27 12:35:54 +01:00
48deab0dde exile_enable_policy(): Only chdir() post chroot() 2021-12-27 12:35:35 +01:00
ce7eb57998 enter_namespaces(): Fix error message 2021-12-27 12:35:35 +01:00
3407fded04 Add EXILE_FS_ALLOW_ALL_{READ,WRITE}
Issue: #19
2021-12-27 00:30:52 +01:00
1b4c5477a5 rename to exile.h
qssb.h was a preliminary name and can't be pronounced smoothly.

exile.h is more fitting and it's also short. Something exiled is essentially
something isolated, which is pretty much what this library does (isolation from
resources such as file system, network and others accessible by system calls).
2021-11-30 18:19:15 +01:00
756b0fb421 rename qssb.h to exile.h 2021-11-30 17:40:36 +01:00
d150c2ecd9 Don't add any seccomp rules by default
Cannot be done properly on a pure syscall basis at this point.

A whitelist is almost certainly too restrictive, which means user
has to manually adjust the policy anyway. Then the default is not
of much use. Or too permissive.

A blacklist has to play catchup with new kernel versions. This may
be be improved upon by blocking all unknown (too new) syscall
numbers. However, in light of the fact we drop caps and set no_new_privs,
it's debtable how much we can gain from a blacklist anyway.

So best to leave it to the user. We also need to allow checking args
too in order to make it easier to build policies. Perhaps get
inspiration from pledge() in OpenBSD.
2021-11-20 20:54:28 +01:00
435bcefa48 test: Skip landlock specific tests if unavailble during compile time 2021-11-20 19:25:30 +01:00
2a4cee2ece test: Use xqssb_enable_policy() throughout where reasonable 2021-11-20 16:56:19 +01:00
d847d0f996 qssb_append_group_syscall_policy(): Make QSSB_SYSCGROUP_NONE an invalid group 2021-11-14 21:46:47 +01:00
1a2443db18 qssb_append_syscalls_policy(): Fix mem leak on failure 2021-11-14 21:46:47 +01:00
db17e58deb Assign syscalls into groups. Add whitelist mode (default).
Classify syscalls into groups, for x86_64 only for now.
Up to date for 5.15, generate some #ifndef for syscalls
introduced since 5.10. Only support x86_64 therefore at this point.

Switch from blacklisting to a default whitelist.
2021-11-14 21:46:47 +01:00
0d7c5bd6d4 append_syscall_to_bpf(): Explicit type cast to fix (C++) warnings 2021-10-25 18:18:31 +02:00
55e1f42ca8 check_policy_sanity(): Initialize last_policy 2021-10-03 21:25:37 +02:00
11d64c6fcf enter_namespaces(): Check fopen/fprintf errors 2021-09-12 20:00:03 +02:00
ebe043c08d Fix missing \n in some error outputs 2021-09-12 19:50:05 +02:00
8bc0d1e73a Use overflow-safe operator builtins
As a precaution as it does not hurt
2021-09-12 19:47:45 +02:00
215032f32c enable_no_fs(): Fix corresponding test by adding missing default policy 2021-09-06 21:43:50 +02:00
411e00715d Rename qssb_append_default_syscall_policy() to better distinguish it from qssb_append_syscall_default_policy() 2021-09-05 17:24:42 +02:00
8a9b1730de test: Remove argc,argv from tests as there was no use for them 2021-09-05 17:12:25 +02:00
b2b501d97e test: Refactor: Put seccomp tests into child processes ; Simplfy .sh
Refactor the test logic. Seccomp tests that can be
killed run in their own subprocess now.

All test functions now return 0 on success. Therefore,
the shell script can be simplified.
2021-09-05 17:12:25 +02:00
26f391f736 test: implement test_seccomp_errno() 2021-09-05 17:12:25 +02:00
68fd1a0a87 test: test_seccomp_blacklisted_call_permitted(): Add missing default policy 2021-09-05 17:12:25 +02:00
b0d0beab22 README.md: Update 2021-09-05 17:12:25 +02:00
c44ce85628 test: Add test ensuring seccomp ends with default rule, minor fixes 2021-09-05 17:12:25 +02:00
25d8ed9bca check_policy_sanity(): Add syscall policy checks 2021-09-05 17:12:25 +02:00
e389140436 test.sh: Log exit code, print yes/no instead of 1/0 2021-09-05 17:12:25 +02:00
f6af1bb78f policy: Add disable_syscall_filter policy. Add defaults only on enable.
Only add default syscall policy when disable_syscall_filter is 0 (default)
and no user-custom policy has been added.
2021-09-05 17:12:25 +02:00
9192ec3aa4 Rewrite syscall policy logic
Instead of having a blacklist and whitelist, we now allow
setting a policy that runs as a chain.

This adds qssb_append_syscalls_policy()

Furthermore, add a feature to decide per syscall which action to take.
This allows now to return an error instead of just killing the process.

In the future, it may allow us to set optimize/shrink the BPF filter.
2021-09-05 17:12:03 +02:00
51844ea3ab bpf: Deny x32 system calls for now
The arch field is the same for x86_64 and x32, thus checking it
is not enough.

Simply using x32 system calls would allow a bypass. Thus,
we must check whether the system call number is in __X32_SYSCALL_BIT.

This is of course a lazy solution, we could also add the
same system call number + _X32_SYSCALL_BIT to our black/whitelists.

For now however, this however will do.
2021-08-12 12:25:12 +02:00
66c6d28dcd bpf: Check arch value
The filter was missing this check for arch, allowing bypasses
by using different calling conventions of other architectures.

A trivial example is execve() of x86 from and x86_64 process.
2021-08-12 11:57:13 +02:00
5cd45c09b7 bpf: Use SECCOMP_RET_KILL_PROCESS instead SECCOMP_RET_KILL
We generally want to kill the process not the thread.
2021-08-12 11:40:29 +02:00
fa06287b13 Use new qssb_append_*_syscall functions, remove old fields 2021-08-12 11:37:19 +02:00
68694723fe Begin qssb_append_*_syscall family of functions
The purpose of these new functions is to make it simpler for users
to add new syscalls to the whitelist and blacklist.

The current approach uses a user-supplied pointer which however
was difficult to manage with "no_fs", which may add systemcalls
to the blacklist. Then we must resize arrays, and suddenly
it's our job to free them.

As a bonus, implementing them here allows easier data structure
changes and decreases the chances tgat users of this API
do something wrong, like forgetting -1 at then end, etc.
2021-08-12 11:37:19 +02:00
4a4d551e75 Introduce "no_fs" and "no_new_fd" options.
no_fs is a simple way to take away all
FS access, without constructing path_policies etc.

no_new_fd disallows opening any new
file descriptors
2021-08-10 16:58:43 +02:00
57238b535c Expand disallowed system calls
Relevant: #8
2021-08-10 16:57:44 +02:00
b4e8116c20 seccomp_enable_whitelist(): Fix comment 2021-08-10 16:55:58 +02:00
75f607bc35 qssb_append_path_policies(): Add explicit type cast for c++ 2021-08-07 12:05:58 +02:00
a585db7778 qssb_free_policy(): Allow passing NULL 2021-06-08 22:04:46 +02:00
55ec51ba21 Improve and add functions comments 2021-06-08 22:04:46 +02:00
ade022ba62 update README 2021-06-08 22:04:26 +02:00
c57c79fa36 test: Log output of individual tests 2021-06-06 09:27:45 +02:00
5138d88b12 test: Count succeeded/failed tests 2021-06-06 09:02:30 +02:00
b8d6c78780 test: Rename fail(), echogreen() 2021-06-06 08:57:24 +02:00
a7c04537f7 Rename allowed_syscalls to whitelisted_syscalls for consistency 2021-06-05 20:15:09 +02:00
85c01899a9 Start implementing tests 2021-06-05 20:11:07 +02:00