Порівняти коміти

...

2 Коміти

Автор SHA1 Повідомлення Дата
42d44b0cc1 README.md: Minor improvements throughout the file 2022-06-06 14:07:37 +02:00
bd3641981c Introduce EXILE_SYSCALL_DENY_RET_NOSYS for syscalls like clone3()
clone3() is used more and more, but we cannot filter it. We can either
allow it fully or return ENONYS. Some libraries perform fallbacks to the
older clone() in that case, which we can filter again.
2022-06-06 14:07:37 +02:00
4 змінених файлів з 39 додано та 21 видалено

@ -11,7 +11,7 @@ a first impression.
system() is used to keep the example C code short. It also demonstrates that subprocesses are also subject to restrictions imposed by exile.h. system() is used to keep the example C code short. It also demonstrates that subprocesses are also subject to restrictions imposed by exile.h.
While the example show different features separately, it is generally possible to combine those. While the examples show different features separately, it is generally possible to combine those.
### Filesystem isolation ### Filesystem isolation
```c ```c
@ -43,8 +43,8 @@ int main(void)
The assert() calls won't be fired, consistent with the policy. The assert() calls won't be fired, consistent with the policy.
### System call policies / vows ### System call policies / vows
exile.h allows specifying which syscalls are permitted or denied. In the folloing example, exile.h allows specifying which syscalls are permitted or denied. In the following example,
ls is never executed, as the specificed "vows" do not allow the execve() system call. The ls is never executed, as the specified "vows" do not allow the execve() system call. The
process will be killed. process will be killed.
```c ```c
@ -130,7 +130,7 @@ the subprocess "cat()" was launched in violated the policy (missing "rpath" vow)
Naturally, there is a performance overhead. Certain challenges remain, such as the fact Naturally, there is a performance overhead. Certain challenges remain, such as the fact
that being executed in a subprocess, we operate on copies, so handling references that being executed in a subprocess, we operate on copies, so handling references
is not something that has been given much thought. There is also the fact is not something that has been given much thought. There is also the fact
that clone()ing from threads opens a can of worms. Hence, exile_launch() that clone()ing from threads opens a can of worms, particularly with locks. Hence, exile_launch()
is best avoided in multi-threaded contexts. is best avoided in multi-threaded contexts.
## Status ## Status
@ -139,7 +139,7 @@ No release yet, experimental, API is unstable, builds will break on updates of t
Currently, it's mainly evolving from the needs of my other projects. Currently, it's mainly evolving from the needs of my other projects.
## Motivation and Background ## Motivation and Background
exile.h unlocks existing Linux mechanisms to facilite isolation of processes from resources. Limiting the scope of what programs can do helps defending the rest of the system when a process gets under attacker's control (when classic mitigations such as ASLR etc. failed). To this end, OpenBSD has the pledge() and unveil() functions available. Those functions are helpful mitigation mechanisms, but such accessible ways are unfortunately not readily available on Linux. This is where exile.h steps in. exile.h unlocks existing Linux mechanisms to facilitate isolation of processes from resources. Limiting the scope of what programs can do helps defending the rest of the system when a process gets under attacker's control (when classic mitigations such as ASLR etc. failed). To this end, OpenBSD has the pledge() and unveil() functions available. Those functions are helpful mitigation mechanisms, but such accessible ways are unfortunately not readily available on Linux. This is where exile.h steps in.
Seccomp allows restricting the system calls available to a process and thus decrease the systems attack surface, but it generally is not easy to use. Requiring BPF filter instructions, you generally just can't make use of it right away. exile.h provides an API inspired by pledge(), building on top of seccomp. It also provides an interface to manually restrict the system calls that can be issued. Seccomp allows restricting the system calls available to a process and thus decrease the systems attack surface, but it generally is not easy to use. Requiring BPF filter instructions, you generally just can't make use of it right away. exile.h provides an API inspired by pledge(), building on top of seccomp. It also provides an interface to manually restrict the system calls that can be issued.
@ -148,19 +148,9 @@ for daemons and not desktop applications, or are generally rather involved. As a
Abstracting those details may help developers bring sandboxing into their applications. Abstracting those details may help developers bring sandboxing into their applications.
## Example: Archive extraction
A programming uncompressing archives does not need network access, but should a bug allow code execution, obviously the payload may also access the network. Once the target path is known, it doesn't need access to the whole file system, only write-permissions to the target directory and read on the archive file(s).
TODO example with exile.h applied on "tar" or "unzip". Link to repo.
## Example: Web apps
Those generally don't need access to the whole filesystem hierarchy, nor do they necessarily require the ability to execute other processes.
Way more examples can be given, but we can put it in simple words: A general purpose OS allow a process to do more things than it actually needs to do.
## Features ## Features
- Restricting file system access (using Landlock or Namespaces/chroot as fallback) - Restricting file system access (using Landlock or Namespaces/chroot as fallback)
- Systemcall filtering (using seccomp-bpf). An interface inspired by OpenBSD's pledge() is available, removing the need to specifc rules for syscalls. - Systemcall filtering (using seccomp-bpf). An interface inspired by OpenBSD's pledge() is available
- Dropping privileges in general, such as capabilities - Dropping privileges in general, such as capabilities
- Isolating the application from the network, etc. through Namespaces - Isolating the application from the network, etc. through Namespaces
- Helpers to isolate single functions - Helpers to isolate single functions
@ -197,7 +187,7 @@ While mostly transparent to users of this API, kernel >= 5.13 is required to tak
## FAQ ## FAQ
### Does the process need to be priviliged to utilize the library? ### Does the process need to be privileged to utilize the library?
No. No.
@ -215,6 +205,9 @@ Outdated:
- cgit sandboxed: https://gitea.quitesimple.org/crtxcr/cgitsb - cgit sandboxed: https://gitea.quitesimple.org/crtxcr/cgitsb
- qpdfviewsb sandboxed (quick and dirty): https://gitea.quitesimple.org/crtxcr/qpdfviewsb - qpdfviewsb sandboxed (quick and dirty): https://gitea.quitesimple.org/crtxcr/qpdfviewsb
### Other projects
- [sandbox2](https://developers.google.com/code-sandboxing/sandbox2/)
### Contributing ### Contributing

12
exile.c

@ -280,7 +280,7 @@ static struct syscall_vow_map exile_vow_map[] =
{EXILE_SYS(copy_file_range), EXILE_SYSCALL_VOW_STDIO}, {EXILE_SYS(copy_file_range), EXILE_SYSCALL_VOW_STDIO},
{EXILE_SYS(statx), EXILE_SYSCALL_VOW_RPATH}, {EXILE_SYS(statx), EXILE_SYSCALL_VOW_RPATH},
{EXILE_SYS(rseq), EXILE_SYSCALL_VOW_THREAD}, {EXILE_SYS(rseq), EXILE_SYSCALL_VOW_THREAD},
{EXILE_SYS(clone3), EXILE_SYSCALL_VOW_CLONE}, {EXILE_SYS(clone3), EXILE_SYSCALL_VOW_CLONE|EXILE_SYSCALL_VOW_THREAD},
{EXILE_SYS(close_range), EXILE_SYSCALL_VOW_STDIO}, {EXILE_SYS(close_range), EXILE_SYSCALL_VOW_STDIO},
{EXILE_SYS(openat2), EXILE_SYSCALL_VOW_RPATH|EXILE_SYSCALL_VOW_WPATH}, {EXILE_SYS(openat2), EXILE_SYSCALL_VOW_RPATH|EXILE_SYSCALL_VOW_WPATH},
{EXILE_SYS(faccessat2), EXILE_SYSCALL_VOW_RPATH}, {EXILE_SYS(faccessat2), EXILE_SYSCALL_VOW_RPATH},
@ -521,7 +521,7 @@ int get_vow_argfilter(long syscall, uint64_t vow_promises, struct sock_filter *f
current_count = COUNT_EXILE_SYSCALL_FILTER(open_filter); current_count = COUNT_EXILE_SYSCALL_FILTER(open_filter);
break; break;
case EXILE_SYS(openat2): case EXILE_SYS(openat2):
*policy = EXILE_SYSCALL_DENY_RET_ERROR; *policy = EXILE_SYSCALL_DENY_RET_NOSYS;
return 0; return 0;
break; break;
case EXILE_SYS(socket): case EXILE_SYS(socket):
@ -539,7 +539,7 @@ int get_vow_argfilter(long syscall, uint64_t vow_promises, struct sock_filter *f
case EXILE_SYS(clone3): case EXILE_SYS(clone3):
if((vow_promises & EXILE_SYSCALL_VOW_CLONE) == 0) if((vow_promises & EXILE_SYSCALL_VOW_CLONE) == 0)
{ {
*policy = EXILE_SYSCALL_DENY_RET_ERROR; *policy = EXILE_SYSCALL_DENY_RET_NOSYS;
return 0; return 0;
} }
break; break;
@ -1075,6 +1075,10 @@ static struct sock_filter *append_syscall_to_bpf(struct exile_syscall_policy *sy
{ {
action = SECCOMP_RET_ERRNO|EACCES; action = SECCOMP_RET_ERRNO|EACCES;
} }
if(action == EXILE_SYSCALL_DENY_RET_NOSYS)
{
action = SECCOMP_RET_ERRNO|ENOSYS;
}
long syscall = syscallpolicy->syscall; long syscall = syscallpolicy->syscall;
struct sock_filter syscall_load = BPF_STMT(BPF_LD+BPF_W+BPF_ABS, offsetof(struct seccomp_data, nr)); struct sock_filter syscall_load = BPF_STMT(BPF_LD+BPF_W+BPF_ABS, offsetof(struct seccomp_data, nr));
@ -1141,7 +1145,7 @@ static struct sock_filter *append_syscall_to_bpf(struct exile_syscall_policy *sy
static int is_valid_syscall_policy(unsigned int policy) static int is_valid_syscall_policy(unsigned int policy)
{ {
return policy == EXILE_SYSCALL_ALLOW || policy == EXILE_SYSCALL_DENY_RET_ERROR || policy == EXILE_SYSCALL_DENY_KILL_PROCESS; return policy == EXILE_SYSCALL_ALLOW || policy == EXILE_SYSCALL_DENY_RET_ERROR || policy == EXILE_SYSCALL_DENY_KILL_PROCESS || policy == EXILE_SYSCALL_DENY_RET_NOSYS;
} }
/* /*

@ -75,6 +75,7 @@
#define EXILE_UNSHARE_NETWORK 1<<1 #define EXILE_UNSHARE_NETWORK 1<<1
#define EXILE_UNSHARE_USER 1<<2 #define EXILE_UNSHARE_USER 1<<2
#define EXILE_UNSHARE_MOUNT 1<<3 #define EXILE_UNSHARE_MOUNT 1<<3
#define EXILE_UNSHARE_AUTOMATIC 1<<4
#ifndef EXILE_LOG_ERROR #ifndef EXILE_LOG_ERROR
#define EXILE_LOG_ERROR(...) do { fprintf(stderr, "exile.h: %s(): Error: ", __func__); fprintf(stderr, __VA_ARGS__); } while(0) #define EXILE_LOG_ERROR(...) do { fprintf(stderr, "exile.h: %s(): Error: ", __func__); fprintf(stderr, __VA_ARGS__); } while(0)
@ -273,6 +274,7 @@ struct exile_path_policy
#define EXILE_SYSCALL_ALLOW 1 #define EXILE_SYSCALL_ALLOW 1
#define EXILE_SYSCALL_DENY_KILL_PROCESS 2 #define EXILE_SYSCALL_DENY_KILL_PROCESS 2
#define EXILE_SYSCALL_DENY_RET_ERROR 3 #define EXILE_SYSCALL_DENY_RET_ERROR 3
#define EXILE_SYSCALL_DENY_RET_NOSYS 4
#define EXILE_BPF_NOP \ #define EXILE_BPF_NOP \
BPF_STMT(BPF_JMP+BPF_JA,0) BPF_STMT(BPF_JMP+BPF_JA,0)

19
test.c

@ -643,6 +643,24 @@ int test_vows_from_str()
return 0; return 0;
} }
int test_clone3_nosys()
{
struct exile_policy *policy = exile_init_policy();
policy->vow_promises = exile_vows_from_str("stdio rpath wpath cpath thread error");
exile_enable_policy(policy);
/* While args are invalid, it should never reach clone3 syscall handler, so it's irrelevant for
our test*/
long ret = syscall(__NR_clone3, NULL, 0);
if(ret == -1 && errno != ENOSYS)
{
LOG("clone3() was not allowed but did not return ENOSYS. It returned: %li, errno: %i\n", ret, errno);
return 1;
}
return 0;
}
struct dispatcher struct dispatcher
{ {
char *name; char *name;
@ -670,6 +688,7 @@ struct dispatcher dispatchers[] = {
{ "launch", &test_launch}, { "launch", &test_launch},
{ "launch-get", &test_launch_get}, { "launch-get", &test_launch_get},
{ "vow_from_str", &test_vows_from_str}, { "vow_from_str", &test_vows_from_str},
{ "clone3_nosys", &test_clone3_nosys},
}; };
int main(int argc, char *argv[]) int main(int argc, char *argv[])