Linux security in 2011, or LKML’s yearly digest
<aside>
Disclaimer: I have nothing to do with the following, all credits go to their respective authors.
I'm just publishing my 2011's bookmarks about Linux kernel security with a one line summary based
on my (possibly wrong) understanding
Do not hesitate to correct me (gently if possible :)) in comments or mail.
- False boundaries of (certain) capabilities: Brad Spengler describes 19 capabilities (of 35) which can be used to regain full privileges. Coincidentally, Vasily Kulikov discovered a “funny” behavior of CAP_NET_ADMIN which permit to load any modules available in /lib/modules/ instead of limiting to network related modules only, AFAIK, this vulnerability was closed but the fix got reverted some weeks later because of some userspace breakages.
-
PaX team introduced a
new range of stuff using
the new GCC plugin
infrastructure. At compile-time, pro-active code is
automatically added to potentially dangerous paths:
constify_plugin.cenforces read-onlintroduces new constraints (__do_constand__no_const) enforcing read-only permissions at compilation-time and run-time. PaX then makes usage of theses new constraints by patching most of the “ops structures”. The plugin also automatically protects structures where all members are function pointers, this patching on-the-fly is required because patching directly the source kernel would never be integrated upstream.stackleak_plugin.cadds instrumentation code beforealloca()calls. This code checks that stack-frame size does not overlap with kernel task size. It circumvents techniques described in "Large memory management vulnerabilities" by Gaël Delalleau (2005) and "The stack is back" by Jon Oberheide (2012).- GCC 4.6
introduced named
address spaces. It was initially specified for embedded
processors but PaX team uses this feature to represent user and
kernel space.
checker_plugin.cthus introduces__user,__kerneland__iomemnamespaces to spot non-legit flows between address spaces. kallocstat_plugin.cproduces statistics about the size given in parameter to various memory allocation functionskernexec_plugin.cenforces non-executable pages like theKERNEXECPaX feature, but without huge performance impact on AMD64.
- pagexec also managed to compile Linux Kernel with clang by patching both Linux and clang. Now that gcc integrated plugins, it is less interesting but llvm was the solely compiler with easy access to its internal structure, allowing external applications to perform static analysis...
- A user space interface to kernel Crypto-API was submitted to kernel developers, an interesting use-case was to offer a way to deport key material between processes. Imagine process A in possession of private keys and another one, B, actually performing encryption / decryption stuff part. The idea was to initialize a “crypto socket” in A and pass this file descriptor to B (via a classic ancillary message).
- Pseudo-files in
/proc/<pid>/have a different security model than “normal” files because of its ephemeral nature: checks need to happen during each system call and not atopen()time because permissions can change at anytime. Halfdog discovered (and Kees Cook reported it to LKML) that not all files were protected accordingly. If a program opens/proc/self/auxvand keeps this file descriptor opened. Then, even after aexecve()of a setuid binary, the file descriptor would still be available, leaking information! Fixing this vulnerability has been a long road and a pretty solution came up with the introduction ofrevoke(), a new syscall invalidating file descriptors. Unfortunately, the thread didn’t survive and ideas were lost... (by the way, it is funny that this kind of problem resuscitated in CVE-2012-0056 lately...) - As one goes along,
execve()became almost magical, it had to support Set-User-Id, capabilities, and file capabilities. Each feature added complexity and different legacy behaviors to maintain. Instead of dropping these POSIX features, OpenWall 3.0 took a different approach by removing Suid binaries from its base install, thus preventing execve’s voodoo. This change is just a line in Owl’s changelog but is in fact a major achievement: it required them to re-architecture important softwares like crontab or user management tools.
/bin/pingis setuid-root because it opens a raw socket and injects its packet on the wire directly. A new socket type,PROT_ICMP, was developed by Openwall team, it makes possible to send ICMP Echo messages without special privileges (caller’s GID has to be included in a range stored in a sysctl key). It is interesting to note that only replies (based on ICMP identifier field) are sent to userspace, not the whole ICMP traffic like in Mac OS X. - TCP Initial Sequence number is now a 32-bits random number using MD5. ISN was previously the concatenation of 24 random bits (MD4 of TCP end points with a secret rekeyed every 5 minutes) and an 8 bits counter (number of times secret key was regenerated)
- Vasilily tried to push
upstream additional
checks for
copy_{to,from}_user()(by checking if requested size fits boundaries fixed at compile time), this patch was a cut down version ofPAX_USERCOPYbut was NACKed by Linus asking him for more “balance and sanity”. However, he didn’t reject the idea itself, saying that a cleaner version might be accepted...
par Nicolas Bareil (noreply@blogger.com) le 24 January 2012 à 15:16




















