permission oddity in fs/proc/proc_sysctl.c

Discuss and suggest new grsecurity features

permission oddity in fs/proc/proc_sysctl.c

Postby khaos » Thu Jun 18, 2015 2:47 pm

Hello @all,

while I was diagnosing a SELinux avc denial today, I dug deeper into the kernel sysctl code to understand what I was seeing since it wasn't making any sense.

I stumbled over the following added bit in fs/proc/proc_sysctl.c which seems very odd to me:

Code: Select all
@@ -501,6 +513,27 @@ static ssize_t proc_sys_call_handler(struct file *filp, void __user *buf,
    if (!table->proc_handler)
       goto out;
+   error = -EPERM;
+   if (gr_handle_chroot_sysctl(op))
+      goto out;
+   dget(filp->f_path.dentry);
+   if (gr_handle_sysctl_mod(filp->f_path.dentry->d_parent->, table->procname, op)) {
+      dput(filp->f_path.dentry);
+      goto out;
+   }
+   dput(filp->f_path.dentry);
+   if (!gr_acl_handle_open(filp->f_path.dentry, filp->f_path.mnt, op))
+      goto out;
+   if (write) {
+      if (current->nsproxy->net_ns != table->extra2) {
+         if (!capable(CAP_SYS_ADMIN))
+            goto out;
+      } else if (!ns_capable(current->nsproxy->net_ns->user_ns, CAP_NET_ADMIN))
+         goto out;
+   }

The part that stands out is:

Code: Select all
   if (write) {
      if (current->nsproxy->net_ns != table->extra2) {
         if (!capable(CAP_SYS_ADMIN))
            goto out;
      } else if (!ns_capable(current->nsproxy->net_ns->user_ns, CAP_NET_ADMIN))
         goto out;

What is odd, this part of the code should actually not make any assumptions about what extra1 or extra2 are about since they are defined by the handlers themselves. And here it is apparently assumed that extra2 can contain the network namespace (net_ns is a pointer to a struct). Since proc_sys_call_handler is the main entry point for all calls and since I haven't been able to find any indication that any code is actually putting a pointer to net_ns in extra2, this, imho, effectively means all modifications require the CAP_SYS_ADMIN capability, no matter what. This obviously cannot be what the author of this code intended since that could have been achieved with less code. :-)

For my problem that means: A daemon (running as root w/o CAP_SYS_ADMIN but w/ CAP_NET_ADMIN) cannot change /proc/sys/net/core/xfrm_acq_expires even though it should.

Is this intended behavior? If yes, could someone point me to the right location where extra2 is set accordingly so that it makes sense in this context?

Thanks a lot in advance,
Posts: 1
Joined: Thu Jun 18, 2015 2:26 pm

Re: permission oddity in fs/proc/proc_sysctl.c

Postby spender » Mon Jun 22, 2015 7:00 pm

Commit de322e5dc505ccfc15428aebd72dfad5d84b9a02
Author: Brad Spengler <>
Date: Sun Nov 10 13:54:25 2013 -0500

Compatibility fix for LXC:
Don't require CAP_SYS_ADMIN to modify our own net namespace's sysctl values,
use a CAP_NET_ADMIN check within the user namespace of the process performing the modification
CAP_SYS_ADMIN is still required for any other sysctl modification, including modification
of sysctls of a net namespace other than our own

This allows for LXC containers to not need CAP_SYS_ADMIN to be able to set up their namespace's

Thanks to ncopa from IRC for testing

See code in net/ipv4/devinet.c, for instance. It mostly applies to /proc/sys/net/ipv[4|6]/conf/*, but not to your example path, where CAP_SYS_ADMIN would still be required.

Posts: 2185
Joined: Wed Feb 20, 2002 8:00 pm

Return to grsecurity development