Page 1 of 2

repeating kernel panic - is this grsec or an upstream issue?

PostPosted: Mon Oct 27, 2014 11:52 am
by jowr
(I wrote this once then the fucking forum ate it. phpBB is my mortal enemy.)

Since the netfilter folks seem a bit hostile to things grsecurity related (re: https://twitter.com/grsecurity/status/5 ... 9110685696) I'm running this by you guys.

Basically I'm not a kernel developer. I can read the kernel panic and see what syscall is causing shit to go down, and can plainly see how its' happening across two entirely different kerenl builds. My expectation is that this is an actual bug either in the netfilter code itself, or with something grsecurity is doing in conjunction with it.

For background, this runs as a tor exit node which quite happily pushes 20-50 thousand packets per second. The firewall rules are reasonably simple, and I am only invoking the conntrack module twice:

# iptables-save | grep conn
-A INPUT -m comment --comment "001-v4 drop invalid traffic" -m conntrack --ctstate INVALID -j DROP
-A INPUT -m comment --comment "990-v4 accept existing connections" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT

Unfortunately something on the order of one in a billion packets is murdering this server. And its' happening every 12 hours or so, which is "annoying".

This panic is happening on 3.17.1, but also on 3.16.5 as well but for brevity's sake I'm not pasting that as well. I'd do a bug report to Gentoo hardened but this feels a bit out of their depth at the moment.

Note:

* Ignore the xt_* modules. They are not in use and the panics predate them. Besides, I haven't yet made the CHAOS target behave the way I want yet.
* netconsole produces staggered output, but at least it works. Yes, you can trap kernel panics remotely with netconsole!
* kernel config cliffs notes: grsec automatic config, usage server, no virtualization, security priority, selinux is in use but permissive due to "tuning issues".

Now, I'm more than happy to help participate in squishing this and will provide whatever is needed. I just need a solid push in the right direction.

Oct 27 09:52:53 REDACTED [23041.341354] general protection fault: 0000 [#4]
Oct 27 09:52:53 REDACTED SMP
Oct 27 09:52:53 REDACTED
Oct 27 09:52:53 REDACTED [23041.341413] Modules linked in:
Oct 27 09:52:53 REDACTED xt_DELUDE(O)
Oct 27 09:52:53 REDACTED xt_CHAOS(O)
Oct 27 09:52:53 REDACTED xt_TARPIT(O)
Oct 27 09:52:53 REDACTED
Oct 27 09:52:53 REDACTED [23041.341476] CPU: 6 PID: 3052 Comm: tor Tainted: G D O 3.17.1-hardened #1
Oct 27 09:52:53 REDACTED [23041.341538] Hardware name: Supermicro A1SA2-2750F/A1SA2-2750F, BIOS 1.0a 07/14/2014
Oct 27 09:52:53 REDACTED [23041.341600] task: ffff880276ed6b10 ti: ffff880276ed6f60 task.ti: ffff880276ed6f60
Oct 27 09:52:53 REDACTED [23041.341660] RIP: 0010:[<ffffffff814b58ce>]
Oct 27 09:52:53 REDACTED [<ffffffff814b58ce>] __nf_conntrack_find_get+0x6e/0x290
Oct 27 09:52:53 REDACTED [23041.341732] RSP: 0018:ffffc90006073930 EFLAGS: 00010246
Oct 27 09:52:53 REDACTED [23041.341770] RAX: 0000000000014230 RBX: fefefefefefefefe RCX: 0000000000014a70
Oct 27 09:52:53 REDACTED [23041.341811] RDX: 000000000000294e RSI: 00000000000266e2 RDI: 00000000fefefefe
Oct 27 09:52:53 REDACTED [23041.341852] RBP: ffffc90006073958 R08: 0000000073a1bccf R09: 00000000bd127271
Oct 27 09:52:53 REDACTED [23041.341894] R10: ffffc900060739c0 R11: ffff880273943f08 R12: ffffc900060739a8
Oct 27 09:52:53 REDACTED [23041.341935] R13: 0000000000000000 R14: 00000000a538c88a R15: ffffffff81a7e240
Oct 27 09:52:53 REDACTED [23041.341976] FS: 0000031cb9d65700(0000) GS:ffff88027fd80000(0000) knlGS:0000000000000000
Oct 27 09:52:53 REDACTED [23041.342037] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 27 09:52:53 REDACTED [23041.342077] CR2: 000002faa3f38000 CR3: 0000000001654000 CR4: 00000000001007f0
Oct 27 09:52:53 REDACTED [23041.342117] Stack:
Oct 27 09:52:53 REDACTED [23041.342147] ffff880211a540e0
Oct 27 09:52:53 REDACTED ffffffff81a7e240
Oct 27 09:52:53 REDACTED 0000000000000014
Oct 27 09:52:53 REDACTED ffffffff81a9e660
Oct 27 09:52:53 REDACTED [23041.342225] 0000000000000000
Oct 27 09:52:53 REDACTED
Oct 27 09:52:53 REDACTED ffffc90006073a28
Oct 27 09:52:53 REDACTED ffffffff814b6d2c
Oct 27 09:52:53 REDACTED ffffffff81a9e660
Oct 27 09:52:53 REDACTED [23041.342303] ffffffff81a904a0
Oct 27 09:52:53 REDACTED
Oct 27 09:52:53 REDACTED ffff880079cabc4c
Oct 27 09:52:53 REDACTED ffff8802a538c88a
Oct 27 09:52:53 REDACTED ffffffff81a904a0
Oct 27 09:52:53 REDACTED
Oct 27 09:52:53 REDACTED [23041.342380] Call Trace:
Oct 27 09:52:53 REDACTED [23041.342416] [<ffffffff814b6d2c>] nf_conntrack_in+0x1fc/0x990
Oct 27 09:52:53 REDACTED [23041.342459] [<ffffffff8158bcab>] ipv4_conntrack_local+0x4b/0x50
Oct 27 09:52:53 REDACTED [23041.342501] [<ffffffff814ae7f8>] nf_iterate+0xa8/0xc0
Oct 27 09:52:53 REDACTED [23041.342543] [<ffffffff8152ffe0>] ? ip_forward_options+0x1f0/0x1f0
Oct 27 09:52:53 REDACTED [23041.342585] [<ffffffff814ae885>] nf_hook_slow+0x75/0x120
Oct 27 09:52:53 REDACTED [23041.342625] [<ffffffff8152ffe0>] ? ip_forward_options+0x1f0/0x1f0
Oct 27 09:52:53 REDACTED [23041.342667] [<ffffffff81532503>] __ip_local_out+0xa3/0xb0
Oct 27 09:52:53 REDACTED [23041.342708] [<ffffffff81532525>] ip_local_out_sk+0x15/0x50
Oct 27 09:52:53 REDACTED [23041.342749] [<ffffffff815328cf>] ip_queue_xmit+0x14f/0x400
Oct 27 09:52:53 REDACTED [23041.342791] [<ffffffff8154b99b>] tcp_transmit_skb+0x48b/0x930
Oct 27 09:52:53 REDACTED [23041.342832] [<ffffffff8154bf82>] tcp_write_xmit+0x142/0xd10
Oct 27 09:52:53 REDACTED [23041.342873] [<ffffffff8154cdb9>] __tcp_push_pending_frames+0x29/0x90
Oct 27 09:52:53 REDACTED [23041.342915] [<ffffffff8153b737>] tcp_push+0xe7/0x120
Oct 27 09:52:53 REDACTED [23041.342954] [<ffffffff8153d027>] tcp_sendmsg+0x107/0x11d0
Oct 27 09:52:53 REDACTED [23041.342995] [<ffffffff8126e1ce>] ? selinux_socket_sendmsg+0x1e/0x30
Oct 27 09:52:53 REDACTED [23041.343037] [<ffffffff8126dbc3>] ? avc_has_perm+0xa3/0x190
Oct 27 09:52:53 REDACTED [23041.343079] [<ffffffff8142b02f>] ? sock_sendmsg+0x9f/0xd0
Oct 27 09:52:53 REDACTED [23041.343120] [<ffffffff8156955e>] inet_sendmsg+0x6e/0xc0
Oct 27 09:52:53 REDACTED [23041.343160] [<ffffffff8126e1ce>] ? selinux_socket_sendmsg+0x1e/0x30
Oct 27 09:52:53 REDACTED [23041.343203] [<ffffffff81429d38>] sock_aio_write+0x118/0x150
Oct 27 09:52:53 REDACTED [23041.343243] [<ffffffff8126fd72>] ? inode_has_perm.isra.28+0x22/0x40
Oct 27 09:52:53 REDACTED [23041.343285] [<ffffffff8126febe>] ? file_has_perm+0x8e/0x90
Oct 27 09:52:53 REDACTED [23041.343327] [<ffffffff81186fd3>] do_sync_write+0x63/0x90
Oct 27 09:52:53 REDACTED [23041.343367] [<ffffffff81187ee2>] vfs_write+0x242/0x2b0
Oct 27 09:52:53 REDACTED [23041.343407] [<ffffffff81188a47>] SyS_write+0x47/0xb0
Oct 27 09:52:53 REDACTED [23041.343448] [<ffffffff81632dfe>] system_call_fastpath+0x16/0x1b
Oct 27 09:52:53 REDACTED [23041.343487] Code:
Oct 27 09:52:53 REDACTED 00
Oct 27 09:52:53 REDACTED 00
Oct 27 09:52:53 REDACTED 48
Oct 27 09:52:53 REDACTED 8b
Oct 27 09:52:53 REDACTED 18
Oct 27 09:52:53 REDACTED f6
Oct 27 09:52:53 REDACTED c3
Oct 27 09:52:53 REDACTED 01
Oct 27 09:52:53 REDACTED 74
Oct 27 09:52:53 REDACTED 21
Oct 27 09:52:53 REDACTED e9
Oct 27 09:52:53 REDACTED 56
Oct 27 09:52:53 REDACTED 01
Oct 27 09:52:53 REDACTED 00
Oct 27 09:52:53 REDACTED 00
Oct 27 09:52:53 REDACTED 66
Oct 27 09:52:53 REDACTED 0f
Oct 27 09:52:53 REDACTED 1f
Oct 27 09:52:53 REDACTED 44
Oct 27 09:52:53 REDACTED 00
Oct 27 09:52:53 REDACTED 00
Oct 27 09:52:53 REDACTED 49
Oct 27 09:52:53 REDACTED 8b
Oct 27 09:52:53 REDACTED 87
Oct 27 09:52:53 REDACTED 58
Oct 27 09:52:53 REDACTED 0d
Oct 27 09:52:53 REDACTED 00
Oct 27 09:52:53 REDACTED 00
Oct 27 09:52:53 REDACTED 65
Oct 27 09:52:53 REDACTED ff
Oct 27 09:52:53 REDACTED 00
Oct 27 09:52:53 REDACTED 48
Oct 27 09:52:53 REDACTED 8b
Oct 27 09:52:53 REDACTED 1b
Oct 27 09:52:53 REDACTED f6
Oct 27 09:52:53 REDACTED c3
Oct 27 09:52:53 REDACTED 01
Oct 27 09:52:53 REDACTED 0f
Oct 27 09:52:53 REDACTED 85
Oct 27 09:52:53 REDACTED 3a
Oct 27 09:52:53 REDACTED 01
Oct 27 09:52:53 REDACTED 00
Oct 27 09:52:53 REDACTED 00
Oct 27 09:52:53 headless syslog-ng[11045]: Error processing log message: <0f>
Oct 27 09:52:53 REDACTED b6
Oct 27 09:52:53 REDACTED 43
Oct 27 09:52:53 REDACTED 37
Oct 27 09:52:53 REDACTED 8b
Oct 27 09:52:53 REDACTED 7b
Oct 27 09:52:53 REDACTED 10
Oct 27 09:52:53 REDACTED 41
Oct 27 09:52:53 REDACTED 39
Oct 27 09:52:53 REDACTED 3c
Oct 27 09:52:53 REDACTED 24
Oct 27 09:52:53 REDACTED 75
Oct 27 09:52:53 REDACTED dd
Oct 27 09:52:53 REDACTED 8b
Oct 27 09:52:53 REDACTED 73
Oct 27 09:52:53 REDACTED 14
Oct 27 09:52:53 REDACTED 41
Oct 27 09:52:53 REDACTED 39
Oct 27 09:52:53 REDACTED 74
Oct 27 09:52:53 REDACTED 24
Oct 27 09:52:53 REDACTED 04
Oct 27 09:52:53 REDACTED [23041.343964] RIP
Oct 27 09:52:53 REDACTED
Oct 27 09:52:53 REDACTED [<ffffffff814b58ce>] __nf_conntrack_find_get+0x6e/0x290
Oct 27 09:52:53 REDACTED [23041.344011] RSP <ffffc90006073930>
Oct 27 09:52:53 REDACTED [23041.344609] ---[ end trace 874c3cf41b00aa37 ]---
Oct 27 09:52:53 REDACTED [23041.344717] Kernel panic - not syncing: Fatal exception in interrupt
Oct 27 09:52:53 REDACTED [23041.344832] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
Oct 27 09:52:53 REDACTED [23041.344965] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Mon Oct 27, 2014 12:46 pm
by PaX Team
this looks like another use-after-free bug (rbx was dereferenced which holds the pattern used by SANITIZE) but i'm afraid we won't be able to figure it out either. someone with deep knowledge of how rcu locking is supposed to work in conntrack would have to look at it (the triggering function seems to have the proper locking so it's some other user of conntrack data that is probably missing some rcu locking). you could also run "addr2line -e vmlinux -fip ffffffff814b58ce" on your kernel to have a better location info for the actual triggering code line (best is if you build your kernel with CONFIG_DEBUG_INFO and CONFIG_DEBUG_INFO_REDUCED).

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Mon Oct 27, 2014 1:39 pm
by jowr
Thanks for the quick reply.

On the affected:

Code: Select all
# addr2line -e vmlinux -fip ffffffff814b58ce
nf_ct_tuplehash_to_ctrack at /usr/src/linux/include/net/netfilter/nf_conntrack.h:122
 (inlined by) nf_ct_key_equal at /usr/src/linux/net/netfilter/nf_conntrack_core.c:393
 (inlined by) ____nf_conntrack_find at /usr/src/linux/net/netfilter/nf_conntrack_core.c:422
 (inlined by) __nf_conntrack_find_get at /usr/src/linux/net/netfilter/nf_conntrack_core.c:453


Offending code @ /usr/src/linux/include/net/netfilter/nf_conntrack.h:122
Code: Select all
static inline struct nf_conn *
nf_ct_tuplehash_to_ctrack(const struct nf_conntrack_tuple_hash *hash)
{
        return container_of(hash, struct nf_conn,
                            tuplehash[hash->tuple.dst.dir]);
}


The option CONFIG_DEBUG_INFO is already set, and it appears that REDUCED only reduces (imagine that) the information output. I wouldn't have any output otherwise, if I'm understanding the debugging properly.

Would any of the RCU debugging options (or anything else for that matter) in the kernel help isolate this thing further?

I don't mind going on a bughunt. Its' just that my kernel debugging skills are very limited.

Where next with this?

Mostly I am trying to suss out whether this is a grsecurity bug, gentoo bug (a small amount of the Gentoo hardened patchset is Gentoo-specific and not just grsecurity), or is an issue with the base kernel itself. Or some bizarre interaction between a combination thereof.

If this is a netfilter issue, or at least convincingly not a grsecurity issue, what would be my best way to proceed?

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Tue Oct 28, 2014 12:33 pm
by jowr
This is a grsecurity issue.

A normal 3.17.1 doesn't exhibit this issue, with equivalent traffic loads over nearly 24h. Where I couldn't get 12h before.

I still want to run grsecurity very badly so I'll probably spend the next week slowly enabling feature sets, starting with a patched kernel but with no new things enabled, to isolate this bullshit.

My current hunch says this is some PaX memory protection thing that's going haywire.

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Tue Oct 28, 2014 3:43 pm
by PaX Team
what you are seeing is the SANITIZE feature catching a use-after-free bug which must exist in the vanilla kernel already as we don't touch conntrack code beyond some trivial constification related changes. if you ported SANITIZE to vanilla or enabled slab debugging with proper magic values (the free poison value must have its LSB set to 0 to trigger this particular bug which we were (un)lucky to choose for our poison value) then you'd trigger it on vanilla i'm sure. if you want to keep SANITIZE but sweep this particular bug under the carpet then you can add SLAB_NO_SANITIZE to the kmem_cache_create call in net/netfilter/nf_conntrack_core.c:nf_conntrack_init_net.

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Tue Oct 28, 2014 4:56 pm
by minipli
Can you please try the following patch on top of a v3.17.1 vanilla kernel and see if it triggers the panic again?

http://r00tworld.net/~minipli/grsec/v3.17-pax_slab_sanitize_only.diff

If so, there are good chances to report this upstream as a use-after-free bug, referring to the patch, that, essentially, just does a memset(0xfe) on a kmem_cache_free(ptr). After that function has been called, no code should access *ptr any more, but, apparently, some code does and triggers the #GP you're seeing.

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Wed Oct 29, 2014 4:00 pm
by jowr
Ah, ok, thanks for the explanation. The SANITIZE sidebar now makes sense now.

The feature isn't critically important, but is nice to have. Other stuff is far more important. But squishing this would be best especially since it has revealed an issue in the vanilla kernel. Kernel bugs in networking code make me nervous.

Just for clarity, this is the build plan for that patch for reproducability's sake:

* 3.17.1 from kernel.org
* https://grsecurity.net/test/grsecurity- ... 1754.patch
* minipli's patch

Once that's done, I'll use the previous kernel config (including the memory sanitization feature) and see if the issue recurs.

Sound good?

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Wed Oct 29, 2014 4:31 pm
by PaX Team
minipli's patch's against vanilla, not grsec and it unconditionally enables the extracted portion of SANITIZE, you don't need to configure anything special for it (a make oldconfig on the grsec config should produce a good config for the vanilla kernel).

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Wed Oct 29, 2014 5:00 pm
by jowr
Ah, I misunderstood.

That's why I asked! I'll do that and report tomorrow or thereabouts.

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Thu Oct 30, 2014 12:09 pm
by jowr
Patch applied and running. With the packet workload this machine gets, it takes 2-12h to get a panic. I'll update if/when it dies, including the ever-lovable crash output.

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Thu Oct 30, 2014 2:48 pm
by jowr
We have a winner, only took like two hours to panic.

https://imgur.com/4li7ePm

What I find interesting is that you don't always get netconsole output. This panic, for example, did not output via netconsole. But I did get other trash messages before the panic, indicating the logging method worked.

I wonder why.

Anyway, issue confirmed. What's next?

I'm going back to the grsec kernel with sanitize disabled in the short term, but getting this squished would be optimal.

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Thu Oct 30, 2014 3:10 pm
by PaX Team
a full log with register content would still be nice but it's the same instruction that triggers so i'm fairly sure it's the same bug. as for what's next, it's clearly an upstream problem so you have the honour to report it on lkml/netdev and get it fixed ;). probably confer with minipli as well since it's his code in SANITIZE that found this one and the other one somewhere in UDP.

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Thu Oct 30, 2014 3:18 pm
by jowr
I know you'd like better output, but how ARE you supposed to get full output in situations like this?

I've battled this for years! I've gone through the sysrq functions and there isn't anything I can use to "scroll up" in the buffer. The only option seems to be some value of directed console output (serial, network) which as you can see is a bit finiky in the latter case. I mean, I can enable a framebuffer device that's "bigger" so I can screenshot output but that's a halfassed solution at best. I've worked in linux admin for years and I'm yet to see a good solution to this, and netconsole is something I've never seen anyone do for panic tracking.

I'll work on a kernel bug report later.

Though I'm sitting here wondering why I'm the one who found this because my kernel use cases are almost always bog standard and well traveled. Probably the unique combination of grsec, the memory saniziation option (I specifically enabled it), huge packet throughput, and netfilter connection tracking.

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Thu Oct 30, 2014 3:23 pm
by minipli
As pipacs said, next would be to report this to netdev@vger.kernel.org. You should mention the above patch, that sanitizes freed slab objects with the pattern '\xfe' which can be seen in registers when the panic happens -- assuming you can trigger the panic again with a full backtrace and register dump. But as it happend in the past, it should be very likely.

Feel free to cc me (minipli@googlemail.com) when reporting this upstream.

It's definitely a use after free bug in the netfilter code.

Re: repeating kernel panic - is this grsec or an upstream is

PostPosted: Fri Oct 31, 2014 11:38 am
by jowr