16
dlmopen(3C)をつかった VM たざき はじめ (@thehajime) カーネル/VM探検隊 2013/12/8

Kernelvm 201312-dlmopen

Embed Size (px)

Citation preview

Page 1: Kernelvm 201312-dlmopen

dlmopen(3C)をつかった VM

たざき はじめ (@thehajime)

カーネル/VM探検隊2013/12/8

Page 2: Kernelvm 201312-dlmopen

デモ

2

Page 3: Kernelvm 201312-dlmopen

dlmopen とは?

•dlmopen(3)• dlopenの亜種• link map ID (lmid)で名前空間分離• 2006年以降誰もいじってない

3

Page 4: Kernelvm 201312-dlmopen

なぜに dlmopen ?

4

•これ全部VMであげたい!

• fs.inotify.max_user_instances = 128 (LXCの場合)

• loadavgが30とか

• 必要なのはネットワーク部分だけなのに。。

Page 5: Kernelvm 201312-dlmopen

なぜに dlmopen ?

5

•ノードまたがるデバッグ。。• 1、2台ならまだ余裕

• 30台で動くアプリを gdb とか無理。。。

Page 6: Kernelvm 201312-dlmopen

なぜに dlmopen ?

6

•dlmopenで同一 ELF バイナリを複数回load

• lmid (Link-map ID)を別にする• global変数はぶつかるよ?

•退避• IPアドレスとか、ノード間で違う情報は?

•ネットワークスタックも別にする

Page 7: Kernelvm 201312-dlmopen

Direct Code Executionでぃれくと こーど えぐじぇきゅーじょん

•ns-3 (ネットワークシミュレータ)の拡張•カーネルコードはネットワーク部のみ利用•ファイルシステム無: chrootなディレクトリを割当• /proc もなし(sysctl ライクな口と一部familyのsocketのみ)

• LKMも今は使えず• struct net_device <=> ns3::NetDevice• jiffies <=> ns3::Time (仮想クロック)

7

Page 8: Kernelvm 201312-dlmopen

どうやって動く ?

• main (ns-3 シナリオ)

• PIEバイナリを dlmopen

•必要なsyscall/libcallを上書き (weak_alias)

•カーネル突入部は liblinux.so へリダイレクト

8

#!/usr/bin/python

from ns.dce import *from ns.core import *

nodes = NodeContainer()nodes.Create (100)dce = DceManagerHelper()dce.SetNetworkStack ("liblinux.so");dce.Install (nodes);

app = DceApplicationHelper()app.SetBinary ("ospfd")app.Install (nodes)

Simulator.Stop (Seconds(1000.0))Simulator.Run ()

Page 9: Kernelvm 201312-dlmopen

liblinux.so

•カーネルソースをsharedでビルド

• arch/simにglueコード

9

jiffies/gettimeofday()

SimulatedClock

Synchronize

structnet_device

ns3::NetDevice

ARP

Qdisc

TCP UDP DCCP SCTP

ICMP IPv4IPv6

Netlink

BridgingNetfilter

IPSec Tunneling

Kernel layer

Heap Stack

memory

Virtualization Corelayer

network simulation core

POSIX layer

Application(ip, iptables, quagga)

bottom halves/rcu/timer/interruptstruct net_device

DCE

Page 10: Kernelvm 201312-dlmopen

10

(gdb) bt#0 rumpcomp_sockin_sendmsg (s=7, msg=0x703010, flags=0, snd=0x7ffffffed178) at buildrump.sh/src/sys/rump/net/lib/libsockin/rumpcomp_user.c:426#1 0x00007ffff7df8526 in sockin_usrreq (so=so@entry=0x6fedb0, req=req@entry=9, m=0x6cce00, nam=nam@entry=0x0, control=control@entry=0x0, l=<optimized out>) at buildrump.sh/src/sys/rump/net/lib/libsockin/sockin.c:510#2 0x00007ffff7be4e79 in sosend (so=0x6fedb0, addr=0x0, uio=0x7ffffffed500, top=0x6cce00, control=0x0, flags=0, l=0x700800) at /home/tazaki/gitworks/buildrump.sh/src/lib/librumpnet/../../sys/rump/../kern/uipc_socket.c:1048#3 0x00007ffff7be7b4c in soo_write (fp=<optimized out>, offset=<optimized out>, uio=0x7ffffffed500, cred=<optimized out>, flags=<optimized out>) at /home/tazaki/gitworks/buildrump.sh/src/lib/librumpnet/../../sys/rump/../kern/sys_socket.c:116#4 0x00007ffff788f620 in dofilewrite (fd=fd@entry=3, fp=0x6f8e80, buf=0x400e88, nbyte=37, offset=0x6f8e80, flags=flags@entry=1, retval=retval@entry=0x7ffffffed5e0) at /home/tazaki/gitworks/buildrump.sh/src/lib/librump/../../sys/rump/../kern/sys_generic.c:355#5 0x00007ffff788f72f in sys_write (l=<optimized out>, uap=0x7ffffffed5f0, retval=0x7ffffffed5e0) at /home/tazaki/gitworks/buildrump.sh/src/lib/librump/../../sys/rump/../kern/sys_generic.c:323#6 0x00007ffff78de3cd in sy_call (rval=0x7ffffffed5e0, uap=0x7ffffffed5f0, l=0x700800, sy=<optimized out>) at /home/tazaki/gitworks/buildrump.sh/src/lib/librump/../../sys/rump/../sys/syscallvar.h:61#7 rump_syscall (num=num@entry=4, data=data@entry=0x7ffffffed5f0, dlen=dlen@entry=24, retval=retval@entry=0x7ffffffed5e0) at /home/tazaki/gitworks/buildrump.sh/src/lib/librump/../../sys/rump/librump/rumpkern/rump.c:1024#8 0x00007ffff78d573b in rump___sysimpl_write (fd=<optimized out>, buf=<optimized out>, nbyte=<optimized out>) at /home/tazaki/gitworks/buildrump.sh/src/lib/librump/../../sys/rump/librump/rumpkern/rump_syscalls.c:121#9 0x0000000000400d08 in main () at webbrowser.c:86(gdb)

BSDStack

glue

apps

rump (netbsd)

Page 11: Kernelvm 201312-dlmopen

11

(gdb) bt#0 if_transmit (ifp=0xffffc0003fdfa800, m=0xffffc00005bfe100) at ../../bsd/sys/net/if.c:3082#1 0x0000000000252a57 in ether_output_frame (ifp=0xffffc0003fdfa800, m=0xffffc00005bfe100) at ../../bsd/sys/net/if_ethersubr.c:387#2 0x0000000000252a0a in ether_output (ifp=0xffffc0003fdfa800, m=0xffffc00005bfe100, dst=0xffffc0003e9e8db0, ro=0x2000059102a0) at ../../bsd/sys/net/if_ethersubr.c:356#3 0x0000000000277982 in ip_output (m=0xffffc00005bfe100, opt=0x0, ro=0x2000059102a0, flags=0, imo=0x0, inp=0xffffc00009ea6400) at ../../bsd/sys/netinet/ip_output.c:612#4 0x000000000028cb49 in tcp_output (tp=0xffffc00009eafc00) at ../../bsd/sys/netinet/tcp_output.c:1219#5 0x0000000000296276 in tcp_output_connect (so=0xffffc0000a5a0800, nam=0xffffc00005a8e140) at ../../bsd/sys/netinet/tcp_offload.h:270#6 0x0000000000296b25 in tcp_usr_connect (so=0xffffc0000a5a0800, nam=0xffffc00005a8e140, td=0x0) at ../../bsd/sys/netinet/tcp_usrreq.c:453#7 0x000000000023503e in soconnect (so=0xffffc0000a5a0800, nam=0xffffc00005a8e140, td=0x0) at ../../bsd/sys/kern/uipc_socket.c:744#8 0x000000000023ad0e in kern_connect (fd=46, sa=0xffffc00005a8e140) at ../../bsd/sys/kern/uipc_syscalls.c:364#9 0x00000000002511fa in linux_connect (s=46, name=0x200005910660, namelen=16) at ../../bsd/sys/compat/linux/linux_socket.c:712#10 0x000000000023c088 in connect (fd=46, addr=0x200005910660, len=16) at ../../bsd/sys/kern/uipc_syscalls_wrap.c:104#11 0x000010000220c65a in NET_Connect ()#12 0x000010000220d0fa in Java_java_net_PlainSocketImpl_socketConnect ()#13 0x000020000021cd8e in ?? ()#14 0x00002000059106d8 in ?? ()(snip)(gdb)

BSDStack

glue

apps(java)

OSv

Page 12: Kernelvm 201312-dlmopen

12

(dce:node0) bt#0 sim_dev_xmit (dev=0x7ffff5587020, data=0x7ffff3e0688a "", len=105) at arch/sim/sim.c:349#1 kernel_dev_xmit (skb=0x7ffff5ccaa68, dev=0x7ffff5587020) at arch/sim/sim-device.c:20#2 dev_hard_start_xmit (skb=0x7ffff5ccaa68, dev=0x7ffff5587020, txq=0x7ffff5571a90) at net/core/dev.c:2580#3 dev_queue_xmit (skb=0x7ffff5ccaa68) at net/core/dev.c:2830#4 neigh_hh_output (skb=0x7ffff5ccaa68, hh=0x7ffff5ce8850) at include/net/neighbour.h:357#5 dst_neigh_output (skb=0x7ffff5ccaa68, n=0x7ffff5ce8790, dst=0x7ffff3e045d0) at include/net/dst.h:409#6 ip_finish_output2 (skb=0x7ffff5ccaa68) at net/ipv4/ip_output.c:201#7 ip_finish_output (skb=0x7ffff5ccaa68) at net/ipv4/ip_output.c:234#8 ip_output (skb=0x7ffff5ccaa68) at net/ipv4/ip_output.c:307#9 dst_output (skb=0x7ffff5ccaa68) at include/net/dst.h:448#10 ip_local_out (skb=0x7ffff5ccaa68) at net/ipv4/ip_output.c:110#11 ip_queue_xmit (skb=0x7ffff5ccaa68, fl=0x7ffff3e04e78) at net/ipv4/ip_output.c:403#12 tcp_transmit_skb (sk=0x7ffff3e04bd0, skb=0x7ffff5ccaa68, clone_it=1, gfp_mask=32) at net/ipv4/tcp_output.c:1021#13 mptcp_write_xmit (meta_sk=0x7ffff3e053d0, mss_now=1428, nonagle=0, push_one=0, gfp=32) at net/mptcp/mptcp_output.c:1182#14 tcp_write_xmit (sk=0x7ffff3e053d0, mss_now=516, nonagle=0, push_one=0, gfp=32) at net/ipv4/tcp_output.c:1930#15 __tcp_push_pending_frames (sk=0x7ffff3e053d0, cur_mss=516, nonagle=0) at net/ipv4/tcp_output.c:2154#16 tcp_push_pending_frames (sk=0x7ffff3e053d0) at include/net/tcp.h:1610#17 do_tcp_setsockopt (sk=0x7ffff3e053d0, level=6, optname=3, optval=0x7ffff439cc78 "", optlen=4) at net/ipv4/tcp.c:2625#18 tcp_setsockopt (sk=0x7ffff3e053d0, level=6, optname=3, optval=0x7ffff439cc78 "", optlen=4) at net/ipv4/tcp.c:2762#19 sock_common_setsockopt (sock=0x7ffff3e03850, level=6, optname=3, optval=0x7ffff439cc78 "", optlen=4) at net/core/sock.c:2455#20 sim_sock_setsockopt (socket=0x7ffff3e03850, level=6, optname=3, optval=0x7ffff439cc78, optlen=4) at arch/sim/sim-socket.c:167#21 sim_sock_setsockopt_forwarder (v0=0x7ffff3e03850, v1=6, v2=3, v3=0x7ffff439cc78, v4=4) at arch/sim/sim.c:97#22 ns3::LinuxSocketFdFactory::Setsockopt (this=0x64f000, socket=0x7ffff3e03850, level=6, optname=3, optval=0x7ffff439cc78, optlen=4) at ../model/linux-socket-fd-factory.cc:947#23 ns3::LinuxSocketFd::Setsockopt (this=0x815f20, level=6, optname=3, optval=0x7ffff439cc78, optlen=4) at ../model/linux-socket-fd.cc:89#24 dce_setsockopt (fd=11, level=6, optname=3, optval=0x7ffff439cc78, optlen=4) at ../model/dce-fd.cc:529#25 setsockopt () at ../model/libc-ns3.h:179#26 sockopt_cork (sock=11, onoff=0) at sockunion.c:534#27 bgp_write (thread=0x7ffff439ce10) at bgp_packet.c:691#28 thread_call (thread=0x7ffff439ce10) at thread.c:1177#29 main (argc=5, argv=0x658100) at bgp_main.c:455#30 ns3::DceManager::DoStartProcess (context=0x6fa970) at ../model/dce-manager.cc:281#31 ns3::TaskManager::Trampoline (context=0x6fab50) at ../model/task-manager.cc:274#32 ns3::UcontextFiberManager::Trampoline (a0=32767, a1=-139668064, a2=0, a3=7318352) at ../model/ucontext-fiber-manager.cc:199#33 ?? () from /lib64/libc.so.6#34 ?? ()

LinuxStack

apps

glue

glue(POSIX)

glue(linux)

DCE

Page 13: Kernelvm 201312-dlmopen

valgrind

13

==5864== Memcheck, a memory error detector==5864== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.==5864== Using Valgrind-3.6.0.SVN and LibVEX; rerun with -h for copyright info==5864== Command: ../build/bin/ns3test-dce-vdl --verbose==5864== ==5864== Conditional jump or move depends on uninitialised value(s)==5864== at 0x7D5AE32: tcp_parse_options (tcp_input.c:3782)==5864== by 0x7D65DCB: tcp_check_req (tcp_minisocks.c:532)==5864== by 0x7D63B09: tcp_v4_hnd_req (tcp_ipv4.c:1496)==5864== by 0x7D63CB4: tcp_v4_do_rcv (tcp_ipv4.c:1576)==5864== by 0x7D6439C: tcp_v4_rcv (tcp_ipv4.c:1696)==5864== by 0x7D447CC: ip_local_deliver_finish (ip_input.c:226)==5864== by 0x7D442E4: ip_rcv_finish (dst.h:318)==5864== by 0x7D2313F: process_backlog (dev.c:3368)==5864== by 0x7D23455: net_rx_action (dev.c:3526)==5864== by 0x7CF2477: do_softirq (softirq.c:65)==5864== by 0x7CF2544: softirq_task_function (softirq.c:21)==5864== by 0x4FA2BE1: ns3::TaskManager::Trampoline(void*) (task-manager.cc:261)==5864== Uninitialised value was created by a stack allocation==5864== at 0x7D65B30: tcp_check_req (tcp_minisocks.c:522)==5864==

Page 14: Kernelvm 201312-dlmopen

gdb

14

(gdb) b mip6_mh_filter if dce_debug_nodeid()==0Breakpoint 1 at 0x7ffff287c569: file net/ipv6/mip6.c, line 88.<continue>(gdb) bt 4#0  mip6_mh_filter (sk=0x7ffff7f69e10, skb=0x7ffff7cde8b0) at net/ipv6/mip6.c:109 #1  0x00007ffff2831418 in ipv6_raw_deliver (skb=0x7ffff7cde8b0, nexthdr=135) at net/ipv6/raw.c:199 #2  0x00007ffff2831697 in raw6_local_deliver (skb=0x7ffff7cde8b0, nexthdr=135) at net/ipv6/raw.c:232 #3  0x00007ffff27e6068 in ip6_input_finish (skb=0x7ffff7cde8b0) at net/ipv6/ip6_input.c:197

Wi-Fi Wi-Fi

Home Agent

AP1 AP2

handoff

ping6

mobile node

correspondentnode

Page 15: Kernelvm 201312-dlmopen

CI

15

• Linuxカーネルテスト• gcov

• nightly build/tests

Page 16: Kernelvm 201312-dlmopen

ありがとうございましたDirect Code Execution

http://bit.ly/ns-3-dcehttps://github.com/direct-code-execution

16