View
1.986
Download
0
Category
Preview:
DESCRIPTION
Introduction to Linux namespaces, containers.
Citation preview
Namespaces in Linux
Ľubomír RintelGoodData Q1 off-site Harrachov 2014
UNIX processes● Virtualization
– Virtual CPU and memory– Consistently accessible devices
● Shared resources– Runtime configuration– Communication channels– Filesystem– Privileges, credentials
pid_t pid = fork ();if (pid) { <parent>} else { <child>}
What about threads?
Sharing more
● Sharing resources and state– Address space– Signal handlers– Open file handles– CWD, umask(), ...
Linux processes
● Threads are processes● Process: own resources & state● Thread: shared resources & state
pid_t pid = clone (<what_to_share>);CLONE_VM Address spaceCLONE_FILES Open filesCLONE_FS CWD, umask(), ...
...SEE ALSO: unshare(2)
...and what about containers?
Containers
● Virtualization● Less sharing● More separation
Sharing is not caring.Your mother was wrong!
Namespaces
● Containers are to processes what processes are to threads
pid_t pid = clone (<what_to_share>);CLONE_NEWUTS Hostname, domainnameCLONE_NEWIPC SysV IPC objectsCLONE_NEWPID Process IDsCLONE_NEWNET Network configurationCLONE_NEWNS File system mountsCLONE_NEWUSER User and Group IDs
SEE ALSO: setns(2)
UTS namespace
● CLONE_NEWUTS● CONFIG_UTS_NS since Linux 2.6.19● needs CAP_SYS_ADMIN● hostname● domainname
SysV IPC namespace
● CLONE_NEWIPC● CONFIG_IPC_NS since 2.6.19● Obsolete System V UNIX IPC mechanisms:● semaphores● shared memory● message queues
PID namespace
● CLONE_NEWPID● CONFIG_PID_NS since Linux 2.6.24● a different PID visible from within namespace
than from outside● new PID 1
Network namespace● CLONE_NEWNET● CONFIG_NET_NS since Linux 2.6.29● separate network stack
– network addresses– nftables/netfilter rules– loopback interface for namespace
● veth interface (CONFIG_VETH), ip netns
Mount namespace
● CLONE_NEWNS● First namespace, since 2.4.19● /proc/<pid>/mounts instead of /proc/mounts● In Fedora, run mount --make-private /
or create new user NS
User namespace● CLONE_NEWUSER● CONFIG_USER_NS since 2.6.23● Unprivileged since 3.8, still disabled by default● a different UID/GID visible from within namespace than from outside● all capabilities within namespace
– limited by capabilities in parent namespace● can be combined with other namespaces● Mapping of ranges via /proc/<pid>/uid_map /proc/<pid>/gid_map
– Unprivileged user can map theirselves
LXC: Lightweight containers
● Container management toolset● Create namespaces● Configure networking● Resource management with control groups● Integrated with libvirt
Docker
systemd-nspawn
● Quick way to boot a container● Can be run from a service unit in a separate
cgroup
Future
● CONFIG_USER_NS=y by default● Userspace for multiple UIDs (ranges) per user● Syslog namespace
Questions?
What else?
● Auditing & SELinux● Checkpoint & Restore in userspace● fakeroot
Further reading● Configuring network namespaces with iproute2's
ip netns: http://blog.scottlowe.org/2013/09/04/introducing-linux-network-namespaces/
● Mike Kerrisk's LWN series on namespaces: http://lwn.net/Articles/531114/
● Rami Rosen's great Namespaces/Cgroups lecturehttp://www.haifux.org/lectures/299/netLec7.pdf
Recommended