31
Introduction to Linux Control Groups and Namespaces Andre Ferraz @deferraz Luiz Viana @luizxx Delivery Engineering Team

Linux cgroups and namespaces

  • Upload
    locaweb

  • View
    572

  • Download
    5

Embed Size (px)

DESCRIPTION

Luiz Viana e André Ferraz (Cazé) mostram uma visão geral das tecnologias nativas para isolamento de recursos em ambiente Linux.

Citation preview

Introduction toLinux Control Groups and Namespaces

Andre Ferraz @deferraz

Luiz Viana @luizxx

Delivery Engineering Team

What is it?

• Basically, a kernel feature that allows you to allocate resources among groups of tasks running on a system.

• Provides a way to hierarchically group and label processes, and to apply resource limits to them.

Resource allocation

• CPU time and scheduling

• System memory / swap area

• Network bandwidth and namespaces

• Block devices bandwidth and IOPS

• Device access and isolation

Hierarchy

Copyright Red Hat Inc.

Relationship

Copyright Red Hat Inc.

Implications

• Because a task can belong to only a single cgroup, there is only one way that a task can be limited or affected by any single subsystem. This is logical: a feature, not a limitation.

• You can group several subsystems together so that they affect all tasks in a single hierarchy. Because cgroups in that hierarchy have different parameters set, those tasks will be affected differently.

• Conversely, if the need for splitting subsystems among separate hierarchies is reduced, you can remove a hierarchy and attach its subsystems to an existing one.

• The design allows for simple cgroup usage, such as setting a few parameters for specific tasks in a single hierarchy, such as one with just the cpu and memory subsystems attached.

• The design also allows for highly specific configuration: each task (process) on a system could be a member of each hierarchy, each of which has a single attached subsystem. Such a configuration would give the system administrator absolute control over all parameters for every single task.

• If you are limiting resources from a user, he will have more processes waiting for resources and due to this, load average on your server will have higher values constantly.

Using control groups: hard way

Using control groups: command line

Using control groups: cgconfig

• The cgconfig service installed with the libcgroup package provides a convenient way to create hierarchies, attach subsystems to hierarchies, and manage cgroups within those hierarchies.

• It is recommended that you use cgconfig to manage hierarchies and cgroups on your system.

• The default /etc/cgconfig.conf file installed with the libcgroup package creates and mounts an individual hierarchy for each subsystem, and attaches the subsystems to these hierarchies. The cgconfig service also allows to create configuration files in the /etc/cgconfig.d/ directory and to invoke them from /etc/cgconfig.conf.

• If you stop the cgconfig service (with the service cgconfig stop command), it unmounts all the hierarchies that it mounted.

Configuration example

Using control groups: cgred

• Cgred (cgrulesengd daemon) is a service that moves tasks into cgroups according to parameters set in the /etc/cgrules.conf file.

• Entries in the /etc/cgrules.conf file can take one of the two forms:

user subsystems control_group

user:command subsystems control_group

• Group names can be specified prefixing the "@" character.

• More than one subsystem can be specified in a comma-separated list

• Commands are identified by the process name or full command path of a process.

Configuration example

Using control groups: reaper

• Reaper allows you to manage groups dynamically on shared multi-user environments.

• Can be extended to work on any environment by creating a function to validate users.

• Entirely written in Python and easy to modify.

• Limit exceptions can be created using the command line interface.

• Does not depend on external agents.

• Use of standard items from libcgroups available in most Linux distributions.

Available on Github, https://github.com/lviana/reaper

Obtaining cgroups information

• Listing controllers

• # lssubsys -m controllers

• # cat /proc/cgroups

• Finding control groups

• # lscgroup

• # lscgroup cpuset:adminusers

• Display parameters

• # cgget -r parameter list_of_cgroups

• # cgget -g cpuset /

Future...

Actually, not so future anymore!

Systemd

• System service manager for Linux that provides parallelization capabilities, keeps track of processes using Linux control groups, offers on-demand starting of daemons and implements an elaborated transactional dependency-based service control logic.

• A cgroup is bound to a system unit configurable with a unit file and manageable with systemd's command-line utilities.

• Cgroups in systemd can be transient or persistent.

Transient cgroups

• Using transient cgroups, you can set limits on resources consumed by the service during its runtime.

• Applications can create transient cgroups dynamically by using API calls to systemd.

• Commands are started directly from the systemd-run process and thus inherit the execution environment of the caller.

• Commands are run in scope units in synchronous execution.

Persistent cgroups

• You can assign a persistent cgroup to a systemd service, editting its unit configuration file.

• It can be used to manage services that are started automatically.

• Unit configuration files are available on /usr/lib/systemd/system/directory.

• Temporary changes can be set using systemctl command.

Where the f*ck do I use it?

• Prioritizing database io

• Limit resources available to end users

• Optimizing processor usage

• Control network access

• Isolate process from devices

• Optimize available physical resources

• Set network traffic priority

Projects using it

• Linux Containers / LXC (https://linuxcontainers.org/)

• Docker (http://docker.io)

• Apache Mesos (http://mesos.apache.org)

• Openstack (http://www.openstack.org)

• Locaweb (http://github.com/locaweb)

Namespaces!

Namespaces,what is it?

• Lightweight process isolation

• Processes can have different views of the system than other processes

• Old Concept: 1992 on plan9 (http://www.cs.bell-labs.com/sys/doc/names.html)

• No hypervisor

• setns() syscall

Namespaces, types

• mountpoints / fs (MNT) [First created on 2002 by Al Viro]

• processes (PID)

• network (NET)

• System V IPC

• Hostname (UTS)

• User (UIDS)

Namespaces, flags

• CLONE_NEWNS 2.4.19 CAP_SYS_ADMIN

• CLONE_NEWUTS 2.6.19 CAP_SYS_ADMIN

• CLONE_NEWIPC 2.6.19 CAP_SYS_ADMIN

• CLONE_NEWPID 2.6.24 CAP_SYS_ADMIN

• CLONE_NEWNET 2.6.29 CAP_SYS_ADMIN

• CLONE_NEWUSER 3.8 No cap Required

Namespaces, syscalls

clone () - create new process and new namespace

unshare() - create new namespace and attaches current process

setns() - join an existing namespace

Namespaces, network ns example

# ip netns add newnet

# ip link add veth0 type veth peer name veth1

# ip link set veth1 netns newnet

# ip netns exec newnet ip link list

# ip netns exec newnet bash

Namespaces, security

CVE 2013-1858

http://lwn.net/Articles/543273/

Namespaces, application server support

• uWSGI got full namespaces support in 1.9/2.0

• Additional isolated filesystems

• You can detach single components to increase isolation

More information on:

http://uwsgi-docs.readthedocs.org/en/latest/Namespaces.html

Reference

• https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt

• http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/cgroups/

• https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/

• https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/

• http://uwsgi-docs.readthedocs.org/en/latest/Namespaces.html

Questions?

Thank you!