Upload
susant-sahani
View
790
Download
2
Embed Size (px)
DESCRIPTION
Citation preview
User Space
Kernel Space
netlink socketrtnetlink socket
include/linux/pkt_cls.hinclude/linux/pkt_sched.h
net/netlink
tc
struct sockaddr_nlstruct nlmsghdr
net/core/rtnetlink.clinux/include/rtnetlink.h
OverviewOverview
Boot TimeBoot Time
__initfunc
pktsched_init
net/core/dev.c
net/sched/sch_api.c
• declarations
• binding
pktsched_initpktsched_init
struct rtnetlink_link *link_p;
if (link_p) {link_p[RTM_NEWQDISC-RTM_BASE].doit = tc_ctl_qdisc;link_p[RTM_DELQDISC-RTM_BASE].doit = tc_ctl_qdisc;link_p[RTM_GETQDISC-RTM_BASE].doit = tc_ctl_qdisc;link_p[RTM_GETQDISC-RTM_BASE].dumpit = tc_dump_qdisc;link_p[RTM_NEWTCLASS-RTM_BASE].doit = tc_ctl_tclass;link_p[RTM_DELTCLASS-RTM_BASE].doit = tc_ctl_tclass;link_p[RTM_GETTCLASS-RTM_BASE].doit = tc_ctl_tclass;link_p[RTM_GETTCLASS-RTM_BASE].dumpit = tc_dump_tclass;}
User level ApplicationUser level Application
Create netlink socketsendtonetlink_sendmsg
rtnetlink_rcv_msgcall function in rtnetlink_link
net/core/rtnetlink.c
net/netlink/af_netlink.c
nl_tablenl_table
nl_table : array of INET socket linked list
rtnetlink_linksrtnetlink_linksrtnetlink_links : array of
pointers to rtnetlink_linkrtnetlink_link : command
TC programTC program
do_qdisc
do_class
do_filter
tc_qdisc_modify
tc_qdisc_list
usage
tc_qdisc_modifytc_qdisc_modifyallocate “req”initialize it
tc_qdisc_modify (con’t)tc_qdisc_modify (con’t)
rtnl_open : create ‘rtnetlink’ socketfamily = AF_NETLINKtype = SOCK_RAWprotocol = NETLINK_ROUTE
setup and bindlocal address, sockaddr_nl local
call “rtnl_talk”
rtnl_talkrtnl_talkallocate “msghdr msg”
call “sendmsg” sys_sendmsg
sys_sendmsgsys_sendmsg
Kernel SpaceUser space
Copyreqmsg
reqmsg
• sock_sendmsgsock_sendmsg
scm_cookie scmcall ‘scm_send’call socket’s ‘sendmsg’ = netlink_ops
netlink_sendmsg
netlink_sendmsgnetlink_sendmsg
skbuffmemcpy_from_iovec
msg msg
• netlink_broadcastnetlink_broadcast• netlink_unicastnetlink_unicastdstgroups
netlink_unicastnetlink_unicastsocket’s protocol
find ‘linked list’ in nl_tablel
pid
add_wait_queue
socket’s receive queue
call ‘data_ready’ = rtnetlink_rcv
skbuff
rtnetlink_rcvrtnetlink_rcv
socket’s receive queue skbuff
invoke ‘rtnetlink_rcv_skb’
rtnetlink_rcv_skbrtnetlink_rcv_skb
nlhskbuff
invoke ‘rtnetlink_rcv_msg’
passing ‘nlh’
rtnetlink_rcv_msgrtnetlink_rcv_msg
invoke ‘doit’ in ‘rtnetlink_link’In this case, doit = tc_modify_qdisc
middle summarymiddle summary
User Space
Kernel Space
tc
netlink, rtnetlink
nlmsghdr, tcmsg
rtnetlink_rcv
tc_modify_qdisctc_ctl_tfilter
tc_get_qdisc
tc_modify_qdisctc_modify_qdisc
dev_get_by_index index = tcm->tcm_ifindex
if qdisc parent is set, call ‘qdisc_lookup’ : Find parent
Q call ‘qdisc_leaf’
tc_modify_qdisc (con’t)tc_modify_qdisc (con’t)
if tcm->tcm_handle is not empty, call ‘qdisc_lookup’ for band Q
graftcreate_n_graft
fail
tc_modify_qdisc (con’t)tc_modify_qdisc (con’t)
if tcm->tcm_handle is empty,if q is empty
elsecreate_n_graft
create graft
tc_modify_qdisc (con’t)tc_modify_qdisc (con’t)
if (tcm->tcm_parent is not specified),if (tcm->tcm->handle is not
empty)then call ‘qdisc_lookup’
call qdisc_change(q,tca) ‘qdisc_change’ call ‘prio_tune’
create_n_graftcreate_n_graft
qdisc_create
dev, tcm->tcm_handle, tca, &err
qdisc_createqdisc_create
find qdisc’s kindusing kind, get ‘Qdisc_ops’allocate space for Q displinecall ‘skb_queue_head_init’set up ‘enqueue’, ‘dequeue’call ‘ops->init’
= prio_initinsert new Q into qdisc_list
graftgraft
call ‘qdisc_graft’connect ‘new’ to parent’s class
or devif parent Q displine is empty,
call ‘dev_graft_qdisc(dev,new)’else call ‘get’ from classcall ‘qdisc_notify’
dev_graft_qdiscdev_graft_qdisc
dev_deactiveput old ‘qdisc_sleeping’ to ‘oqdisc’if new Q is empty,
set new Q to noop_qdiscthen, set dev’s qdisc_sleeping to new Q,
dev->qdisc to noop_qdiscReactive device
prio_getprio_get
get minor class ID
prio_graftprio_graft
using minor class ID as index which band
qdisc_chageqdisc_chage
directly call ‘sch->ops->change’ chage = prio_tune
prio_tuneprio_tune
argument opt contains ‘bands’outside band is set by ‘noop_qdisc’update child Q by ‘prio2band array’if Q == noop_qdisc
qdisc_create_dfltqdisc_creat_dflt set up child Q set up operator to ‘pfifo_qdisc_ops’