1
linux/net/sched/Kconfig
Eric Dumazet 4b549a2ef4 fq_codel: Fair Queue Codel AQM
Fair Queue Codel packet scheduler

Principles :

- Packets are classified (internal classifier or external) on flows.
- This is a Stochastic model (as we use a hash, several flows might
                              be hashed on same slot)
- Each flow has a CoDel managed queue.
- Flows are linked onto two (Round Robin) lists,
  so that new flows have priority on old ones.

- For a given flow, packets are not reordered (CoDel uses a FIFO)
- head drops only.
- ECN capability is on by default.
- Very low memory footprint (64 bytes per flow)

tc qdisc ... fq_codel [ limit PACKETS ] [ flows number ]
                      [ target TIME ] [ interval TIME ] [ noecn ]
                      [ quantum BYTES ]

defaults : 1024 flows, 10240 packets limit, quantum : device MTU
           target : 5ms (CoDel default)
           interval : 100ms (CoDel default)

Impressive results on load :

class htb 1:1 root leaf 10: prio 0 quantum 1514 rate 200000Kbit ceil 200000Kbit burst 1475b/8 mpu 0b overhead 0b cburst 1475b/8 mpu 0b overhead 0b level 0
 Sent 43304920109 bytes 33063109 pkt (dropped 0, overlimits 0 requeues 0)
 rate 201691Kbit 28595pps backlog 0b 312p requeues 0
 lended: 33063109 borrowed: 0 giants: 0
 tokens: -912 ctokens: -912

class fq_codel 10:1735 parent 10:
 (dropped 1292, overlimits 0 requeues 0)
 backlog 15140b 10p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:4524 parent 10:
 (dropped 1291, overlimits 0 requeues 0)
 backlog 16654b 11p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:4e74 parent 10:
 (dropped 1290, overlimits 0 requeues 0)
 backlog 6056b 4p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 6.4ms dropping drop_next 92.0ms
class fq_codel 10:628a parent 10:
 (dropped 1289, overlimits 0 requeues 0)
 backlog 7570b 5p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 5.4ms dropping drop_next 90.9ms
class fq_codel 10:a4b3 parent 10:
 (dropped 302, overlimits 0 requeues 0)
 backlog 16654b 11p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:c3c2 parent 10:
 (dropped 1284, overlimits 0 requeues 0)
 backlog 13626b 9p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 5.9ms
class fq_codel 10:d331 parent 10:
 (dropped 299, overlimits 0 requeues 0)
 backlog 15140b 10p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 7.0ms
class fq_codel 10:d526 parent 10:
 (dropped 12160, overlimits 0 requeues 0)
 backlog 35870b 211p requeues 0
  deficit 1508 count 12160 lastcount 1 ldelay 15.3ms dropping drop_next 247us
class fq_codel 10:e2c6 parent 10:
 (dropped 1288, overlimits 0 requeues 0)
 backlog 15140b 10p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:eab5 parent 10:
 (dropped 1285, overlimits 0 requeues 0)
 backlog 16654b 11p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 5.9ms
class fq_codel 10:f220 parent 10:
 (dropped 1289, overlimits 0 requeues 0)
 backlog 15140b 10p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms

qdisc htb 1: root refcnt 6 r2q 10 default 1 direct_packets_stat 0 ver 3.17
 Sent 43331086547 bytes 33092812 pkt (dropped 0, overlimits 66063544 requeues 71)
 rate 201697Kbit 28602pps backlog 0b 260p requeues 71
qdisc fq_codel 10: parent 1:1 limit 10240p flows 65536 target 5.0ms interval 100.0ms ecn
 Sent 43331086547 bytes 33092812 pkt (dropped 949359, overlimits 0 requeues 0)
 rate 201697Kbit 28602pps backlog 189352b 260p requeues 0
  maxpacket 1514 drop_overlimit 0 new_flow_count 5582 ecn_mark 125593
  new_flows_len 0 old_flows_len 11

PING 172.30.42.18 (172.30.42.18) 56(84) bytes of data.
64 bytes from 172.30.42.18: icmp_req=1 ttl=64 time=0.227 ms
64 bytes from 172.30.42.18: icmp_req=2 ttl=64 time=0.165 ms
64 bytes from 172.30.42.18: icmp_req=3 ttl=64 time=0.166 ms
64 bytes from 172.30.42.18: icmp_req=4 ttl=64 time=0.151 ms
64 bytes from 172.30.42.18: icmp_req=5 ttl=64 time=0.164 ms
64 bytes from 172.30.42.18: icmp_req=6 ttl=64 time=0.172 ms
64 bytes from 172.30.42.18: icmp_req=7 ttl=64 time=0.175 ms
64 bytes from 172.30.42.18: icmp_req=8 ttl=64 time=0.183 ms
64 bytes from 172.30.42.18: icmp_req=9 ttl=64 time=0.158 ms
64 bytes from 172.30.42.18: icmp_req=10 ttl=64 time=0.200 ms

10 packets transmitted, 10 received, 0% packet loss, time 8999ms
rtt min/avg/max/mdev = 0.151/0.176/0.227/0.022 ms

Much better than SFQ because of priority given to new flows, and fast
path dirtying less cache lines.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-12 15:53:42 -04:00

634 lines
20 KiB
Plaintext

#
# Traffic control configuration.
#
menuconfig NET_SCHED
bool "QoS and/or fair queueing"
select NET_SCH_FIFO
---help---
When the kernel has several packets to send out over a network
device, it has to decide which ones to send first, which ones to
delay, and which ones to drop. This is the job of the queueing
disciplines, several different algorithms for how to do this
"fairly" have been proposed.
If you say N here, you will get the standard packet scheduler, which
is a FIFO (first come, first served). If you say Y here, you will be
able to choose from among several alternative algorithms which can
then be attached to different network devices. This is useful for
example if some of your network devices are real time devices that
need a certain minimum data flow rate, or if you need to limit the
maximum data flow rate for traffic which matches specified criteria.
This code is considered to be experimental.
To administer these schedulers, you'll need the user-level utilities
from the package iproute2+tc at <ftp://ftp.tux.org/pub/net/ip-routing/>.
That package also contains some documentation; for more, check out
<http://www.linuxfoundation.org/collaborate/workgroups/networking/iproute2>.
This Quality of Service (QoS) support will enable you to use
Differentiated Services (diffserv) and Resource Reservation Protocol
(RSVP) on your Linux router if you also say Y to the corresponding
classifiers below. Documentation and software is at
<http://diffserv.sourceforge.net/>.
If you say Y here and to "/proc file system" below, you will be able
to read status information about packet schedulers from the file
/proc/net/psched.
The available schedulers are listed in the following questions; you
can say Y to as many as you like. If unsure, say N now.
if NET_SCHED
comment "Queueing/Scheduling"
config NET_SCH_CBQ
tristate "Class Based Queueing (CBQ)"
---help---
Say Y here if you want to use the Class-Based Queueing (CBQ) packet
scheduling algorithm. This algorithm classifies the waiting packets
into a tree-like hierarchy of classes; the leaves of this tree are
in turn scheduled by separate algorithms.
See the top of <file:net/sched/sch_cbq.c> for more details.
CBQ is a commonly used scheduler, so if you're unsure, you should
say Y here. Then say Y to all the queueing algorithms below that you
want to use as leaf disciplines.
To compile this code as a module, choose M here: the
module will be called sch_cbq.
config NET_SCH_HTB
tristate "Hierarchical Token Bucket (HTB)"
---help---
Say Y here if you want to use the Hierarchical Token Buckets (HTB)
packet scheduling algorithm. See
<http://luxik.cdi.cz/~devik/qos/htb/> for complete manual and
in-depth articles.
HTB is very similar to CBQ regarding its goals however is has
different properties and different algorithm.
To compile this code as a module, choose M here: the
module will be called sch_htb.
config NET_SCH_HFSC
tristate "Hierarchical Fair Service Curve (HFSC)"
---help---
Say Y here if you want to use the Hierarchical Fair Service Curve
(HFSC) packet scheduling algorithm.
To compile this code as a module, choose M here: the
module will be called sch_hfsc.
config NET_SCH_ATM
tristate "ATM Virtual Circuits (ATM)"
depends on ATM
---help---
Say Y here if you want to use the ATM pseudo-scheduler. This
provides a framework for invoking classifiers, which in turn
select classes of this queuing discipline. Each class maps
the flow(s) it is handling to a given virtual circuit.
See the top of <file:net/sched/sch_atm.c> for more details.
To compile this code as a module, choose M here: the
module will be called sch_atm.
config NET_SCH_PRIO
tristate "Multi Band Priority Queueing (PRIO)"
---help---
Say Y here if you want to use an n-band priority queue packet
scheduler.
To compile this code as a module, choose M here: the
module will be called sch_prio.
config NET_SCH_MULTIQ
tristate "Hardware Multiqueue-aware Multi Band Queuing (MULTIQ)"
---help---
Say Y here if you want to use an n-band queue packet scheduler
to support devices that have multiple hardware transmit queues.
To compile this code as a module, choose M here: the
module will be called sch_multiq.
config NET_SCH_RED
tristate "Random Early Detection (RED)"
---help---
Say Y here if you want to use the Random Early Detection (RED)
packet scheduling algorithm.
See the top of <file:net/sched/sch_red.c> for more details.
To compile this code as a module, choose M here: the
module will be called sch_red.
config NET_SCH_SFB
tristate "Stochastic Fair Blue (SFB)"
---help---
Say Y here if you want to use the Stochastic Fair Blue (SFB)
packet scheduling algorithm.
See the top of <file:net/sched/sch_sfb.c> for more details.
To compile this code as a module, choose M here: the
module will be called sch_sfb.
config NET_SCH_SFQ
tristate "Stochastic Fairness Queueing (SFQ)"
---help---
Say Y here if you want to use the Stochastic Fairness Queueing (SFQ)
packet scheduling algorithm.
See the top of <file:net/sched/sch_sfq.c> for more details.
To compile this code as a module, choose M here: the
module will be called sch_sfq.
config NET_SCH_TEQL
tristate "True Link Equalizer (TEQL)"
---help---
Say Y here if you want to use the True Link Equalizer (TLE) packet
scheduling algorithm. This queueing discipline allows the combination
of several physical devices into one virtual device.
See the top of <file:net/sched/sch_teql.c> for more details.
To compile this code as a module, choose M here: the
module will be called sch_teql.
config NET_SCH_TBF
tristate "Token Bucket Filter (TBF)"
---help---
Say Y here if you want to use the Token Bucket Filter (TBF) packet
scheduling algorithm.
See the top of <file:net/sched/sch_tbf.c> for more details.
To compile this code as a module, choose M here: the
module will be called sch_tbf.
config NET_SCH_GRED
tristate "Generic Random Early Detection (GRED)"
---help---
Say Y here if you want to use the Generic Random Early Detection
(GRED) packet scheduling algorithm for some of your network devices
(see the top of <file:net/sched/sch_red.c> for details and
references about the algorithm).
To compile this code as a module, choose M here: the
module will be called sch_gred.
config NET_SCH_DSMARK
tristate "Differentiated Services marker (DSMARK)"
---help---
Say Y if you want to schedule packets according to the
Differentiated Services architecture proposed in RFC 2475.
Technical information on this method, with pointers to associated
RFCs, is available at <http://www.gta.ufrj.br/diffserv/>.
To compile this code as a module, choose M here: the
module will be called sch_dsmark.
config NET_SCH_NETEM
tristate "Network emulator (NETEM)"
---help---
Say Y if you want to emulate network delay, loss, and packet
re-ordering. This is often useful to simulate networks when
testing applications or protocols.
To compile this driver as a module, choose M here: the module
will be called sch_netem.
If unsure, say N.
config NET_SCH_DRR
tristate "Deficit Round Robin scheduler (DRR)"
help
Say Y here if you want to use the Deficit Round Robin (DRR) packet
scheduling algorithm.
To compile this driver as a module, choose M here: the module
will be called sch_drr.
If unsure, say N.
config NET_SCH_MQPRIO
tristate "Multi-queue priority scheduler (MQPRIO)"
help
Say Y here if you want to use the Multi-queue Priority scheduler.
This scheduler allows QOS to be offloaded on NICs that have support
for offloading QOS schedulers.
To compile this driver as a module, choose M here: the module will
be called sch_mqprio.
If unsure, say N.
config NET_SCH_CHOKE
tristate "CHOose and Keep responsive flow scheduler (CHOKE)"
help
Say Y here if you want to use the CHOKe packet scheduler (CHOose
and Keep for responsive flows, CHOose and Kill for unresponsive
flows). This is a variation of RED which trys to penalize flows
that monopolize the queue.
To compile this code as a module, choose M here: the
module will be called sch_choke.
config NET_SCH_QFQ
tristate "Quick Fair Queueing scheduler (QFQ)"
help
Say Y here if you want to use the Quick Fair Queueing Scheduler (QFQ)
packet scheduling algorithm.
To compile this driver as a module, choose M here: the module
will be called sch_qfq.
If unsure, say N.
config NET_SCH_CODEL
tristate "Controlled Delay AQM (CODEL)"
help
Say Y here if you want to use the Controlled Delay (CODEL)
packet scheduling algorithm.
To compile this driver as a module, choose M here: the module
will be called sch_codel.
If unsure, say N.
config NET_SCH_FQ_CODEL
tristate "Fair Queue Controlled Delay AQM (FQ_CODEL)"
help
Say Y here if you want to use the FQ Controlled Delay (FQ_CODEL)
packet scheduling algorithm.
To compile this driver as a module, choose M here: the module
will be called sch_fq_codel.
If unsure, say N.
config NET_SCH_INGRESS
tristate "Ingress Qdisc"
depends on NET_CLS_ACT
---help---
Say Y here if you want to use classifiers for incoming packets.
If unsure, say Y.
To compile this code as a module, choose M here: the
module will be called sch_ingress.
config NET_SCH_PLUG
tristate "Plug network traffic until release (PLUG)"
---help---
This queuing discipline allows userspace to plug/unplug a network
output queue, using the netlink interface. When it receives an
enqueue command it inserts a plug into the outbound queue that
causes following packets to enqueue until a dequeue command arrives
over netlink, causing the plug to be removed and resuming the normal
packet flow.
This module also provides a generic "network output buffering"
functionality (aka output commit), wherein upon arrival of a dequeue
command, only packets up to the first plug are released for delivery.
The Remus HA project uses this module to enable speculative execution
of virtual machines by allowing the generated network output to be rolled
back if needed.
For more information, please refer to http://wiki.xensource.com/xenwiki/Remus
Say Y here if you are using this kernel for Xen dom0 and
want to protect Xen guests with Remus.
To compile this code as a module, choose M here: the
module will be called sch_plug.
comment "Classification"
config NET_CLS
boolean
config NET_CLS_BASIC
tristate "Elementary classification (BASIC)"
select NET_CLS
---help---
Say Y here if you want to be able to classify packets using
only extended matches and actions.
To compile this code as a module, choose M here: the
module will be called cls_basic.
config NET_CLS_TCINDEX
tristate "Traffic-Control Index (TCINDEX)"
select NET_CLS
---help---
Say Y here if you want to be able to classify packets based on
traffic control indices. You will want this feature if you want
to implement Differentiated Services together with DSMARK.
To compile this code as a module, choose M here: the
module will be called cls_tcindex.
config NET_CLS_ROUTE4
tristate "Routing decision (ROUTE)"
depends on INET
select IP_ROUTE_CLASSID
select NET_CLS
---help---
If you say Y here, you will be able to classify packets
according to the route table entry they matched.
To compile this code as a module, choose M here: the
module will be called cls_route.
config NET_CLS_FW
tristate "Netfilter mark (FW)"
select NET_CLS
---help---
If you say Y here, you will be able to classify packets
according to netfilter/firewall marks.
To compile this code as a module, choose M here: the
module will be called cls_fw.
config NET_CLS_U32
tristate "Universal 32bit comparisons w/ hashing (U32)"
select NET_CLS
---help---
Say Y here to be able to classify packets using a universal
32bit pieces based comparison scheme.
To compile this code as a module, choose M here: the
module will be called cls_u32.
config CLS_U32_PERF
bool "Performance counters support"
depends on NET_CLS_U32
---help---
Say Y here to make u32 gather additional statistics useful for
fine tuning u32 classifiers.
config CLS_U32_MARK
bool "Netfilter marks support"
depends on NET_CLS_U32
---help---
Say Y here to be able to use netfilter marks as u32 key.
config NET_CLS_RSVP
tristate "IPv4 Resource Reservation Protocol (RSVP)"
select NET_CLS
---help---
The Resource Reservation Protocol (RSVP) permits end systems to
request a minimum and maximum data flow rate for a connection; this
is important for real time data such as streaming sound or video.
Say Y here if you want to be able to classify outgoing packets based
on their RSVP requests.
To compile this code as a module, choose M here: the
module will be called cls_rsvp.
config NET_CLS_RSVP6
tristate "IPv6 Resource Reservation Protocol (RSVP6)"
select NET_CLS
---help---
The Resource Reservation Protocol (RSVP) permits end systems to
request a minimum and maximum data flow rate for a connection; this
is important for real time data such as streaming sound or video.
Say Y here if you want to be able to classify outgoing packets based
on their RSVP requests and you are using the IPv6 protocol.
To compile this code as a module, choose M here: the
module will be called cls_rsvp6.
config NET_CLS_FLOW
tristate "Flow classifier"
select NET_CLS
---help---
If you say Y here, you will be able to classify packets based on
a configurable combination of packet keys. This is mostly useful
in combination with SFQ.
To compile this code as a module, choose M here: the
module will be called cls_flow.
config NET_CLS_CGROUP
tristate "Control Group Classifier"
select NET_CLS
depends on CGROUPS
---help---
Say Y here if you want to classify packets based on the control
cgroup of their process.
To compile this code as a module, choose M here: the
module will be called cls_cgroup.
config NET_EMATCH
bool "Extended Matches"
select NET_CLS
---help---
Say Y here if you want to use extended matches on top of classifiers
and select the extended matches below.
Extended matches are small classification helpers not worth writing
a separate classifier for.
A recent version of the iproute2 package is required to use
extended matches.
config NET_EMATCH_STACK
int "Stack size"
depends on NET_EMATCH
default "32"
---help---
Size of the local stack variable used while evaluating the tree of
ematches. Limits the depth of the tree, i.e. the number of
encapsulated precedences. Every level requires 4 bytes of additional
stack space.
config NET_EMATCH_CMP
tristate "Simple packet data comparison"
depends on NET_EMATCH
---help---
Say Y here if you want to be able to classify packets based on
simple packet data comparisons for 8, 16, and 32bit values.
To compile this code as a module, choose M here: the
module will be called em_cmp.
config NET_EMATCH_NBYTE
tristate "Multi byte comparison"
depends on NET_EMATCH
---help---
Say Y here if you want to be able to classify packets based on
multiple byte comparisons mainly useful for IPv6 address comparisons.
To compile this code as a module, choose M here: the
module will be called em_nbyte.
config NET_EMATCH_U32
tristate "U32 key"
depends on NET_EMATCH
---help---
Say Y here if you want to be able to classify packets using
the famous u32 key in combination with logic relations.
To compile this code as a module, choose M here: the
module will be called em_u32.
config NET_EMATCH_META
tristate "Metadata"
depends on NET_EMATCH
---help---
Say Y here if you want to be able to classify packets based on
metadata such as load average, netfilter attributes, socket
attributes and routing decisions.
To compile this code as a module, choose M here: the
module will be called em_meta.
config NET_EMATCH_TEXT
tristate "Textsearch"
depends on NET_EMATCH
select TEXTSEARCH
select TEXTSEARCH_KMP
select TEXTSEARCH_BM
select TEXTSEARCH_FSM
---help---
Say Y here if you want to be able to classify packets based on
textsearch comparisons.
To compile this code as a module, choose M here: the
module will be called em_text.
config NET_CLS_ACT
bool "Actions"
---help---
Say Y here if you want to use traffic control actions. Actions
get attached to classifiers and are invoked after a successful
classification. They are used to overwrite the classification
result, instantly drop or redirect packets, etc.
A recent version of the iproute2 package is required to use
extended matches.
config NET_ACT_POLICE
tristate "Traffic Policing"
depends on NET_CLS_ACT
---help---
Say Y here if you want to do traffic policing, i.e. strict
bandwidth limiting. This action replaces the existing policing
module.
To compile this code as a module, choose M here: the
module will be called act_police.
config NET_ACT_GACT
tristate "Generic actions"
depends on NET_CLS_ACT
---help---
Say Y here to take generic actions such as dropping and
accepting packets.
To compile this code as a module, choose M here: the
module will be called act_gact.
config GACT_PROB
bool "Probability support"
depends on NET_ACT_GACT
---help---
Say Y here to use the generic action randomly or deterministically.
config NET_ACT_MIRRED
tristate "Redirecting and Mirroring"
depends on NET_CLS_ACT
---help---
Say Y here to allow packets to be mirrored or redirected to
other devices.
To compile this code as a module, choose M here: the
module will be called act_mirred.
config NET_ACT_IPT
tristate "IPtables targets"
depends on NET_CLS_ACT && NETFILTER && IP_NF_IPTABLES
---help---
Say Y here to be able to invoke iptables targets after successful
classification.
To compile this code as a module, choose M here: the
module will be called act_ipt.
config NET_ACT_NAT
tristate "Stateless NAT"
depends on NET_CLS_ACT
---help---
Say Y here to do stateless NAT on IPv4 packets. You should use
netfilter for NAT unless you know what you are doing.
To compile this code as a module, choose M here: the
module will be called act_nat.
config NET_ACT_PEDIT
tristate "Packet Editing"
depends on NET_CLS_ACT
---help---
Say Y here if you want to mangle the content of packets.
To compile this code as a module, choose M here: the
module will be called act_pedit.
config NET_ACT_SIMP
tristate "Simple Example (Debug)"
depends on NET_CLS_ACT
---help---
Say Y here to add a simple action for demonstration purposes.
It is meant as an example and for debugging purposes. It will
print a configured policy string followed by the packet count
to the console for every packet that passes by.
If unsure, say N.
To compile this code as a module, choose M here: the
module will be called act_simple.
config NET_ACT_SKBEDIT
tristate "SKB Editing"
depends on NET_CLS_ACT
---help---
Say Y here to change skb priority or queue_mapping settings.
If unsure, say N.
To compile this code as a module, choose M here: the
module will be called act_skbedit.
config NET_ACT_CSUM
tristate "Checksum Updating"
depends on NET_CLS_ACT && INET
---help---
Say Y here to update some common checksum after some direct
packet alterations.
To compile this code as a module, choose M here: the
module will be called act_csum.
config NET_CLS_IND
bool "Incoming device classification"
depends on NET_CLS_U32 || NET_CLS_FW
---help---
Say Y here to extend the u32 and fw classifier to support
classification based on the incoming device. This option is
likely to disappear in favour of the metadata ematch.
endif # NET_SCHED
config NET_SCH_FIFO
bool