Changeset 11159

Show
Ignore:
Timestamp:
06/23/09 13:45:47 (9 months ago)
Author:
arvid
Message:

updates to bep 29

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • dotorg/trunk/html/beps/bep_0029.rst

    r11158 r11159  
    1616------- 
    1717 
     18The uTorrent transport protocol was designed by Ludvig Strigeus, 
     19Greg Hazel, Stanislav Shalunov and Arvid Norberg 
     20 
    1821rationale 
    1922--------- 
    2023 
     24The motivation for uTP is for BitTorrent clients to not disrupt 
     25internet connections, while still utilizing the unused bandwidth 
     26fully. 
     27 
     28The problem is that DSL and cable modems typically have a send 
     29buffer disproportional to their max send rate, which can hold several 
     30seconds worth of packets. BitTorrent traffic is typically background 
     31transfers, and should have lower priority than checking email, 
     32phone calls and browsing the web, but when using regular TCP 
     33connections BitTorrent quickly fills up the send buffer, adding 
     34multiple seconds delay to all interactive traffic. 
     35 
     36The fact that BitTorrent uses multiple TCP connections gives it an 
     37unfair advantage when competing with other services for bandwidth, 
     38which exaggerates the effect of BitTorrent filling the upload pipe. 
     39 
     40The traditional solution to this problem is to cap the upload rate 
     41of the bittorrent client to 80% of the uplink capacity. 80% leaves 
     42some head room for interactive traffic. 
     43 
     44The main drawbacks with this solution are: 
     45 
     461. The user needs to configure his/her bittorrent client, it won't 
     47   work out-of-the-box. 
     482. The user needs to know his/her internet connection's upload 
     49   capacity. This capacity may change, especially on laptops that 
     50   may connect to a large number of different networks. 
     513. The headroom of 20% is arbitrary and wastes bandwidth. Whenever 
     52   there is no interactive traffic competing with bittorrent, the 
     53   extra 20% are wasted. Whenever there is competing interactive 
     54   traffic, it cannot use more than 20% of the capacity. 
     55 
     56uTP solves this problem by using the modem queue size as a controller 
     57for its send rate. When the queue grows too large, it throttles back. 
     58 
     59This lets it utilize the full upload capacity when there is no 
     60competition for it, and it lets it throttle back to virtually nothing 
     61when there is a lot of interactive traffic. 
     62 
    2163overview 
    2264-------- 
    2365 
    24 connection setup 
    25 ---------------- 
     66This document assumes some knowledge of how TCP and window based 
     67congestion control works. 
     68 
     69The main addition compared to TCP is the delay based congestion 
     70control. See the `congestion control`_ section. 
     71 
     72Like TCP, uTP uses window based congestion control. Each socket 
     73has a ``max_window`` which determines the maximum number of bytes 
     74the socket may have *in-flight* at any given time. Any packet that 
     75has been sent, but not yet acked, is considered to be in-flight. 
     76 
     77The number of bytes in-flight is ``cur_window``. 
     78 
     79A socket may only send a packet if ``cur_window`` + ``packet_size`` 
     80is less than or equal to min(``max_window``, ``wnd_size``). 
     81The packet size may vary, see the `packet sizes`_ section. 
     82 
     83``wnd_size`` is the advertised window from the other end. It sets 
     84an upper limit on the number of packets in-flight. 
     85 
     86An implementation MAY violate the above rule if the ``max_window`` 
     87is smaller than the packet size, and it paces the packets so that 
     88the average ``cur_window`` is less than or equal to ``max_window``. 
     89 
     90Each socket keeps a state for the last delay measurement from the 
     91other endpoint (``reply_micro``). Whenever a packet is received, 
     92this state is updated by subtracting ``timestamp_microseconds`` 
     93from the hosts current time, in microseconds (see `header format`_). 
     94 
     95Every time a packet is sent, the sockets ``reply_micro`` value is 
     96put in the ``timestamp_difference_microseconds`` field of the packet 
     97header. 
     98 
     99Unlike TCP, sequence numbers and ACKs in uTP refers to packets, not 
     100bytes. This means uTP cannot *repackage* data when resending it. 
     101 
     102Each socket keeps a state of the next sequence number to use when 
     103sending a packet, ``seq_nr``. It also keeps a state of the sequence 
     104number that was last received, ``ack_nr``. The oldest unacked packet 
     105is ``seq_nr`` - ``cur_window``. 
    26106 
    27107header format 
     
    50130....... 
    51131 
    52 This is the protcol version. The protocol is detected by inspecting the first 
    53 two bytes. If the first byte is 1 and the second is 0, the packet is version 
    54 1. Otherwise it's version 0. 
     132This is the protcol version. The current version is 1. 
    55133 
    56134connection_id 
     
    79157........ 
    80158 
    81 Advertised receive window. In v1, this is specified in the number of segments 
    82 (packets) of 300 bytes. In v2 it is 16 bits wide and specified in bytes. 
     159Advertised receive window. This is 32 bits wide and specified in bytes. 
    83160 
    84161The window size is the number of bytes currently in-flight, i.e. sent but not 
    85 acked. This is not the same as the next sequence number minus the last acked 
    86 sequence number. Packets can be acked in the middle of the send window using 
    87 the Selective ACK extension. Any acked packet is deducted from the current 
    88 window size, and allows the sender to send more packets to fill up the window. 
     162acked. The advertised receive window lets the other end cap the window size 
     163if it cannot receive any faster, if its receive buffer is filling up. 
     164 
     165When sending packets, this should be set to the number of bytes left in the 
     166socket's receive buffer. 
    89167 
    90168extension 
     
    209287direction. 
    210288 
     289connection setup 
     290---------------- 
     291 
     292*describe the initial connection setup sequence* 
     293 
     294packet loss 
     295----------- 
     296 
     297If the packet with sequence number (``seq_nr`` - ``cur_window``) has not been acked 
     298(this is the oldest packet in the send buffer, and the next one expected to be acked) 
     299has not been acked, but 3 or more packets have been acked passed it (through Selective 
     300ACK), the packet is assumed to have been lost. Similarly, when receiving 3 duplicate 
     301acks, ``ack_nr`` + 1 is assumed to have been lost (if a packet with that sequence number 
     302has been sent). 
     303 
     304When a packet is lost, the ``max_window`` is multiplied by 0.78. 
     305 
    211306timeouts 
    212307-------- 
    213308 
    214309Every packet that is ACKed, either by falling in the range (last_ack_nr, ack_nr] 
    215 or by explicitly being acked by an Selective ACK message, should be used to update 
     310or by explicitly being acked by a Selective ACK message, should be used to update 
    216311an ``rtt`` (round trip time) and ``rtt_var`` (rtt variance) measurement. 
    217312last_ack_nr here is the last ack_nr received on the socket before the current packet, 
     
    236331Where timeout is specified in milliseconds. i.e. the minimum timeout for a packet is 
    2373321/2 second. 
     333 
     334Every time a socket sends or receives a packet, it updates its timeout counter. If 
     335no packet has arrived within ``timeout`` number of milliseconds from the last 
     336timeout counter reset, the socket triggers a timeout. It will set its ``packet_size`` 
     337and ``max_window`` to the smallest packet size (150 bytes). This allows it to send 
     338one more packet, and this is how the socket gets started again if the window size 
     339goes down to zero. 
     340 
     341The initial timeout is set to 1000 milliseconds, and later updated according to 
     342the formula above. For each packet that times out in a row, the timeout is 
     343doubled. 
     344 
     345packet sizes 
     346------------ 
     347 
     348In order to have as little impact as possible on slow congested links, uTP adjusts 
     349its packet size down to as small as 150 bytes per packet. Using packets that small 
     350has the benefit of not clogging up a slow uplink, with long serialization delay. 
     351The cost of using packets that small is that the overhead from the packet headers 
     352become significant. At high rates, large packet sizes are used, at slow rates, 
     353small packet sizes are used. 
    238354 
    239355congestion control 
     
    255371over uTP, and the receiving end calculates the difference between its own high resolution 
    256372timer and the timestamp in the packet it received. This difference is then fed back to the 
    257 original sender of the packet (timestamp_difference_microseconds). This value is meaningful 
     373original sender of the packet (timestamp_difference_microseconds). This value is not meaningful 
    258374as an absolute value. The clocks in the machines are most likely not synchronized, 
    259375especially not down to micorsecond resolution, and the time the packet is in transit is 
     
    289405off target is less than 0. 
    290406 
    291 The uTP congestion control is implemented in the ``UTP_ApplyPlictoCControl`` function. 
    292 This function is called for a socket every time a packet is received on the socket. 
     407If max_window becomes less than 0, it is set to 0. A window size of zero means that the 
     408socket may not send any packets. In this state, the socket will trigger a timeout and 
     409force the window size to one packet size, and send one packet. See the section on timeouts 
     410for more information. 
    293411 
    294412