Changeset 11164

Show
Ignore:
Timestamp:
06/25/09 11:53:53 (9 months ago)
Author:
arvid
Message:

clarifications and minor fixes

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • dotorg/trunk/html/beps/bep_0029.rst

    r11163 r11164  
    3737unfair advantage when competing with other services for bandwidth, 
    3838which exaggerates the effect of BitTorrent filling the upload pipe. 
     39The reason for this is because TCP distributes the available bandwidth 
     40evenly across connections, and the more connections one application 
     41uses, the larger share of the bandwidth it gets. 
    3942 
    4043The traditional solution to this problem is to cap the upload rate 
     
    6770congestion control works. 
    6871 
    69 The main addition compared to TCP is the delay based congestion 
     72uTP is a transport protocol layered on top of UDP. As such, it must 
     73(and has the ability to) implement its own congestion control. 
     74 
     75The main difference compared to TCP is the delay based congestion 
    7076control. See the `congestion control`_ section. 
    7177 
     
    140146connection decides which ID to use, and the return path has the same ID + 1. 
    141147 
    142 timestamp_seconds, timestamp_microseconds 
    143 ......................................... 
    144  
    145 This is the 'seconds' and 'microseconds' parts of the timestamp of when this 
    146 packet was sent. This is set using gettimeofday() on posix and QueryPerformanceTimer() 
    147 on windows. The higher resolution this timestamp has, the better. 
     148timestamp_microseconds 
     149...................... 
     150 
     151This is the 'microseconds' parts of the timestamp of when this packet was sent. 
     152This is set using gettimeofday() on posix and QueryPerformanceTimer() 
     153on windows. The higher resolution this timestamp has, the better. The closer 
     154to the actual transmit time it is set, the better. 
    148155 
    149156timestamp_difference_microseconds 
     
    192199,,,,,,,,,,,,, 
    193200 
    194 Selective ack is an extension that can selectively ACK packets non-sequentially. 
    195 Its payload is a bitmask of 32 bits, representing the first 32 packets in the 
    196 send window. A set bit specifies that packet has been received, a cleared bit 
     201Selective ACK is an extension that can selectively ACK packets non-sequentially. 
     202Its payload is a bitmask of at least 32 bits, in multiples of 32 bits. Each bit 
     203represents one packet in the send window. Bits that are outside of the send window 
     204are ignored. A set bit specifies that packet has been received, a cleared bit 
    197205specifies that the packet has not been received. The header looks like this:: 
    198206 
     
    204212        +---------------+---------------+ 
    205213 
    206 The selective ack is only sent when at least one sequence number was skipped in 
     214Note that the len field of extensions refer to bytes, which in this extension 
     215must be at least 4, and in multiples of 4. 
     216 
     217The selective ACK is only sent when at least one sequence number was skipped in 
    207218the received stream. The first bit in the mask therefore represents ack_nr + 2. 
    208 ack_nr +1 is assumed to have been dropped or be missing when this packet was sent. 
     219ack_nr + 1 is assumed to have been dropped or be missing when this packet was sent. 
    209220A set bit represents a packet that has been received, a cleared bit represents 
    210221a packet that has not yet been received. 
     
    215226next byte in the mask represents [ack_nr + 2 + 8, ack_nr + 2 + 15] in reverse order, 
    216227and so on. The bitmask is not limited to 32 bits but can be of any size. 
     228 
     229Here is the layout of a bitmask representing the first 32 packet acks 
     230represented in a selective ACK bitfield:: 
     231 
     232        0               8               16 
     233        +---------------+---------------+---------------+---------------+ 
     234        | 9 8 ...   3 2 | 17   ...   10 | 25   ...   18 | 33   ...   26 | 
     235        +---------------+---------------+---------------+---------------+ 
     236 
     237The number in the diagram maps the bit in the bitmask to the offset to add to 
     238``ack_nr`` in order to calculate the sequence number that the bit is ACKing. 
    217239 
    218240Extension bits 
     
    272294        ID in the packet header. The send ID for the socket should be initialized 
    273295        to the ID + 1. The sequence number for the return channel is initialized 
    274         to a random number. The other end expects an ST_STATE packet (only an ack) 
     296        to a random number. The other end expects an ST_STATE packet (only an ACK) 
    275297        in response. 
    276298 
     
    350372             V                                               V 
    351373 
     374Connections are identified by their ``conn_id`` header. If the connection ID of a new 
     375connection collides with an existing connection, the connection attempt will fails, since 
     376the ST_SYN packet will be unexpected in the existing stream, and ignored. 
    352377 
    353378packet loss 
     
    356381If the packet with sequence number (``seq_nr`` - ``cur_window``) has not been acked 
    357382(this is the oldest packet in the send buffer, and the next one expected to be acked) 
    358 has not been acked, but 3 or more packets have been acked passed it (through Selective 
     383has not been acked, but 3 or more packets have been acked past it (through Selective 
    359384ACK), the packet is assumed to have been lost. Similarly, when receiving 3 duplicate 
    360385acks, ``ack_nr`` + 1 is assumed to have been lost (if a packet with that sequence number 
    361386has been sent). 
    362387 
    363 When a packet is lost, the ``max_window`` is multiplied by 0.78. 
     388When a packet is lost, the ``max_window`` is multiplied by 0.78. TCP multiplies by 
     3890.5, but since this is a much less likely event in uTP, and since the uTP ramp-up 
     390is slower than TCP, this is a reasonable optimization. 
    364391 
    365392timeouts 
     
    372399and ack_nr is the field in the currently received packet. 
    373400 
    374 The ``rtt`` and ``rtt_var`` is only updated for packets that where sent only once. 
     401The ``rtt`` and ``rtt_var`` is only updated for packets that were sent only once. 
    375402This avoids problems with figuring out which packet was acked, the first or the 
    376403second one. 
     
    399426 
    400427The initial timeout is set to 1000 milliseconds, and later updated according to 
    401 the formula above. For each packet that times out in a row, the timeout is 
    402 doubled. 
     428the formula above. For every packet consecutive subsequent packet that times out, 
     429the timeout is doubled. 
    403430 
    404431packet sizes 
     
    407434In order to have as little impact as possible on slow congested links, uTP adjusts 
    408435its packet size down to as small as 150 bytes per packet. Using packets that small 
    409 has the benefit of not clogging up a slow up-link, with long serialization delay. 
     436has the benefit of not clogging a slow up-link, with long serialization delay. 
    410437The cost of using packets that small is that the overhead from the packet headers 
    411438become significant. At high rates, large packet sizes are used, at slow rates, 
     
    417444The overall goal of the uTP congestion control is to use one way buffer delay as the 
    418445main congestion measurement, as well as packet loss, like TCP. The point is to avoid 
    419 running at full send buffers whenever data is being sent. This is specifically a 
     446running with full send buffers whenever data is being sent. This is specifically a 
    420447problem for DSL/Cable modems, where the send buffer in the modem often has room for 
    421448multiple seconds worth of data. The ideal buffer utilization for uTP (or any background 
     
    452479``max_window``. Its size is controlled, roughly, by the following expression:: 
    453480 
    454         scaled_gain = (off_target / CCONTROL_TARGET) 
    455                 * (outstanding_packet * MAX_CWND_INCREASE_PACKETS_PER_RTT / max_window); 
     481        delay_factor = off_target / CCONTROL_TARGET; 
     482        window_factor = outstanding_packet / max_window; 
     483        scaled_gain = MAX_CWND_INCREASE_PACKETS_PER_RTT * delay_factor * window_factor; 
    456484 
    457485Where the first factor scales the *off_target* to units of target delays.