Changeset 11164
- Timestamp:
- 06/25/09 11:53:53 (9 months ago)
- Files:
-
- 1 modified
-
dotorg/trunk/html/beps/bep_0029.rst (modified) (14 diffs)
Legend:
- Unmodified
- Added
- Removed
-
dotorg/trunk/html/beps/bep_0029.rst
r11163 r11164 37 37 unfair advantage when competing with other services for bandwidth, 38 38 which exaggerates the effect of BitTorrent filling the upload pipe. 39 The reason for this is because TCP distributes the available bandwidth 40 evenly across connections, and the more connections one application 41 uses, the larger share of the bandwidth it gets. 39 42 40 43 The traditional solution to this problem is to cap the upload rate … … 67 70 congestion control works. 68 71 69 The main addition compared to TCP is the delay based congestion 72 uTP is a transport protocol layered on top of UDP. As such, it must 73 (and has the ability to) implement its own congestion control. 74 75 The main difference compared to TCP is the delay based congestion 70 76 control. See the `congestion control`_ section. 71 77 … … 140 146 connection decides which ID to use, and the return path has the same ID + 1. 141 147 142 timestamp_seconds, timestamp_microseconds 143 ......................................... 144 145 This is the 'seconds' and 'microseconds' parts of the timestamp of when this 146 packet was sent. This is set using gettimeofday() on posix and QueryPerformanceTimer() 147 on windows. The higher resolution this timestamp has, the better. 148 timestamp_microseconds 149 ...................... 150 151 This is the 'microseconds' parts of the timestamp of when this packet was sent. 152 This is set using gettimeofday() on posix and QueryPerformanceTimer() 153 on windows. The higher resolution this timestamp has, the better. The closer 154 to the actual transmit time it is set, the better. 148 155 149 156 timestamp_difference_microseconds … … 192 199 ,,,,,,,,,,,,, 193 200 194 Selective ack is an extension that can selectively ACK packets non-sequentially. 195 Its payload is a bitmask of 32 bits, representing the first 32 packets in the 196 send window. A set bit specifies that packet has been received, a cleared bit 201 Selective ACK is an extension that can selectively ACK packets non-sequentially. 202 Its payload is a bitmask of at least 32 bits, in multiples of 32 bits. Each bit 203 represents one packet in the send window. Bits that are outside of the send window 204 are ignored. A set bit specifies that packet has been received, a cleared bit 197 205 specifies that the packet has not been received. The header looks like this:: 198 206 … … 204 212 +---------------+---------------+ 205 213 206 The selective ack is only sent when at least one sequence number was skipped in 214 Note that the len field of extensions refer to bytes, which in this extension 215 must be at least 4, and in multiples of 4. 216 217 The selective ACK is only sent when at least one sequence number was skipped in 207 218 the received stream. The first bit in the mask therefore represents ack_nr + 2. 208 ack_nr + 1 is assumed to have been dropped or be missing when this packet was sent.219 ack_nr + 1 is assumed to have been dropped or be missing when this packet was sent. 209 220 A set bit represents a packet that has been received, a cleared bit represents 210 221 a packet that has not yet been received. … … 215 226 next byte in the mask represents [ack_nr + 2 + 8, ack_nr + 2 + 15] in reverse order, 216 227 and so on. The bitmask is not limited to 32 bits but can be of any size. 228 229 Here is the layout of a bitmask representing the first 32 packet acks 230 represented in a selective ACK bitfield:: 231 232 0 8 16 233 +---------------+---------------+---------------+---------------+ 234 | 9 8 ... 3 2 | 17 ... 10 | 25 ... 18 | 33 ... 26 | 235 +---------------+---------------+---------------+---------------+ 236 237 The number in the diagram maps the bit in the bitmask to the offset to add to 238 ``ack_nr`` in order to calculate the sequence number that the bit is ACKing. 217 239 218 240 Extension bits … … 272 294 ID in the packet header. The send ID for the socket should be initialized 273 295 to the ID + 1. The sequence number for the return channel is initialized 274 to a random number. The other end expects an ST_STATE packet (only an ack)296 to a random number. The other end expects an ST_STATE packet (only an ACK) 275 297 in response. 276 298 … … 350 372 V V 351 373 374 Connections are identified by their ``conn_id`` header. If the connection ID of a new 375 connection collides with an existing connection, the connection attempt will fails, since 376 the ST_SYN packet will be unexpected in the existing stream, and ignored. 352 377 353 378 packet loss … … 356 381 If the packet with sequence number (``seq_nr`` - ``cur_window``) has not been acked 357 382 (this is the oldest packet in the send buffer, and the next one expected to be acked) 358 has not been acked, but 3 or more packets have been acked pas sedit (through Selective383 has not been acked, but 3 or more packets have been acked past it (through Selective 359 384 ACK), the packet is assumed to have been lost. Similarly, when receiving 3 duplicate 360 385 acks, ``ack_nr`` + 1 is assumed to have been lost (if a packet with that sequence number 361 386 has been sent). 362 387 363 When a packet is lost, the ``max_window`` is multiplied by 0.78. 388 When a packet is lost, the ``max_window`` is multiplied by 0.78. TCP multiplies by 389 0.5, but since this is a much less likely event in uTP, and since the uTP ramp-up 390 is slower than TCP, this is a reasonable optimization. 364 391 365 392 timeouts … … 372 399 and ack_nr is the field in the currently received packet. 373 400 374 The ``rtt`` and ``rtt_var`` is only updated for packets that w here sent only once.401 The ``rtt`` and ``rtt_var`` is only updated for packets that were sent only once. 375 402 This avoids problems with figuring out which packet was acked, the first or the 376 403 second one. … … 399 426 400 427 The initial timeout is set to 1000 milliseconds, and later updated according to 401 the formula above. For e ach packet that times out in a row, the timeout is402 doubled.428 the formula above. For every packet consecutive subsequent packet that times out, 429 the timeout is doubled. 403 430 404 431 packet sizes … … 407 434 In order to have as little impact as possible on slow congested links, uTP adjusts 408 435 its packet size down to as small as 150 bytes per packet. Using packets that small 409 has the benefit of not clogging upa slow up-link, with long serialization delay.436 has the benefit of not clogging a slow up-link, with long serialization delay. 410 437 The cost of using packets that small is that the overhead from the packet headers 411 438 become significant. At high rates, large packet sizes are used, at slow rates, … … 417 444 The overall goal of the uTP congestion control is to use one way buffer delay as the 418 445 main congestion measurement, as well as packet loss, like TCP. The point is to avoid 419 running atfull send buffers whenever data is being sent. This is specifically a446 running with full send buffers whenever data is being sent. This is specifically a 420 447 problem for DSL/Cable modems, where the send buffer in the modem often has room for 421 448 multiple seconds worth of data. The ideal buffer utilization for uTP (or any background … … 452 479 ``max_window``. Its size is controlled, roughly, by the following expression:: 453 480 454 scaled_gain = (off_target / CCONTROL_TARGET) 455 * (outstanding_packet * MAX_CWND_INCREASE_PACKETS_PER_RTT / max_window); 481 delay_factor = off_target / CCONTROL_TARGET; 482 window_factor = outstanding_packet / max_window; 483 scaled_gain = MAX_CWND_INCREASE_PACKETS_PER_RTT * delay_factor * window_factor; 456 484 457 485 Where the first factor scales the *off_target* to units of target delays.