Changeset 11165

Show
Ignore:
Timestamp:
06/25/09 11:54:14 (9 months ago)
Author:
arvid
Message:

refreshed html

Location:
dotorg/trunk/html/beps
Files:
2 modified

Legend:

Unmodified
Added
Removed
  • dotorg/trunk/html/beps/bep_0003.html

    r11097 r11165  
    4141<tr class="field"><th class="field-name">Version:</th><td class="field-body">11031</td> 
    4242</tr> 
    43 <tr class="field"><th class="field-name">Last-Modified:</th><td class="field-body"><a class="reference external" href="http://bittorrent.org/trac/browser/dotorg/trunk/html/beps/bep_0003.rst">2008-02-28 16:43:58 -0800 (Thu, 28 Feb 2008)</a></td> 
     43<tr class="field"><th class="field-name">Last-Modified:</th><td class="field-body"><a class="reference" href="http://bittorrent.org/trac/browser/dotorg/trunk/html/beps/bep_0003.rst">2008-02-28 16:43:58 -0800 (Thu, 28 Feb 2008)</a></td> 
    4444</tr> 
    4545<tr class="field"><th class="field-name">Author:</th><td class="field-body">Bram Cohen &lt;bram&#32;&#97;t&#32;bittorrent.com&gt;</td> 
     
    5151<tr class="field"><th class="field-name">Created:</th><td class="field-body">10-Jan-2008</td> 
    5252</tr> 
    53 <tr class="field"><th class="field-name">Post-History:</th><td class="field-body"></td> 
     53<tr class="field"><th class="field-name">Post-History:</th><td class="field-body">24-Jun-2009, clarified the encoding of strings in torrent files</td> 
    5454</tr> 
    5555</tbody> 
     
    5959<p class="topic-title first">Contents</p> 
    6060<ul class="simple"> 
    61 <li><a class="reference internal" href="#a-bittorrent-file-distribution-consists-of-these-entities" id="id2">A BitTorrent file distribution consists of these entities:</a></li> 
    62 <li><a class="reference internal" href="#to-start-serving-a-host-goes-through-the-following-steps" id="id3">To start serving, a host goes through the following steps:</a></li> 
    63 <li><a class="reference internal" href="#to-start-downloading-a-user-does-the-following" id="id4">To start downloading, a user does the following:</a></li> 
    64 <li><a class="reference internal" href="#the-connectivity-is-as-follows" id="id5">The connectivity is as follows:</a></li> 
    65 <li><a class="reference internal" href="#metainfo-files-are-bencoded-dictionaries-with-the-following-keys" id="id6">Metainfo files are bencoded dictionaries with the following keys:</a></li> 
    66 <li><a class="reference internal" href="#tracker-get-requests-have-the-following-keys" id="id7">Tracker GET requests have the following keys:</a></li> 
    67 <li><a class="reference internal" href="#all-non-keepalive-messages-start-with-a-single-byte-which-gives-their-type" id="id8">All non-keepalive messages start with a single byte which gives their type.</a></li> 
    68 <li><a class="reference internal" href="#the-possible-values-are" id="id9">The possible values are:</a></li> 
    69 <li><a class="reference internal" href="#copyright" id="id10">Copyright</a></li> 
     61<li><a class="reference" href="#a-bittorrent-file-distribution-consists-of-these-entities" id="id2">A BitTorrent file distribution consists of these entities:</a></li> 
     62<li><a class="reference" href="#to-start-serving-a-host-goes-through-the-following-steps" id="id3">To start serving, a host goes through the following steps:</a></li> 
     63<li><a class="reference" href="#to-start-downloading-a-user-does-the-following" id="id4">To start downloading, a user does the following:</a></li> 
     64<li><a class="reference" href="#the-connectivity-is-as-follows" id="id5">The connectivity is as follows:</a></li> 
     65<li><a class="reference" href="#metainfo-files-are-bencoded-dictionaries-with-the-following-keys" id="id6">Metainfo files are bencoded dictionaries with the following keys:</a></li> 
     66<li><a class="reference" href="#tracker-get-requests-have-the-following-keys" id="id7">Tracker GET requests have the following keys:</a></li> 
     67<li><a class="reference" href="#all-non-keepalive-messages-start-with-a-single-byte-which-gives-their-type" id="id8">All non-keepalive messages start with a single byte which gives their type.</a></li> 
     68<li><a class="reference" href="#the-possible-values-are" id="id9">The possible values are:</a></li> 
     69<li><a class="reference" href="#copyright" id="id10">Copyright</a></li> 
    7070</ul> 
    7171</div> 
     
    139139<dt>info</dt> 
    140140<dd><p class="first">This maps to a dictionary, with keys described below.</p> 
    141 <p>The <tt class="docutils literal"><span class="pre">name</span></tt> key maps to a string which is the suggested name 
    142 to save the file (or directory) as. It is purely advisory.</p> 
     141<p>The <tt class="docutils literal"><span class="pre">name</span></tt> key maps to a UTF-8 encoded string which is the 
     142suggested name to save the file (or directory) as. It is purely advisory.</p> 
    143143<p><tt class="docutils literal"><span class="pre">piece</span> <span class="pre">length</span></tt> maps to the number of bytes in each piece 
    144144the file is split into. For the purposes of transfer, files are 
     
    163163the following keys:</p> 
    164164<p><tt class="docutils literal"><span class="pre">length</span></tt> - The length of the file, in bytes.</p> 
    165 <p><tt class="docutils literal"><span class="pre">path</span></tt> - A list of strings corresponding to subdirectory 
     165<p><tt class="docutils literal"><span class="pre">path</span></tt> - A list of UTF-8 encoded strings corresponding to subdirectory 
    166166names, the last of which is the actual file name (a zero length list 
    167167is an error case).</p> 
    168 <p class="last">In the single file case, the name key is the name of a file, in the 
     168<p>In the single file case, the name key is the name of a file, in the 
    169169muliple file case, it's the name of a directory.</p> 
     170<p class="last">All strings in a .torrent file that contains text must be UTF-8 
     171encoded.</p> 
    170172</dd> 
    171173</dl> 
  • dotorg/trunk/html/beps/bep_0029.html

    r11160 r11165  
    3939<tr class="field"><th class="field-name">Title:</th><td class="field-body">uTorrent transport protocol</td> 
    4040</tr> 
    41 <tr class="field"><th class="field-name">Version:</th><td class="field-body">11159</td> 
    42 </tr> 
    43 <tr class="field"><th class="field-name">Last-Modified:</th><td class="field-body"><a class="reference" href="http://bittorrent.org/trac/browser/dotorg/trunk/html/beps/bep_0029.rst">2009-06-23 13:45:47 -0700 (Tue, 23 Jun 2009)</a></td> 
     41<tr class="field"><th class="field-name">Version:</th><td class="field-body">11164</td> 
     42</tr> 
     43<tr class="field"><th class="field-name">Last-Modified:</th><td class="field-body"><a class="reference" href="http://bittorrent.org/trac/browser/dotorg/trunk/html/beps/bep_0029.rst">2009-06-25 11:53:53 -0700 (Thu, 25 Jun 2009)</a></td> 
    4444</tr> 
    4545<tr class="field"><th class="field-name">Author:</th><td class="field-body">Arvid Norberg &lt;arvid&#32;&#97;t&#32;bittorrent.com&gt;</td> 
     
    6868<li><a class="reference" href="#version" id="id7">version</a></li> 
    6969<li><a class="reference" href="#connection-id" id="id8">connection_id</a></li> 
    70 <li><a class="reference" href="#timestamp-seconds-timestamp-microseconds" id="id9">timestamp_seconds, timestamp_microseconds</a></li> 
     70<li><a class="reference" href="#timestamp-microseconds" id="id9">timestamp_microseconds</a></li> 
    7171<li><a class="reference" href="#timestamp-difference-microseconds" id="id10">timestamp_difference_microseconds</a></li> 
    7272<li><a class="reference" href="#wnd-size" id="id11">wnd_size</a></li> 
     
    111111<p>The fact that BitTorrent uses multiple TCP connections gives it an 
    112112unfair advantage when competing with other services for bandwidth, 
    113 which exaggerates the effect of BitTorrent filling the upload pipe.</p> 
     113which exaggerates the effect of BitTorrent filling the upload pipe. 
     114The reason for this is because TCP distributes the available bandwidth 
     115evenly across connections, and the more connections one application 
     116uses, the larger share of the bandwidth it gets.</p> 
    114117<p>The traditional solution to this problem is to cap the upload rate 
    115 of the bittorrent client to 80% of the uplink capacity. 80% leaves 
     118of the BitTorrent client to 80% of the up-link capacity. 80% leaves 
    116119some head room for interactive traffic.</p> 
    117120<p>The main drawbacks with this solution are:</p> 
    118121<ol class="arabic simple"> 
    119 <li>The user needs to configure his/her bittorrent client, it won't 
     122<li>The user needs to configure his/her BitTorrent client, it won't 
    120123work out-of-the-box.</li> 
    121124<li>The user needs to know his/her internet connection's upload 
     
    123126may connect to a large number of different networks.</li> 
    124127<li>The headroom of 20% is arbitrary and wastes bandwidth. Whenever 
    125 there is no interactive traffic competing with bittorrent, the 
     128there is no interactive traffic competing with BitTorrent, the 
    126129extra 20% are wasted. Whenever there is competing interactive 
    127130traffic, it cannot use more than 20% of the capacity.</li> 
     
    137140<p>This document assumes some knowledge of how TCP and window based 
    138141congestion control works.</p> 
    139 <p>The main addition compared to TCP is the delay based congestion 
     142<p>uTP is a transport protocol layered on top of UDP. As such, it must 
     143(and has the ability to) implement its own congestion control.</p> 
     144<p>The main difference compared to TCP is the delay based congestion 
    140145control. See the <a class="reference" href="#congestion-control">congestion control</a> section.</p> 
    141146<p>Like TCP, uTP uses window based congestion control. Each socket 
     
    186191<div class="section" id="version"> 
    187192<h3>version</h3> 
    188 <p>This is the protcol version. The current version is 1.</p> 
     193<p>This is the protocol version. The current version is 1.</p> 
    189194</div> 
    190195<div class="section" id="connection-id"> 
     
    195200connection decides which ID to use, and the return path has the same ID + 1.</p> 
    196201</div> 
    197 <div class="section" id="timestamp-seconds-timestamp-microseconds"> 
    198 <h3>timestamp_seconds, timestamp_microseconds</h3> 
    199 <p>This is the 'seconds' and 'microseconds' parts of the timestamp of when this 
    200 packet was sent. This is set using gettimeofday() on posix and QueryPerformanceTimer() 
    201 on windows. The higher resultion this timstamp has, the better.</p> 
     202<div class="section" id="timestamp-microseconds"> 
     203<h3>timestamp_microseconds</h3> 
     204<p>This is the 'microseconds' parts of the timestamp of when this packet was sent. 
     205This is set using gettimeofday() on posix and QueryPerformanceTimer() 
     206on windows. The higher resolution this timestamp has, the better. The closer 
     207to the actual transmit time it is set, the better.</p> 
    202208</div> 
    203209<div class="section" id="timestamp-difference-microseconds"> 
     
    238244<div class="section" id="selective-ack"> 
    239245<h4>Selective ACK</h4> 
    240 <p>Selective ack is an extension that can selectively ACK packets non-sequentially. 
    241 Its payload is a bitmask of 32 bits, representing the first 32 packets in the 
    242 send window. A set bit specifies that packet has been received, a cleared bit 
     246<p>Selective ACK is an extension that can selectively ACK packets non-sequentially. 
     247Its payload is a bitmask of at least 32 bits, in multiples of 32 bits. Each bit 
     248represents one packet in the send window. Bits that are outside of the send window 
     249are ignored. A set bit specifies that packet has been received, a cleared bit 
    243250specifies that the packet has not been received. The header looks like this:</p> 
    244251<pre class="literal-block"> 
     
    250257+---------------+---------------+ 
    251258</pre> 
    252 <p>The selective ack is only sent when at least one sequence number was skipped in 
     259<p>Note that the len field of extensions refer to bytes, which in this extension 
     260must be at least 4, and in multiples of 4.</p> 
     261<p>The selective ACK is only sent when at least one sequence number was skipped in 
    253262the received stream. The first bit in the mask therefore represents ack_nr + 2. 
    254 ack_nr +1 is assumed to have been dropped or be missing when this packet was sent. 
     263ack_nr + 1 is assumed to have been dropped or be missing when this packet was sent. 
    255264A set bit represents a packet that has been received, a cleared bit represents 
    256265a packet that has not yet been received.</p> 
    257 <p>The bitmask has reverse byteorder. The first byte represents packets [ack_nr + 2, 
     266<p>The bitmask has reverse byte order. The first byte represents packets [ack_nr + 2, 
    258267ack_nr + 2 + 7] in reverse order. The least significant bit in the byte represents 
    259268ack_nr + 2, the most significant bit in the byte represents ack_nr + 2 + 7. The 
    260269next byte in the mask represents [ack_nr + 2 + 8, ack_nr + 2 + 15] in reverse order, 
    261270and so on. The bitmask is not limited to 32 bits but can be of any size.</p> 
     271<p>Here is the layout of a bitmask representing the first 32 packet acks 
     272represented in a selective ACK bitfield:</p> 
     273<pre class="literal-block"> 
     2740               8               16 
     275+---------------+---------------+---------------+---------------+ 
     276| 9 8 ...   3 2 | 17   ...   10 | 25   ...   18 | 33   ...   26 | 
     277+---------------+---------------+---------------+---------------+ 
     278</pre> 
     279<p>The number in the diagram maps the bit in the bitmask to the offset to add to 
     280<tt class="docutils literal"><span class="pre">ack_nr</span></tt> in order to calculate the sequence number that the bit is ACKing.</p> 
    262281</div> 
    263282<div class="section" id="extension-bits"> 
     
    295314the ST_FIN packet.</dd> 
    296315<dt>ST_STATE = 2</dt> 
    297 <dd>State packet. Used to transmit an ACK with no data.</dd> 
     316<dd>State packet. Used to transmit an ACK with no data. Packets that don't 
     317include any payload do not increase the <tt class="docutils literal"><span class="pre">seq_nr</span></tt>.</dd> 
    298318<dt>ST_RESET = 3</dt> 
    299319<dd>Terminate connection forcefully. Similar to TCP RST flag. The remote 
     
    304324The sequence number is initialized to 1. The connection ID is initialized 
    305325to a random number. The syn packet is special, all subsequent packets sent 
    306 on this connection (except for resends of the ST_SYN) are sent with the 
     326on this connection (except for re-sends of the ST_SYN) are sent with the 
    307327connection ID + 1. The connection ID is what the other end is expected to 
    308328use in its responses.</p> 
     
    310330ID in the packet header. The send ID for the socket should be initialized 
    311331to the ID + 1. The sequence number for the return channel is initialized 
    312 to a random number. The other end expects an ST_STATE packet (only an ack) 
     332to a random number. The other end expects an ST_STATE packet (only an ACK) 
    313333in response.</p> 
    314334</dd> 
     
    329349<div class="section" id="connection-setup"> 
    330350<h2>connection setup</h2> 
    331 <p><em>describe the initial connection setup sequence</em></p> 
     351<p>Here is a diagram illustrating the exchanges and states to initiate 
     352a connection. The c.* refers to a state in the socket itself, pkt.* 
     353refers to a field in the packet header.</p> 
     354<pre class="literal-block"> 
     355initiating endpoint                           accepting endpoint 
     356 
     357          | c.state = CS_SYN_SENT                         | 
     358          | c.seq_nr = 1                                  | 
     359          | c.conn_id_recv = rand()                       | 
     360          | c.conn_id_send = c.conn_id_recv + 1           | 
     361          |                                               | 
     362          |                                               | 
     363          | ST_SYN                                        | 
     364          |   seq_nr=c.seq_nr++                           | 
     365          |   ack_nr=*                                    | 
     366          |   conn_id=c.rcv_conn_id                       | 
     367          | &gt;-------------------------------------------&gt; | 
     368          |             c.receive_conn_id = pkt.conn_id+1 | 
     369          |             c.send_conn_id = pkt.conn_id      | 
     370          |             c.seq_nr = rand()                 | 
     371          |             c.ack_nr = pkt.seq_nr             | 
     372          |             c.state = CS_CONNECTED            | 
     373          |                                               | 
     374          |                                               | 
     375          |                                               | 
     376          |                                               | 
     377          |                     ST_STATE                  | 
     378          |                       seq_nr=c.seq_nr++       | 
     379          |                       ack_nr=c.ack_nr         | 
     380          |                       conn_id=c.send_conn_id  | 
     381          | &lt;------------------------------------------&lt;  | 
     382          | c.state = CS_CONNECTED                        | 
     383          | c.ack_nr = pkt.seq_nr                         | 
     384          |                                               | 
     385          |                                               | connection established 
     386     .. ..|.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..|.. .. 
     387          |                                               | 
     388          |                                               | 
     389          | ST_DATA                                       | 
     390          |   seq_nr=c.seq_nr++                           | 
     391          |   ack_nr=c.ack_nr                             | 
     392          |   conn_id=c.conn_id_send                      | 
     393          | &gt;-------------------------------------------&gt; | 
     394          |                         c.ack_nr = pkt.seq_nr | 
     395          |                                               | 
     396          |                                               | 
     397          |                                               | 
     398          |                     ST_DATA                   | 
     399          |                       seq_nr=c.seq_nr++       | 
     400          |                       ack_nr=c.ack_nr         | 
     401          |                       conn_id=c.send_conn_id  | 
     402          | &lt;------------------------------------------&lt;  | 
     403          | c.ack_nr = pkt.seq_nr                         | 
     404          |                                               | 
     405          |                                               | 
     406          V                                               V 
     407</pre> 
     408<p>Connections are identified by their <tt class="docutils literal"><span class="pre">conn_id</span></tt> header. If the connection ID of a new 
     409connection collides with an existing connection, the connection attempt will fails, since 
     410the ST_SYN packet will be unexpected in the existing stream, and ignored.</p> 
    332411</div> 
    333412<div class="section" id="packet-loss"> 
     
    335414<p>If the packet with sequence number (<tt class="docutils literal"><span class="pre">seq_nr</span></tt> - <tt class="docutils literal"><span class="pre">cur_window</span></tt>) has not been acked 
    336415(this is the oldest packet in the send buffer, and the next one expected to be acked) 
    337 has not been acked, but 3 or more packets have been acked passed it (through Selective 
     416has not been acked, but 3 or more packets have been acked past it (through Selective 
    338417ACK), the packet is assumed to have been lost. Similarly, when receiving 3 duplicate 
    339418acks, <tt class="docutils literal"><span class="pre">ack_nr</span></tt> + 1 is assumed to have been lost (if a packet with that sequence number 
    340419has been sent).</p> 
    341 <p>When a packet is lost, the <tt class="docutils literal"><span class="pre">max_window</span></tt> is multiplied by 0.78.</p> 
     420<p>When a packet is lost, the <tt class="docutils literal"><span class="pre">max_window</span></tt> is multiplied by 0.78. TCP multiplies by 
     4210.5, but since this is a much less likely event in uTP, and since the uTP ramp-up 
     422is slower than TCP, this is a reasonable optimization.</p> 
    342423</div> 
    343424<div class="section" id="timeouts"> 
     
    348429last_ack_nr here is the last ack_nr received on the socket before the current packet, 
    349430and ack_nr is the field in the currently received packet.</p> 
    350 <p>The <tt class="docutils literal"><span class="pre">rtt</span></tt> and <tt class="docutils literal"><span class="pre">rtt_var</span></tt> is only updated for packets that where sent only once. 
     431<p>The <tt class="docutils literal"><span class="pre">rtt</span></tt> and <tt class="docutils literal"><span class="pre">rtt_var</span></tt> is only updated for packets that were sent only once. 
    351432This avoids problems with figuring out which packet was acked, the first or the 
    352433second one.</p> 
     
    372453goes down to zero.</p> 
    373454<p>The initial timeout is set to 1000 milliseconds, and later updated according to 
    374 the formula above. For each packet that times out in a row, the timeout is 
    375 doubled.</p> 
     455the formula above. For every packet consecutive subsequent packet that times out, 
     456the timeout is doubled.</p> 
    376457</div> 
    377458<div class="section" id="packet-sizes"> 
     
    379460<p>In order to have as little impact as possible on slow congested links, uTP adjusts 
    380461its packet size down to as small as 150 bytes per packet. Using packets that small 
    381 has the benefit of not clogging up a slow uplink, with long serialization delay. 
     462has the benefit of not clogging a slow up-link, with long serialization delay. 
    382463The cost of using packets that small is that the overhead from the packet headers 
    383464become significant. At high rates, large packet sizes are used, at slow rates, 
     
    388469<p>The overall goal of the uTP congestion control is to use one way buffer delay as the 
    389470main congestion measurement, as well as packet loss, like TCP. The point is to avoid 
    390 running at full send buffers whenever data is being sent. This is specifically a 
     471running with full send buffers whenever data is being sent. This is specifically a 
    391472problem for DSL/Cable modems, where the send buffer in the modem often has room for 
    392473multiple seconds worth of data. The ideal buffer utilization for uTP (or any background 
     
    401482original sender of the packet (timestamp_difference_microseconds). This value is not meaningful 
    402483as an absolute value. The clocks in the machines are most likely not synchronized, 
    403 especially not down to micorsecond resolution, and the time the packet is in transit is 
     484especially not down to microsecond resolution, and the time the packet is in transit is 
    404485also included in the difference of these timestamps. However, the value is useful in 
    405486comparison to previous values.</p> 
    406487<p>Each socket keeps a sliding minimum of the lowest value for the last two minutes. This value 
    407 is called <em>base_delay</em>, and is used as a baseline, the minumum delay between the hosts. 
     488is called <em>base_delay</em>, and is used as a baseline, the minimum delay between the hosts. 
    408489When subtracting the base_delay from the timestamp difference in each packet you get a 
    409490measurement of the current buffering delay on the socket. This measurement is called <em>our_delay</em>. 
    410491It has a lot of noise it it, but is used as the driver to determine whether to increase or 
    411492decrease the send window (which controls the send rate).</p> 
    412 <p>The <em>CCONTROL_TARGET</em> is the buffering delay that the uTP accepts on the uplink. Currently the 
     493<p>The <em>CCONTROL_TARGET</em> is the buffering delay that the uTP accepts on the up-link. Currently the 
    413494delay target is set to 100 ms. <em>off_target</em> is how far the actual measured delay is from the 
    414495target delay (calculated from CCONTROL_TARGET - our_delay).</p> 
     
    418499<tt class="docutils literal"><span class="pre">max_window</span></tt>. Its size is controlled, roughly, by the following expression:</p> 
    419500<pre class="literal-block"> 
    420 scaled_gain = (off_target / CCONTROL_TARGET) 
    421         * (outstanding_packet * MAX_CWND_INCREASE_PACKETS_PER_RTT / max_window); 
     501delay_factor = off_target / CCONTROL_TARGET; 
     502window_factor = outstanding_packet / max_window; 
     503scaled_gain = MAX_CWND_INCREASE_PACKETS_PER_RTT * delay_factor * window_factor; 
    422504</pre> 
    423505<p>Where the first factor scales the <em>off_target</em> to units of target delays.</p>