Ignore:
Timestamp:
04/10/07 17:52:31 (3 years ago)
Author:
mike
Message:

bringing dotorg up to date with the live site

File:
1 edited

Legend:

Unmodified
Added
Removed
  • dotorg/trunk/html/protocol.html

    r4448 r4460  
    1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> 
    2 <html><head><title>BitTorrent Community Forum - Protocol</title> 
    3  
    4 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">  
    5  
    6 <link href="community.css" rel="stylesheet" type="text/css"></head> 
    7  
    8 <body> 
    9 <!-- LEFT STARTS --> 
    10 <div align="left"> 
    11   <div style="margin-left: 20px;"> 
    12     <a href="http://www.bittorrent.org/index.html"> 
    13       <span class="org"> bittorrent.org </span> 
    14       <!--<img src="images/bittorrent_lg.gif" alt="BitTorrent" border="0" height="54" width="153">--> 
    15     </a> 
    16     <!--<span class="title"> Community Forum </span>--> 
    17   </div> 
    18  
    19  
    20 <!-- DOWNLINK STARTS --> 
    21 <div style="margin-left:12px;"> 
    22   <div id="downlinks"><a href="index.html">Home</a></div> 
    23   <div id="downlinks"><a href="introduction.html">For Users</a></div> 
    24   <div id="downunlink"><a href="developer.html">For Developers</a></div> 
    25   <div id="downlinks"><a href="donate.html">Donate!</a></div> 
    26   <div style="clear:left; background-color:#f0f0f0;height:5px;"><PRE></PRE> 
    27   </div> 
     1<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"  
     2        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 
     3<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 
     4<head> 
     5<meta http-equiv="Content-type" content="text/html; charset=utf-8" /> 
     6<title>BitTorrent.org » For Developers » Protocol Specification</title> 
     7<link rel="stylesheet" type="text/css" href="./css/screen.css" media="screen" /> 
     8</head> 
     9<body id="www-bittorrent-org"> 
     10<div id="upper" class="clear"> 
     11<div id="wrap"> 
     12<div id="header"> 
     13<h1><a href="./index.html">BitTorrent<span>.org</span></a></h1> 
    2814</div> 
    29  
    30 <!-- DOWNLINK ENDS --> 
    31  
    32 <!-- CONTENT STARTS --> 
    33         <div id="content" align="justify"> 
    34 <!--                    <span class="welcome">Documentation: Protocol</span><br> 
    35                     <a href="http://www.bittorrent.com/guide.html">Guide</a> | <a href="http://www.bittorrent.com/protocol.html">Protocol</a> | <a href="http://www.bittorrent.com/bittorrentecon.pdf">Paper</a> <br> 
    36 --> 
    37                     <p>BitTorrent 
    38 is a protocol for distributing files. It identifies content by URL and 
    39 is designed to integrate seamlessly with the web. Its advantage over 
    40 plain HTTP is that when multiple downloads of the same file happen 
    41 concurrently, the downloaders upload to each other, making it possible 
    42 for the file source to support very large numbers of downloaders with 
    43 only a modest increase in its load. </p> 
    44                     <p class="header">A BitTorrent file distribution consists of these entities: </p> 
    45                     <ul> 
    46                       <li>An ordinary web server 
    47                       </li><li>A static 'metainfo' file 
    48                       </li><li>A BitTorrent tracker 
    49                       </li><li>An 'original' downloader 
    50                       </li><li>The end user web browsers 
    51                       </li><li>The end user downloaders </li> 
    52                     </ul> 
    53  
    54                     <p>There are ideally many end users for a single file. </p> 
    55                     <p class="header">To start serving, a host goes through the following steps: </p> 
    56                     <ol> 
    57                       <li>Start running a tracker (or, more likely, have one running already). 
    58                       </li><li>Start running an ordinary web server, such as apache, or have one already. 
    59                       </li><li>Associate the extension .torrent with mimetype application/x-bittorrent on their web server (or have done so already). 
    60                       </li><li>Generate a metainfo (.torrent) file using the complete file to be served and the URL of the tracker. 
    61                       </li><li>Put the metainfo file on the web server. 
    62                       </li><li>Link to the metainfo (.torrent) file from some other web page. 
    63                       </li><li>Start a downloader which already has the complete file (the 'origin'). </li> 
    64  
    65                     </ol> 
    66                     <p class="header">To start downloading, a user does the following: </p> 
    67                     <ol> 
    68                       <li>Install BitTorrent (or have done so already). 
    69                       </li><li>Surf the web. 
    70                       </li><li>Click on a link to a .torrent file. 
    71                       </li><li>Select where to save the file locally, or select a partial download to resume. 
    72                       </li><li>Wait for download to complete. 
    73                       </li><li>Tell downloader to exit (it keeps uploading until this happens). </li> 
    74                     </ol> 
    75  
    76                     <p class="header">The connectivity is as follows: </p> 
    77                     <ul> 
    78                       <li>The web site is serving up static files as normal, but kicking off the BitTorrent helper app on the clients. 
    79                       </li><li>The 
    80 tracker is receiving information from all downloaders and giving them 
    81 random lists of peers. This is done over HTTP or HTTPS. </li><li>Downloaders are periodically checking 
    82 in with the tracker to keep it informed of their progress, and are 
    83 uploading to and downloading from each other via direct connections. 
    84 These connections use the BitTorrent peer protocol, which operates over 
    85 TCP. </li><li>The origin is uploading but not 
    86 downloading at all, since it has the entire file. The origin is 
    87 necessary to get the entire file into the network. Often for popular 
    88 downloads the origin can be taken down after a while since several 
    89 downloads may have completed and been left running indefinitely. </li> 
    90                     </ul> 
    91                     <p>Metainfo 
    92 file and tracker responses are both sent in a simple, efficient, and 
    93 extensible format called bencoding (pronounced 'bee encoding'). 
    94 Bencoded messages are nested dictionaries and lists (as in Python), 
    95 which can contain strings and integers. Extensibility is supported by 
    96 ignoring unexpected dictionary keys, so additional optional ones can be 
    97 added later. </p> 
    98                     <p class="header">Bencoding is done as follows: </p> 
    99  
    100                     <ul> 
    101                       <li>Strings are length-prefixed base ten followed by a colon and the string. For example 4:spam corresponds to 'spam'. 
    102                       </li><li>Integers 
    103 are represented by an 'i' followed by the number in base 10 followed by 
    104 an 'e'. For example i3e corresponds to 3 and i-3e corresponds to -3. 
    105 Integers have no size limitation. i-0e is invalid. All encodings with a 
    106 leading zero, such as i03e , are invalid, other than i0e , which of 
    107 course corresponds to 0. </li><li>Lists are encoded as an 'l' followed 
    108 by their elements (also bencoded) followed by an 'e'. For example 
    109 l4:spam4:eggse corresponds to ['spam', 'eggs']. </li><li>Dictionaries are encoded as a 'd' 
    110 followed by a list of alternating keys and their corresponding values 
    111 followed by an 'e'. For example, d3:cow3:moo4:spam4:eggse corresponds 
    112 to {'cow': 'moo', 'spam': 'eggs'} and d4:spaml1:a1:bee corresponds to 
    113 {'spam': ['a', 'b']} . Keys must be strings and appear in sorted order 
    114 (sorted as raw strings, not alphanumerics). </li> 
    115                     </ul> 
    116                     <p class="header">Metainfo files are bencoded dictionaries with the following keys: </p> 
    117                     announce 
    118                     <blockquote> 
    119                       <p>The URL of the tracker. </p> 
    120  
    121                     </blockquote> 
    122                     info 
    123                     <blockquote> 
    124                       <p>This maps to a dictionary, with keys described below. </p> 
    125                       <p>The name key maps to a string which is the suggested name to save the file (or directory) as. It is purely advisory. </p> 
    126                       <p>piece 
    127 length maps to the number of bytes in each piece the file is split 
    128 into. For the purposes of transfer, files are split into fixed-size 
    129 pieces which are all the same length except for possibly the last one 
    130 which may be truncated. Piece length is almost always a power of two, 
    131 most commonly 2 18 = 256 K (BitTorrent prior to version 3.2 uses 2 20 = 
    132 1 M as default). </p> 
    133                       <p>pieces maps to a string 
    134 whose length is a multiple of 20. It is to be subdivided into strings 
    135 of length 20, each of which is the SHA1 hash of the piece at the 
    136 corresponding index. </p> 
    137  
    138                       <p>There is also a 
    139 key length or a key files , but not both or neither. If length is 
    140 present then the download represents a single file, otherwise it 
    141 represents a set of files which go in a directory structure. </p> 
    142                       <p>In the single file case, length maps to the length of the file in bytes. </p> 
    143                       <p>For 
    144 the purposes of the other keys, the multi-file case is treated as only 
    145 having a single file by concatenating the files in the order they 
    146 appear in the files list. The files list is the value files maps to, 
    147 and is a list of dictionaries containing the following keys: </p> 
    148                       <p>length 
    149 The length of the file, in bytes. path A list of strings corresponding 
    150 to subdirectory names, the last of which is the actual file name (a 
    151 zero length list is an error case). </p> 
    152                       <p>In the single file case, the name key is the name of a file, in the muliple file case, it's the name of a directory. </p> 
    153                     </blockquote> 
    154  
    155                     <p>Tracker 
    156 queries are two way. The tracker receives information via HTTP GET 
    157 parameters and returns a bencoded message. Note that although the 
    158 current tracker implementation has its own web server, the tracker 
    159 could run very nicely as, for example, an apache module. </p> 
    160                     <p class="header">Tracker GET requests have the following keys:</p> 
    161                     info_hash 
    162                     <blockquote> 
    163                       <p>The 
    164 20 byte sha1 hash of the bencoded form of the info value from the 
    165 metainfo file. Note that this is a substring of the metainfo file. This 
    166 value will almost certainly have to be escaped. </p> 
    167                     </blockquote> 
    168                     peer_id 
    169                     <blockquote> 
    170  
    171                       <p>A 
    172 string of length 20 which this downloader uses as its id. Each 
    173 downloader generates its own id at random at the start of a new 
    174 download. This value will also almost certainly have to be escaped. </p> 
    175                     </blockquote> 
    176                     ip 
    177                     <blockquote> 
    178                       <p>An 
    179 optional parameter giving the IP (or dns name) which this peer is at. 
    180 Generally used for the origin if it's on the same machine as the 
    181 tracker. </p> 
    182                     </blockquote> 
    183                     port 
    184                     <blockquote> 
    185                       <p>The 
    186 port number this peer is listening on. Common behavior is for a 
    187 downloader to try to listen on port 6881 and if that port is taken try 
    188 6882, then 6883, etc. and give up after 6889. </p> 
    189  
    190                     </blockquote> 
    191                     uploaded 
    192                     <p>The total amount uploaded so far, encoded in base ten ascii. </p> 
    193                     downloaded 
    194                     <blockquote> 
    195                       <p>The total amount downloaded so far, encoded in base ten ascii. </p> 
    196                     </blockquote> 
    197                     left 
    198                     <blockquote> 
    199  
    200                       <p>The 
    201 number of bytes this peer still has to download, encoded in base ten 
    202 ascii. Note that this can't be computed from downloaded and the file 
    203 length since it might be a resume, and there's a chance that some of 
    204 the downloaded data failed an integrity check and had to be 
    205 re-downloaded. </p> 
    206                     </blockquote> 
    207                     event 
    208                     <blockquote> 
    209                       <p>This 
    210 is an optional key which maps to started , completed , or stopped (or 
    211 empty, which is the same as not being present). If not present, this is 
    212 one of the announcements done at regular intervals. An announcement 
    213 using started is sent when a download first begins, and one using 
    214 completed is sent when the download is complete. No completed is sent 
    215 if the file was complete when started. Downloaders send an announcement 
    216 using 'stopped' when they cease downloading. </p> 
    217                     </blockquote> 
    218                     <p>Tracker 
    219 responses are bencoded dictionaries. If a tracker response has a key 
    220 failure reason , then that maps to a human readable string which 
    221 explains why the query failed, and no other keys are required. 
    222 Otherwise, it must have two keys: interval , which maps to the number 
    223 of seconds the downloader should wait between regular rerequests, and 
    224 peers . peers maps to a list of dictionaries corresponding to peers, 
    225 each of which contains the keys peer id , ip , and port , which map to 
    226 the peer's self-selected ID, IP address or dns name as a string, and 
    227 port number, respectively. Note that downloaders may rerequest on 
    228 nonscheduled times if an event happens or they need more peers. </p> 
    229                     <p>If 
    230 you want to make any extensions to metainfo files or tracker queries, 
    231 please coordinate with Bram Cohen to make sure that all extensions are 
    232 done compatibly. </p> 
    233  
    234                     <p>BitTorrent's peer protocol operates over TCP. It performs efficiently without setting any socket options. </p> 
    235                     <p>Peer connections are symmetrical. Messages sent in both directions look the same, and data can flow in either direction. </p> 
    236                     <p>The 
    237 peer protocol refers to pieces of the file by index as described in the 
    238 metainfo file, starting at zero. When a peer finishes downloading a 
    239 piece and checks that the hash matches, it announces that it has that 
    240 piece to all of its peers. </p> 
    241                     <p>Connections 
    242 contain two bits of state on either end: choked or not, and interested 
    243 or not. Choking is a notification that no data will be sent until 
    244 unchoking happens. The reasoning and common techniques behind choking 
    245 are explained later in this document. </p> 
    246                     <p>Data 
    247 transfer takes place whenever one side is interested and the other side 
    248 is not choking. Interest state must be kept up to date at all times - 
    249 whenever a downloader doesn't have something they currently would ask a 
    250 peer for in unchoked, they must express lack of interest, despite being 
    251 choked. Implementing this properly is tricky, but makes it possible for 
    252 downloaders to know which peers will start downloading immediately if 
    253 unchoked. </p> 
    254                     <p>Connections start out choked and not interested. </p> 
    255  
    256                     <p>When 
    257 data is being transferred, downloaders should keep several piece 
    258 requests queued up at once in order to get good TCP performance (this 
    259 is called 'pipelining'.) On the other side, requests which can't be 
    260 written out to the TCP buffer immediately should be queued up in memory 
    261 rather than kept in an application-level network buffer, so they can 
    262 all be thrown out when a choke happens. </p> 
    263                     <p>The 
    264 peer wire protocol consists of a handshake followed by a never-ending 
    265 stream of length-prefixed messages. The handshake starts with character 
    266 ninteen (decimal) followed by the string 'BitTorrent protocol'. The 
    267 leading character is a length prefix, put there in the hope that other 
    268 new protocols may do the same and thus be trivially distinguishable 
    269 from each other. </p> 
    270                     <p>All later integers sent in the protocol are encoded as four bytes big-endian. </p> 
    271                     <p>After 
    272 the fixed headers come eight reserved bytes, which are all zero in all 
    273 current implementations. If you wish to extend the protocol using these 
    274 bytes, please coordinate with Bram Cohen to make sure all extensions 
    275 are done compatibly. </p> 
    276                     <p>Next comes the 20 
    277 byte sha1 hash of the bencoded form of the info value from the metainfo 
    278 file. (This is the same value which is announced as info_hash to the 
    279 tracker, only here it's raw instead of quoted here). If both sides 
    280 don't send the same value, they sever the connection. The one possible 
    281 exception is if a downloader wants to do multiple downloads over a 
    282 single port, they may wait for incoming connections to give a download 
    283 hash first, and respond with the same one if it's in their list. </p> 
    284                     <p>After 
    285 the download hash comes the 20-byte peer id which is reported in 
    286 tracker requests and contained in peer lists in tracker responses. If 
    287 the receiving side's peer id doesn't match the one the initiating side 
    288 expects, it severs the connection. </p> 
    289  
    290                     <p>That's 
    291 it for handshaking, next comes an alternating stream of length prefixes 
    292 and messages. Messages of length zero are keepalives, and ignored. 
    293 Keepalives are generally sent once every two minutes, but note that 
    294 timeouts can be done much more quickly when data is expected. </p> 
    295                     <p class="header">All non-keepalive messages start with a single byte which gives their type. The possible values are: </p> 
    296                     <ul> 
    297                       <li>0 - choke 
    298                       </li><li>1 - unchoke 
    299                       </li><li>2 - interested 
    300                       </li><li>3 - not interested 
    301                       </li><li>4 - have 
    302                       </li><li>5 - bitfield 
    303                       </li><li>6 - request 
    304                       </li><li>7 - piece 
    305                       </li><li>8 - cancel </li> 
    306  
    307                     </ul> 
    308                     <p>'choke', 'unchoke', 'interested', and 'not interested' have no payload. </p> 
    309                     <p>'bitfield' 
    310 is only ever sent as the first message. Its payload is a bitfield with 
    311 each index that downloader has sent set to one and the rest set to 
    312 zero. Downloaders which don't have anything yet may skip the 'bitfield' 
    313 message. The first byte of the bitfield corresponds to indices 0 - 7 
    314 from high bit to low bit, respectively. The next one 8-15, etc. Spare 
    315 bits at the end are set to zero. </p> 
    316                     <p>The 'have' message's payload is a single number, the index which that downloader just completed and checked the hash of. </p> 
    317                     <p>'request' 
    318 messages contain an index, begin, and length. The last two are byte 
    319 offsets. Length is generally a power of two unless it gets truncated by 
    320 the end of the file. All current implementations use 2 15 , and close 
    321 connections which request an amount greater than 2 17 . </p> 
    322                     <p>'cancel' 
    323 messages have the same payload as request messages. They are generally 
    324 only sent towards the end of a download, during what's called 'endgame 
    325 mode'. When a download is almost complete, there's a tendency for the 
    326 last few pieces to all be downloaded off a single hosed modem line, 
    327 taking a very long time. To make sure the last few pieces come in 
    328 quickly, once requests for all pieces a given downloader doesn't have 
    329 yet are currently pending, it sends requests for everything to everyone 
    330 it's downloading from. To keep this from becoming horribly inefficient, 
    331 it sends cancels to everyone else every time a piece arrives. </p> 
    332  
    333                     <p>'piece' 
    334 messages contain an index, begin, and piece. Note that they are 
    335 correlated with request messages implicitly. It's possible for an 
    336 unexpected piece to arrive if choke and unchoke messages are sent in 
    337 quick succession and/or transfer is going very slowly. </p> 
    338                     <p>Downloaders 
    339 generally download pieces in random order, which does a reasonably good 
    340 job of keeping them from having a strict subset or superset of the 
    341 pieces of any of their peers. </p> 
    342                     <p>Choking is 
    343 done for several reasons. TCP congestion control behaves very poorly 
    344 when sending over many connections at once. Also, choking lets each 
    345 peer use a tit-for-tat-ish algorithm to ensure that they get a 
    346 consistent download rate. </p> 
    347                     <p>The choking 
    348 algorithm described below is the currently deployed one. It is very 
    349 important that all new algorithms work well both in a network 
    350 consisting entirely of themselves and in a network consisting mostly of 
    351 this one. </p> 
    352                     <p>There are several criteria a 
    353 good choking algorithm should meet. It should cap the number of 
    354 simultaneous uploads for good TCP performance. It should avoid choking 
    355 and unchoking quickly, known as 'fibrillation'. It should reciprocate 
    356 to peers who let it download. Finally, it should try out unused 
    357 connections once in a while to find out if they might be better than 
    358 the currently used ones, known as optimistic unchoking. </p> 
    359                     <p>The 
    360 currently deployed choking algorithm avoids fibrillation by only 
    361 changing who's choked once every ten seconds. It does reciprocation and 
    362 number of uploads capping by unchoking the four peers which it has the 
    363 best download rates from and are interested. Peers which have a better 
    364 upload rate but aren't interested get unchoked and if they become 
    365 interested the worst uploader gets choked. If a downloader has a 
    366 complete file, it uses its upload rate rather than its download rate to 
    367 decide who to unchoke. </p> 
    368  
    369                     <p>For optimistic 
    370 unchoking, at any one time there is a single peer which is unchoked 
    371 regardless of it's upload rate (if interested, it counts as one of the 
    372 four allowed downloaders.) Which peer is optimistically unchoked 
    373 rotates every 30 seconds. To give them a decent chance of getting a 
    374 complete piece to upload, new connections are three times as likely to 
    375 start as the current optimistic unchoke as anywhere else in the 
    376 rotation. </p> 
     15<div id="nav"> 
     16<ul> 
     17<li><a href="./index.html">Home</a></li> 
     18<li><a href="./introduction.html">For Users</a></li> 
     19<li><span>For Developers</span></li> 
     20<!-- <li><a href="./blog.html">Blog</a></li> --> 
     21<!-- <li><a href="./donate.html">Donate!</a></li> --> 
     22</ul> 
    37723</div> 
    378 <!-- CONTENT ENDS --> 
    379  
     24<!-- ### Begin Content ### --> 
     25<div id="second"> 
     26<p>BitTorrent is a protocol for distributing files. It identifies content by URL and is designed to integrate seamlessly with the web. Its advantage over plain HTTP is that when multiple downloads of the same file happen concurrently, the downloaders upload to each other, making it possible for the file source to support very large numbers of downloaders with only a modest increase in its load.</p> 
     27<h3>A BitTorrent file distribution consists of these entities:</h3> 
     28<ul> 
     29<li>An ordinary web server</li> 
     30<li>A static 'metainfo' file</li> 
     31<li>A BitTorrent tracker</li> 
     32<li>An 'original' downloader</li> 
     33<li>The end user web browsers</li> 
     34<li>The end user downloaders</li> 
     35</ul> 
     36<p>There are ideally many end users for a single file.</p> 
     37<h3>To start serving, a host goes through the following steps:</h3> 
     38<ol> 
     39<li>Start running a tracker (or, more likely, have one running already).</li> 
     40<li>Start running an ordinary web server, such as apache, or have one already.</li> 
     41<li>Associate the extension .torrent with mimetype <code>application/x-bittorrent</code> on their web server (or have done so already).</li> 
     42<li>Generate a metainfo (.torrent) file using the complete file to be served and the URL of the tracker.</li> 
     43<li>Put the metainfo file on the web server.</li> 
     44<li>Link to the metainfo (.torrent) file from some other web page.</li> 
     45<li>Start a downloader which already has the complete file (the 'origin').</li> 
     46</ol> 
     47<h3>To start downloading, a user does the following:</h3> 
     48<ol> 
     49<li>Install BitTorrent (or have done so already).</li> 
     50<li>Surf the web.</li> 
     51<li>Click on a link to a .torrent file.</li> 
     52<li>Select where to save the file locally, or select a partial download to resume.</li> 
     53<li>Wait for download to complete.</li> 
     54<li>Tell downloader to exit (it keeps uploading until this happens).</li> 
     55</ol> 
     56<h3>The connectivity is as follows:</h3> 
     57<ul> 
     58<li>Strings are length-prefixed base ten followed by a colon and the string. For example <code>4:spam</code> corresponds to 'spam'.</li> 
     59<li>Integers are represented by an 'i' followed by the number in base 10 followed by an 'e'. For example <code>i3e</code> corresponds to 3 and <code>i-3e </code>corresponds to -3. Integers have no size limitation. <code>i-0e</code> is invalid. All encodings with a leading zero, such as <code>i03e</code>, are invalid, other than <code>i0e</code>, which of course corresponds to 0.</li> 
     60<li>Lists are encoded as an 'l' followed by their elements (also bencoded) followed by an 'e'. For example <code>l4:spam4:eggse</code> corresponds to ['spam', 'eggs'].</li> 
     61<li>Dictionaries are encoded as a 'd' followed by a list of alternating keys and their corresponding values followed by an 'e'. For example, <code>d3:cow3:moo4:spam4:eggse</code> corresponds to {'cow': 'moo', 'spam': 'eggs'} and <code>d4:spaml1:a1:bee</code> corresponds to {'spam': ['a', 'b']}. Keys must be strings and appear in sorted order (sorted as raw strings, not alphanumerics).</li> 
     62</ul> 
     63<h3>Metainfo files are bencoded dictionaries with the following keys:</h3> 
     64<dl> 
     65<dt><code>announce</code></dt> 
     66<dd>The URL of the tracker.</dd> 
     67<dt><code>info</code></dt> 
     68<dd>This maps to a dictionary, with keys described below.</dd> 
     69<dd>The <code>name</code> key maps to a string which is the suggested name to save the file (or directory) as. It is purely advisory.</dd> 
     70<dd><code>piece length</code> maps to the number of bytes in each piece the file is split into. For the purposes of transfer, files are split into fixed-size pieces which are all the same length except for possibly the last one which may be truncated. <code>piece length</code> is almost always a power of two, most commonly 2 18 = 256 K (BitTorrent prior to version 3.2 uses 2 20 = 1 M as default).</dd> 
     71<dd><code>pieces</code> maps to a string whose length is a multiple of 20. It is to be subdivided into strings of length 20, each of which is the SHA1 hash of the piece at the corresponding index.</dd> 
     72<dd>There is also a key <code>length</code> or a key <code>files</code>, but not both or neither. If <code>length</code> is present then the download represents a single file, otherwise it represents a set of files which go in a directory structure.</dd> 
     73<dd>In the single file case, <code>length</code> maps to the length of the file in bytes.</dd> 
     74<dd>For the purposes of the other keys, the multi-file case is treated as only having a single file by concatenating the files in the order they appear in the files list. The files list is the value <code>files</code> maps to, and is a list of dictionaries containing the following keys:</dd> 
     75<dd><code>length</code> - The length of the file, in bytes.</dd> 
     76<dd><code>path</code> - A list of strings corresponding to subdirectory names, the last of which is the actual file name (a zero length list is an error case).</dd> 
     77<dd>In the single file case, the name key is the name of a file, in the muliple file case, it's the name of a directory.</dd> 
     78</dl> 
     79<h3>Tracker GET requests have the following keys:</h3> 
     80<dl> 
     81<dt><code>info_hash</code></dt> 
     82<dd>The 20 byte sha1 hash of the bencoded form of the info value from the metainfo file. Note that this is a substring of the metainfo file. This value will almost certainly have to be escaped.</dd> 
     83<dt><code>peer_id</code></dt> 
     84<dd>A string of length 20 which this downloader uses as its id. Each downloader generates its own id at random at the start of a new download. This value will also almost certainly have to be escaped.</dd> 
     85<dt><code>ip</code></dt> 
     86<dd>An optional parameter giving the IP (or dns name) which this peer is at. Generally used for the origin if it's on the same machine as the tracker.</dd> 
     87<dt><code>port</code></dt> 
     88<dd>The port number this peer is listening on. Common behavior is for a downloader to try to listen on port 6881 and if that port is taken try 6882, then 6883, etc. and give up after 6889.</dd> 
     89<dt><code>uploaded</code></dt> 
     90<dd>The total amount uploaded so far, encoded in base ten ascii.</dd> 
     91<dt><code>downloaded</code></dt> 
     92<dd>The total amount downloaded so far, encoded in base ten ascii.</dd> 
     93<dt><code>left</code></dt> 
     94<dd>The number of bytes this peer still has to download, encoded in base ten ascii. Note that this can't be computed from downloaded and the file length since it might be a resume, and there's a chance that some of the downloaded data failed an integrity check and had to be re-downloaded.</dd> 
     95<dt><code>event</code></dt> 
     96<dd>This is an optional key which maps to <code>started</code>, <code>completed</code>, or <code>stopped</code> (or <code>empty</code>, which is the same as not being present). If not present, this is one of the announcements done at regular intervals. An announcement using <code>started</code> is sent when a download first begins, and one using <code>completed</code> is sent when the download is complete. No <code>completed</code> is sent if the file was complete when started. Downloaders send an announcement using <code>stopped</code> when they cease downloading.</dd> 
     97</dl> 
     98<p>Tracker responses are bencoded dictionaries. If a tracker response has a key <code>failure reason</code>, then that maps to a human readable string which explains why the query failed, and no other keys are required. Otherwise, it must have two keys: <code>interval</code>, which maps to the number of seconds the downloader should wait between regular rerequests, and <code>peers</code>. <code>peers</code> maps to a list of dictionaries corresponding to <code>peers</code>, each of which contains the keys <code>peer id</code>, <code>ip</code>, and <code>port</code>, which map to the peer's self-selected ID, IP address or dns name as a string, and port number, respectively. Note that downloaders may rerequest on nonscheduled times if an event happens or they need more peers.</p> 
     99<p>If you want to make any extensions to metainfo files or tracker queries, please coordinate with Bram Cohen to make sure that all extensions are done compatibly.</p> 
     100<p>BitTorrent's peer protocol operates over TCP. It performs efficiently without setting any socket options.</p> 
     101<p>Peer connections are symmetrical. Messages sent in both directions look the same, and data can flow in either direction.</p> 
     102<p>The peer protocol refers to pieces of the file by index as described in the metainfo file, starting at zero. When a peer finishes downloading a piece and checks that the hash matches, it announces that it has that piece to all of its peers.</p> 
     103<p>Connections contain two bits of state on either end: choked or not, and interested or not. Choking is a notification that no data will be sent until unchoking happens. The reasoning and common techniques behind choking are explained later in this document.</p> 
     104<p>Data transfer takes place whenever one side is interested and the other side is not choking. Interest state must be kept up to date at all times - whenever a downloader doesn't have something they currently would ask a peer for in unchoked, they must express lack of interest, despite being choked. Implementing this properly is tricky, but makes it possible for downloaders to know which peers will start downloading immediately if unchoked.</p> 
     105<p>Connections start out choked and not interested.</p> 
     106<p>When data is being transferred, downloaders should keep several piece requests queued up at once in order to get good TCP performance (this is called 'pipelining'.) On the other side, requests which can't be written out to the TCP buffer immediately should be queued up in memory rather than kept in an application-level network buffer, so they can all be thrown out when a choke happens.</p> 
     107<p>The peer wire protocol consists of a handshake followed by a never-ending stream of length-prefixed messages. The handshake starts with character ninteen (decimal) followed by the string 'BitTorrent protocol'. The leading character is a length prefix, put there in the hope that other new protocols may do the same and thus be trivially distinguishable from each other.</p> 
     108<p>All later integers sent in the protocol are encoded as four bytes big-endian.</p> 
     109<p>After the fixed headers come eight reserved bytes, which are all zero in all current implementations. If you wish to extend the protocol using these bytes, please coordinate with Bram Cohen to make sure all extensions are done compatibly.</p> 
     110<p>Next comes the 20 byte sha1 hash of the bencoded form of the info value from the metainfo file. (This is the same value which is announced as <code>info_hash</code> to the tracker, only here it's raw instead of quoted here). If both sides don't send the same value, they sever the connection. The one possible exception is if a downloader wants to do multiple downloads over a single port, they may wait for incoming connections to give a download hash first, and respond with the same one if it's in their list.</p> 
     111<p>After the download hash comes the 20-byte peer id which is reported in tracker requests and contained in peer lists in tracker responses. If the receiving side's peer id doesn't match the one the initiating side expects, it severs the connection.</p> 
     112<p>That's it for handshaking, next comes an alternating stream of length prefixes and messages. Messages of length zero are keepalives, and ignored. Keepalives are generally sent once every two minutes, but note that timeouts can be done much more quickly when data is expected.</p> 
     113<h3>All non-keepalive messages start with a single byte which gives their type.<br /> 
     114The possible values are:</h3> 
     115<ul> 
     116<li>0 - choke</li> 
     117<li>1 - unchoke</li> 
     118<li>2 - interested</li> 
     119<li>3 - not interested</li> 
     120<li>4 - have</li> 
     121<li>5 - bitfield</li> 
     122<li>6 - request</li> 
     123<li>7 - piece</li> 
     124<li>8 - cancel</li> 
     125</ul> 
     126<p>'choke', 'unchoke', 'interested', and 'not interested' have no payload.</p> 
     127<p>'bitfield' is only ever sent as the first message. Its payload is a bitfield with each index that downloader has sent set to one and the rest set to zero. Downloaders which don't have anything yet may skip the 'bitfield' message. The first byte of the bitfield corresponds to indices 0 - 7 from high bit to low bit, respectively. The next one 8-15, etc. Spare bits at the end are set to zero.</p> 
     128<p>The 'have' message's payload is a single number, the index which that downloader just completed and checked the hash of.</p> 
     129<p>'request' messages contain an index, begin, and length. The last two are byte offsets. Length is generally a power of two unless it gets truncated by the end of the file. All current implementations use 2 15 , and close connections which request an amount greater than 2 17.</p> 
     130<p>'cancel' messages have the same payload as request messages. They are generally only sent towards the end of a download, during what's called 'endgame mode'. When a download is almost complete, there's a tendency for the last few pieces to all be downloaded off a single hosed modem line, taking a very long time. To make sure the last few pieces come in quickly, once requests for all pieces a given downloader doesn't have yet are currently pending, it sends requests for everything to everyone it's downloading from. To keep this from becoming horribly inefficient, it sends cancels to everyone else every time a piece arrives.</p> 
     131<p>'piece' messages contain an index, begin, and piece. Note that they are correlated with request messages implicitly. It's possible for an unexpected piece to arrive if choke and unchoke messages are sent in quick succession and/or transfer is going very slowly.</p> 
     132<p>Downloaders generally download pieces in random order, which does a reasonably good job of keeping them from having a strict subset or superset of the pieces of any of their peers.</p> 
     133<p>Choking is done for several reasons. TCP congestion control behaves very poorly when sending over many connections at once. Also, choking lets each peer use a tit-for-tat-ish algorithm to ensure that they get a consistent download rate.</p> 
     134<p>The choking algorithm described below is the currently deployed one. It is very important that all new algorithms work well both in a network consisting entirely of themselves and in a network consisting mostly of this one.</p> 
     135<p>There are several criteria a good choking algorithm should meet. It should cap the number of simultaneous uploads for good TCP performance. It should avoid choking and unchoking quickly, known as 'fibrillation'. It should reciprocate to peers who let it download. Finally, it should try out unused connections once in a while to find out if they might be better than the currently used ones, known as optimistic unchoking.</p> 
     136<p>The currently deployed choking algorithm avoids fibrillation by only changing who's choked once every ten seconds. It does reciprocation and number of uploads capping by unchoking the four peers which it has the best download rates from and are interested. Peers which have a better upload rate but aren't interested get unchoked and if they become interested the worst uploader gets choked. If a downloader has a complete file, it uses its upload rate rather than its download rate to decide who to unchoke.</p> 
     137<p>For optimistic unchoking, at any one time there is a single peer which is unchoked regardless of it's upload rate (if interested, it counts as one of the four allowed downloaders.) Which peer is optimistically unchoked rotates every 30 seconds. To give them a decent chance of getting a complete piece to upload, new connections are three times as likely to start as the current optimistic unchoke as anywhere else in the rotation.</p> 
    380138</div> 
    381 <!-- LEFT ENDS --> 
    382  
    383 <!-- FOOTER STARTS --> 
    384                 <div class="dashlines"> 
    385 <br>Copyright &copy 2006 bittorrent.org </A> 
    386                 </div> 
    387 <!-- FOOTER ENDS --> 
    388 </body></html> 
     139<!-- ### End Content ### --> 
     140</div> 
     141</div> 
     142<div id="footer"> 
     143<hr /> 
     144<p>Copyright © 2006 BitTorrent.org</p> 
     145</div> 
     146</body> 
     147</html> 
Note: See TracChangeset for help on using the changeset viewer.