Changeset 10506 for dotorg/trunk/html/beps/bep_0002.html
- Timestamp:
- 02/02/08 03:04:50 (2 years ago)
- File:
-
- 1 edited
-
dotorg/trunk/html/beps/bep_0002.html (modified) (1 diff)
Legend:
- Unmodified
- Added
- Removed
-
dotorg/trunk/html/beps/bep_0002.html
r10505 r10506 1 <?xml version="1.0" encoding="utf-8" ?> 2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 3 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 4 <head> 5 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 6 <meta name="generator" content="Docutils 0.5: http://docutils.sourceforge.net/" /> 7 <title></title> 8 <link rel="stylesheet" href="../css/screen.css" type="text/css" /> 9 </head> 10 <body> 11 <div class="document"> 12 13 <div id="upper" class="clear"> 14 <div id="wrap"> 15 <div id="header"> 16 <h1><a href="./index.html">BitTorrent<span>.org</span></a></h1> 17 </div> 18 <div id="nav"> 19 <ul> 20 <li><a href="./index.html">Home</a></li> 21 <li><a href="./introduction.html">For Users</a></li> 22 <li><span>For Developers</span></li> 23 <!-- <li><a href="./blog">Blog</a></li> --> 24 <li><a href="./donate.html">Donate!</a></li> 25 </ul> 26 </div> <!-- nav --> 27 <!-- ### Begin Content ### --> 28 <div id="second"> 29 30 31 32 <table class="rfc2822 docutils field-list" frame="void" rules="none"> 33 <col class="field-name" /> 34 <col class="field-body" /> 35 <tbody valign="top"> 36 <tr class="field"><th class="field-name">BEP:</th><td class="field-body">2</td> 37 </tr> 38 <tr class="field"><th class="field-name">Title:</th><td class="field-body">The BitTorrent Protocol Specification</td> 39 </tr> 40 <tr class="field"><th class="field-name">Version:</th><td class="field-body">2</td> 41 </tr> 42 <tr class="field"><th class="field-name">Last-Modified:</th><td class="field-body"><a class="reference external" href="https://svn.bittorrent.com/trac.cgi/browser/dotorg/trunk/html/beps/bep_0002.rst">11-Jan-2008</a></td> 43 </tr> 44 <tr class="field"><th class="field-name">Author:</th><td class="field-body">Bram Cohen <bram at bittorrent.com></td> 45 </tr> 46 <tr class="field"><th class="field-name">Status:</th><td class="field-body">Final</td> 47 </tr> 48 <tr class="field"><th class="field-name">Type:</th><td class="field-body">Standard</td> 49 </tr> 50 <tr class="field"><th class="field-name">Created:</th><td class="field-body">10-Jan-2008</td> 51 </tr> 52 <tr class="field"><th class="field-name">Post-History:</th><td class="field-body"></td> 53 </tr> 54 </tbody> 55 </table> 56 <hr /> 57 <div class="contents topic" id="contents"> 58 <p class="topic-title first">Contents</p> 59 <ul class="simple"> 60 <li><a class="reference internal" href="#a-bittorrent-file-distribution-consists-of-these-entities" id="id2">A BitTorrent file distribution consists of these entities:</a></li> 61 <li><a class="reference internal" href="#to-start-serving-a-host-goes-through-the-following-steps" id="id3">To start serving, a host goes through the following steps:</a></li> 62 <li><a class="reference internal" href="#to-start-downloading-a-user-does-the-following" id="id4">To start downloading, a user does the following:</a></li> 63 <li><a class="reference internal" href="#the-connectivity-is-as-follows" id="id5">The connectivity is as follows:</a></li> 64 <li><a class="reference internal" href="#metainfo-files-are-bencoded-dictionaries-with-the-following-keys" id="id6">Metainfo files are bencoded dictionaries with the following keys:</a></li> 65 <li><a class="reference internal" href="#tracker-get-requests-have-the-following-keys" id="id7">Tracker GET requests have the following keys:</a></li> 66 <li><a class="reference internal" href="#all-non-keepalive-messages-start-with-a-single-byte-which-gives-their-type" id="id8">All non-keepalive messages start with a single byte which gives their type.</a></li> 67 <li><a class="reference internal" href="#the-possible-values-are" id="id9">The possible values are:</a></li> 68 </ul> 69 </div> 70 <p>BitTorrent is a protocol for distributing files. It identifies content 71 by URL and is designed to integrate seamlessly with the web. Its 72 advantage over plain HTTP is that when multiple downloads of the same 73 file happen concurrently, the downloaders upload to each other, making 74 it possible for the file source to support very large numbers of 75 downloaders with only a modest increase in its load.</p> 76 <div class="section" id="a-bittorrent-file-distribution-consists-of-these-entities"> 77 <h1>A BitTorrent file distribution consists of these entities:</h1> 78 <ul class="simple"> 79 <li>An ordinary web server</li> 80 <li>A static 'metainfo' file</li> 81 <li>A BitTorrent tracker</li> 82 <li>An 'original' downloader</li> 83 <li>The end user web browsers</li> 84 <li>The end user downloaders</li> 85 </ul> 86 <p>There are ideally many end users for a single file.</p> 87 </div> 88 <div class="section" id="to-start-serving-a-host-goes-through-the-following-steps"> 89 <h1>To start serving, a host goes through the following steps:</h1> 90 <ol class="arabic simple"> 91 <li>Start running a tracker (or, more likely, have one running already).</li> 92 <li>Start running an ordinary web server, such as apache, or have one already.</li> 93 <li>Associate the extension .torrent with mimetype application/x-bittorrent on their web server (or have done so already).</li> 94 <li>Generate a metainfo (.torrent) file using the complete file to be served and the URL of the tracker.</li> 95 <li>Put the metainfo file on the web server.</li> 96 <li>Link to the metainfo (.torrent) file from some other web page.</li> 97 <li>Start a downloader which already has the complete file (the 'origin').</li> 98 </ol> 99 </div> 100 <div class="section" id="to-start-downloading-a-user-does-the-following"> 101 <h1>To start downloading, a user does the following:</h1> 102 <ol class="arabic simple"> 103 <li>Install BitTorrent (or have done so already).</li> 104 <li>Surf the web.</li> 105 <li>Click on a link to a .torrent file.</li> 106 <li>Select where to save the file locally, or select a partial download to resume.</li> 107 <li>Wait for download to complete.</li> 108 <li>Tell downloader to exit (it keeps uploading until this happens).</li> 109 </ol> 110 </div> 111 <div class="section" id="the-connectivity-is-as-follows"> 112 <h1>The connectivity is as follows:</h1> 113 <ul class="simple"> 114 <li>Strings are length-prefixed base ten followed by a colon and the string. For example <code>4:spam</code> corresponds to 'spam'.</li> 115 <li>Integers are represented by an 'i' followed by the number in base 10 116 followed by an 'e'. For example <code>i3e</code> corresponds to 3 and 117 <code>i-3e </code>corresponds to -3. Integers have no size 118 limitation. <code>i-0e</code> is invalid. All encodings with a leading 119 zero, such as <code>i03e</code>, are invalid, other than 120 <code>i0e</code>, which of course corresponds to 0.</li> 121 <li>Lists are encoded as an 'l' followed by their elements (also 122 bencoded) followed by an 'e'. For example <code>l4:spam4:eggse</code> 123 corresponds to ['spam', 'eggs'].</li> 124 <li>Dictionaries are encoded as a 'd' followed by a list of alternating 125 keys and their corresponding values followed by an 'e'. For example, 126 <code>d3:cow3:moo4:spam4:eggse</code> corresponds to {'cow': 'moo', 127 'spam': 'eggs'} and <code>d4:spaml1:a1:bee</code> corresponds to 128 {'spam': ['a', 'b']}. Keys must be strings and appear in sorted order 129 (sorted as raw strings, not alphanumerics).</li> 130 </ul> 131 </div> 132 <div class="section" id="metainfo-files-are-bencoded-dictionaries-with-the-following-keys"> 133 <h1>Metainfo files are bencoded dictionaries with the following keys:</h1> 134 <dl class="docutils"> 135 <dt>announce</dt> 136 <dd>The URL of the tracker.</dd> 137 <dt>info</dt> 138 <dd><p class="first">This maps to a dictionary, with keys described below.</p> 139 <p>The <code>name</code> key maps to a string which is the suggested name 140 to save the file (or directory) as. It is purely advisory.</p> 141 <p><code>piece length</code> maps to the number of bytes in each piece 142 the file is split into. For the purposes of transfer, files are 143 split into fixed-size pieces which are all the same length except for 144 possibly the last one which may be truncated. <code>piece 145 length</code> is almost always a power of two, most commonly 2 18 = 146 256 K (BitTorrent prior to version 3.2 uses 2 20 = 1 M as 147 default).</p> 148 <p><code>pieces</code> maps to a string whose length is a multiple of 149 20. It is to be subdivided into strings of length 20, each of which is 150 the SHA1 hash of the piece at the corresponding index.</p> 151 <p>There is also a key <code>length</code> or a key <code>files</code>, 152 but not both or neither. If <code>length</code> is present then the 153 download represents a single file, otherwise it represents a set of 154 files which go in a directory structure.</p> 155 <p>In the single file case, <code>length</code> maps to the length of 156 the file in bytes.</p> 157 <p>For the purposes of the other keys, the multi-file case is treated as 158 only having a single file by concatenating the files in the order they 159 appear in the files list. The files list is the value 160 <code>files</code> maps to, and is a list of dictionaries containing 161 the following keys:</p> 162 <p><code>length</code> - The length of the file, in bytes.</p> 163 <p><code>path</code> - A list of strings corresponding to subdirectory 164 names, the last of which is the actual file name (a zero length list 165 is an error case).</p> 166 <p class="last">In the single file case, the name key is the name of a file, in the 167 muliple file case, it's the name of a directory.</p> 168 </dd> 169 </dl> 170 </div> 171 <div class="section" id="tracker-get-requests-have-the-following-keys"> 172 <h1>Tracker GET requests have the following keys:</h1> 173 <dl class="docutils"> 174 <dt>info_hash</dt> 175 <dd>The 20 byte sha1 hash of the bencoded form of the info value from the 176 metainfo file. Note that this is a substring of the metainfo 177 file. This value will almost certainly have to be escaped.</dd> 178 <dt>peer_id</dt> 179 <dd>A string of length 20 which this downloader uses as its id. Each 180 downloader generates its own id at random at the start of a new 181 download. This value will also almost certainly have to be escaped.</dd> 182 <dt>ip</dt> 183 <dd>An optional parameter giving the IP (or dns name) which this peer is 184 at. Generally used for the origin if it's on the same machine as the 185 tracker.</dd> 186 <dt>port</dt> 187 <dd>The port number this peer is listening on. Common behavior is for a 188 downloader to try to listen on port 6881 and if that port is taken try 189 6882, then 6883, etc. and give up after 6889.</dd> 190 <dt>uploaded</dt> 191 <dd>The total amount uploaded so far, encoded in base ten ascii.</dd> 192 <dt>downloaded</dt> 193 <dd>The total amount downloaded so far, encoded in base ten ascii.</dd> 194 <dt>left</dt> 195 <dd>The number of bytes this peer still has to download, encoded in 196 base ten ascii. Note that this can't be computed from downloaded and 197 the file length since it might be a resume, and there's a chance that 198 some of the downloaded data failed an integrity check and had to be 199 re-downloaded.</dd> 200 <dt>event</dt> 201 <dd>This is an optional key which maps to <code>started</code>, 202 <code>completed</code>, or <code>stopped</code> (or 203 <code>empty</code>, which is the same as not being present). If not 204 present, this is one of the announcements done at regular 205 intervals. An announcement using <code>started</code> is sent when a 206 download first begins, and one using <code>completed</code> is sent 207 when the download is complete. No <code>completed</code> is sent if 208 the file was complete when started. Downloaders send an announcement 209 using <code>stopped</code> when they cease downloading.</dd> 210 </dl> 211 <p>Tracker responses are bencoded dictionaries. If a tracker response 212 has a key <code>failure reason</code>, then that maps to a human 213 readable string which explains why the query failed, and no other keys 214 are required. Otherwise, it must have two keys: <code>interval</code>, 215 which maps to the number of seconds the downloader should wait between 216 regular rerequests, and <code>peers</code>. <code>peers</code> maps to 217 a list of dictionaries corresponding to <code>peers</code>, each of 218 which contains the keys <code>peer id</code>, <code>ip</code>, and 219 <code>port</code>, which map to the peer's self-selected ID, IP 220 address or dns name as a string, and port number, respectively. Note 221 that downloaders may rerequest on nonscheduled times if an event 222 happens or they need more peers.</p> 223 <p>If you want to make any extensions to metainfo files or tracker 224 queries, please coordinate with Bram Cohen to make sure that all 225 extensions are done compatibly.</p> 226 <p>BitTorrent's peer protocol operates over TCP. It performs efficiently 227 without setting any socket options.</p> 228 <p>Peer connections are symmetrical. Messages sent in both directions 229 look the same, and data can flow in either direction.</p> 230 <p>The peer protocol refers to pieces of the file by index as 231 described in the metainfo file, starting at zero. When a peer finishes 232 downloading a piece and checks that the hash matches, it announces 233 that it has that piece to all of its peers.</p> 234 <p>Connections contain two bits of state on either end: choked or not, 235 and interested or not. Choking is a notification that no data will be 236 sent until unchoking happens. The reasoning and common techniques 237 behind choking are explained later in this document.</p> 238 <p>Data transfer takes place whenever one side is interested and the 239 other side is not choking. Interest state must be kept up to date at 240 all times - whenever a downloader doesn't have something they 241 currently would ask a peer for in unchoked, they must express lack of 242 interest, despite being choked. Implementing this properly is tricky, 243 but makes it possible for downloaders to know which peers will start 244 downloading immediately if unchoked.</p> 245 <p>Connections start out choked and not interested.</p> 246 <p>When data is being transferred, downloaders should keep several 247 piece requests queued up at once in order to get good TCP performance 248 (this is called 'pipelining'.) On the other side, requests which can't 249 be written out to the TCP buffer immediately should be queued up in 250 memory rather than kept in an application-level network buffer, so 251 they can all be thrown out when a choke happens.</p> 252 <p>The peer wire protocol consists of a handshake followed by a 253 never-ending stream of length-prefixed messages. The handshake starts 254 with character ninteen (decimal) followed by the string 'BitTorrent 255 protocol'. The leading character is a length prefix, put there in the 256 hope that other new protocols may do the same and thus be trivially 257 distinguishable from each other.</p> 258 <p>All later integers sent in the protocol are encoded as four bytes 259 big-endian.</p> 260 <p>After the fixed headers come eight reserved bytes, which are all 261 zero in all current implementations. If you wish to extend the 262 protocol using these bytes, please coordinate with Bram Cohen to make 263 sure all extensions are done compatibly.</p> 264 <p>Next comes the 20 byte sha1 hash of the bencoded form of the info 265 value from the metainfo file. (This is the same value which is 266 announced as <code>info_hash</code> to the tracker, only here it's raw 267 instead of quoted here). If both sides don't send the same value, they 268 sever the connection. The one possible exception is if a downloader 269 wants to do multiple downloads over a single port, they may wait for 270 incoming connections to give a download hash first, and respond with 271 the same one if it's in their list.</p> 272 <p>After the download hash comes the 20-byte peer id which is reported 273 in tracker requests and contained in peer lists in tracker 274 responses. If the receiving side's peer id doesn't match the one the 275 initiating side expects, it severs the connection.</p> 276 <p>That's it for handshaking, next comes an alternating stream of 277 length prefixes and messages. Messages of length zero are keepalives, 278 and ignored. Keepalives are generally sent once every two minutes, but 279 note that timeouts can be done much more quickly when data is 280 expected.</p> 281 </div> 282 <div class="section" id="all-non-keepalive-messages-start-with-a-single-byte-which-gives-their-type"> 283 <h1>All non-keepalive messages start with a single byte which gives their type.</h1> 284 </div> 285 <div class="section" id="the-possible-values-are"> 286 <h1>The possible values are:</h1> 287 <ul class="simple"> 288 <li>0 - choke</li> 289 <li>1 - unchoke</li> 290 <li>2 - interested</li> 291 <li>3 - not interested</li> 292 <li>4 - have</li> 293 <li>5 - bitfield</li> 294 <li>6 - request</li> 295 <li>7 - piece</li> 296 <li>8 - cancel</li> 297 </ul> 298 <p>'choke', 'unchoke', 'interested', and 'not interested' have no payload.</p> 299 <p>'bitfield' is only ever sent as the first message. Its payload is a 300 bitfield with each index that downloader has sent set to one and the 301 rest set to zero. Downloaders which don't have anything yet may skip 302 the 'bitfield' message. The first byte of the bitfield corresponds to 303 indices 0 - 7 from high bit to low bit, respectively. The next one 304 8-15, etc. Spare bits at the end are set to zero.</p> 305 <p>The 'have' message's payload is a single number, the index which 306 that downloader just completed and checked the hash of.</p> 307 <p>'request' messages contain an index, begin, and length. The last 308 two are byte offsets. Length is generally a power of two unless it 309 gets truncated by the end of the file. All current implementations use 310 2 15 , and close connections which request an amount greater than 2 311 17.</p> 312 <p>'cancel' messages have the same payload as request messages. They 313 are generally only sent towards the end of a download, during what's 314 called 'endgame mode'. When a download is almost complete, there's a 315 tendency for the last few pieces to all be downloaded off a single 316 hosed modem line, taking a very long time. To make sure the last few 317 pieces come in quickly, once requests for all pieces a given 318 downloader doesn't have yet are currently pending, it sends requests 319 for everything to everyone it's downloading from. To keep this from 320 becoming horribly inefficient, it sends cancels to everyone else every 321 time a piece arrives.</p> 322 <p>'piece' messages contain an index, begin, and piece. Note that they 323 are correlated with request messages implicitly. It's possible for an 324 unexpected piece to arrive if choke and unchoke messages are sent in 325 quick succession and/or transfer is going very slowly.</p> 326 <p>Downloaders generally download pieces in random order, which does a 327 reasonably good job of keeping them from having a strict subset or 328 superset of the pieces of any of their peers.</p> 329 <p>Choking is done for several reasons. TCP congestion control behaves 330 very poorly when sending over many connections at once. Also, choking 331 lets each peer use a tit-for-tat-ish algorithm to ensure that they get 332 a consistent download rate.</p> 333 <p>The choking algorithm described below is the currently deployed 334 one. It is very important that all new algorithms work well both in a 335 network consisting entirely of themselves and in a network consisting 336 mostly of this one.</p> 337 <p>There are several criteria a good choking algorithm should meet. It 338 should cap the number of simultaneous uploads for good TCP 339 performance. It should avoid choking and unchoking quickly, known as 340 'fibrillation'. It should reciprocate to peers who let it 341 download. Finally, it should try out unused connections once in a 342 while to find out if they might be better than the currently used 343 ones, known as optimistic unchoking.</p> 344 <p>The currently deployed choking algorithm avoids fibrillation by 345 only changing who's choked once every ten seconds. It does 346 reciprocation and number of uploads capping by unchoking the four 347 peers which it has the best download rates from and are 348 interested. Peers which have a better upload rate but aren't 349 interested get unchoked and if they become interested the worst 350 uploader gets choked. If a downloader has a complete file, it uses its 351 upload rate rather than its download rate to decide who to 352 unchoke.</p> 353 <p>For optimistic unchoking, at any one time there is a single peer 354 which is unchoked regardless of it's upload rate (if interested, it 355 counts as one of the four allowed downloaders.) Which peer is 356 optimistically unchoked rotates every 30 seconds. To give them a 357 decent chance of getting a complete piece to upload, new connections 358 are three times as likely to start as the current optimistic unchoke 359 as anywhere else in the rotation.</p> 360 </div> 361 362 363 </div> 364 <div id="footer"> 365 <hr/> 366 <p>Copyright 2006 BitTorrent.org</p> 367 </div> 368 369 </div> 370 </body> 371 </html>
Note: See TracChangeset
for help on using the changeset viewer.