root/dotorg/trunk/html/beps/bep_0003.html

Revision 11165, 21.7 KB (checked in by arvid, 9 months ago)

refreshed html

Line 
1<?xml version="1.0" encoding="utf-8" ?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
4<head>
5<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
6<meta name="generator" content="Docutils 0.5: http://docutils.sourceforge.net/" />
7<title></title>
8<link rel="stylesheet" href="../css/bep.css" type="text/css" />
9</head>
10<body>
11<div class="document">
12
13<div id="upper" class="clear">
14<div id="wrap">
15<div id="header">
16<h1><a href="../index.html">BitTorrent<span>.org</span></a></h1>
17</div>
18<div id="nav">
19<ul>
20<li><a href="../index.html">Home</a></li>
21<li><a href="../introduction.html">For Users</a></li>
22<li><a href="bep_0000.html"><span>For Developers</span></a></li>
23<!-- <li><a href="./blog">Blog</a></li> -->
24<li><a href="http://forum.bittorrent.org"> Forums </li>
25<li><a href="../donate.html">Donate!</a></li>
26</ul>
27</div> <!-- nav -->
28<!-- ### Begin Content ### -->
29<div id="second">
30
31
32
33<table class="rfc2822 docutils field-list" frame="void" rules="none">
34<col class="field-name" />
35<col class="field-body" />
36<tbody valign="top">
37<tr class="field"><th class="field-name">BEP:</th><td class="field-body">3</td>
38</tr>
39<tr class="field"><th class="field-name">Title:</th><td class="field-body">The BitTorrent Protocol Specification</td>
40</tr>
41<tr class="field"><th class="field-name">Version:</th><td class="field-body">11031</td>
42</tr>
43<tr class="field"><th class="field-name">Last-Modified:</th><td class="field-body"><a class="reference" href="http://bittorrent.org/trac/browser/dotorg/trunk/html/beps/bep_0003.rst">2008-02-28 16:43:58 -0800 (Thu, 28 Feb 2008)</a></td>
44</tr>
45<tr class="field"><th class="field-name">Author:</th><td class="field-body">Bram Cohen &lt;bram&#32;&#97;t&#32;bittorrent.com&gt;</td>
46</tr>
47<tr class="field"><th class="field-name">Status:</th><td class="field-body">Final</td>
48</tr>
49<tr class="field"><th class="field-name">Type:</th><td class="field-body">Standard</td>
50</tr>
51<tr class="field"><th class="field-name">Created:</th><td class="field-body">10-Jan-2008</td>
52</tr>
53<tr class="field"><th class="field-name">Post-History:</th><td class="field-body">24-Jun-2009, clarified the encoding of strings in torrent files</td>
54</tr>
55</tbody>
56</table>
57<hr />
58<div class="contents topic" id="contents">
59<p class="topic-title first">Contents</p>
60<ul class="simple">
61<li><a class="reference" href="#a-bittorrent-file-distribution-consists-of-these-entities" id="id2">A BitTorrent file distribution consists of these entities:</a></li>
62<li><a class="reference" href="#to-start-serving-a-host-goes-through-the-following-steps" id="id3">To start serving, a host goes through the following steps:</a></li>
63<li><a class="reference" href="#to-start-downloading-a-user-does-the-following" id="id4">To start downloading, a user does the following:</a></li>
64<li><a class="reference" href="#the-connectivity-is-as-follows" id="id5">The connectivity is as follows:</a></li>
65<li><a class="reference" href="#metainfo-files-are-bencoded-dictionaries-with-the-following-keys" id="id6">Metainfo files are bencoded dictionaries with the following keys:</a></li>
66<li><a class="reference" href="#tracker-get-requests-have-the-following-keys" id="id7">Tracker GET requests have the following keys:</a></li>
67<li><a class="reference" href="#all-non-keepalive-messages-start-with-a-single-byte-which-gives-their-type" id="id8">All non-keepalive messages start with a single byte which gives their type.</a></li>
68<li><a class="reference" href="#the-possible-values-are" id="id9">The possible values are:</a></li>
69<li><a class="reference" href="#copyright" id="id10">Copyright</a></li>
70</ul>
71</div>
72<p>BitTorrent is a protocol for distributing files. It identifies content
73by URL and is designed to integrate seamlessly with the web. Its
74advantage over plain HTTP is that when multiple downloads of the same
75file happen concurrently, the downloaders upload to each other, making
76it possible for the file source to support very large numbers of
77downloaders with only a modest increase in its load.</p>
78<div class="section" id="a-bittorrent-file-distribution-consists-of-these-entities">
79<h1>A BitTorrent file distribution consists of these entities:</h1>
80<ul class="simple">
81<li>An ordinary web server</li>
82<li>A static 'metainfo' file</li>
83<li>A BitTorrent tracker</li>
84<li>An 'original' downloader</li>
85<li>The end user web browsers</li>
86<li>The end user downloaders</li>
87</ul>
88<p>There are ideally many end users for a single file.</p>
89</div>
90<div class="section" id="to-start-serving-a-host-goes-through-the-following-steps">
91<h1>To start serving, a host goes through the following steps:</h1>
92<ol class="arabic simple">
93<li>Start running a tracker (or, more likely, have one running already).</li>
94<li>Start running an ordinary web server, such as apache, or have one already.</li>
95<li>Associate the extension .torrent with mimetype application/x-bittorrent on their web server (or have done so already).</li>
96<li>Generate a metainfo (.torrent) file using the complete file to be served and the URL of the tracker.</li>
97<li>Put the metainfo file on the web server.</li>
98<li>Link to the metainfo (.torrent) file from some other web page.</li>
99<li>Start a downloader which already has the complete file (the 'origin').</li>
100</ol>
101</div>
102<div class="section" id="to-start-downloading-a-user-does-the-following">
103<h1>To start downloading, a user does the following:</h1>
104<ol class="arabic simple">
105<li>Install BitTorrent (or have done so already).</li>
106<li>Surf the web.</li>
107<li>Click on a link to a .torrent file.</li>
108<li>Select where to save the file locally, or select a partial download to resume.</li>
109<li>Wait for download to complete.</li>
110<li>Tell downloader to exit (it keeps uploading until this happens).</li>
111</ol>
112</div>
113<div class="section" id="the-connectivity-is-as-follows">
114<h1>The connectivity is as follows:</h1>
115<ul class="simple">
116<li>Strings are length-prefixed base ten followed by a colon and the string. For example <tt class="docutils literal"><span class="pre">4:spam</span></tt> corresponds to 'spam'.</li>
117<li>Integers are represented by an 'i' followed by the number in base 10
118followed by an 'e'. For example <tt class="docutils literal"><span class="pre">i3e</span></tt> corresponds to 3 and
119<tt class="docutils literal"><span class="pre">i-3e</span></tt> corresponds to -3. Integers have no size
120limitation. <tt class="docutils literal"><span class="pre">i-0e</span></tt> is invalid. All encodings with a leading
121zero, such as <tt class="docutils literal"><span class="pre">i03e</span></tt>, are invalid, other than
122<tt class="docutils literal"><span class="pre">i0e</span></tt>, which of course corresponds to 0.</li>
123<li>Lists are encoded as an 'l' followed by their elements (also
124bencoded) followed by an 'e'. For example <tt class="docutils literal"><span class="pre">l4:spam4:eggse</span></tt>
125corresponds to ['spam', 'eggs'].</li>
126<li>Dictionaries are encoded as a 'd' followed by a list of alternating
127keys and their corresponding values followed by an 'e'. For example,
128<tt class="docutils literal"><span class="pre">d3:cow3:moo4:spam4:eggse</span></tt> corresponds to {'cow': 'moo',
129'spam': 'eggs'} and <tt class="docutils literal"><span class="pre">d4:spaml1:a1:bee</span></tt> corresponds to
130{'spam': ['a', 'b']}. Keys must be strings and appear in sorted order
131(sorted as raw strings, not alphanumerics).</li>
132</ul>
133</div>
134<div class="section" id="metainfo-files-are-bencoded-dictionaries-with-the-following-keys">
135<h1>Metainfo files are bencoded dictionaries with the following keys:</h1>
136<dl class="docutils">
137<dt>announce</dt>
138<dd>The URL of the tracker.</dd>
139<dt>info</dt>
140<dd><p class="first">This maps to a dictionary, with keys described below.</p>
141<p>The <tt class="docutils literal"><span class="pre">name</span></tt> key maps to a UTF-8 encoded string which is the
142suggested name to save the file (or directory) as. It is purely advisory.</p>
143<p><tt class="docutils literal"><span class="pre">piece</span> <span class="pre">length</span></tt> maps to the number of bytes in each piece
144the file is split into. For the purposes of transfer, files are
145split into fixed-size pieces which are all the same length except for
146possibly the last one which may be truncated. <tt class="docutils literal"><span class="pre">piece</span>
147<span class="pre">length</span></tt> is almost always a power of two, most commonly 2 18 =
148256 K (BitTorrent prior to version 3.2 uses 2 20 = 1 M as
149default).</p>
150<p><tt class="docutils literal"><span class="pre">pieces</span></tt> maps to a string whose length is a multiple of
15120. It is to be subdivided into strings of length 20, each of which is
152the SHA1 hash of the piece at the corresponding index.</p>
153<p>There is also a key <tt class="docutils literal"><span class="pre">length</span></tt> or a key <tt class="docutils literal"><span class="pre">files</span></tt>,
154but not both or neither. If <tt class="docutils literal"><span class="pre">length</span></tt> is present then the
155download represents a single file, otherwise it represents a set of
156files which go in a directory structure.</p>
157<p>In the single file case, <tt class="docutils literal"><span class="pre">length</span></tt> maps to the length of
158the file in bytes.</p>
159<p>For the purposes of the other keys, the multi-file case is treated as
160only having a single file by concatenating the files in the order they
161appear in the files list. The files list is the value
162<tt class="docutils literal"><span class="pre">files</span></tt> maps to, and is a list of dictionaries containing
163the following keys:</p>
164<p><tt class="docutils literal"><span class="pre">length</span></tt> - The length of the file, in bytes.</p>
165<p><tt class="docutils literal"><span class="pre">path</span></tt> - A list of UTF-8 encoded strings corresponding to subdirectory
166names, the last of which is the actual file name (a zero length list
167is an error case).</p>
168<p>In the single file case, the name key is the name of a file, in the
169muliple file case, it's the name of a directory.</p>
170<p class="last">All strings in a .torrent file that contains text must be UTF-8
171encoded.</p>
172</dd>
173</dl>
174</div>
175<div class="section" id="tracker-get-requests-have-the-following-keys">
176<h1>Tracker GET requests have the following keys:</h1>
177<dl class="docutils">
178<dt>info_hash</dt>
179<dd>The 20 byte sha1 hash of the bencoded form of the info value from the
180metainfo file. Note that this is a substring of the metainfo
181file. This value will almost certainly have to be escaped.</dd>
182<dt>peer_id</dt>
183<dd>A string of length 20 which this downloader uses as its id. Each
184downloader generates its own id at random at the start of a new
185download. This value will also almost certainly have to be escaped.</dd>
186<dt>ip</dt>
187<dd>An optional parameter giving the IP (or dns name) which this peer is
188at. Generally used for the origin if it's on the same machine as the
189tracker.</dd>
190<dt>port</dt>
191<dd>The port number this peer is listening on. Common behavior is for a
192downloader to try to listen on port 6881 and if that port is taken try
1936882, then 6883, etc. and give up after 6889.</dd>
194<dt>uploaded</dt>
195<dd>The total amount uploaded so far, encoded in base ten ascii.</dd>
196<dt>downloaded</dt>
197<dd>The total amount downloaded so far, encoded in base ten ascii.</dd>
198<dt>left</dt>
199<dd>The number of bytes this peer still has to download, encoded in
200base ten ascii. Note that this can't be computed from downloaded and
201the file length since it might be a resume, and there's a chance that
202some of the downloaded data failed an integrity check and had to be
203re-downloaded.</dd>
204<dt>event</dt>
205<dd>This is an optional key which maps to <tt class="docutils literal"><span class="pre">started</span></tt>,
206<tt class="docutils literal"><span class="pre">completed</span></tt>, or <tt class="docutils literal"><span class="pre">stopped</span></tt> (or
207<tt class="docutils literal"><span class="pre">empty</span></tt>, which is the same as not being present). If not
208present, this is one of the announcements done at regular
209intervals. An announcement using <tt class="docutils literal"><span class="pre">started</span></tt> is sent when a
210download first begins, and one using <tt class="docutils literal"><span class="pre">completed</span></tt> is sent
211when the download is complete. No <tt class="docutils literal"><span class="pre">completed</span></tt> is sent if
212the file was complete when started. Downloaders send an announcement
213using <tt class="docutils literal"><span class="pre">stopped</span></tt> when they cease downloading.</dd>
214</dl>
215<p>Tracker responses are bencoded dictionaries. If a tracker response
216has a key <tt class="docutils literal"><span class="pre">failure</span> <span class="pre">reason</span></tt>, then that maps to a human
217readable string which explains why the query failed, and no other keys
218are required. Otherwise, it must have two keys: <tt class="docutils literal"><span class="pre">interval</span></tt>,
219which maps to the number of seconds the downloader should wait between
220regular rerequests, and <tt class="docutils literal"><span class="pre">peers</span></tt>. <tt class="docutils literal"><span class="pre">peers</span></tt> maps to
221a list of dictionaries corresponding to <tt class="docutils literal"><span class="pre">peers</span></tt>, each of
222which contains the keys <tt class="docutils literal"><span class="pre">peer</span> <span class="pre">id</span></tt>, <tt class="docutils literal"><span class="pre">ip</span></tt>, and
223<tt class="docutils literal"><span class="pre">port</span></tt>, which map to the peer's self-selected ID, IP
224address or dns name as a string, and port number, respectively. Note
225that downloaders may rerequest on nonscheduled times if an event
226happens or they need more peers.</p>
227<p>If you want to make any extensions to metainfo files or tracker
228queries, please coordinate with Bram Cohen to make sure that all
229extensions are done compatibly.</p>
230<p>BitTorrent's peer protocol operates over TCP. It performs efficiently
231without setting any socket options.</p>
232<p>Peer connections are symmetrical. Messages sent in both directions
233look the same, and data can flow in either direction.</p>
234<p>The peer protocol refers to pieces of the file by index as
235described in the metainfo file, starting at zero. When a peer finishes
236downloading a piece and checks that the hash matches, it announces
237that it has that piece to all of its peers.</p>
238<p>Connections contain two bits of state on either end: choked or not,
239and interested or not. Choking is a notification that no data will be
240sent until unchoking happens. The reasoning and common techniques
241behind choking are explained later in this document.</p>
242<p>Data transfer takes place whenever one side is interested and the
243other side is not choking. Interest state must be kept up to date at
244all times - whenever a downloader doesn't have something they
245currently would ask a peer for in unchoked, they must express lack of
246interest, despite being choked. Implementing this properly is tricky,
247but makes it possible for downloaders to know which peers will start
248downloading immediately if unchoked.</p>
249<p>Connections start out choked and not interested.</p>
250<p>When data is being transferred, downloaders should keep several
251piece requests queued up at once in order to get good TCP performance
252(this is called 'pipelining'.) On the other side, requests which can't
253be written out to the TCP buffer immediately should be queued up in
254memory rather than kept in an application-level network buffer, so
255they can all be thrown out when a choke happens.</p>
256<p>The peer wire protocol consists of a handshake followed by a
257never-ending stream of length-prefixed messages. The handshake starts
258with character ninteen (decimal) followed by the string 'BitTorrent
259protocol'. The leading character is a length prefix, put there in the
260hope that other new protocols may do the same and thus be trivially
261distinguishable from each other.</p>
262<p>All later integers sent in the protocol are encoded as four bytes
263big-endian.</p>
264<p>After the fixed headers come eight reserved bytes, which are all
265zero in all current implementations. If you wish to extend the
266protocol using these bytes, please coordinate with Bram Cohen to make
267sure all extensions are done compatibly.</p>
268<p>Next comes the 20 byte sha1 hash of the bencoded form of the info
269value from the metainfo file. (This is the same value which is
270announced as <tt class="docutils literal"><span class="pre">info_hash</span></tt> to the tracker, only here it's raw
271instead of quoted here). If both sides don't send the same value, they
272sever the connection. The one possible exception is if a downloader
273wants to do multiple downloads over a single port, they may wait for
274incoming connections to give a download hash first, and respond with
275the same one if it's in their list.</p>
276<p>After the download hash comes the 20-byte peer id which is reported
277in tracker requests and contained in peer lists in tracker
278responses. If the receiving side's peer id doesn't match the one the
279initiating side expects, it severs the connection.</p>
280<p>That's it for handshaking, next comes an alternating stream of
281length prefixes and messages. Messages of length zero are keepalives,
282and ignored. Keepalives are generally sent once every two minutes, but
283note that timeouts can be done much more quickly when data is
284expected.</p>
285</div>
286<div class="section" id="all-non-keepalive-messages-start-with-a-single-byte-which-gives-their-type">
287<h1>All non-keepalive messages start with a single byte which gives their type.</h1>
288</div>
289<div class="section" id="the-possible-values-are">
290<h1>The possible values are:</h1>
291<ul class="simple">
292<li>0 - choke</li>
293<li>1 - unchoke</li>
294<li>2 - interested</li>
295<li>3 - not interested</li>
296<li>4 - have</li>
297<li>5 - bitfield</li>
298<li>6 - request</li>
299<li>7 - piece</li>
300<li>8 - cancel</li>
301</ul>
302<p>'choke', 'unchoke', 'interested', and 'not interested' have no payload.</p>
303<p>'bitfield' is only ever sent as the first message. Its payload is a
304bitfield with each index that downloader has sent set to one and the
305rest set to zero. Downloaders which don't have anything yet may skip
306the 'bitfield' message. The first byte of the bitfield corresponds to
307indices 0 - 7 from high bit to low bit, respectively. The next one
3088-15, etc. Spare bits at the end are set to zero.</p>
309<p>The 'have' message's payload is a single number, the index which
310that downloader just completed and checked the hash of.</p>
311<p>'request' messages contain an index, begin, and length. The last
312two are byte offsets. Length is generally a power of two unless it
313gets truncated by the end of the file. All current implementations use
3142 15 , and close connections which request an amount greater than 2
31517.</p>
316<p>'cancel' messages have the same payload as request messages. They
317are generally only sent towards the end of a download, during what's
318called 'endgame mode'. When a download is almost complete, there's a
319tendency for the last few pieces to all be downloaded off a single
320hosed modem line, taking a very long time. To make sure the last few
321pieces come in quickly, once requests for all pieces a given
322downloader doesn't have yet are currently pending, it sends requests
323for everything to everyone it's downloading from. To keep this from
324becoming horribly inefficient, it sends cancels to everyone else every
325time a piece arrives.</p>
326<p>'piece' messages contain an index, begin, and piece. Note that they
327are correlated with request messages implicitly. It's possible for an
328unexpected piece to arrive if choke and unchoke messages are sent in
329quick succession and/or transfer is going very slowly.</p>
330<p>Downloaders generally download pieces in random order, which does a
331reasonably good job of keeping them from having a strict subset or
332superset of the pieces of any of their peers.</p>
333<p>Choking is done for several reasons. TCP congestion control behaves
334very poorly when sending over many connections at once. Also, choking
335lets each peer use a tit-for-tat-ish algorithm to ensure that they get
336a consistent download rate.</p>
337<p>The choking algorithm described below is the currently deployed
338one. It is very important that all new algorithms work well both in a
339network consisting entirely of themselves and in a network consisting
340mostly of this one.</p>
341<p>There are several criteria a good choking algorithm should meet. It
342should cap the number of simultaneous uploads for good TCP
343performance. It should avoid choking and unchoking quickly, known as
344'fibrillation'. It should reciprocate to peers who let it
345download. Finally, it should try out unused connections once in a
346while to find out if they might be better than the currently used
347ones, known as optimistic unchoking.</p>
348<p>The currently deployed choking algorithm avoids fibrillation by
349only changing who's choked once every ten seconds. It does
350reciprocation and number of uploads capping by unchoking the four
351peers which it has the best download rates from and are
352interested. Peers which have a better upload rate but aren't
353interested get unchoked and if they become interested the worst
354uploader gets choked. If a downloader has a complete file, it uses its
355upload rate rather than its download rate to decide who to
356unchoke.</p>
357<p>For optimistic unchoking, at any one time there is a single peer
358which is unchoked regardless of it's upload rate (if interested, it
359counts as one of the four allowed downloaders.) Which peer is
360optimistically unchoked rotates every 30 seconds. To give them a
361decent chance of getting a complete piece to upload, new connections
362are three times as likely to start as the current optimistic unchoke
363as anywhere else in the rotation.</p>
364</div>
365<div class="section" id="copyright">
366<h1>Copyright</h1>
367<p>This document has been placed in the public domain.</p>
368<!-- Local Variables:
369mode: indented-text
370indent-tabs-mode: nil
371sentence-end-double-space: t
372fill-column: 70
373coding: utf-8
374End: -->
375</div>
376
377
378</div>
379        <div id="footer">
380<hr/>
381</div>
382
383</div>
384</body>
385</html>
Note: See TracBrowser for help on using the browser.