root / dotorg / trunk / html / beps / bep_0019.html

Revision 11040, 17.7 kB (checked in by dave, 8 weeks ago)

All HTML docs now reference the new SVN server running at bittorrent.org

Line 
1<?xml version="1.0" encoding="utf-8" ?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
4<head>
5<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
6<meta name="generator" content="Docutils 0.5: http://docutils.sourceforge.net/" />
7<title></title>
8<link rel="stylesheet" href="../css/bep.css" type="text/css" />
9</head>
10<body>
11<div class="document">
12
13<div id="upper" class="clear">
14<div id="wrap">
15<div id="header">
16<h1><a href="../index.html">BitTorrent<span>.org</span></a></h1>
17</div>
18<div id="nav">
19<ul>
20<li><a href="../index.html">Home</a></li>
21<li><a href="../introduction.html">For Users</a></li>
22<li><a href="bep_0000.html"><span>For Developers</span></a></li>
23<!-- <li><a href="./blog">Blog</a></li> -->
24<li><a href="http://forum.bittorrent.org"> Forums </li>
25<li><a href="../donate.html">Donate!</a></li>
26</ul>
27</div> <!-- nav -->
28<!-- ### Begin Content ### -->
29<div id="second">
30
31
32
33<table class="rfc2822 docutils field-list" frame="void" rules="none">
34<col class="field-name" />
35<col class="field-body" />
36<tbody valign="top">
37<tr class="field"><th class="field-name">BEP:</th><td class="field-body">19</td>
38</tr>
39<tr class="field"><th class="field-name">Title:</th><td class="field-body">WebSeed - HTTP/FTP Seeding (GetRight style)</td>
40</tr>
41<tr class="field"><th class="field-name">Version:</th><td class="field-body">11016</td>
42</tr>
43<tr class="field"><th class="field-name">Last-Modified:</th><td class="field-body"><a class="reference external" href="http://svn.bittorrent.org/trac/browser/dotorg/trunk/html/beps/bep_0019.rst">2008-02-25 17:05:04 -0800 (Mon, 25 Feb 2008)</a></td>
44</tr>
45<tr class="field"><th class="field-name">Author:</th><td class="field-body">Michael Burford &lt;michael &#64;at&#64; getright.com&gt;</td>
46</tr>
47<tr class="field"><th class="field-name">Status:</th><td class="field-body">Draft</td>
48</tr>
49<tr class="field"><th class="field-name">Type:</th><td class="field-body">Standards Track</td>
50</tr>
51<tr class="field"><th class="field-name">Content-Type:</th><td class="field-body"><a class="reference external" href="http://www.bittorrent.org/beps/bep-0012">text/x-rst</a></td>
52</tr>
53<tr class="field"><th class="field-name">Created:</th><td class="field-body">21-Feb-2008</td>
54</tr>
55<tr class="field"><th class="field-name">Post-History:</th><td class="field-body"></td>
56</tr>
57</tbody>
58</table>
59<hr />
60<div class="contents topic" id="contents">
61<p class="topic-title first">Contents</p>
62<ul class="simple">
63<li><a class="reference internal" href="#abstract" id="id2">Abstract</a></li>
64<li><a class="reference internal" href="#rationale" id="id3">Rationale</a></li>
65<li><a class="reference internal" href="#advantages" id="id4">Advantages</a></li>
66<li><a class="reference internal" href="#metadata-extension" id="id5">Metadata Extension</a><ul>
67<li><a class="reference internal" href="#multi-file-torrents" id="id6">Multi-File Torrents</a></li>
68</ul>
69</li>
70<li><a class="reference internal" href="#client-implementation-overview" id="id7">Client Implementation Overview</a></li>
71<li><a class="reference internal" href="#client-implementation-notes" id="id8">Client Implementation Notes</a><ul>
72<li><a class="reference internal" href="#definition-of-gaps" id="id9">Definition of Gaps</a></li>
73<li><a class="reference internal" href="#piece-selection" id="id10">Piece Selection</a></li>
74<li><a class="reference internal" href="#changes-to-piece-selection" id="id11">Changes to Piece Selection</a><ul>
75<li><a class="reference internal" href="#pretty-rare-with-biggest-gap" id="id12">Pretty Rare With Biggest Gap</a></li>
76<li><a class="reference internal" href="#fill-in-the-gaps" id="id13">Fill In The Gaps</a></li>
77</ul>
78</li>
79<li><a class="reference internal" href="#http-and-ftp-optimizations" id="id14">HTTP and FTP Optimizations</a></li>
80</ul>
81</li>
82<li><a class="reference internal" href="#id1" id="id15">Multi-File Torrents</a></li>
83<li><a class="reference internal" href="#another-possible-client-implementation" id="id16">Another Possible Client Implementation</a></li>
84<li><a class="reference internal" href="#a-not-recommended-client-implementation" id="id17">A Not Recommended Client Implementation</a></li>
85<li><a class="reference internal" href="#more-on-other-protocols" id="id18">More On Other Protocols</a></li>
86<li><a class="reference internal" href="#acknowledgements" id="id19">Acknowledgements</a></li>
87<li><a class="reference internal" href="#references" id="id20">References</a></li>
88<li><a class="reference internal" href="#copyright" id="id21">Copyright</a></li>
89</ul>
90</div>
91<div class="section" id="abstract">
92<h1>Abstract</h1>
93<p>Using HTTP or FTP servers as seeds for BitTorrent downloads.</p>
94</div>
95<div class="section" id="rationale">
96<h1>Rationale</h1>
97<p>Many websites that list a BitTorrent download also provide a
98HTTP or FTP URL for the same file.  The files are identical.
99A WebSeeding BitTorrent client can download from either source,
100putting all the parts together into one complete file.</p>
101<p>The HTTP or FTP server acts as a permanently unchoked seed.</p>
102</div>
103<div class="section" id="advantages">
104<h1>Advantages</h1>
105<p>There is always an unchoked seed where anybody can get started.</p>
106<p>It does not break or change anything for clients that do not
107recognize the addition to the metadata.</p>
108<p>Clients that do not support HTTP/FTP Seeding would still get
109the benefit from peers sharing pieces originally from an HTTP/FTP
110server.</p>
111<p>The metadata does not have to be changed.  Download Manager tools
112(such as GetRight) commonly can add multiple HTTP/FTP mirror URLs
113for a file, usually by the user action of clicking multiple links
114on a web page and recognizing the same file name.</p>
115<p>Everybody is quite familiar with HTTP/FTP servers and their
116protocols.</p>
117<p>It has already been implemented in GetRight, the Mainline client,
118uTorrent, Azureus, libTorrent, and likely others.</p>
119<p>Changes are only needed in BitTorrent clients.  No changes to
120trackers, HTTP, or FTP servers are needed.  No scripts are
121needed on the HTTP or FTP server.</p>
122<p>As this is already supported in many very common clients (notably
123the Mainline client itself) seeding for a BitTorrent download
124could be done entirely with a company or individual's existing
125HTTP/FTP servers.</p>
126</div>
127<div class="section" id="metadata-extension">
128<h1>Metadata Extension</h1>
129<p>In the main area of the metadata file and not part of the &quot;info&quot;
130section, will be a new key, &quot;url-list&quot;.  This key will refer to
131a one or more URLs, and will contain a list of web addresses where
132torrent data can be retrieved.  This key may be safely ignored
133if the client is not capable of using it.</p>
134<dl class="docutils">
135<dt>For example (on multiple lines for readability):</dt>
136<dd>d
1378:announce27:http://tracker.com/announce
1388:url-list26:http://mirror.com/file.exe
1394:info...</dd>
140</dl>
141<p>If the &quot;url-list&quot; URL ends in a slash, &quot;/&quot; the client must add
142the &quot;name&quot; from the torrent to make the full URL.  This allows
143.torrent generators to treat this field same for single file and
144multi-file torrents.</p>
145<div class="section" id="multi-file-torrents">
146<h2>Multi-File Torrents</h2>
147<p>BitTorrent clients normally use the &quot;name&quot; from the torrent info
148section to make a folder, then use the &quot;path/file&quot; items from the
149info section within that folder.  For the case of Multi-File
150torrents, the &quot;url-list&quot; must be a root folder where a client
151could add the same &quot;name&quot; and &quot;path/file&quot; to create the URL for
152the request.</p>
153<dl class="docutils">
154<dt>For example:</dt>
155<dd><p class="first">...
1568:url-list22:http://mirror.com/pub/
1574:infod5:filesld6:lengthi949e4:pathl10:Readme.txte
158e4:name7:michael</p>
159<p class="last">A client would use all that to build a url:
160<a class="reference external" href="http://mirror.com/pub/michael/Readme.txt">http://mirror.com/pub/michael/Readme.txt</a></p>
161</dd>
162</dl>
163</div>
164</div>
165<div class="section" id="client-implementation-overview">
166<h1>Client Implementation Overview</h1>
167<p>A client should ignore any protocols it does not know from the url-list.
168A client may choose to implement HTTP but not FTP, or vice-versa.</p>
169<p>HTTP/FTP are &quot;streaming&quot; type protocols, and do not have BitTorrent's
170concept of blocks.  For HTTP you can use byte-ranges to resume
171anywhere or download specific ranges you specify, but with FTP you
172can only say where to start the download.</p>
173<p>I wanted to have many sequential blocks (&quot;gaps&quot;) in the data
174downloaded from BitTorrent peers so a HTTP/FTP connection would
175have big spaces to fill in.  You could use the byte-ranges with
176HTTP to request individual blocks--but each request will show
177in the server's logs, and somebody is going to think it is a
178denial of service attack on them if their shows 100's of connections
179in their log from the same IP.</p>
180<p>In GetRight's implementation, I made a couple changes to the usual
181&quot;rarest first&quot; piece-selection method to better allow &quot;gaps&quot; to develop
182between pieces.  That way there are longer spaces in the file for HTTP
183and FTP threads to fill.  They can start at the beginning of a gap and
184download until they get to the end.</p>
185</div>
186<div class="section" id="client-implementation-notes">
187<h1>Client Implementation Notes</h1>
188<p>Changes to the standard BitTorrent algorithms to optimize the ability
189of HTTP and FTP connections to fill in many sequential blocks in the file.</p>
190<div class="section" id="definition-of-gaps">
191<h2>Definition of Gaps</h2>
192<p>Gaps are spaces of multiple pieces in a row that the client does not have.</p>
193<p>Given a bitfield &quot;YYnnnnYnnY&quot; where Y is pieces it has and n are ones it
194does not have, there are two &quot;gaps&quot;, one of 4 pieces, and another of 2.</p>
195</div>
196<div class="section" id="piece-selection">
197<h2>Piece Selection</h2>
198<p>The major change is to the Piece Selection.</p>
199<p>If everything else is more-or-less equivalent, it is better to pick a
200piece to do from the gap of 2 when requesting a piece from a BitTorrent peer.</p>
201<p>In any gap, it is best to fill in from the end (ie, the highest piece number
202first).</p>
203<p>So given bitfield &quot;YYnnnnYnnY&quot; if the rareness of all the n pieces is similar
204it is better to select one of the # pieces &quot;YYnnnnY##Y&quot; and the best
205would be to select piece $ &quot;YYnnnnYn$Y&quot;</p>
206<p>These maximize the contiguous pieces for HTTP/FTP connections.  This allows
207a HTTP/FTP connection to start at the beginning of a large gap and download
208the most data before it gets to a piece the client already has.</p>
209</div>
210<div class="section" id="changes-to-piece-selection">
211<h2>Changes to Piece Selection</h2>
212<p>Change the &quot;rarest first&quot; piece selection to a &quot;pretty rare with biggest
213distance from another completed piece&quot;.</p>
214<div class="section" id="pretty-rare-with-biggest-gap">
215<h3>Pretty Rare With Biggest Gap</h3>
216<p>When scanning for the rarest piece, if the distance from another completed
217piece is less than that for the current rarest piece, it must be &quot;rare-X&quot;.
218Or if the distance is greater than the current piece's, it can be rare+X
219to be picked as the next piece. (For no better reason than it seemed to make
220sense at the time and scale, X is the square root of the number of peers
221minus 1.)</p>
222<p>So if 3 peers had the current rarest piece, the normal algorithm would
223pick a piece where 2 peers had it.  The changed algorithm would require
224that only 1 peer has the piece if that piece's distance from a complete
225piece was less than the gap for the current rarest piece.</p>
226<p>If the gap is bigger and the piece is the same &quot;rareness&quot; or the usual
227&quot;rare-1&quot; that piece is selected. (So if the gap is bigger, a piece with
228either 2 or 3 peers would be chosen.)</p>
229<p>So given &quot;YYnnn1Yn2Y&quot;, unless 1 is a lot more rare than 2, it's better to
230pick piece 2.</p>
231<p>Pseudo-Code logic:</p>
232<pre class="literal-block">
233X = sqrt(Peers) - 1;
234Gap = 0;
235CurGap = 0;
236CurRarest = MaxPieces+1;
237for (i=0; i&lt;MaxPieces; i++) {
238    if (IDoNotHavePiece(i)) {
239        Gap++;
240        if (PeerHasPiece(i)) {
241            PieceRareness = NumberOfPeersWithThePiece();
242            if (PieceRareness&lt;(CurRarest-X) ||
243                (PieceRareness&lt;=(CurRarest+X) &amp;&amp; Gap&gt;CurGap)) {
244                CurRarest = PieceRareness;
245                CurGap = Gap;
246                NextPiece = i;
247            }
248        }
249    } else {
250        Gap = 0;
251    }
252}
253</pre>
254</div>
255<div class="section" id="fill-in-the-gaps">
256<h3>Fill In The Gaps</h3>
257<p>If a file is more than 50% complete, it uses a different method for piece
258selection randomly.  (With over 50%, you should have a large number of
259pieces that other peers will want to download.)</p>
260<p>Every few pieces (in GetRight it is randomly 1 in 10), it will pick the
261piece with the smallest gap from a complete piece, ignoring all rareness.
262For the bitfield &quot;YYnnnnYnnY&quot; it would select piece # &quot;YYnnnnYn#Y&quot;.  This
263helps fill in small gaps.</p>
264<p>Clients can choose whether to do this step or not, and if implementing
265could use another percent of file completion.</p>
266<p>Pseudo-Code logic:</p>
267<pre class="literal-block">
268Gap = 0;
269Piece = -1;
270CurGap = MaxPieces+1;
271for (i=0; i&lt;MaxPieces; i++) {
272    if (IDoNotHavePiece(i)) {
273        Gap++;
274        if (PeerHasPiece(i)) {
275            Piece = i;
276        }
277    } else {----
278        if (Gap&lt;CurGap &amp;&amp; Gap&gt;0 &amp;&amp; Piece!=-1) {
279            CurGap = Gap;
280            NextPiece = Piece;
281        }
282        Gap = 0;
283        Piece = -1;
284    }
285}
286</pre>
287</div>
288</div>
289<div class="section" id="http-and-ftp-optimizations">
290<h2>HTTP and FTP Optimizations</h2>
291<p>No changes are needed to the HTTP/FTP protocols or servers.</p>
292<p>If the client knows the HTTP/FTP download is part of a BitTorrent download,
293when the very first connection is made it is better to start the HTTP/FTP
294download somewhere randomly in the file.  This way it is more likely the
295first HTTP pieces it gets will be useful for sharing to the BitTorrent peers.</p>
296<p>If a BitTorrent download is already progressing when starting a HTTP/FTP
297connection, the HTTP/FTP should start at the beginning of the biggest gap.
298Given a bitfield &quot;YYnnnnYnnY&quot; it should start at #: &quot;YY#nnnYnnY&quot;</p>
299<p>If it successfully downloads a piece from a HTTP/FTP server, but the SHA
300checksum does not match, the connection must be closed and that URL should
301be discarded.</p>
302<p>A client does not need to discard a HTTP or FTP server URL if it gets
303a &quot;busy&quot; reply.</p>
304</div>
305</div>
306<div class="section" id="id1">
307<h1>Multi-File Torrents</h1>
308<p>Additional selection algorithms will be needed when using HTTP/FTP servers
309for a multi-file torrent.</p>
310<p>A client may select BitTorrent pieces to optimize so entire large files can
311be downloaded from the HTTP/FTP servers.</p>
312<p>For torrents containing small files, several HTTP/FTP transfers may be needed
313for one Piece.  In this case, it may make more sense to do those using
314BitTorrent.  For example, if there were 100 1KB files, assuming even the worst
315case of 32KB Pieces, it would take 100 HTTP/FTP transfers to do the files,
316but only 4 BitTorrent piece requests.</p>
317<p>Giving BitTorrent piece selection a higher priority for smaller files,
318and HTTP/FTP a higher priority for larger files would work well.</p>
319</div>
320<div class="section" id="another-possible-client-implementation">
321<h1>Another Possible Client Implementation</h1>
322<p>If a client only supported HTTP and not FTP, it could take advantage of
323HTTP's byte-range requests, but request more than one piece at a time.</p>
324<p>Blocks of pieces could treated as a single set and a single byte-range
325request to the HTTP server.  This would reduce the number of HTTP connections,
326and might work well for a client.</p>
327<p>Pieces could be treated as blocks of 10, 50, or 100. If I had done this way,
328I might have chosen &quot;Pieces Per Block&quot; to be MaxPieces/20.  So requesting
329about 5% of the file at a time.</p>
330</div>
331<div class="section" id="a-not-recommended-client-implementation">
332<h1>A Not Recommended Client Implementation</h1>
333<p>A client could simply use HTTP byte-ranges to request individual pieces.</p>
334<p>Some server administrators would not like this, because there will be 100s
335or 1000s of requests in their logs for a single file.  Some may even think
336it is a denial of service attack on them.</p>
337</div>
338<div class="section" id="more-on-other-protocols">
339<h1>More On Other Protocols</h1>
340<p>HTTP and FTP are listed in this BEP as the seeding protocols,
341but a client could use <strong>any</strong> other protocol that allows downloading
342data.  HTTPS, FTPS, or SFTP are obvious, but not likely supported
343by many clients (GetRight can do HTTPS and FTPS, but not SFTP).</p>
344<p>I am sure RTSP and MMS are possible as well.  Potentially even
345Usenet's NNTP protocol could be used.</p>
346<p>Other protocols may have additional issues, such as not allowing
347a download to start anywhere other than the beginning of the file.</p>
348<p>A client may choose to implement only one of the HTTP or FTP protocols
349but not both.</p>
350</div>
351<div class="section" id="acknowledgements">
352<h1>Acknowledgements</h1>
353<p>Thanks to Arvid Norberg (libtorrent.sf.net author) for helping clarify
354the Multi-File torrent parts.  And of course to Bram Cohen for creating
355this whole BitTorrent protocol.</p>
356</div>
357<div class="section" id="references">
358<h1>References</h1>
359<!-- Original WebSeeding document by Michael Burford
360(http://www.getright.com/seedtorrent.html) -->
361</div>
362<div class="section" id="copyright">
363<h1>Copyright</h1>
364<p>This document has been placed in the public domain.</p>
365<!-- Local Variables:
366mode: indented-text
367indent-tabs-mode: nil
368sentence-end-double-space: t
369fill-column: 70
370coding: utf-8
371End: -->
372</div>
373
374
375</div>
376        <div id="footer">
377<hr/>
378</div>
379
380</div>
381</body>
382</html>
Note: See TracBrowser for help on using the browser.