root/dotorg/trunk/html/beps/bep_0019.rst

Revision 11016, 11.9 KB (checked in by dave, 21 months ago)

Add BEP 19 to the BEP index. Update BEP 19 to conform to restructured text syntax. Rebuild BEP index and BEP 19.

Line 
1BEP: 19
2Title: WebSeed - HTTP/FTP Seeding (GetRight style)
3Version: $Revision$
4Last-Modified: $Date$
5Author:  Michael Burford <michael @at@ getright.com>
6Status:  Draft
7Type:    Standards Track
8Content-Type: text/x-rst
9Created: 21-Feb-2008
10Post-History:
11
12
13Abstract
14========
15
16Using HTTP or FTP servers as seeds for BitTorrent downloads.
17
18
19Rationale
20=========
21
22Many websites that list a BitTorrent download also provide a
23HTTP or FTP URL for the same file.  The files are identical.
24A WebSeeding BitTorrent client can download from either source,
25putting all the parts together into one complete file.
26
27The HTTP or FTP server acts as a permanently unchoked seed. 
28
29
30Advantages
31==========
32
33There is always an unchoked seed where anybody can get started.
34
35It does not break or change anything for clients that do not
36recognize the addition to the metadata. 
37
38Clients that do not support HTTP/FTP Seeding would still get
39the benefit from peers sharing pieces originally from an HTTP/FTP
40server.
41
42The metadata does not have to be changed.  Download Manager tools
43(such as GetRight) commonly can add multiple HTTP/FTP mirror URLs
44for a file, usually by the user action of clicking multiple links
45on a web page and recognizing the same file name. 
46
47Everybody is quite familiar with HTTP/FTP servers and their
48protocols. 
49
50It has already been implemented in GetRight, the Mainline client,
51uTorrent, Azureus, libTorrent, and likely others.
52
53Changes are only needed in BitTorrent clients.  No changes to
54trackers, HTTP, or FTP servers are needed.  No scripts are
55needed on the HTTP or FTP server.
56
57As this is already supported in many very common clients (notably
58the Mainline client itself) seeding for a BitTorrent download
59could be done entirely with a company or individual's existing
60HTTP/FTP servers.
61
62
63Metadata Extension
64==================
65
66In the main area of the metadata file and not part of the "info"
67section, will be a new key, "url-list".  This key will refer to
68a one or more URLs, and will contain a list of web addresses where
69torrent data can be retrieved.  This key may be safely ignored
70if the client is not capable of using it.
71
72For example (on multiple lines for readability):
73   d
74   8:announce27:http://tracker.com/announce
75   8:url-list26:http://mirror.com/file.exe
76   4:info...
77
78If the "url-list" URL ends in a slash, "/" the client must add
79the "name" from the torrent to make the full URL.  This allows
80.torrent generators to treat this field same for single file and
81multi-file torrents.
82
83
84Multi-File Torrents
85-------------------
86
87BitTorrent clients normally use the "name" from the torrent info
88section to make a folder, then use the "path/file" items from the
89info section within that folder.  For the case of Multi-File
90torrents, the "url-list" must be a root folder where a client
91could add the same "name" and "path/file" to create the URL for
92the request.
93
94For example:
95   ...
96   8:url-list22:http://mirror.com/pub/
97   4:infod5:filesld6:lengthi949e4:pathl10:Readme.txte
98   e4:name7:michael
99
100   A client would use all that to build a url:
101   http://mirror.com/pub/michael/Readme.txt
102
103
104Client Implementation Overview
105==============================
106
107A client should ignore any protocols it does not know from the url-list. 
108A client may choose to implement HTTP but not FTP, or vice-versa.
109
110HTTP/FTP are "streaming" type protocols, and do not have BitTorrent's
111concept of blocks.  For HTTP you can use byte-ranges to resume
112anywhere or download specific ranges you specify, but with FTP you
113can only say where to start the download. 
114
115I wanted to have many sequential blocks ("gaps") in the data
116downloaded from BitTorrent peers so a HTTP/FTP connection would
117have big spaces to fill in.  You could use the byte-ranges with
118HTTP to request individual blocks--but each request will show
119in the server's logs, and somebody is going to think it is a
120denial of service attack on them if their shows 100's of connections
121in their log from the same IP. 
122
123In GetRight's implementation, I made a couple changes to the usual
124"rarest first" piece-selection method to better allow "gaps" to develop
125between pieces.  That way there are longer spaces in the file for HTTP
126and FTP threads to fill.  They can start at the beginning of a gap and
127download until they get to the end.
128
129
130Client Implementation Notes
131===========================
132
133Changes to the standard BitTorrent algorithms to optimize the ability
134of HTTP and FTP connections to fill in many sequential blocks in the file.
135
136
137Definition of Gaps
138------------------
139
140Gaps are spaces of multiple pieces in a row that the client does not have.
141
142Given a bitfield "YYnnnnYnnY" where Y is pieces it has and n are ones it
143does not have, there are two "gaps", one of 4 pieces, and another of 2.
144
145
146Piece Selection
147---------------
148
149The major change is to the Piece Selection.
150
151If everything else is more-or-less equivalent, it is better to pick a
152piece to do from the gap of 2 when requesting a piece from a BitTorrent peer.
153
154In any gap, it is best to fill in from the end (ie, the highest piece number
155first). 
156
157So given bitfield "YYnnnnYnnY" if the rareness of all the n pieces is similar
158it is better to select one of the # pieces "YYnnnnY##Y" and the best
159would be to select piece $ "YYnnnnYn$Y"
160
161These maximize the contiguous pieces for HTTP/FTP connections.  This allows
162a HTTP/FTP connection to start at the beginning of a large gap and download
163the most data before it gets to a piece the client already has.
164
165
166Changes to Piece Selection
167--------------------------
168
169Change the "rarest first" piece selection to a "pretty rare with biggest
170distance from another completed piece".
171
172
173Pretty Rare With Biggest Gap
174''''''''''''''''''''''''''''
175
176When scanning for the rarest piece, if the distance from another completed
177piece is less than that for the current rarest piece, it must be "rare-X".
178Or if the distance is greater than the current piece's, it can be rare+X
179to be picked as the next piece. (For no better reason than it seemed to make
180sense at the time and scale, X is the square root of the number of peers
181minus 1.) 
182
183So if 3 peers had the current rarest piece, the normal algorithm would
184pick a piece where 2 peers had it.  The changed algorithm would require
185that only 1 peer has the piece if that piece's distance from a complete
186piece was less than the gap for the current rarest piece.
187
188If the gap is bigger and the piece is the same "rareness" or the usual
189"rare-1" that piece is selected. (So if the gap is bigger, a piece with
190either 2 or 3 peers would be chosen.)
191
192So given "YYnnn1Yn2Y", unless 1 is a lot more rare than 2, it's better to
193pick piece 2.
194
195Pseudo-Code logic:
196
197::
198
199   X = sqrt(Peers) - 1;
200   Gap = 0;
201   CurGap = 0;
202   CurRarest = MaxPieces+1;
203   for (i=0; i<MaxPieces; i++) {
204       if (IDoNotHavePiece(i)) {
205           Gap++;
206           if (PeerHasPiece(i)) {
207               PieceRareness = NumberOfPeersWithThePiece();
208               if (PieceRareness<(CurRarest-X) ||
209                   (PieceRareness<=(CurRarest+X) && Gap>CurGap)) {
210                   CurRarest = PieceRareness;
211                   CurGap = Gap;
212                   NextPiece = i;
213               }
214           }
215       } else {
216           Gap = 0;
217       }
218   }
219
220
221Fill In The Gaps
222''''''''''''''''
223
224If a file is more than 50% complete, it uses a different method for piece
225selection randomly.  (With over 50%, you should have a large number of
226pieces that other peers will want to download.)
227
228Every few pieces (in GetRight it is randomly 1 in 10), it will pick the
229piece with the smallest gap from a complete piece, ignoring all rareness. 
230For the bitfield "YYnnnnYnnY" it would select piece # "YYnnnnYn#Y".  This
231helps fill in small gaps.
232
233Clients can choose whether to do this step or not, and if implementing
234could use another percent of file completion.
235
236Pseudo-Code logic:
237
238::
239
240   Gap = 0;
241   Piece = -1;
242   CurGap = MaxPieces+1;
243   for (i=0; i<MaxPieces; i++) {
244       if (IDoNotHavePiece(i)) {
245           Gap++;
246           if (PeerHasPiece(i)) {
247               Piece = i;
248           }
249       } else {----
250           if (Gap<CurGap && Gap>0 && Piece!=-1) {
251               CurGap = Gap;
252               NextPiece = Piece;
253           }
254           Gap = 0;
255           Piece = -1;
256       }
257   }
258
259
260HTTP and FTP Optimizations
261--------------------------
262
263No changes are needed to the HTTP/FTP protocols or servers.
264
265If the client knows the HTTP/FTP download is part of a BitTorrent download,
266when the very first connection is made it is better to start the HTTP/FTP
267download somewhere randomly in the file.  This way it is more likely the
268first HTTP pieces it gets will be useful for sharing to the BitTorrent peers.
269
270If a BitTorrent download is already progressing when starting a HTTP/FTP
271connection, the HTTP/FTP should start at the beginning of the biggest gap.
272Given a bitfield "YYnnnnYnnY" it should start at #: "YY#nnnYnnY"
273
274If it successfully downloads a piece from a HTTP/FTP server, but the SHA
275checksum does not match, the connection must be closed and that URL should
276be discarded.
277
278A client does not need to discard a HTTP or FTP server URL if it gets
279a "busy" reply.
280
281
282Multi-File Torrents
283===================
284
285Additional selection algorithms will be needed when using HTTP/FTP servers
286for a multi-file torrent.
287
288A client may select BitTorrent pieces to optimize so entire large files can
289be downloaded from the HTTP/FTP servers.
290
291For torrents containing small files, several HTTP/FTP transfers may be needed
292for one Piece.  In this case, it may make more sense to do those using
293BitTorrent.  For example, if there were 100 1KB files, assuming even the worst
294case of 32KB Pieces, it would take 100 HTTP/FTP transfers to do the files,
295but only 4 BitTorrent piece requests. 
296
297Giving BitTorrent piece selection a higher priority for smaller files,
298and HTTP/FTP a higher priority for larger files would work well.
299
300
301Another Possible Client Implementation
302======================================
303
304If a client only supported HTTP and not FTP, it could take advantage of
305HTTP's byte-range requests, but request more than one piece at a time. 
306
307Blocks of pieces could treated as a single set and a single byte-range
308request to the HTTP server.  This would reduce the number of HTTP connections,
309and might work well for a client. 
310
311Pieces could be treated as blocks of 10, 50, or 100. If I had done this way,
312I might have chosen "Pieces Per Block" to be MaxPieces/20.  So requesting
313about 5% of the file at a time.
314
315
316A Not Recommended Client Implementation
317=======================================
318
319A client could simply use HTTP byte-ranges to request individual pieces.
320
321Some server administrators would not like this, because there will be 100s
322or 1000s of requests in their logs for a single file.  Some may even think
323it is a denial of service attack on them.
324
325
326More On Other Protocols
327=======================
328
329HTTP and FTP are listed in this BEP as the seeding protocols,
330but a client could use **any** other protocol that allows downloading
331data.  HTTPS, FTPS, or SFTP are obvious, but not likely supported
332by many clients (GetRight can do HTTPS and FTPS, but not SFTP). 
333
334I am sure RTSP and MMS are possible as well.  Potentially even
335Usenet's NNTP protocol could be used. 
336
337Other protocols may have additional issues, such as not allowing
338a download to start anywhere other than the beginning of the file.
339
340A client may choose to implement only one of the HTTP or FTP protocols
341but not both. 
342
343
344Acknowledgements
345================
346
347Thanks to Arvid Norberg (libtorrent.sf.net author) for helping clarify
348the Multi-File torrent parts.  And of course to Bram Cohen for creating
349this whole BitTorrent protocol.
350
351
352References
353==========
354
355.. Original WebSeeding document by Michael Burford
356  (http://www.getright.com/seedtorrent.html)
357
358
359Copyright
360=========
361
362This document has been placed in the public domain.
363
364
365
366..
367  Local Variables:
368  mode: indented-text
369  indent-tabs-mode: nil
370  sentence-end-double-space: t
371  fill-column: 70
372  coding: utf-8
373  End:
374
Note: See TracBrowser for help on using the browser.