| 1 | BEP: 6 |
|---|
| 2 | Title: Fast Extension |
|---|
| 3 | Version: $Revision$ |
|---|
| 4 | Last-Modified: $Date$ |
|---|
| 5 | Author: David Harrison <dave@bittorrent.com>, Bram Cohen <bram@bittorrent.com> |
|---|
| 6 | Status: Draft |
|---|
| 7 | Type: Standards Track |
|---|
| 8 | Created: 01-Feb-2008 |
|---|
| 9 | Post-History: |
|---|
| 10 | |
|---|
| 11 | The Fast Extension packages several extensions: *Have None/Have All*, |
|---|
| 12 | *Reject Requests*, *Suggestions* and *Allowed Fast.* |
|---|
| 13 | These are enabled by setting the third least significant bit of the |
|---|
| 14 | last reserved byte in the BitTorrent handshake: |
|---|
| 15 | |
|---|
| 16 | :: |
|---|
| 17 | |
|---|
| 18 | reserved[7] |= 0x04 |
|---|
| 19 | |
|---|
| 20 | The extension is enabled only if both ends of the connection set this bit. |
|---|
| 21 | |
|---|
| 22 | The following proposed messages adhere to the syntax of messages found |
|---|
| 23 | in v1.0 of the BitTorrent protocol. All integers are four bytes |
|---|
| 24 | big-endian. All messages start with an integer message length. All |
|---|
| 25 | messages but the Keep-Alive follow the message length with a single |
|---|
| 26 | byte opcode and zero or more opcode-dependant arguments. |
|---|
| 27 | |
|---|
| 28 | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL |
|---|
| 29 | NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and |
|---|
| 30 | "OPTIONAL" in this document are to be interpreted as described in |
|---|
| 31 | IETF `RFC 2119`_. |
|---|
| 32 | |
|---|
| 33 | Modifications to Semantics of Existing Messages |
|---|
| 34 | =============================================== |
|---|
| 35 | |
|---|
| 36 | The Fast Extension modifies the semantics of the |
|---|
| 37 | *Request*, *Choke*, *Unchoke*, and *Cancel* |
|---|
| 38 | messages, and adds a *Reject Request.* Now, every request |
|---|
| 39 | is guaranteed to result in EXACTLY ONE response |
|---|
| 40 | which is either the corresponding reject or corresponding piece |
|---|
| 41 | message. Even when a request is cancelled, the peer receiving |
|---|
| 42 | the cancel should respond with either the corresponding reject or |
|---|
| 43 | the corresponding piece: requests that are being processed are |
|---|
| 44 | allowed to complete. |
|---|
| 45 | |
|---|
| 46 | Choke no longer implicitly rejects all pending requests, |
|---|
| 47 | thus eliminating some race conditions which could cause pieces |
|---|
| 48 | to be needlessly requested multiple times. |
|---|
| 49 | |
|---|
| 50 | Additionally, if a peer receives a piece that was never requested, |
|---|
| 51 | the peer MUST close the connection. |
|---|
| 52 | |
|---|
| 53 | |
|---|
| 54 | Have All/Have None |
|---|
| 55 | ================== |
|---|
| 56 | |
|---|
| 57 | :: |
|---|
| 58 | |
|---|
| 59 | *Have All*: <len=0x0001> <op=0x0E> |
|---|
| 60 | |
|---|
| 61 | :: |
|---|
| 62 | |
|---|
| 63 | *Have None*: <len=0x0001><op=0x0F> |
|---|
| 64 | |
|---|
| 65 | *Have All* and *Have None* specify that the message sender |
|---|
| 66 | has all or none of the pieces respectively. When present, *Have All* |
|---|
| 67 | or *Have None* replace the *Have Bitfield.* Exactly one of *Have All*, |
|---|
| 68 | *Have None*, or *Have Bitfield* MUST appear and only immediately after |
|---|
| 69 | the handshake. The reason for these messages is to save bandwidth. Also |
|---|
| 70 | slightly to remove the idiosyncrasy of sending no message when a peer |
|---|
| 71 | has no pieces. |
|---|
| 72 | |
|---|
| 73 | When the fast extension is disabled, if a peer receives *Have All* or |
|---|
| 74 | *Have None* then the peer MUST close the connection. |
|---|
| 75 | |
|---|
| 76 | |
|---|
| 77 | Suggest Piece |
|---|
| 78 | ============= |
|---|
| 79 | |
|---|
| 80 | :: |
|---|
| 81 | |
|---|
| 82 | *Suggest Piece*: <len=0x0005><op=0x0D><index> |
|---|
| 83 | |
|---|
| 84 | *Suggest Piece* is an advisory message meaning "you might like to |
|---|
| 85 | download this piece." The intended usage is for 'super-seeding' |
|---|
| 86 | without throughput reduction, to avoid redundant downloads, and so that |
|---|
| 87 | a seed which is disk I/O bound can upload continguous or identical |
|---|
| 88 | pieces to avoid excessive disk seeks. In all cases, the seed SHOULD |
|---|
| 89 | operate to maintain a roughly equal number of copies of each piece in |
|---|
| 90 | the network. A peer MAY send more than one *suggest piece* message at |
|---|
| 91 | any given time. A peer receiving multiple *suggest piece* messages |
|---|
| 92 | MAY interpret this as meaning that all of the suggested pieces |
|---|
| 93 | are equally appropriate. |
|---|
| 94 | |
|---|
| 95 | When the fast extension is disabled, if a peer receives a |
|---|
| 96 | *Suggest Piece* message, the peer MUST close the connection. |
|---|
| 97 | |
|---|
| 98 | |
|---|
| 99 | Reject Request |
|---|
| 100 | ============== |
|---|
| 101 | |
|---|
| 102 | :: |
|---|
| 103 | |
|---|
| 104 | *Reject Request*: <len=0x000D><op=0x10><index><begin><offset> |
|---|
| 105 | |
|---|
| 106 | *Reject Request* notifies a requesting peer that its request will not be satisfied. |
|---|
| 107 | |
|---|
| 108 | If the fast extension is disabled and a peer receives a reject |
|---|
| 109 | request then the peer MUST close the connection. |
|---|
| 110 | |
|---|
| 111 | When the fast extension is enabled: |
|---|
| 112 | |
|---|
| 113 | - If a peer receives a reject for a request that was never sent then |
|---|
| 114 | the peer SHOULD close the connection. |
|---|
| 115 | |
|---|
| 116 | - If a peer sends a choke, it MUST reject all requests from the peer |
|---|
| 117 | to whom the choke was sent except it SHOULD NOT reject requests for |
|---|
| 118 | pieces that are in the *allowed fast set.* A peer SHOULD choke first |
|---|
| 119 | and then reject requests so that the peer receiving the choke does not |
|---|
| 120 | re-request the pieces. |
|---|
| 121 | |
|---|
| 122 | - If a peer receives a request from a peer its choking, the peer |
|---|
| 123 | receiving the request SHOULD send a reject unless the piece is in the |
|---|
| 124 | *allowed fast set.* |
|---|
| 125 | |
|---|
| 126 | - If a peer receives an excessive number of requests from a peer it is |
|---|
| 127 | choking, the peer receiving the requests MAY close the connection |
|---|
| 128 | rather than reject the request. However, consider that it can take |
|---|
| 129 | several seconds for buffers to drain and messages to propagate once a |
|---|
| 130 | peer is choked. |
|---|
| 131 | |
|---|
| 132 | Allowed Fast |
|---|
| 133 | ============ |
|---|
| 134 | |
|---|
| 135 | :: |
|---|
| 136 | |
|---|
| 137 | * Allowed Fast: <len=0x0005><op=0x11><index>* |
|---|
| 138 | |
|---|
| 139 | With the BitTorrent protocol specified in `BEP 0002`_, new peers take |
|---|
| 140 | several minutes to ramp up before they can effectively engage in |
|---|
| 141 | BitTorrent's tit-for-tat. The reason is simple: starting peers have |
|---|
| 142 | few pieces to trade. |
|---|
| 143 | |
|---|
| 144 | *Allowed Fast* is an advisory message which means "if you ask for this |
|---|
| 145 | piece, I'll give it to you even if you're choked." *Allowed Fast* thus |
|---|
| 146 | shortens the awkward stage during which the peer obtains occasional |
|---|
| 147 | optimistic unchokes but cannot sufficiently reciprocate to remain |
|---|
| 148 | unchoked. |
|---|
| 149 | |
|---|
| 150 | The pieces that can be downloaded when choked constitute a peer's |
|---|
| 151 | *allowed fast set.* The set is generated using a canonical algorithm |
|---|
| 152 | that produces piece indices unique to the message receiver so that if |
|---|
| 153 | two peers offer *k* pieces fast it will be the same *k*, and if one |
|---|
| 154 | offers *k+1* it will be the same *k* plus one more. *k* should be |
|---|
| 155 | small enough to avoid abuse, but large enough to ramp up |
|---|
| 156 | tit-for-tat. We currently set *k* to 10, but peers are free to change |
|---|
| 157 | this number, e.g., to suit load. |
|---|
| 158 | |
|---|
| 159 | The message sender MAY list pieces that the message sender does not |
|---|
| 160 | have. The receiver MUST NOT interpret an Allowed Fast message as |
|---|
| 161 | meaning that the message sender has the piece. This allows peers to |
|---|
| 162 | generate and communicate allowed fast sets at the beginning of a |
|---|
| 163 | connection. However, a peer MAY send Allowed Fast messages at any |
|---|
| 164 | time. |
|---|
| 165 | |
|---|
| 166 | A peer SHOULD send Allowed Fast messages to any starting peer unless |
|---|
| 167 | the local peer lacks sufficient resources. A peer MAY reject requests |
|---|
| 168 | for already Allowed Fast pieces if the local peer lacks sufficient |
|---|
| 169 | resources, if the requested piece has already been sent to the |
|---|
| 170 | requesting peer, or if the requesting peer is not a starting peer. Our |
|---|
| 171 | current implementation rejects requests for Allowed Fast messages |
|---|
| 172 | whenever the requesting peer has more than * k * pieces. |
|---|
| 173 | |
|---|
| 174 | When the fast extension is disabled, if a peer recieves an Allowed |
|---|
| 175 | Fast message then the peer MUST close the connection. |
|---|
| 176 | |
|---|
| 177 | Allowed Fast Set Generation |
|---|
| 178 | =========================== |
|---|
| 179 | |
|---|
| 180 | The canonical algorithm for computing a peer *P'*s *allowed fast set* |
|---|
| 181 | follows. All integers in this pseudocode are four bytes represented |
|---|
| 182 | in network (big-endian) byte order. *[a:b]* denotes the sequence of |
|---|
| 183 | consecutive bytes from *a* to *b* excluding *b*, i.e., *(a, a+1, |
|---|
| 184 | a+2,..., b-1)*. *x[a:b]* denotes a subsequence of elements in an array |
|---|
| 185 | *x* starting from index *a* to but not including index *b*. |
|---|
| 186 | |
|---|
| 187 | Let *ip* denote *P'*s IPv4 address. We currently have no |
|---|
| 188 | provisions for IPv6. If a peer is behind a Network Address Translator |
|---|
| 189 | (NAT) then *ip* should be the externally facing IP address of the |
|---|
| 190 | NAT. Since the node sending the *Allowed Fast* messages computes |
|---|
| 191 | the set, the correct *ip* is usually the *ip* address on the other |
|---|
| 192 | end of the connection. The host computing the set MAY use the *ip* |
|---|
| 193 | address on the other end of the connection regardless |
|---|
| 194 | |
|---|
| 195 | Let *sz* denote the number of pieces in the torrent. |
|---|
| 196 | |
|---|
| 197 | Let *a* denote the allowed fast set. |
|---|
| 198 | |
|---|
| 199 | Let *k* denote the final number of pieces in the allowed fast set. |
|---|
| 200 | |
|---|
| 201 | |
|---|
| 202 | :: |
|---|
| 203 | |
|---|
| 204 | x = 0xFFFFFF00 & ip (1) |
|---|
| 205 | x.append(infohash) (2) |
|---|
| 206 | while |a| < k: |
|---|
| 207 | x = SHA1(x) (3) |
|---|
| 208 | for i in [0:5] and |a| < k: (4) |
|---|
| 209 | j = i*4 (5) |
|---|
| 210 | y = x[j:j+4] (6) |
|---|
| 211 | index = y % sz (7) |
|---|
| 212 | if index not in a: (8) |
|---|
| 213 | add index to a (9) |
|---|
| 214 | |
|---|
| 215 | Step (1) selects the most significant octets in peer *P'*s |
|---|
| 216 | ip address. We do this to prevent a user that obtains more than one |
|---|
| 217 | IP address on the same network from obtaining more than one |
|---|
| 218 | *allowed fast set.* Use of three bytes is heuristic and |
|---|
| 219 | historical. |
|---|
| 220 | |
|---|
| 221 | Step (3) generates a 20-byte random number on each call. By |
|---|
| 222 | performing a SHA-1 hash on the previous iteration's hash, we can |
|---|
| 223 | generate an arbitrarily long pseudorandom sequence. |
|---|
| 224 | |
|---|
| 225 | Steps (4) through (9) partition the 20-byte hash into piece indices |
|---|
| 226 | and add them to the allowed fast set. |
|---|
| 227 | |
|---|
| 228 | Example Implementation |
|---|
| 229 | ====================== |
|---|
| 230 | |
|---|
| 231 | The following C++ implementation was provided by CacheLogic: |
|---|
| 232 | |
|---|
| 233 | |
|---|
| 234 | :: |
|---|
| 235 | |
|---|
| 236 | void generate_fast_set( |
|---|
| 237 | uint32 k, // number of pieces in set |
|---|
| 238 | uint32 sz, // number of pieces in torrent |
|---|
| 239 | const char infohash[20], // infohash of torrent |
|---|
| 240 | uint32 ip, // in host byte order, ie localhost is 0x7f000001 |
|---|
| 241 | std:: |
|---|
| 242 | vector<uint32> &a) // generated set of piece indices |
|---|
| 243 | { |
|---|
| 244 | a.clear(); |
|---|
| 245 | std::string x; |
|---|
| 246 | char buf[4]; |
|---|
| 247 | *(uint32*)buf = htonl(ip & 0xffffff00); |
|---|
| 248 | x.assign(buf, 4); |
|---|
| 249 | x.append(infohash, 20); // (3) |
|---|
| 250 | while (a.size()<k) { |
|---|
| 251 | x = SHA1(x); // (4) |
|---|
| 252 | for ( int i=0 ; i<5 && a.size()<k; i++ ) { // (5) |
|---|
| 253 | int j = i*4; // (6) |
|---|
| 254 | uint32 y = ntohl(*(uint32*)(x.data()+j)); // (7) |
|---|
| 255 | uint32 index = y % sz; // (8) |
|---|
| 256 | if (std::find(a.begin(), a.end(), index)==a.end()) { // (9) |
|---|
| 257 | a.push_back(index); // (10) |
|---|
| 258 | } |
|---|
| 259 | } |
|---|
| 260 | } |
|---|
| 261 | } |
|---|
| 262 | |
|---|
| 263 | Example results generated by this function: |
|---|
| 264 | |
|---|
| 265 | |
|---|
| 266 | :: |
|---|
| 267 | |
|---|
| 268 | 7 piece allowed fast set for torrent with 1313 pieces and hex infohash |
|---|
| 269 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa for node with IP 80.4.4.200: |
|---|
| 270 | 1059,431,808,1217,287,376,1188 |
|---|
| 271 | 9 piece allowed fast set for torrent with 1313 pieces and hex infohash |
|---|
| 272 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa for node with IP 80.4.4.200: |
|---|
| 273 | 1059,431,808,1217,287,376,1188,353,508 |
|---|
| 274 | |
|---|
| 275 | .. _`RFC 2119`: http://www.ietf.org/rfc/rfc2119.txt |
|---|
| 276 | .. _`BEP 0002`: http://www.bittorrent.org/beps/bep_0002.html |
|---|
| 277 | |
|---|
| 278 | |
|---|
| 279 | |
|---|
| 280 | |
|---|
| 281 | .. |
|---|
| 282 | Local Variables: |
|---|
| 283 | mode: indented-text |
|---|
| 284 | indent-tabs-mode: nil |
|---|
| 285 | sentence-end-double-space: t |
|---|
| 286 | fill-column: 70 |
|---|
| 287 | coding: utf-8 |
|---|
| 288 | End: |
|---|