| 1 | BEP: 30 |
|---|
| 2 | Title: Merkle hash torrent extension |
|---|
| 3 | Version: $Revision$ |
|---|
| 4 | Last-Modified: $Date$ |
|---|
| 5 | Author: Arno Bakker <arno@cs.vu.nl> |
|---|
| 6 | Status: Draft |
|---|
| 7 | Type: Standards Track |
|---|
| 8 | Requires: 10 |
|---|
| 9 | Content-Type: text/x-rst |
|---|
| 10 | Created: 11-Mar-2009 |
|---|
| 11 | Post-History: |
|---|
| 12 | |
|---|
| 13 | Abstract |
|---|
| 14 | ======== |
|---|
| 15 | |
|---|
| 16 | BitTorrent requires a torrent file containing a cryptographic digest of |
|---|
| 17 | every piece of the content to allow the verification of pieces during the |
|---|
| 18 | download. Large torrent files put a strain on the Web servers distributing |
|---|
| 19 | them, and cannot be directly included in RSS feeds or gossiped around. |
|---|
| 20 | |
|---|
| 21 | A related problem is the use of large piece sizes. To keep the size of a |
|---|
| 22 | torrent file small (as to not overload the Web servers) the number of hashes |
|---|
| 23 | for a content file is being kept small. For large files this implies that the |
|---|
| 24 | piece size over which digests are calculated must go up (up to 2MB pieces are |
|---|
| 25 | used). The large piece sizes affect the ability of peers to barter pieces. |
|---|
| 26 | Only when a piece has been completely received and verified using the digest |
|---|
| 27 | may it be traded with other peers. This means that it may be some time |
|---|
| 28 | before a node starts bartering with others. |
|---|
| 29 | |
|---|
| 30 | Our solution to these two problems is to replace the list of digests with a |
|---|
| 31 | single Merkle hash [1]_. A Merkle hash can be used to verify the integrity |
|---|
| 32 | of the total content file as well as the individual blocks via a hierarchical |
|---|
| 33 | scheme. It works by constructing a hash tree of the content and using just |
|---|
| 34 | the root hash as data integrity protection. The simple root hash value also |
|---|
| 35 | allows for smaller piece sizes to be used. A common form of hash trees is the |
|---|
| 36 | Merkle hash tree, hence the name. |
|---|
| 37 | |
|---|
| 38 | |
|---|
| 39 | Simple Merkle Hashes |
|---|
| 40 | ==================== |
|---|
| 41 | |
|---|
| 42 | We propose a minimalistic design that does not affect the existing BitTorrent |
|---|
| 43 | protocol and clients very much. The design is backwards compatible in the |
|---|
| 44 | sense that clients supporting the Simple Merkle Hash extension can still be |
|---|
| 45 | made to process regular torrent files easily. |
|---|
| 46 | |
|---|
| 47 | >From the content we construct a hash tree as follows. Given a piece size, |
|---|
| 48 | we calculate the hashes of all the pieces in the set of content files. Next, |
|---|
| 49 | we create a binary tree of sufficient height. Sufficient height means that the |
|---|
| 50 | lowest level in the tree has enough nodes to hold all piece hashes in the set. |
|---|
| 51 | We place all piece hashes in the tree, starting at the left-most leaf, see |
|---|
| 52 | figure. The remaining leaves in the tree are assigned a *filler hash value* of |
|---|
| 53 | 0 (see Discussion). Finally, we calculate the hash values of the higher levels |
|---|
| 54 | in the tree, by concatenating the hash values of the two children (again left |
|---|
| 55 | to right) and computing the hash of that aggregate. This process ends in a |
|---|
| 56 | hash value for the root node, which we call the *root hash*. The hashing |
|---|
| 57 | algorithm used is SHA1, as in normal torrents. |
|---|
| 58 | |
|---|
| 59 | The root hash along with the total size of the content-file set and the piece |
|---|
| 60 | size are now the only information in the system that needs to come from a |
|---|
| 61 | trusted source. A client that has only the root hash of a file set can check |
|---|
| 62 | any piece as follows (see figure). It first calculates the hash of the piece |
|---|
| 63 | it received. Along with this piece it should have received the hashes of the |
|---|
| 64 | piece's sibling and of its *uncles*, that is the sibling Y of its parent X, |
|---|
| 65 | and the uncle of that Y until the root is reached (uncles are marked with \* |
|---|
| 66 | in the figure). Using this information the client recalculates the root hash |
|---|
| 67 | of the tree, and compares it to the root hash it received from the trusted |
|---|
| 68 | source. |
|---|
| 69 | |
|---|
| 70 | :: |
|---|
| 71 | |
|---|
| 72 | 0* = root hash |
|---|
| 73 | / \ |
|---|
| 74 | / \ |
|---|
| 75 | / \ |
|---|
| 76 | / \ |
|---|
| 77 | / \ |
|---|
| 78 | 1* 2 |
|---|
| 79 | / \ / \ |
|---|
| 80 | / \ / \ |
|---|
| 81 | / \ / \ |
|---|
| 82 | / \ / \ |
|---|
| 83 | / \ / \ |
|---|
| 84 | 3 4 5 6* = uncle |
|---|
| 85 | / \ / \ / \ / \ |
|---|
| 86 | / \ / \ / \ / \ |
|---|
| 87 | / \ / \ / \ / \ |
|---|
| 88 | 7 8 9 10 11 12* 13 14 |
|---|
| 89 | / \ / \ / \ / \ / \ / \ / \ / \ |
|---|
| 90 | 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
|---|
| 91 | |
|---|
| 92 | P0 P1 P2 P3 P4 P5 P6 P7 P8* P9* P10 P11 P12 X X X |
|---|
| 93 | = piece index = = = filler hash |
|---|
| 94 | p s |
|---|
| 95 | i i |
|---|
| 96 | e b |
|---|
| 97 | c l |
|---|
| 98 | e i |
|---|
| 99 | n |
|---|
| 100 | g |
|---|
| 101 | |
|---|
| 102 | |
|---|
| 103 | Inclusion in BitTorrent |
|---|
| 104 | ======================= |
|---|
| 105 | |
|---|
| 106 | The original publisher of the content-file set creates a so-called *Merkle |
|---|
| 107 | torrent* which is a torrent file that contains a ``root hash`` key in its |
|---|
| 108 | ``info`` part instead of a ``pieces`` key, see BEP 3 [#BEP-3]_. |
|---|
| 109 | |
|---|
| 110 | When a seeder starts it uses the information in the Merkle torrent and the |
|---|
| 111 | file set to reconstruct the hash tree and registers itself with the tracker |
|---|
| 112 | using the hash value of the ``info`` part of the Merkle torrent, as usual |
|---|
| 113 | (see Discussion). |
|---|
| 114 | |
|---|
| 115 | A BitTorrent client that supports the Simple Merkle Hash extension must also |
|---|
| 116 | support the Extension protocol (BEP 10) [#BEP-10]_. In particular, it must add |
|---|
| 117 | a ``Tr_hashpiece`` message name in the ``m`` field of the Extension |
|---|
| 118 | protocol's handshake message. Such a client must not send ``piece`` messages |
|---|
| 119 | but must use Extension protocol messages with type ``Tr_hashpiece`` to send |
|---|
| 120 | pieces. |
|---|
| 121 | |
|---|
| 122 | A ``Tr_hashpiece`` message consists of an index, begin, hashlist and piece. |
|---|
| 123 | The hashlist consists of the piece's own hash, the piece's sibling hash, and |
|---|
| 124 | the uncles of the piece up until and including the root hash (see above and |
|---|
| 125 | Discussion). In particular, the hashlist is a list of 2-element lists. The |
|---|
| 126 | first element denotes the node offset in the tree, the second element is the |
|---|
| 127 | hash value. The node offset is the number of the node when numbered in a |
|---|
| 128 | breadth-first fashion (i.e., going left to right starting at the top). |
|---|
| 129 | |
|---|
| 130 | Only the ``Tr_hashpiece`` message with begin field equal to 0 must contain a |
|---|
| 131 | filled hashlist, for all other begin values the hashlist must be empty. In |
|---|
| 132 | other words, the message containing the first subpiece should have a filled |
|---|
| 133 | hashlist, subsequent subpieces should not. |
|---|
| 134 | |
|---|
| 135 | Formally, a ``Tr_hashpiece`` message has the following payload: |
|---|
| 136 | |
|---|
| 137 | 1. 4-byte index |
|---|
| 138 | 2. 4-byte begin |
|---|
| 139 | 3. 4-byte length of bencoded hashlist |
|---|
| 140 | 4. the bencoded hashlist |
|---|
| 141 | 5. the subpiece data |
|---|
| 142 | |
|---|
| 143 | Upon receipt of a ``Tr_hashpiece`` message, the receiver recomputes the root |
|---|
| 144 | hash using the hashlist and compares it to the root hash in the Merkle |
|---|
| 145 | torrent. If they match, all the hash values are saved in the receiver's own |
|---|
| 146 | hash tree, such that they can be passed on to others when the piece is |
|---|
| 147 | downloaded from this receiver. When all subpieces have come in, the piece is |
|---|
| 148 | checked using the hash from the hash tree. |
|---|
| 149 | |
|---|
| 150 | |
|---|
| 151 | Discussion |
|---|
| 152 | ========== |
|---|
| 153 | |
|---|
| 154 | We chose a binary tree for simplicity. Trees with larger degrees are also |
|---|
| 155 | possible. However, the number of hashes that need to be sent with each |
|---|
| 156 | piece is already small at about 2log of the file-set size. |
|---|
| 157 | |
|---|
| 158 | Using the hash of the ``info`` part for registering at the tracker means |
|---|
| 159 | that for a given content-file set, the swarm that use a conventional torrent |
|---|
| 160 | file and the swarm that uses a Merkle torrent will be disjunct. The infohash |
|---|
| 161 | value is different, hence the swarms are known under different identifiers at |
|---|
| 162 | the trackers. |
|---|
| 163 | |
|---|
| 164 | In theory we can create one swarm. In that swarm, new clients could serve |
|---|
| 165 | pieces to old clients. For the new clients to benefit from the old clients, |
|---|
| 166 | however, we need to add some way for the new to obtain the hashes required to |
|---|
| 167 | check a piece. Designing a fool proof solution for this problem is not |
|---|
| 168 | trivial. |
|---|
| 169 | |
|---|
| 170 | Because we let the initial seeders recalculate the hash tree, this |
|---|
| 171 | extension is incompatible with the proposed HTTP Seeding extensions in |
|---|
| 172 | BEP 17 [#BEP-17]_ and 19 [#BEP-19]_ . |
|---|
| 173 | |
|---|
| 174 | Including the root hash in a ``Tr_hashpiece`` message allows a quick sanity |
|---|
| 175 | check. |
|---|
| 176 | |
|---|
| 177 | This extension paves the way for BitTorrent URLs. The only information |
|---|
| 178 | required for a client to commence sharing are the root hash, the total size, |
|---|
| 179 | the piece size, and a source of peer addresses (tracker, DHT). |
|---|
| 180 | |
|---|
| 181 | |
|---|
| 182 | Acknowledgements |
|---|
| 183 | ================ |
|---|
| 184 | |
|---|
| 185 | Development of this extension was supported by funding from: |
|---|
| 186 | |
|---|
| 187 | * BSIK Freeband Communication I-Share project (Dutch Ministry of Economic |
|---|
| 188 | Affairs) |
|---|
| 189 | * The European Community's Seventh Framework Programme in the P2P-Next |
|---|
| 190 | project under grant agreement no 216217. |
|---|
| 191 | |
|---|
| 192 | Thanks to Olaf van der Spek and Johan Pouwelse for ideas and suggestions. |
|---|
| 193 | |
|---|
| 194 | |
|---|
| 195 | References |
|---|
| 196 | ========== |
|---|
| 197 | |
|---|
| 198 | .. [1] MERKLE, R. A Digital Signature Based on a Conventional Encryption |
|---|
| 199 | Function. In Proceedings CRYPTOâ87 (Santa Barbara, CA, USA, Aug. 1987), |
|---|
| 200 | C. Pomerance, Ed., no. 293 in Lecture Notes in Computer Science, |
|---|
| 201 | Springer-Verlag, pp. 369â378. |
|---|
| 202 | |
|---|
| 203 | .. [#BEP-3] BEP_0003. The BitTorrent Protocol Specification, Cohen |
|---|
| 204 | (http://www.bittorrent.org/beps/bep_0003.html) |
|---|
| 205 | |
|---|
| 206 | .. [#BEP-10] BEP_0010. Extension Protocol, Norberg, Strigeus, Hazel |
|---|
| 207 | (http://www.bittorrent.org/beps/bep_0010.html) |
|---|
| 208 | |
|---|
| 209 | .. [#BEP-17] BEP_0017. HTTP Seeding, Hoffman |
|---|
| 210 | (http://www.bittorrent.org/beps/bep_0017.html) |
|---|
| 211 | |
|---|
| 212 | .. [#BEP-19] BEP_0019. WebSeed - HTTP/FTP Seeding (GetRight style), Burford |
|---|
| 213 | (http://www.bittorrent.org/beps/bep_0019.html) |
|---|
| 214 | |
|---|
| 215 | |
|---|
| 216 | Copyright |
|---|
| 217 | ========= |
|---|
| 218 | |
|---|
| 219 | This document has been placed in the public domain. |
|---|
| 220 | |
|---|
| 221 | |
|---|
| 222 | |
|---|
| 223 | .. |
|---|
| 224 | Local Variables: |
|---|
| 225 | mode: indented-text |
|---|
| 226 | indent-tabs-mode: nil |
|---|
| 227 | sentence-end-double-space: t |
|---|
| 228 | fill-column: 70 |
|---|
| 229 | coding: utf-8 |
|---|
| 230 | End: |
|---|
| 231 | |
|---|