An Introduction to Peer-to-Peer Systems
The traditional client-server model works fine for most of the use-cases, but it falls little short in use-cases where large number of files or files of large sizes need to be transferred. This shortcoming is inherent due to the fact that there’s single server (small number of servers) tries to serve content to large number of clients with limited bandwidth which hampers transfer rate massively from server’s end. Consider a scenario where 1000 clients demand for a 300mb file from server, and another scenario where 1mn clients demand same file. Now in second scenario bandwidth allocation for each client obviously will be awfully low.
Peer-to-peer systems try nullify this limitation of client-server systems, and are particularly useful for use-cases involving large file transfers to large number of clients. As the name might suggest, there is no central server which serves the data like client-server architecture. In this case, the data transfer happens between peers themselves. We’ll delve into the terminologies in later, but to summarize a peer can act as client and when it receives the data, it can act as server for other peers. There’s a subtype in peer-to-peer systems, which is known as hybrid peer-to-peer system where there a central server which serves metadata (not the actual data).
Peer-to-peer architecture found it’s primary usage in the domain of file sharing. It became particularly popular due to music sharing platforms like Napster in late 90s along with some illegal movie, books, software etc. sharing use-cases. It also found usage in crowdsourcing open source content, where writers and readers, publishers and subscribers can participate efficiently and effectively. In software engineering world, it is being used to send patches of updates among thousands of servers, where servers are the peers.
Now coming to terminologies involved in peer-to-peer, you might’ve heard about these terms if you are seasoned user of Torrents.
- Node: A node is the individual peer involved in the system. It can be either server serving the data or client consuming it.
- Tracker: The centralized server which keeps track of nodes is known as tracker. It store all the metadata which is required to facilitate the file transfer.
- Seeder: Seeder is a node which already has the file/data, and it is willing to share it with other nodes.
- Leecher: Leecher is a node which needs the file/data, so it can connect with seeders to retrieve it.
- .torrent File: A torrent file is a special file format that stores information about files and folders to be distributed using file-sharing protocol. This type of file is small (usually in KBs) and ends with the extension “.torrent” at the end. Torrents don’t contain the file data itself; they carry details such as file name, file size, tracker list, etc.
- Torrent Client: It’s the client application each user or peer needs to use to decode and use the .torrent file to be part of the network.
Once a torrent has been made, the creator can share one of two things: the .torrent file or a hash of the torrent, often called a magnet link. A magnet link is a simple way to identify the torrent on the BitTorrent network without having to deal with a .torrent file. It’s unique to that specific torrent, so although the link is just a string of characters, it’s just as good as having the file. Magnet links and .torrent files are often listed on torrent indexes, which are sites built specifically for sharing torrents. You can also share torrent information over email, text, etc. Since magnet links and .torrent files are just the instructions for a BitTorrent client to understand how to get the data, sharing them is quick and easy. A torrent file isn’t super useful unless it’s used with a client program.