How does peer-to-peer work over the internet?

Your confusion stems from some incorrect assumptions.

But surely, the only device that knows this routing mechanism is Router-C itself! Not even Computer-A nor Computer-B will know about it, right?

What, why‽ Then why was the router configured to forward those ports to those IPs? You have to set up the P2P client to use a specific port and then set up the router to correspond.

but how is Router-D to know to send packets through port 1000, and not say port 1001?

Because you configure the P2P client to use a specific port (standard or non-standard for that protocol).

The only solution I can think of is for Router-D to send the packet to Router-C through all ports, such that it gets forwarded to Computer-A, but is there a better solution?

It is much simpler than that. When the client makes a connection to a peer, it specifies which port it wants to use, so the peer sends the data on that port.

Hmm, but Bittorrent doesn't change the router's behavior right? Since some routing mechanism could have been dynamic as demonstrated in superuser.com/a/187190/78897, how is Computer-A able to know about it?

The client doesn’t directly affect the router, but the router can be configured/intelligent enough to adapt to the client’s behavior. You can enable UPnP in both the router and client to automatically configure the connection and most routers have stateful inspection abilities as part of their port-forwarding mechanism.

Take together, what it means is that a connection can be dynamically made on a random port, and then the router can keep track of what is happening instead of viewing everything as random, meaningless connections. That way, it can forward a connection as necessary because for example, it is a response to this other connection that just happened.

The problem comes when you have multiple systems using the same program. Having multiple systems connected to the same router, sharing the same IP and using dynamic ports quickly becomes unmanageable and even with stateful inspection, it is difficult if not impossible to get it to work correctly. In that case, static ports (default or otherwise) will need to be used.


If you use a program like SmartSniff or TCPView to monitor your connections, you will notice that the P2P connections will usually have the port you configured (or the default for the client) as the destination for incoming connections and either the default or a custom/random port for the source, and vice versa for outgoing connections.


Your question touches the heart of the Internet and the very definition of routing. In your example, Router D sends data to Computer A based on two premises:

  • It's been told to send data to Computer A.
  • It's already processed data from Computer A.

Your scenario seems to assume the first option - Router D wants to send to Computer A. But how does it get there? It does so through the use of routing tables which are shared by routers amongst each other.

Router C regularly sends updates to all routers in knows about - including Router D - that it "knows" the "192.168.*" network (in reality - this wouldn't happen because that network isn't routed - it's considered private. But ignore that.) So, Router D already knows that Router C knows that network.

So when data is destined for Computer A, it's addressed by network first. So, Router D asks, "I need to find the 192.168.* network. Do I know it? Nope. Does I know someone else who does? Yes. Router C does. How do I get to router C? Through my 2.2.2.2 interface."

Router D then sends the data to Router C. Router C gets it and says, "Oh, I have data from Router D but it's for the 192.168 network. Do I know that network? Yes, through my 192.168.1.1 network" And then forwards it.

There's some other work to be done to resolve IP and MAC addressing, but I'm covering routing, per se, not ARP and local networking.

You'll notice your first assumption - the remote router must know the routing mechanism - does not come into play here. Router D does not care if Router C is using EIGRP, RIP, RIPv2, OSPF, or whatever. All it cares is that it got an update. (Of course, how it got an update is important to ensure the two stay in synch. But again, that's a different issue.)

Your second assumption - that port number is a factor in routing - is also incorrect. Routers (generally) don't need port information to make routing decisions. (That has changed slightly, due to some new network technologies and applies mainly to firewalls and proxies, but still the broader assumption still applies to "true" routers.)

Continuing with your example, Router C will forward data on port 1000 (per your scenario) because it's possible there is a service on Computer A expecting data on that specific port. But it only knows to do because Router D sent it on port 1000. And router D only sends it on that port because the originator of the data sent it to Router D on that port.

I don't understand your inclusion of bittorrent or P2P programs as reflective of the question you ask. The same explanations would apply. Routers also can be configured with port triggering which associates a particular device (or IP) with a particular port. Such that when traffic comes in port 1234, the routers knows to send data to Device ABCD. This is usually associated with an outgoing TCP port. i.e. If I send traffic on port 7890, the router knows incoming traffic will be on port 1234 and send it to me.

But port triggering is not associated with (remote) routing decisions - instead it relates to the internal MAC/IP table the router uses for the LAN.

Update/edit: To further answer and elaborate after your comment. Router D knows Computer A only by its IP address (192.168.2.2). But Router C knows Computer A by its IP address and by its MAC address. The MAC (Media Access Control) is a unique (usually...) 48-bit identifier that is defined by international standard. Every device connected to a LAN (wired and wireless) are supposed to have a unique MAC address.

The router (Router C) associates the IP address and MAC address together in a table (the MAC address table). So when traffic comes into Router C, and the router realizes its "local" to it, it does a MAC address table lookup. The router then literally changes the frame addressing information.

It reconstructs (rewrites) the Layer 2 destination information to have the destination MAC address of Computer A but keeps the IP address information (Layer 3) to be the same.

If the route does NOT know the MAC address. Or does not have an IP-MAC relationship in its table, it does something called an ARP (address resolution protocol) to ask "HEY, everyone on this network. Do you have this MAC address?" Or sometimes - "Everyone, What is your MAC address?"). The appropriate device/devices responds and the router builds its IP-MAC table.


Port Triggering. How does a web server send a webpage to you after you've requested it? Because you've requested it. When you request it, the router knows to expect a reply and when it gets it, it forwards it to the appropriate PC. Some programs are written to trigger an opening in anticipation of a signal from a specific PC, even if one isn't really on its way.

Some models have a central server used for basic communication. For example:

  • Client1 signs in with Server for 2-way communications.
  • Client2 signs in for the same thing.

Server now knows all files that Client1 and Client2 have.

  • Client2 says "I want file X from Client1" to Server.
  • Server tells Client1 "Client2 wants X file."
  • Client 1 sends a garbage piece of data to Client2's public IP, setting off the Port Triggering so it opens up the port for a reply from Client2.
  • Client2 sends its initial signal to Client1's public IP.

Client1 just fooled the router into opening up that port for Client2.

In some cases, such as BitTorrent or the original Napster (iirc), you have to forward a port on your router for it to work optimally.

As far as other clients knowing which port to connect to initially, it's because your client told the swarm or server which port you use. BitTorrent frequently uses a tracker and that keeps track of which ports are used by which clients.