UDP Peering Source Port Problem #961
Description
The issue tracker is only for reporting bugs or submitting feature requests.
If you need technical assistance for running a node please consult the #fullnode channel on Discord (https://discord.gg/jrxApWC) or https://forum.helloiota.com/Technology/Help.
If you have general questions on IOTA you can go to https://iota.stackexchange.com/, https://helloiota.com/, or browse Discord channels (https://discord.gg/C88Wexg).
Bug description
When having a UDP neighbor, it seems that IRI will only allow a "session" to occur when the source port of the neighbor matches the port the neighbor was registered with.
Steps To Reproduce
Please see the two examples:
Example of a working UDP peering
Consider our address is udp://80.61.xxx.xxx:14600
and our neighbor's is:
udp://185.10.xxx.xxx:14600
After having added the neighbor (and the neighbor adds us) we can run tcpdump. We see the packets originating from the neighbor and the source port is 14600:
22:54:13.734570 IP 185.10.xxx.xxx.14600 > 80.61.xxx.xxx.14600: UDP, length 1650
In this situation IRI establishes a "session" and the neighbors are communicating.
Example of a non-working UDP peering
If our neighbor's packets traverse a router which performs PNAT (port address translation) we no longer see the source port of 14600, but some randomly assigned port:
22:54:13.734570 IP 185.10.xxx.xxx.30485 > 80.61.xxx.xxx.14600: UDP, length 1650
In this case, there's no "session" created and no communication between the neighbors.
The case with Docker
The problem described above is most problematic when running IRI in Docker (not on host network!).
When using the default docker network or a user-defined network -- masquerade is enabled by default. This means that the source IP of the container (normally on a 172.16.0.0/16 network) translates to the IP of the host. The problem is that the port also gets translated, hence the problem.
Disabling the masquerade from the network doesn't help because the source IP becomes that of the container rather than the host.
Spec
On what hardware is the node running on?
- Node can be baremetal or VPS.
OS:
- Ubuntu, Debian or CentOS, all the same.
IRI version:
- latest (1.5.3) and earlier.
Expected behaviour
Expect that when the UDP source port is translated, IRI can still allow the neighbors to communicate (that is the case with TCP).
Actual behaviour
When source port is translated, there is no communication between the neighbors.
Errors
No errors, just no communication.
Activity
cyclux commentedon Aug 30, 2018
Thanks for this detailed issue report! I can totally confirm all the points mentioned, it's also easy to reproduce. Probably many of the experienced UDP neighboring issues are linked to this. The topology would also be more healthy if there would not be so many "incompatible" neighbors (especially considering Nelson).
jakubcech commentedon Aug 30, 2018
Hey, thanks for reporting this!
legacycode commentedon Sep 18, 2018
I have the same problem running iri in a docker container with managed network. At the moment i am running on host network.
legacycode commentedon Sep 19, 2018
Found: moby/moby#15127 (comment)
cyclux commentedon Sep 20, 2018
Seems related yes, however in case of IRI it's primarily the source port which makes the trouble.
Is there any progress / insight in this regard yet?
nuriel77 commentedon Sep 21, 2018
@legacycode I am not 100% sure this is related.
jakubcech commentedon Nov 15, 2018
We decided we can remove both the UDP and TCP port checks from the peering logic. If anyone wants to take it up on himself and open a new PR, more than welcome. We will get to this eventually, but right now local snapshots have a priority. :)
Original PR with the change: https://github.com/iotaledger/iri/pull/206/files But there's also a dns refresher thread being spawned in Node.java that checks for port:
https://github.com/iotaledger/iri/blob/08c2cb9a0ff268001c77916c8d123956ddcbd889/src/main/java/com/iota/iri/network/Node.java#L122 and possibly more :)
Please also see Contributing guielines
jakubcech commentedon Nov 15, 2018
To add to this, adding neighbors with a port should still be possible. Otherwise we'd have to redo validations, addNeighbour calls and so on. It just won't be a requirement anymore and won't be relied on in the peering logic. If clients want to redo their dependencies to use port-less addresses, they can, but they don't have to.
^ thoughts?