Page 1 of 1

The thing with the multiple hops

Posted: Thu Feb 01, 2024 3:37 pm
by lgillis
I have a whimsical question about hops. For those who have no idea but would like to join in, hops are the things that separate us from each other for supposed security reasons. This works according to the wisdom that if I get into mischief with I2P, someone else whose IP is tangible is liable. The principle is described in detail on Echelon's I2P homepage, or at least it was at the time.

By default, we have 3 hops on each side and in each direction. So everyone hides behind a number of other participants. So far so familiar. How does the opponent recognize when the file sharer Han Solo uses only 1 or no hops instead of 3?

Actually a question that should be easy to answer, right?

Re: The thing with the multiple hops

Posted: Thu Feb 01, 2024 7:32 pm
by cumlord
i think partly has to do with the number of nodes that would need to be compromised in the network to associate such traffic to an ip as could be done with sybil

with 1 hop, less nodes need to be compromised to correlate traffic with the target ip. that hop would know the sender and receiver ip on that tunnel. so chances of this happening are greater than with 3 hops.

otherwise the attacker would need to have control of either ends of the tunnel so would need to control a higher % of tunnels to increase probability of a successful attack.

Re: The thing with the multiple hops

Posted: Sat Feb 03, 2024 12:38 am
by anikey
Tl;dr at the end.
lgillis wrote: Thu Feb 01, 2024 3:37 pm How does the opponent recognize when the file sharer [...] uses only 1 or no hops instead of 3?

Actually a question that should be easy to answer, right?
Well, actually, according to the documentation, they aren't supposed to learn the length of the tunnel.
And i quote (from here: http://i2p-projekt.i2p/en/docs/tunnels/implementation): (emphasis mine)
The tunnel's creator selects exactly which peers will participate in the tunnel, and provides each with the necessary configuration data. They may have any number of hops. It is the intent to make it hard for either participants or third parties to determine the length of a tunnel, or even for colluding participants to determine whether they are a part of the same tunnel at all (barring the situation where colluding peers are next to each other in the tunnel).
Re: zero-length tunnels

But an attacker could theoretically determine if one has a zero-length tunnel by observing the end of the tunnel (which is the only thing visible to the external observer). If it changes, it's probably not zero-length. If it doesn't change, there's a chance that the tunnel is zero-length (but they can't be properly sure that it is -- this is called 'plausible deniablility'). This can also be to some extent used for a performance boost - mix in some zero-length tunnel among longer tunnels. But that would only be suitable if you need really basic anonymity - it is not equivalent to having longer tunnels.

Another thing is that outbound and inbound tunnels are a bit different. Inb. tunnels are listed in a leaseset, and they are clearly visible for the other end (the other end would in this case be the sender of packets), as can be demonstrated by the java i2p webconsole, which if memory serves me right, lists all the inb.gateways of a destination somewhere in its 'netdb search' section.
Outbound tunnels, however, would be a bit more difficult to look at, since they might not be listed anywhere. According to my understanding of I2P network (not talking about the existing implementations), you don't even have to have a defined 'pool' of outbound tunnels, since the tunnels (for the purpose they are serving) don't need to be recorded anywhere (besides of course the owner of those tunnels), but i don't know how easy it would be to detect the router hash of the outb. endpoint of a message (i haven't researched the protocols extensively enough to know that).

Variable tunnel build messages (5 hops and more)

Although there is one moment mentioned deep down in the docs, this only really matters for people who use tunnels longer than 4 participants (i think there shouldn't be many of them, i even think that most people wouldn't go beyond 3 hops), but here is what it says (source: http://i2p-projekt.i2p/spec/tunnel-creation):
... The newer Variable Tunnel Build Message ([VTBM]) contains 1 to 8 records. The originator may trade off the size of the message with the desired amount of tunnel length obfuscation.

In the current network, most tunnels are 2 or 3 hops long. The current implementation uses a 5-record VTBM to build tunnels of 4 hops or less, and the 8-record TBM for longer tunnels.
So as i understand it, it means that tunnel participants (if they assume you're running a standard i2p router, i.e. the 'current implementation') would be able to understand that the tunnel they're participating in fits one of these two cases:
a) it is 4 hops or less,
b) it is 5 hops or more,
and they can't really be sure because they can't be sure that the tunnel creator is running standard, unmodified I2P.

However as i said eariler, many people surely wouldn't use that many hops (right??) and as such this should not be as big of a deal as it may seem.

In conclusion (and as a tl;dr) i would like to say that the network is supposed to hide the tunnel length, and it does exactly that in most cases.

Re: The thing with the multiple hops

Posted: Sat Feb 03, 2024 4:09 pm
by lgillis
I like your explanations. (Let's wait and see if there are any misunderstandings about the explanation that outgoing and incoming tunnels should be treated a little differently. ;-) )

I quickly drew a picture to roughly illustrate the tunnel length thing, the basics of slowness so to speak:

Link: I2P Tunnel and Hops.webp, Type: image/webp, Size: 99.13 KB

Re: The thing with the multiple hops

Posted: Sat Feb 03, 2024 6:42 pm
by anikey
Don't forget that one can have multiple inbound and outbound tunnels, and the packets can (at least theoretically) be spread across all of them, to combine throughput and speed of multiple tunnels.
I'm not sure whether they do that in practice or not. That probably depends on the implementation.

Re: The thing with the multiple hops

Posted: Mon Feb 05, 2024 10:05 am
by lgillis
Yes, but although the participants can set the number of tunnels they own, they have no influence on their use. It is therefore not possible to determine whether and when which tunnels are used and which data packet is sent through which tunnel. The path of the data is therefore not traceable. The hops used in the tunnel (the other I2P nodes, not their quantity) are also determined by the software. As a result, the same hops are often found in the various tunnels, both in the inbound and outbound tunnels, which does not necessarily promote personal confidentiality. Perhaps this will change with the number of I2P participants?

If you now want to test the throughput of the respective router implementations, as cumlord suggested, the 1-in/out tunnel configuration with 0 hops would be the basis. As soon as several hops per tunnel are used, the average performance of the peer network is tested. (The graphic above should also illustrate this.)

We can therefore assume that average file sharers, i.e. participants who download and then switch off their BitTorrent client or are only active on their free weekends, do not expose themselves to any major insecurity with a one-hop tunnel. On the other hand, clients that are in continuous use should consider various points of attack, precisely because of the time available to the attackers.

By the way, I think in 2022 "Sybil detectors" were installed in the routers, among other things. But I can't say whether and to what extent this helps with file sharing.

The next time you dear people out there want to download single or multidigit gigabyte torrents, you might want to take this information into account. Don't forget to restart the bt-client.

Re: The thing with the multiple hops

Posted: Mon Feb 05, 2024 11:08 pm
by anikey
lgillis wrote: Mon Feb 05, 2024 10:05 am The hops used in the tunnel (the other I2P nodes, not their quantity) are also determined by the software. As a result, the same hops are often found in the various tunnels, both in the inbound and outbound tunnels, which does not necessarily promote personal confidentiality. Perhaps this will change with the number of I2P participants?
Participants of tunnels are indeed determined by the i2p router. More details here: http://i2p-projekt.i2p/en/docs/how/peer-selection.
Quoting from there:
To reduce the susceptibility to some attacks, and increase performance, peers for building client tunnels are chosen randomly from the smallest group, which is the "fast" group. There is no bias toward selecting peers that were previously participants in a tunnel for the same client.
By the way, there is a defined order for tunnel participants:
Peers are ordered within tunnels to to deal with the predecessor attack (2008 update). More information is on the tunnel page.
.
lgillis wrote: Mon Feb 05, 2024 10:05 am By the way, I think in 2022 "Sybil detectors" were installed in the routers, among other things. But I can't say whether and to what extent this helps with file sharing.
I only know about sybil detection in java i2p. Unfortunately i can't find info about it on the i2p project site.
You can explore it somewhere in webconsole.

Re: The thing with the multiple hops

Posted: Tue Feb 06, 2024 9:43 am
by cumlord
lots of good info here, no idea if or how number of tunnels could be discovered. Had me wondering about what the probabilities were for some form of predecessor/sybil attack leading to de-anonymization, since either ends of the tunnel are the most important for this even if the attacker has no idea how many hops there are

tl;dr: Less hops offer more protection than i thought and for some users like the average weekend torrent downloader as Igillis said it's probably enough. It's safer to run things as long as possible to prevent peer order needing to reload, generally much lower probability of these kinds of attacks de-anonymizing than i thought even if there are thousands of compromised routers online doing surveillance (i think, anyway, i'm making lots of assumptions here)

used this formula to find probability of an attacker having nodes at both ends of tunnel, like "-C-X-X-X-X-C-" not 100% sure if it's right but here's it is:

Code: Select all

p of tunnel order = compromised routers/total routers)^hops in path
tunnel combinations = 2^(hops_total-2)
corrected p = (tunnel combinations * p of tunnel order) / tunnel combinations
Assumptions:
  • 45k i2p routers
  • router order is random chance of all available routers (although it's not in reality-i have no idea how to account for that)
http://o7jgnp7bubzdn7mxfqmghn3lzsjtpgkb ... uters2.png
Image
Top:
if both parties have 3 hop tunnels, what is p (in %) of this happening compared to # of compromised routers?

Middle:
probability (in %) of bad tunnel with varying number of hops used and # of compromised routers. i assumed there would be a reason why 3 is the default (7 hops), does not seem to change probability much until heavy burden of compromised routers. lower hop tunnels offer more protection than i thought

Bottom:
assume users restart the service weekly, with varying amounts of "low-grade" network surveillance occurring, which would be unsurprising in my opinion even as java router makes this kind of attack more expensive. I intended to get a more realistic view of the probability of this kind of attack being used successfully over time as data is gathered by an attacker to de-anonymize.

For up to 10% network compromise with those assumptions, the p for 1 yr appears to be between 0-1%, which seems manageable for most use cases and could theoretically be lowered.

As part of defense against predecessor attack i2p keeps the peer order for each pool, which is only reset on restart or tunnel pool start. So as long as things aren't restarting all the time, peer order is stable, so long-term probability of attacker succeeding this way is low

Takeaways:
Seems like to minimize this sort of attack potential you'd want to limit the number of times a new peer order is chosen:
  • let services run as long as possible
  • limit number of routers
  • make sure routers are stable for long periods
  • have as much cover traffic as possible to create false positives for the attacker
  • more hops past the default 3 will not offer much more protection, unless the network is under a very large scale attack

Re: The thing with the multiple hops

Posted: Wed Feb 07, 2024 7:37 am
by lgillis
Oha, difficult math for computer scientists … 8-)
anikey wrote: Mon Feb 05, 2024 11:08 pm I only know about sybil detection in java i2p. Unfortunately i can't find info about it on the i2p project site.
You can explore it somewhere in webconsole.
I can't find anything about Sybil or other scenarios in the I2Pd source code or documentation. From the discussion "i2pd specific attack?" it seems to me that no software solution has been found.

Literatur link: Alachkar and Gaastra, Blockchain-based Sybil Attack Mitigation: A Case Study of the I2P Network (2018).pdf
Type: application/pdf, Size: 469.05 KB
(The authors try to explain how a Sybil attack works.)

Re: The thing with the multiple hops

Posted: Sat Feb 10, 2024 12:56 am
by anikey
Actually, there is an anonymity network (lokinet) different from I2P, that uses a blockchain and a crypto coin for preventing sybil attacks.
It's more like tor in the sense of there being two levels of nodes (clients and 'service nodes' like tor relays)
But instead of using centralized directory servers they put a blockchain that makes all 'service node' operators stake an amount of coins so that their node gets included in the network participation (i.e. routes traffic).

Their point is that if an attacker wants to control lots of hops, in their network the attacker would need to get a lot of coins, too, to stake them, because otherwise they wouldn't be able to make lots of nodes.

My problem with this kind of network (whether tor or lokinet) is that it has an explicit "bottleneck" in the form of a second level of participants (tor relays/lokinet service nodes) and if there are suddenly many users increasing, that bottleneck will get choked and the network would slow down. (They want to push freaking video calls over that lokinet!) I2P should scale because everyone routes traffic for everyone.

Oh, and "cover traffic" or something. (and one that is also useful for others).