4.4 Traffic-Correlation Attacks & Global Adversaries
Traffic-correlation attacks are among the most serious and widely studied threats to low-latency anonymity networks like Tor.
They do not rely on breaking encryption. Instead, they exploit timing and volume relationships between traffic entering and leaving the network.
This chapter explains:
what traffic correlation is
what a “global adversary” means in research
what has been demonstrated in practice
why perfect defenses are impossible
how systems reduce (but cannot eliminate) risk
A. What Is a Traffic-Correlation Attack?
At a high level, a traffic-correlation attack attempts to link:
who is sending traffic
withwhere that traffic ultimately goes
by comparing observable patterns such as:
packet timing
packet counts
burst shapes
flow durations
If two traffic streams look statistically similar over time, an adversary may infer they are related.
Key point:
The content can remain fully encrypted and still be vulnerable.
B. The “Global Adversary” Concept
In anonymity research, a global adversary is a theoretical attacker who can:
observe large portions of the internet
monitor traffic at multiple points simultaneously
store traffic for long periods
perform large-scale statistical analysis
Examples used in papers include:
nation-state intelligence agencies
large ISPs cooperating
internet backbone observers
This is a worst-case threat model, not an everyday attacker.
C. Why Low-Latency Anonymity Is Vulnerable
Tor and similar systems are low-latency by design.
They aim to support interactive web use.
Low latency means:
packets are forwarded quickly
timing relationships are preserved
delays are minimized
Unfortunately, preserved timing enables correlation.
High-latency systems (mixnets) resist correlation better but sacrifice usability.
This is a fundamental trade-off.
D. How Traffic Correlation Works (Conceptually)
Without going into mechanics, correlation relies on:
Observation at Entry
- traffic leaving a user toward the network
Observation at Exit
- traffic leaving the network toward a destination
Statistical Matching
comparing timing and volume patterns
ruling out unrelated flows
narrowing candidates over time
The longer the observation period, the stronger the signal.
E. Key Research Results
1. Murdoch & Zieliński (2007)
Showed that:
timing variations can leak information
congestion effects can amplify signals
This demonstrated feasibility under strong assumptions.
2. Johnson et al. (2013)
Demonstrated:
correlation attacks improve with time
partial visibility still provides advantage
guard node design reduces exposure
This paper formalized many modern threat models.
3. Later Measurement Studies
Subsequent work showed:
real-world noise complicates attacks
accuracy is probabilistic, not absolute
scale and coordination are required
Research consistently emphasizes difficulty, not inevitability.
F. What Traffic-Correlation Attacks Do Not Do
Important clarifications:
They do not instantly reveal identities
They do not work reliably on short sessions
They do not bypass Tor’s cryptography
They do not guarantee correctness
Results are:
statistical
confidence-based
error-prone
This is why claims like “Tor is broken” are inaccurate.
G. Defenses Used in Practice
While no perfect defense exists, systems reduce risk through:
1. Entry Guards
Limit the number of relays that see user entry traffic, reducing exposure.
2. Circuit Rotation
Short-lived circuits reduce long-term correlation.
3. Network Diversity
Geographic and administrative diversity complicates observation.
4. Padding Research
Adding cover traffic reduces signal quality (at performance cost).
These measures raise the cost, not eliminate the threat.
H. Why Global Adversaries Are Hard to Model
In reality:
no single entity sees the entire internet
data collection is fragmented
legal, technical, and economic barriers exist
Research therefore uses upper-bound models to test resilience.
This ensures systems are designed conservatively.
I. Hidden Services and Traffic Correlation
Hidden services are exposed differently than clients:
no exit relay is used
rendezvous points are involved
both sides are inside the network
This reduces some risks but does not remove:
timing leakage
long-term correlation potential
Hence the emphasis on:
blinded keys
descriptor rotation
minimizing uptime patterns
J. Why This Is an Unsolved Problem
Traffic correlation remains unsolved because:
Timing is essential for usability
Noise reduction improves user experience
Global observation is theoretically possible
Perfect cover traffic is impractical
This is why anonymity research continues exploring:
mixnets
batching
delay tolerance
hybrid architectures
K. Core Lessons from Traffic-Correlation Research
Anonymity is probabilistic, not absolute
Time strengthens adversaries
Usability and anonymity conflict
Raising cost is the realistic goal
Threat models must assume the worst
These lessons guide modern darknet engineering.