4.4 Traffic-Correlation Attacks & Global Adversaries

4.4 Traffic-Correlation Attacks & Global Adversaries

Traffic-correlation attacks are among the most serious and widely studied threats to low-latency anonymity networks like Tor.
They do not rely on breaking encryption. Instead, they exploit timing and volume relationships between traffic entering and leaving the network.

This chapter explains:

  • what traffic correlation is

  • what a “global adversary” means in research

  • what has been demonstrated in practice

  • why perfect defenses are impossible

  • how systems reduce (but cannot eliminate) risk


A. What Is a Traffic-Correlation Attack?

At a high level, a traffic-correlation attack attempts to link:

  • who is sending traffic
    with

  • where that traffic ultimately goes

by comparing observable patterns such as:

  • packet timing

  • packet counts

  • burst shapes

  • flow durations

If two traffic streams look statistically similar over time, an adversary may infer they are related.

Key point:
The content can remain fully encrypted and still be vulnerable.


B. The “Global Adversary” Concept

In anonymity research, a global adversary is a theoretical attacker who can:

  • observe large portions of the internet

  • monitor traffic at multiple points simultaneously

  • store traffic for long periods

  • perform large-scale statistical analysis

Examples used in papers include:

  • nation-state intelligence agencies

  • large ISPs cooperating

  • internet backbone observers

This is a worst-case threat model, not an everyday attacker.


C. Why Low-Latency Anonymity Is Vulnerable

Tor and similar systems are low-latency by design.
They aim to support interactive web use.

Low latency means:

  • packets are forwarded quickly

  • timing relationships are preserved

  • delays are minimized

Unfortunately, preserved timing enables correlation.

High-latency systems (mixnets) resist correlation better but sacrifice usability.

This is a fundamental trade-off.


D. How Traffic Correlation Works (Conceptually)

Without going into mechanics, correlation relies on:

  1. Observation at Entry

    • traffic leaving a user toward the network
  2. Observation at Exit

    • traffic leaving the network toward a destination
  3. Statistical Matching

    • comparing timing and volume patterns

    • ruling out unrelated flows

    • narrowing candidates over time

The longer the observation period, the stronger the signal.


E. Key Research Results

1. Murdoch & Zieliński (2007)

Showed that:

  • timing variations can leak information

  • congestion effects can amplify signals

This demonstrated feasibility under strong assumptions.


2. Johnson et al. (2013)

Demonstrated:

  • correlation attacks improve with time

  • partial visibility still provides advantage

  • guard node design reduces exposure

This paper formalized many modern threat models.


3. Later Measurement Studies

Subsequent work showed:

  • real-world noise complicates attacks

  • accuracy is probabilistic, not absolute

  • scale and coordination are required

Research consistently emphasizes difficulty, not inevitability.


F. What Traffic-Correlation Attacks Do Not Do

Important clarifications:

  • They do not instantly reveal identities

  • They do not work reliably on short sessions

  • They do not bypass Tor’s cryptography

  • They do not guarantee correctness

Results are:

  • statistical

  • confidence-based

  • error-prone

This is why claims like “Tor is broken” are inaccurate.


G. Defenses Used in Practice

While no perfect defense exists, systems reduce risk through:

1. Entry Guards

Limit the number of relays that see user entry traffic, reducing exposure.

2. Circuit Rotation

Short-lived circuits reduce long-term correlation.

3. Network Diversity

Geographic and administrative diversity complicates observation.

4. Padding Research

Adding cover traffic reduces signal quality (at performance cost).

These measures raise the cost, not eliminate the threat.


H. Why Global Adversaries Are Hard to Model

In reality:

  • no single entity sees the entire internet

  • data collection is fragmented

  • legal, technical, and economic barriers exist

Research therefore uses upper-bound models to test resilience.

This ensures systems are designed conservatively.


I. Hidden Services and Traffic Correlation

Hidden services are exposed differently than clients:

  • no exit relay is used

  • rendezvous points are involved

  • both sides are inside the network

This reduces some risks but does not remove:

  • timing leakage

  • long-term correlation potential

Hence the emphasis on:

  • blinded keys

  • descriptor rotation

  • minimizing uptime patterns


J. Why This Is an Unsolved Problem

Traffic correlation remains unsolved because:

  1. Timing is essential for usability

  2. Noise reduction improves user experience

  3. Global observation is theoretically possible

  4. Perfect cover traffic is impractical

This is why anonymity research continues exploring:

  • mixnets

  • batching

  • delay tolerance

  • hybrid architectures


K. Core Lessons from Traffic-Correlation Research

  1. Anonymity is probabilistic, not absolute

  2. Time strengthens adversaries

  3. Usability and anonymity conflict

  4. Raising cost is the realistic goal

  5. Threat models must assume the worst

These lessons guide modern darknet engineering.


docs