Cointegration vs Correlation

Why co-movement isn't equilibrium, and what changes when you switch from one to the other

The naive question

Two assets move together. The first instinct, learned early in any data course, is to compute the correlation. A high number — above 0.8, say — feels like proof of a stable relationship, and it is where most analyses stop.

The trouble starts when those two series are non-stationary. In 1974, Granger and Newbold ran a now-famous experiment: they generated pairs of independent random walks, regressed one on the other, and found significant coefficients with t-statistics above five most of the time. The $R^2$ values were routinely in the 0.7 to 0.9 range. The relationships did not exist. The regressions were spurious.

The lesson is uncomfortable. The same arithmetic that detects a real economic link also fabricates one out of pure noise — whenever the inputs trend or wander. Correlation, applied to levels of integrated series, lies confidently. To get past the lie, we need to think about a different property entirely: not whether two series move together, but whether their difference is stable.

That property is cointegration.

Stationarity, the forgotten prerequisite

A weakly stationary process has a constant mean, a constant variance, and an autocovariance that depends only on lag, not on absolute time. Most textbook estimators — t-tests, OLS, the central limit theorem — quietly assume stationarity.

A series is integrated of order $d$, written $I(d)$, if it must be differenced $d$ times to become stationary. Returns of well-behaved assets are typically $I(0)$. The price levels themselves are usually $I(1)$: their first differences are stationary, but the levels are not.

Random walks are the textbook $I(1)$ example:

$$x_t = x_{t-1} + \varepsilon_t, \qquad \varepsilon_t \sim \text{i.i.d.}(0, \sigma^2)$$

The variance of $x_t$ grows with $t$, so the unconditional moments are not constant. The Augmented Dickey-Fuller (ADF) test formalises this. It estimates

$$\Delta x_t = \alpha + \beta t + \rho , x_{t-1} + \sum_{i=1}^{p} \gamma_i , \Delta x_{t-i} + \varepsilon_t$$

with the null hypothesis $\rho = 0$ — the series has a unit root, no mean-reversion, and is $I(1)$. Rejecting the null says the series is mean-reverting at some rate set by $\rho$.

Without this preliminary test, every downstream computation on the levels is suspect.

Key point: If a series has a unit root, its sample correlations with anything else are not what they seem. The first question before any regression on price levels is whether those levels are stationary.

Correlation: what it captures, and what it misses

The Pearson correlation between two series $x$ and $y$ is

$$\rho_{xy} = \frac{\operatorname{Cov}(x, y)}{\sqrt{\operatorname{Var}(x),\operatorname{Var}(y)}}$$

When $x$ and $y$ are both stationary, this number is interpretable: it captures the linear co-movement of the values. Most quants, however, compute correlations on returns, not levels — and that is the right thing to do, because returns of liquid assets tend to be stationary. Daily return correlation answers the question "when one moves up by 1%, by how much does the other tend to move?"

What correlation does not tell you is whether the levels stay tied together. Two return series can be 95% correlated and still have their cumulative levels drift arbitrarily far apart. There is no equilibrium being measured. There is only a co-movement of step sizes.

Two failure modes follow from this:

  1. High return correlation, divergent levels. Two assets respond to the same broad market factor and have correlated returns, but their long-run paths separate without bound. Hedging one with a fixed ratio of the other will lose money over time. The hedge ratio is unstable.

  2. Cointegrated levels, modest return correlation. Two assets share an equilibrium spread, but their daily moves are noisy enough that return correlation looks unimpressive. A naive screen by correlation will miss them, even though they are the only pair where a stationary spread exists.

Correlation is the right tool for portfolio variance and for linear factor analysis. It is the wrong tool for hedging or for arbitrage.

Cointegration: the long-run equilibrium

Engle and Granger's 1987 definition is precise. Two series $x_t$ and $y_t$, both $I(1)$, are cointegrated if there exists a constant $\beta$ such that

$$y_t - \beta , x_t = e_t \quad \text{is} \quad I(0).$$

In words: a particular linear combination of two non-stationary series is stationary. The vector $(1, -\beta)$ is the cointegrating vector; the series $e_t$ is the spread.

The geometric picture matters. Each level wanders far from any starting point — that is what $I(1)$ means. But the gap between $y_t$ and $\beta , x_t$ does not wander. It oscillates around a mean and revisits it on a finite timescale. Whatever common force drives the two levels to wander does so in lockstep, leaving the spread untouched.

Cointegration is therefore a statement about the long-run, not about today. Two perfectly cointegrated series can have low daily-return correlation — the random component of their daily moves can be uncorrelated — and still share a strict long-run anchor. Conversely, two highly correlated returns series can have no cointegration at all if no stable hedge ratio $\beta$ produces a stationary spread.

Key point: Correlation is a property of changes. Cointegration is a property of levels. They answer different questions and one does not imply the other.

Testing for cointegration

Engle-Granger two-step

The original procedure is direct:

  1. Estimate $\beta$ by ordinary least squares on the levels: $y_t = \alpha + \beta , x_t + e_t$
  2. Apply an ADF test to the residuals $\hat{e}_t$.

If the residuals reject the unit-root null, the pair is cointegrated. Two practical caveats:

Phillips-Ouliaris

The Phillips-Ouliaris test relaxes the assumption that the residuals' errors are independent. It uses non-parametric corrections for serial correlation and is generally preferred when the data is noisy at high frequency — common in crypto and intraday equities.

Johansen

The Johansen procedure is the multivariate generalisation. It works on a vector autoregression (VAR) of the levels and identifies the rank of the cointegrating space — that is, how many independent stationary linear combinations exist among $k$ series. The two test statistics are the trace test and the maximum-eigenvalue test, both based on the eigenvalues of the long-run covariance.

Johansen pairs naturally with the Vector Error Correction Model (VECM):

$$\Delta y_t = \alpha , (\beta' , y_{t-1} - \mu) + \sum_{i=1}^{p-1} \Gamma_i , \Delta y_{t-i} + \varepsilon_t$$

The matrix $\beta$ holds the cointegrating vectors; the matrix $\alpha$ holds the speeds of adjustment back to equilibrium. VECM is the right framework whenever there are more than two series and one wants to model the system jointly rather than pair by pair.

From statistics to strategy

The spread as an AR(1)

Cointegration on its own is a statistical fact. To trade it, the spread must mean-revert quickly enough to fit a holding period. Model the spread as an AR(1):

$$e_t = (1 + \lambda) , e_{t-1} + \varepsilon_t, \qquad \lambda < 0.$$

Negative $\lambda$ pulls the spread back to zero. The expected value $h$ steps ahead is

$$\mathbb{E}[,e_{t+h} \mid e_t,] = (1 + \lambda)^h , e_t.$$

The half-life — the horizon over which the expected spread shrinks to half its current value — solves $(1 + \lambda)^h = 1/2$:

$$h_{1/2} = -\frac{\ln 2}{\ln(1 + \lambda)}.$$

Half-life is the single most important filter in pair selection. A pair with a 600-day half-life cointegrates, statistically, but is useless for strategies on a swing horizon. A pair with a 30-minute half-life mean-reverts inside the bid-ask spread and is useless after costs. The usable band depends on the asset class and execution stack — for liquid crypto perpetuals, half-lives of a few hours to a few days tend to map well onto realistic holding periods.

Pairs trading, briefly

The strategy is well-known: when the spread $e_t$ is far from its mean — say, more than two standard deviations — sell the rich leg and buy the cheap leg in the ratio $1 : \beta$. Close when the spread mean-reverts. The math is mostly bookkeeping; the difficulty is operational.

The dominant friction is slippage on the basket leg. Two assets in a hedge ratio of 1.34 must be traded simultaneously at the right relative price; missing one fill or lagging the other by seconds can convert an expected reversion into a realised loss. The expected reversion magnitude — driven by the spread's standard deviation and how far from the mean the entry occurs — must comfortably exceed the round-trip cost on both legs. Pairs that look beautiful in-sample frequently fail the cost test out-of-sample for exactly this reason.

Warning

A cointegration p-value of 0.001 is not a profit margin. Many statistically clean pairs are economically dead because round-trip slippage exceeds the spread's typical excursion.

Pitfalls in practice

Multiple testing

Scanning 1,000 candidate pairs at a significance level of $\alpha = 0.05$ produces, by construction, fifty false positives. Cointegration tests are no exception. Two corrections are standard:

Without correction, a screen will find dozens of "cointegrated" pairs every week that are nothing of the sort.

Structural breaks

Cointegration is a long-run property estimated on a finite window. A regime change — a protocol upgrade, a delisting, a new product, a macro shock — can break the equilibrium relationship without warning. The Gregory-Hansen test extends Engle-Granger with a single break point; Bai-Perron handles multiple breaks. For more flexibility, regime-switching models in the tradition of Hamilton (1989) treat the cointegrating relationship itself as state-dependent.

Hedge-ratio instability

The OLS estimate of $\beta$ from a fixed window is not a constant of nature. Re-estimating on a rolling window typically shows $\beta$ drifting on a meaningful timescale. A trading system that assumes a static hedge ratio is hostage to that drift. Walk-forward re-estimation, with periodic re-validation against a held-out window, is the minimum hygiene.

Survivorship bias

Pair-screening over a universe that excludes delisted assets — common in crypto, where coins disappear regularly — overstates the prevalence of stable cointegration. Any conclusion drawn from a survivorship-clean universe will not generalise to live trading.

When to use what

Use case Correlation Cointegration
Portfolio variance, risk budgeting
Linear factor exposure
Long-run hedging of one asset by another partial
Statistical arbitrage between related assets
Rebalancing rules that assume mean-reversion of the spread
Spread-option pricing, convergence trades

The two are complementary. Correlation answers how much do these move together day to day. Cointegration answers do these stay tied together over months. A serious quant pipeline computes both and treats them as separate tools.

Summary

Two ways to read this article.

Property Correlation Cointegration
Operates on Returns / changes Levels
Stationarity assumption Inputs assumed $I(0)$ Inputs assumed $I(1)$, residual must be $I(0)$
Time horizon Short-run, instantaneous Long-run, equilibrium
Symmetric Yes No (depends on hedge ratio $\beta$)
Standard test Pearson, Spearman Engle-Granger, Phillips-Ouliaris, Johansen
Mean-reversion implication None Spread is mean-reverting
Tradable as edge No (it is exposure to a common factor) Yes (the spread reverts)

References

← Back to Articles