Autocorrelation

Autocorrelation measures how strongly a series predicts its own future — and the ACF plot is the first tool to reach for when choosing a forecasting model.

Ordinary correlation measures the relationship between two different variables. Autocorrelation measures the relationship between a variable and itself at a different point in time. It asks: if the value today is higher than usual, how much does that tell me about the value tomorrow? Next week? Next year?

The autocorrelation function (ACF)

For a time series x, the autocorrelation at lag k is the Pearson correlation between x(t) and x(t−k):

ACF(k) = Corr(x(t), x(t-k))

At lag 0, the correlation is always 1.0 — a series is perfectly correlated with itself. At lag 1, it measures whether knowing today's value helps predict tomorrow's. At lag 12 (for monthly data), it measures whether the value from the same month last year is informative.

The ACF plot shows ACF(k) for k = 0, 1, 2, …, N. Bars extending outside the confidence bands (typically ±2/√n) are statistically significant.

Reading ACF patterns

Slow exponential decay across all lags indicates a trend. A trending series has high correlation at lag 1 (this month resembles last month), still significant correlation at lag 2 and beyond. The correlation decays slowly because trend carries information across long windows.

Spikes at regular intervals (lag 12, 24, 36 for monthly data; lag 7, 14 for daily data) indicate seasonality. The series is more similar to itself one period ago than to observations at non-seasonal lags.

Rapid decay to zero (significant only at lag 1, or not at all) indicates a stationary series with little memory — each observation is mostly noise relative to the previous one.

ACF and ARIMA parameter selection

The ARIMA family of models has three main parameters: p (autoregressive order), d (differencing order), and q (moving average order). The ACF, along with the partial ACF (PACF), provides the traditional way to choose these:

If the ACF decays slowly: difference the series (increment d) until the decay becomes rapid.
A sharp cut-off in the ACF at lag q, with significant spikes only up to that lag, suggests a moving-average term of order q.
The PACF (not covered here) is the complementary diagnostic for the autoregressive order p.

At this level, automatic selection tools (pmdarima.auto_arima) handle the parameter search, but understanding what the ACF tells you is necessary to diagnose a poorly-fitting model.

statsmodels.graphics.tsaplots.plot_acf produces the ACF plot. In a non-graphical environment, statsmodels.tsa.stattools.acf returns the values as a numpy array — useful for printing or programmatic inspection.

Where to go next

Next: simple forecasting — implementing three forecasters (naïve, moving average, exponential smoothing) and measuring each with MAE to see how much the added complexity is worth.

Finished reading? Mark it complete to track your progress.

The autocorrelation function (ACF)

Reading ACF patterns

ACF and ARIMA parameter selection

Where to go next

On this page