I have two time series and I want to find the lag that results in maximum correlation between the two time series. The basic problem we’re considering is the description and modeling of the relationship between these two time series.

In signal processing, cross-correlation is a measure of similarity of two series as a function of the lag of one relative to the other. This is also known as a sliding dot product or sliding inner-product.

For discrete functions, the cross-correlation is defined as**:**

In the relationship between two time series (*y*_{t} and *x*_{t}), the series *y*_{t} may be related to past lags of the *x*-series. The **sample cross correlation function (CCF)** is helpful for identifying lags of the *x*-variable that might be useful predictors of *y*_{t}.

In R, the *sample ***CCF** is defined as the set of sample correlations between *x*_{t+h} and *y*_{t} for *h* = 0, ±1, ±2, ±3, and so on.

A negative value for *h* is a correlation between the *x*-variable at a time before *t* and the *y*-variable at time *t*. For instance, consider *h* = −2. The CCF value would give the correlation between *x*_{t-2} and *y*_{t}.

For example, let’s start with the first series, y1:

x <- seq(0,2*pi,pi/100)
length(x)
# [1] 201
y1 <- sin(x)
plot(x,y1,type="l", col = "green")

Adding series y2, with a shift of pi/2:

y2 <- sin(x+pi/2)
lines(x,y2,type="l",col="red")

Applying the **cross correlation function (cff)**

cv <- ccf(x = y1, y = y2, lag.max = 100, type = c("correlation"),plot = TRUE)

The maximal correlation is calculated at a positive shift of the y1 series:

cor = cv$acf[,,1]
lag = cv$lag[,,1]
res = data.frame(cor,lag)
res_max = res[which.max(res$cor),]$lag
res_max
# [1] 44

Which means that maximal correlation between series y1 and series y2 is calculated between y1_{t+44} and y2_{t}

### Like this:

Like Loading...