Friday, November 15, 2013

Cointegration Trading with Log Prices vs. Prices

In my recent book, I highlighted a difference between cointegration (pair) trading of price spreads and log price spreads. Suppose the price spread hA*yA-hB*yB of two stocks A and B is stationary. We should just keep the number of shares of stocks A and B fixed, in the ratio hA:hB, and short this spread when it is much higher than average, and long this spread when it is much lower. On the other hand, for a stationary log price spread hA*log(yA)-hB*log(yB), we need to keep the market values of stocks A and B fixed, in the ratio hA:hB, which means that at the end of every bar, we need to rebalance the shares of A and B due to price changes.

For most cointegrating pairs that I have studied, both the price spreads and the log price spreads are stationary, so it doesn't matter which one we use for our trading strategy. However, for an unusual pair where its log price spread cointegrates but price spread does not (Hat tip: Adam G. for drawing my attention to one such example), the implication is quite significant. A stationary price spread means that prices differences are mean-reverting, a stationary log price spread means that returns differences are mean-reverting. For example, if stock A typically grows 2 times as fast as B, but has been growing 2.5 times as fast recently, we can expect the growth rate differential to decrease going forward. We would still short A and long B, but we would exit this position when the growth rates of A vs B return to a 2:1 ratio, and not when the price spread of A vs B returns to a historical mean. In fact, the price spread of A vs B should continue to increase over the long term.

This much is easy to understand. But thanks to a reader Ferenc F. who referred me to a paper by Fernholz and Maguire, I realize there is a simple mathematical relationship between stock A and B in order for their log prices to cointegrate.

Let us start with a formula derived by these authors for the change in log market value P of a portfolio of 2 stocks: d(logP) = hA*d(log(yA))+hB*d(log(yB))+gamma*dt.

The gamma in this equation is

gamma=1/2*(hA*varA + hB*varB), where varA is the variance of stock A minus the variance of the portfolio market value, and ditto for varB.

Note that this formula holds for a portfolio of any two stocks, not just when they are cointegrating. But if they are in fact cointegrating, and if hA and hB are the weights which create the stationary portfolio P, we know that d(logP) cannot have a non-zero long term drift term represented by gamma*dt. So gamma must be zero. Now in order for gamma to be zero, the covariance of the two stocks must be positive (no surprise here) and equal to the average of the variances of the two stocks. I invite the reader to verify this conclusion by expressing the variance of the portfolio market value in terms of the variances of the individual stocks and their covariance, and also to extend it to a portfolio with N stocks. This cointegration test for log prices is certainly simpler than the usual CADF or Johansen tests! (The price to pay for this simplicity? We must assume normal distributions of returns.)

===

My online Quantitative Momentum Strategies workshop will be offered on December 2-4. Please visit epchan.com/my-workshops for registration details.

28 comments:

Anonymous said...

Ernie,

You wrote,

"For example, if stock A typically grows 2 times as fast as B, but has been growing 2.5 times as fast recently, we can expect the growth rate differential to decrease going forward. We would still short A and long B, but we would exit this position when the growth rates of A vs B return to a 2:1 ratio, and not when the price spread of A vs B returns to a historical mean. In fact, the price spread of A vs B should continue to increase over the long term."

I believe this explanation is incorrect (no offense!).

Let's assume A and B have fixed long-term growth rates, but that each has an instantaneous growth rate that fluctates randomly around the mean. If stock A typically grows twice as fast as stock B, then the log(A) price series will grow, on average, at twice the linear rate as the log(B) price series. So, viewed in log space, A and B will both tend to rise linearly with some fluctuations around the best fit line, but log(A)'s best fit line will have twice the slope as log(B)'s. Therefore, log(A) will diverge from log(B) over time, and therefore they cannot be cointegrated.

In order for log(A) and log(B) to be cointegrated, A and B must have the same long-term growth rate. Consider the example of two classes of common stock for a single company that trade on different exchanges in different countries. After accounting for forex effects these two stocks must grow at the same rate since they fundamentally represent the same company. Therefore their log-price series will grow at the same rate and their log-prices will be cointegrated. But over a very long period of time both price series should grow exponentially, so their raw price series will diverge because even a small percentage difference between the two will correspond to a large absolute difference compared to their inital values.

- aagold (Adam G.)

Ernie Chan said...

Adam,

For 2 stocks with growth rates in the ratio of 2:1, we merely have to keep the ratio of their market values to be 1:2 so that their positions will have the same long-term growth rate.

As I wrote, if log prices are cointegrating, we need to constantly rebalance these positions so that their market values are always in 1:2 ratio.

Ernie

Anonymous said...

Ernie,

Let's separate the discussion of a trading strategy from the discussion of cointegration definition. Let's see if we can agree on the following definitions.

1) If stocks A and B are cointegrated in raw price space with hedge ratio h, then the difference A - h*B will fluctuate randomly around 0 with no drift.

2) If stocks A and B are cointegrated in log price space with hedge ratio h, then the difference log(A) - log(h*B) will fluctuate randomly around 0 with no drift.

My claim is that any two stocks A & B which satisfy definition #2 must have the same long-term growth rate. Consider this example: A=exp(alpha0*t) and B=exp(alpha1*t). The difference in their logs is (alpha0 - alpha1)*t, which has no drift only when alpha0 = alpha1 (i.e., same growth rate).

- Adam

Ernie Chan said...

Adam,
I believe your definition of cointegration of log prices is incorrect.

The spread in this case is defined as log(A)-h*log(B), not log(A)-log(h*B) as you wrote.

Ernie

Anonymous said...

Ernie,

Ok, I see your point. Defined this way, stocks with different growth rates can be cointegrated in log space. The hedge ratio h compensates for the different growth rates.

However, I think the most interesting real life examples where log prices are cointegrated, but raw prices are not, occur when h=1 (i.e., same growth rate). At least that's the case for any real-world examples I can think of.

Regards,
Adam

Anonymous said...

Ernie,

Sorry if I'm beating this to death, but in the example you cited with A and B the hedge ratio would be 2. So the log spread would be log(A) - 2*log(B).

You wrote we would exit this position when the growth rate of A reverts to 2 times the growth rate of B, but I think this is incorrect. We should exit the position when the *ratio* A/B^2 reverts to its historical mean (which is equivalent to the log spread returning to its historical mean).

The analagous statement for raw prices is, we would exit the position when the *difference* A - 2*B reverts to its historical mean.

- Adam

Ernie Chan said...

Adam,
You are right that I was being imprecise when I said entry/exit signals should be based on differential "growth rates". By growth rates, I don't mean the instantaneous growth rates d(log(P))/dt, but the average growth rate log(P)/t where t is the time since some distant past at the beginning of our backtest period. Since t is the same for both stocks, difference in average growth rates are essentially the same as the difference in log prices.
Ernie

Anonymous said...

Ernie,

ON the topic of Kelly leverage. I'm following the example in your book (Quantitative Trading pp. 99) and for SPY I calculate a leverage of 21.08 usinfg the last 252 returns. Is that also what you get?

Ernie Chan said...

Anon,
Yes, that's about what I get too.
Ernie

Anonymous said...

Anon,

I certainly hope you're not planning on investing on SPY with a leverage factor of 21.08! You know that would be absolutely insane, right? Even half-Kelly at 10.5 would be insane.

It might be ok to estimate future variance using the past 252 daily returns, but it's certainly not correct to estimate future expected returns that way.

I use a half-Kelly model to determine how much stock market exposure I should have with my real-life portfolio, and right now it's saying I should be 70% in the US stock market and 30% in cash. My estimate of the market's future daily mean return is 6.05% per year, annualized standard deviation is 15.4%, and risk-free interest rate is 2.71%.

- Adan

Anonymous said...

Hi Ernie,

It seems Yahoo Finance starts to provide real-time tick data (no delay).
Do you hear any good news or comments about that?

Ernie Chan said...

Hi Anon,
Indeed Yahoo Finance now offers real-time data, but only from Nasdaq. So if a stock like IBM is primarily traded on NYSE, the Nasdaq price may be slightly different.
Ernie

Anonymous said...

Hi Ernie,

Is that ok to use IB real-time data stream to do pairs trading in US markets?

Anonymous said...

Hi Ernie
After reading your books, I find that a pair of futures contracts traded on Shnaghai Futures exchange that is great to extract roll return form. the question is if one is in backwardation and the other is in contango, how do you determine the hedge ratio between them. Are we suppose to belance their spot return fluctuation? Right now I optimize the backtesting sharp ratio,to determine the ratio, any advice?
Ruan Xun

Ernie Chan said...

Hi Anon,
I find the real-time feed of IB too noisy for pair trading stocks. I recommend IQFeed or even Yahoo Realtime instead.
Ernie

Ernie Chan said...

Hi Ruan,
You can use linear regression on their prices (or log prices) to determine the optimal hedge ratio.

You will always be long the one in backwardation, and short the one in contango if you want to extract roll return.

Ernie

Anonymous said...

Hi Ernie
So should I use continuious futures prices of these 2 contracts,or should I use the 2 underlying spot prices instead?

Hi Ruan,
You can use linear regression on their prices (or log prices) to determine the optimal hedge ratio.

You will always be long the one in backwardation, and short the one in contango if you want to extract roll return.

Ernie

Ernie Chan said...

Ruan,
You should not use continuous futures, nor spot prices.
You should be using individual futures contracts to test for roll returns.
Ernie

Anonymous said...

Ernie,

I tested the price spread you talked about in your book A - h*B for stationarity and I find that for a lot of pairs , the stationarity keeps fluctuating from true to false , and vice versa if I retest the pair for stationarity every month (verying lookback windows). Even a stable pair like GLD-GDX that you mentioned isn't very stationary month to month. How does one interpret this result ?

Ernie Chan said...

Anon,
Stationarity tests should involve at least 1 year of daily data. How are saying that even with 1 year of lookback, the test statistic changes greatly from month-to-month?
Ernie

Anonymous said...

Yes. Using a 1 year lookback & testing for stationarity after every month, I notice that stationary flips from True to False and vice versa on majority of the pairs.
Is this something you notice too ?

Ernie Chan said...

Anon,
Usually stationarity test is more stable than that. Perhaps increasing your lookback to 3 years would help. If not, it simply indicates those pairs are not really stationary.
Ernie

Paul said...

Hi Ernie

do you use fundamental data such as PE and ROE etc? Any good service/api recommendation? thanks

Paul

Ernie Chan said...

Hi Paul,
I don't currently use fundamental data. But I believe you can scrape the Yahoo Finance website for such info. Also, IB's API also provides such data.
Ernie

Anonymous said...

Have you tried Quantum Mechanics trading?!
http://arxiv.org/abs/1307.6727

Ernie Chan said...

Hi Anon,
Thanks for the article. I find this article lacking in empirical support.
Ernie

Anonymous said...

Hi Erine

I just read the paper of
"Optimal Pairs Trading: A Stochastic Control Approach"

http://www.nt.ntnu.no/users/skoge/prost/proceedings/acc08/data/papers/0479.pdf

As I m not familiar of Ornstein–Uhlenbeck process and its application on pairs trading, so I would like to seek your opinion,

Isn't it true that the OU process can model the spread and the mean reverting behaviour in continuous time and dynamic way but the cointegration approach cannot ,but the weakness of the OU process is it does not tell us what is the weightage of each stock in a pair. Thus,We have to use the stochastic control approach to get this weightage, but we will have to set a final period T when we close the position. Whereas for the cointegration approach, it explicitly shows the weightage of each stock in a pair.

Ernie Chan said...

Hi Anon,
Indeed the OU process does not tell you the optimal hedge ratio. It is a model of one mean-reverting time series, not the cointegration of several series. The only use for an OU model for me is to extract the halflife of mean reversion from the regression coefficient.
Ernie