### A simulation-based bayes-procedure for robust prediction of pairs trading strategies

Click to see the full-text of:

A simulation-based Bayes’ procedure for robust

prediction of pairs trading strategies

Lukasz T. Gatarek?, Lennart F. Hoogerheide and Herman K. Van Dijk

February 8, 2011

We propose a new simulation method to estimate the cointegration model with

nonnormal disturbances in nonparametric Bayesian framework in order to present a

robust prediction of some alternative trading strategies. We apply the theory of Dirich -

let processes to estimate the distribution of disturbances in form of infinite mixture

Introduction

The motivation for statistical arbitrage techniques has its roots in works that preach

predictability of stock prices and existence of long term relations in the stock markets.

This literature challenges the stylized fact in financial economics which says that the

stock prices shall be decribed by independent random walk processes; what would au -

tomatically imply no predictability in the stock prices. The key references in this area

are [Lo and MacKinlay, 1988], [Lo, 1991], [Lo and MacKinlay, 1992] and [Guidolin, 2009].

Based on these empirical investigations trading strategies might be formed to explore the

inefficiences of the stock markets.

[Khadani, 2007] consider a specific strategy first proposed by [Lehmann, 1990] and

[Lo and MacKinlay, 1990] that can be analyzed directly using individual U. S. equities

returns. Given a collection of securities, they consider a long/short market-neutral equity

strategy consisting of an equal dollar amount of long and short positions, where at each

rebalancing interval, the long positions consist of losers (underperforming stocks, relative

to come back to equilibrium which is zero. [Gatev et al. 2006] show the performance of

this arbitrage rule over a period of 40 years and they find huge empirical evidence in favour

The crucial steps in building the pairs trading strategy is the local estimation of both

current and expected spreads. In the framework of cointegration analysis spread is modeled

as the local deviation from the long-term equilibrium among the time series. Therefore the

current spread betwenn the assets is computed as the product of cointegrating vector and

current stock prices. On the other hand, the expected spread is estimated as the product

of cointegrating vector and predicted stock prices. The spread prediction is based on the

assumption of sound cointegration relation among the pair of assets. To sum up, the

pairs trading technique is based on the assumption that the linear combination of prices

(scaled by the cointegrating vector) reverts to zero and a trading rule can be constructed

to exploit the expected temporary deviations.

The problems concerning the implementation of this technique may have two main

The paper is constructed as follows.

1Preliminaries

In order to test the profitability of pairs trading strategy we need to identify long term rela -

tions in the stock prices. Therefore we apply cointegration model, see [Juselius, 2006]. The

distributions of stock market returns are typically nonnormal. Thus usually t-distribution

and other fattailed distributions are applied. In case of pairs trading, we try to identify

cointegrating relations in a huge universe of assets. It might be incorrect to assume com -

mon distribution of returns across different stocks. We propose a general algorithm to

estimate the cointegration model in a Bayesian way under nonnormality. The outline of

such an algorithm is composed as follows

4. Standardize the residuals and construct artificial timeseries ytaccording to

yt= ? yt+ (?t, j? µt, j)/?Vt, j.

5. Go to Step 1 using artificial timeseries.

The challenge to construct such an algorithm lies in finding an accurate method to select

the number of components in the mixture of Normal distributions. The components of

this mixture are different in each repetition of the algorithm. Thus a flexible method for

estimating this mixture is needed.

We propose to model this distribution as a Dirichlet Process Mixture (DPM) - a mix -

ture with a countably infinite number of components. Due to this property this technique

is more flexible than a finite ordered mixture model which ex ante specifies the number

of components. For a general introduction to modelling via Dirichlet processes refer to