In the first article on successful backtesting we discussed statistical and behavioural biases that affect our backtest performance. We also discussed software packages for backtesting, including Excel, MATLAB, Python, R and C++. In this article we will consider how to incorporate transaction costs, as well as certain decisions that need to be made when creating a backtest engine, such as order types and frequency of data.
One of the most prevalent beginner mistakes when implementing trading models is to neglect (or grossly underestimate) the effects of transaction costs on a strategy. Though it is often assumed that transaction costs only reflect broker commissions, there are in fact many other ways that costs can be accrued on a trading model. The three main types of costs that must be considered include:
The most direct form of transaction costs incurred by an algorithmic trading strategy are commissions and fees. All strategies require some form of access to an exchange, either directly or through a brokerage intermediary (“the broker”). These services incur an incremental cost with each trade, known as commission.
Brokers generally provide many services, although quantitative algorithms only really make use of the exchange infrastructure. Hence brokerage commissions are often small on per trade basis. Brokers also charge fees, which are costs incurred to clear and settle trades. Further to this are taxes imposed by regional or national governments. For instance, in the UK there is a stamp duty to pay on equities transactions. Since commissions, fees and taxes are generally fixed, they are relatively straightforward to implement in a backtest engine (see below).
Slippage is the difference in price achieved between the time when a trading system decides to transact and the time when a transaction is actually carried out at an exchange. Slippage is a considerable component of transaction costs and can make the difference between a very profitable strategy and one that performs poorly. Slippage is a function of the underlying asset volatility, the latency between the trading system and the exchange and the type of strategy being carried out.
An instrument with higher volatility is more likely to be moving and so prices between signal and execution can differ substantially. Latency is defined as the time difference between signal generation and point of execution. Higher frequency strategies are more sensitive to latency issues and improvements of milliseconds on this latency can make all the difference towards profitability. The type of strategy is also important. Momentum systems suffer more from slippage on average because they are trying to purchase instruments that are already moving in the forecast direction. The opposite is true for mean-reverting strategies as these strategies are moving in a direction opposing the trade.
Market impact is the cost incurred to traders due to the supply/demand dynamics of the exchange (and asset) through which they are trying to trade. A large order on a relatively illiquid asset is likely to move the market substantially as the trade will need to access a large component of the current supply. To counter this, large block trades are broken down into smaller “chunks” which are transacted periodically, as and when new liquidity arrives at the exchange. On the opposite end, for highly liquid instruments such as the S&P500 E-Mini index futures contract, low volume trades are unlikely to adjust the “current price” in any great amount.
More illiquid assets are characterised by a larger spread, which is the difference between the current bid and ask prices on the limit order book. This spread is an additional transaction cost associated with any trade. Spread is a very important component of the total transaction cost - as evidenced by the myriad of UK spread-betting firms whose advertising campaigns express the “tightness” of their spreads for heavily traded instruments.
In order to successfully model the above costs in a backtesting system, various degrees of complex transaction models have been introduced. They range from simple flat modelling through to a non-linear quadratic approximation. Here we will outline the advantages and disadvantages of each model:
Flat transaction costs are the simplest form of transaction cost modelling. They assume a fixed cost associated with each trade. Thus they best represent the concept of brokerage commissions and fees. They are not very accurate for modelling more complex behaviour such as slippage or market impact. In fact, they do not consider asset volatility or liquidity at all. Their main benefit is that they are computationally straightforward to implement. However they are likely to significantly under or over estimate transaction costs depending upon the strategy being employed. Thus they are rarely used in practice.
More advanced transaction cost models start with linear models, continue with piece-wise linear models and conclude with quadratic models. They lie on a spectrum of least to most accurate, albeit with least to greatest implementation effort. Since slippage and market impact are inherently non-linear phenomena quadratic functions are the most accurate at modelling these dynamics. Quadratic transaction cost models are much harder to implement and can take far longer to compute than for simpler flat or linear models, but they are often used in practice.
Algorithmic traders also attempt to make use of actual historical transaction costs for their strategies as inputs to their current transaction models to make them more accurate. This is tricky business and often verges on the complicated areas of modelling volatility, slippage and market impact. However, if the trading strategy is transacting large volumes over short time periods, then accurate estimates of the incurred transaction costs can have a significant effect on the strategy bottom-line and so it is worth the effort to invest in researching these models.
While transaction costs are a very important aspect of successful backtesting implementations, there are many other issues that can affect strategy performance.
One choice that an algorithmic trader must make is how and when to make use of the different exchange orders available. This choice usually falls into the realm of the execution system, but we will consider it here as it can greatly affect strategy backtest performance. There are two types of order that can be carried out: market orders and limit orders.
A market order executes a trade immediately, irrespective of available prices. Thus large trades executed as market orders will often get a mixture of prices as each subsequent limit order on the opposing side is filled. Market orders are considered aggressive orders since they will almost certainly be filled, albeit with a potentially unknown cost.
Limit orders provide a mechanism for the strategy to determine the worst price at which the trade will get executed, with the caveat that the trade may not get filled partially or fully. Limit orders are considered passive orders since they are often unfilled, but when they are a price is guaranteed. An individual exchange’s collection of limit orders is known as the limit order book, which is essentially a queue of buy and sell orders at certain sizes and prices.
When backtesting, it is essential to model the effects of using market or limit orders correctly. For high-frequency strategies in particular, backtests can significantly outperform live trading if the effects of market impact and the limit order book are not modelled accurately.
There are particular issues related to backtesting strategies when making use of daily data in the form of Open-High-Low-Close (OHLC) figures, especially for equities. Note that this is precisely the form of data given out by Yahoo Finance, which is a very common source of data for retail algorithmic traders!
Cheap or free datasets, while suffering from survivorship bias (which we have already discussed in Part I), are also often composite price feeds from multiple exchanges. This means that the extreme points (i.e. the open, close, high and low) of the data are very susceptible to “outlying” values due to small orders at regional exchanges. Further, these values are also sometimes more likely to be tick-errors that have yet to be removed from the dataset.
This means that if your trading strategy makes extensive use of any of the OHLC points specifically, backtest performance can differ from live performance as orders might be routed to different exchanges depending upon your broker and your available access to liquidity. The only way to resolve these problems is to make use of higher frequency data or obtain data directly from an individual exchange itself, rather than a cheaper composite feed.
In the next couple of articles we will consider performance measurement of the backtest, as well as a real example of a backtesting algorithm, with many of the above effects included.