Backtesting trading strategies… guh, where do we start?
Can we cut this blog short and all agree backtesting trading strategies should never be done? Not even when you are bored and there is nothing on Netflix.
Backtesting trading strategies are the general method for seeing how well a strategy or model would have done ex-post. Backtesting assesses the viability of a trading strategy by discovering how it would play out using historical data. If backtesting works, traders and analysts may have the confidence to employ it going forward.
Except… You should never employ it going forward.
The appeal of backtesting trading strategies is in their simplicity. Just load a bunch of data, apply a couple rules and press “test”. Voilà… results. If the results look bad, do it again with expanded rules and parameters. Still not seeing what you want to see? Keep doing this until you get a picture that looks good with strong results and there you have it… a trading strategy with rules and results and such.
On top of this, trading platforms make it so easy to perform these tests with endless amounts of data and parameters and technical analysis. Once testing is done, they make it equally easy to apply those rules to your live account.
Except, think about this for a minute… what did you actually test?
Last week, someone presented us with a backtested strategy covering the last 21 years. Only 113 trades were placed. Of those, only 3 were losses. (What did we say about win ratio?) And the account appreciated from some nominal beginning balance to over $1.2M.
Awesome right? A 97% win ratio over 20 years in all environments with amazing net results.
But how many times was this test done before the results were produced? What kind of range was applied to the parameters which then allowed the program to test all possible scenarios to pick the best result?
What was presented was the output of likely hundreds of thousand or even millions of possible outcomes, where perhaps only a handful of those outcomes showed results. This one happened to be the best. The computer found the perfect (or closest to it) scenario.
From the beginning of SPY (the S&P’s ETF) in 1993 to 2009, had you simply bought the S&P on the close and sold it again the next day on the open (so effectively doing nothing but holding overnight), you would have out performed the market by a margin of nearly 6 to 1 (302% return vs 53% return). This includes both strong bull markets, strong bear markets and a lot of nonsense in between. Had you performed this backtest anywhere between 2004 and 2010, it would have looked like the perfect strategy for all environments, right?
Now walk it forward. From 2009 until today, that exact same strategy produced a loss of 65% vs. a market gain of 402%. Net-net… the strategy produced a gain of 38% vs. the market at 670%
That is a very rudimentary example over a very long period of time. But the net conclusion remains the same. Backtesting is misleading even when applied to numerous different environments, conditions and extremes. It’s the trading analogy to not being able to see the forest through the trees.
No, No, No, No, No. Just don’t. When looking at a black hole, observing it from a different angle is never going to make that black hole look like a bright shiny object. Monte Carlo analysis allows you to cherry pick trades out of an already steaming pile of ……
Blind testing and walk forward analysis.
Don’t allow yourself, or the computer running the test, to know the forward looking data before making a decision. Force your test to come to a conclusive decision before seeing how that decision may have performed.
Backtesting absorbs the sum of data included and tries to find the best outcome. Its curve fitting. Walk forward analysis makes a decision, scores it and walks forward to the next point, makes another decision and walks forwards, etc. Walk forward analysis is doing the exact opposite of finding a best fit. It’s trying to find the only path available.
For example, we don’t trade Lumber futures but we can absorb all the available data from that market and run it through our programs to perform a reliable test and thus a reliable strategy going forward. Our programs will start at the beginning of the data and look for a pattern out of a very small sample, then trade that pattern. As more data is absorbed, new patterns are found and that strategy progresses. At no time does our program know anything about the market it’s looking at or the long term performance of the loaded data. Our programs cannot make a decision based on forward looking data…only what has already happened at any point.
Speaking of Lumber futures, has anyone else seen this move from 760 to 480 over the last month?
What if we did conduct backtests? With pattern recognition software, give our computers enough time and they will find a backtested scenario in which we would never have been wrong… ever. It would be beautiful, extremely profitable and… fiction.
Want some help coming up with a strategy?
To simply learn more about our process, check out our “The Science” page.