Backtesting vs. Data Mining

Jason Zwieg at WSJ wrote this piece about how data mining works.  Here’s the summary from one of his test portfolios:

If you had invested $10,000 in this portfolio two decades ago, you would have $105,971 now. Meanwhile, the losers who stuck with the stodgy S&P 500 ended up with only $58,347.

Craig Lazarra at S&P did an analysis of the test portfolios, and determined that the outperformance was caused by equal weighting:

The most important thing about the Journal and the Vanguard portfolios is not their (somewhat similar) stock selection processes.  The most important thing is that after they determine what stocks they want to own, the portfolio construction process weights each stock equally.

The most important thing about the Journal and Vanguard portfolios is not their (somewhat similar) stock selection processes.  The most important thing is that after they determine what stocks they want to own, the portfolio construction process weights each stock equally. – See more at: http://www.indexologyblog.com/2014/06/29/creating-a-performance-tailwind/#sthash.BPZgmTE5.dpuf
The most important thing about the Journal and Vanguard portfolios is not their (somewhat similar) stock selection processes.  The most important thing is that after they determine what stocks they want to own, the portfolio construction process weights each stock equally. – See more at: http://www.indexologyblog.com/2014/06/29/creating-a-performance-tailwind/#sthash.BPZgmTE5.dpuf
The most important thing about the Journal and Vanguard portfolios is not their (somewhat similar) stock selection processes.  The most important thing is that after they determine what stocks they want to own, the portfolio construction process weights each stock equally. – See more at: http://www.indexologyblog.com/2014/06/29/creating-a-performance-tailwind/#sthash.BPZgmTE5.dpuf

I’m not sure I would agree with that conclusion.  It seems like if you pick any group of stocks today, and compare them to the index over the last 20 years, survivorship bias will pretty much guarantee you a better return.   I’m thinking equal weighting has an impact, but survivorship bias has to be a key to the success of those portfolios.

Speaking of data mining, I saw a presentation by Jeremy Schwartz of WisdomTree 2 weeks ago.  The speech included a rationale for hedging foreign currencies (one of their newest funds is in this category).  The main points are included in this Seeking Alpha article.  But to me, the argument was completely undercut by one chart (marked not for public viewing, sorry), showing a graph with ups and downs marked off at separation points (start of dollar strength and weakness dates).  The point was to show that hedging improves returns.  But there were a total of 5 time periods marked off from 1969 to present.  Essentially, 5 data points.  To me, that’s data mining.  I’m not saying they are wrong.  I just remain unconvinced.  On the topic of hedging currency, the counterpoint argument is from Larry Swedroe at ETF.com.  Here’s the gist of his argument, and I would like to see the WisdomTree response:

Despite the currency risk, and despite the MSCI EAFE having lower returns (10.0 percent versus 10.4 percent) as well as higher volatility (22.4 percent versus 17.6 percent), the portfolio that included the allocation to the MSCI EAFE had slightly higher returns and slightly lower volatility.

The reason is that the annual correlation of the MSCI EAFE to the S&P 500 was just 0.66, and while the correlation of the S&P 500 to five-year Treasurys was just 0.03, the correlation of the MSCI EAFE to five-year Treasurys was actually a negative -0.16, providing greater diversification benefits. This example demonstrates why you should never make the mistake of considering assets in isolation.

I don’t think backtesting is a bad thing, or misrepresentative, necessarily.  In fact, if you have some great investing idea, and want to see if it would work, is there really any other way to see if it would have worked in the past?  I think for due diligence, you just need to make sure that the backtesting is done correctly.  It must be done as if it were real time, that is, use the universe of stocks for any given time that was actually available at that time, and data for weighting that was accurate for that time.

Advertisements

Leave a comment

Filed under Financial

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s