In a randomized log normal world, such a framework for analysis would be redundant. By mathematical definition, one could not outperform the market's risk adjusted return in the long run except by pure luck. The alpha of such models would be zero (worse, counting transaction costs). Model development would be as fruitful as attempting to make money flipping fair coins. Therefore, all developers of trading models explicitly or implicitly believe markets are not unpredictably random. This is an assumption which should cause some humility. The challenge for modelers in trying to discover patterns which repeat themselves is daunting.

No model building method can assure success. However, the lack of a proper scientific methodology will almost certainly guarantee failure. There are many hurdles model builders need to overcome. In MSR's experience, the "data mining" bias is one of the most difficult problems to solve. At its most basic level, the data mining bias is a form of self-deception that "discovers" spurious correlations in historical simulations, which are fundamentally random in nature. This is the primary reason most models fail "out of sample" in real trading. As obvious as this may seem as a general statement, in practice the elimination of the data mining bias is a very complex and detailed process.

There are an unlimited number of ways to combine historical data into formulas and regressions that perfectly fit history but which lack any predictive value. The challenge for model builders is to distinguish between that which may be predictive and that which is not. Professor David Leinweber of Caltech created one of the best examples of data mining bias in a paper known by its famous satirical "butter in Bangladesh" method of predicting stock market prices. Leinweber demonstrated how easy it is to find a meaningless correlation if one scours enough data and uses enough polynomials.

Leinweber literally regressed thousands of data series from 140 countries against the price of the S&P 500 over a 10-year period. He "discovered" that butter production in Bangladesh "explained" 75% of the return in the stock market. When he combined butter in Bangladesh with US cheese production and the sheep population in both countries he created an almost perfect fit (an R-squared of .99).

This may seem obviously absurd, but Leinweber's point is that if instead of butter in Bangladesh one had a model predicting stock prices using GDP and interest rates with an R-squared of .70, it might not seem so ridiculous. A data miner can create non-predictive meaningless models using "sensible" data just as easily as with "butter in Bangladesh".

What does MSR do to try to avoid this pitfall? One cannot avoid using historical data to "mine" for statistically significant patterns, nor should one want to. We have only one history, as multifaceted as it is. It is also unlikely that one's first attempt at a hypothesis will yield the results one desires. It is inevitable that one will use the same data multiple times in the search for a successful predictive hypothesis. In statistics this is often referred to as the multiple comparison problem. However, if one uses hypothesis testing and other techniques on models without taking into account the number of different variables or parameters that were tested, one is almost certain to fall victim to the dating mining bias. One has to account for the number of tests done on the data to arrive at meaningful statistical inferences. It is extremely difficult to build successful models without using methods which "discount" these effects. In doing so, one improves the odds that the output of one's models will not be fallacious.

The above model building prescription is neither straightforward nor mechanical, and in practice it is very difficult. Judgment is always required at every step. "Researcher bias" (i.e., the tendency of researchers to interpret data, or make judgments, toward their desired conclusion) is a risk for MSR as it is with all financial model builders. However, we try to keep this risk at the forefront of our thinking and methodology in order to minimize its likelihood.