A broad segment of the industry invests based on established factors such as value, momentum, and low-risk. In this post, we share the key results from our study of out-of-sample factors over a sizable and economically important sample period. Using the longest sample period to date — 1866 to the 2020s — we dispel concerns about the data mining and performance decay of equity factors. We find that equity factors are robust out-of-sample and have been an ever-present phenomenon in financial markets for more than 150 years.
Data Mining Concerns are Real
Why did we conduct this study? First, more research on factor premiums is needed, especially using out-of-sample data. Most practitioner studies on equity factors use samples that date back to the 1980s or 1990s, covering about 40 to 50 years. From a statistical perspective, this is not a substantial amount of data. In addition, these years have been unique, marked by few recessions, the longest expansion and bull market in history, and, until 2021, minimal inflationary episodes. Academic studies on equity factors often use longer samples, typically starting in 1963 using the US Center for Research in Security Prices (CRSP) database from the University of Chicago. But imagine if we could double that sample length using a comprehensive dataset of stock prices. Stock markets have been essential to economic growth and innovation financing long before the 20th century.
Second, academics have discovered hundreds of factors—often referred to as the “factor zoo.” Recent academic research suggests many of these factors may result from data dredging, or statistical flukes caused by extensive testing by both academics and industry researchers. A single test typically has a 95% confidence level, implying that about one in every 20 tests will “discover” a false factor. This issue compounds when multiple tests are conducted. It is critical given that millions of tests have been performed in financial markets. This is a serious concern for investors, as factor investing has become mainstream globally. Imagine if the factors driving hundreds of billions of dollars in investments were the result of statistical noise, and therefore unlikely to deliver returns in the future.
Figure 1 illustrates one of the motives behind our study. It shows the test statistics for portfolios of size, value, momentum, and low-risk factors over the in-sample and out-of-sample periods within the CRSP era (post-1926). Consistent with earlier studies, most factors exhibit significance during the in-sample period. However, results look materially different over subsequent out-of-sample periods with several factors losing their significance at traditional confidence levels. This decline in the performance of equity factors can be attributed to multiple reasons, including limited data samples, as discussed in the literature. Regardless, it underscores the need for independent out-of-sample tests on equity factors in a sufficiently sizable sample. In our research paper, we tackle this challenge by testing equity factors out-of-sample in a sample not touched before by extending the CRSP dataset with 61 years of data.
Figure 1.
Source: Global Financial Data, Kenneth French website, Erasmus University Rotterdam
Stock Markets in the 19th Century
Before diving into the key results, let’s outline the US stock market in the 19th century. In our paper, we collect information from all major stocks listed on the US exchanges between 1866 and 1926 (the start date of the CRSP dataset). This period was characterized by strong economic growth and rapid industrial development, which laid the foundation for the United States to become the world’s leading economic power. Stock markets played a pivotal role in economic growth and innovation financing, with market capitalizations growing more than 50-fold in 60 years — in line with US nominal GDP growth over the same period.
In many ways, 19th- and 20th-century markets were similar. Equities could be easily bought or sold across exchanges via dealer firms, traded via derivatives and options, purchased on margin, and shorted, with well-known short sellers. Major 19th century technological innovations such as the telegraph (1844), the transatlantic cable (1866), the introduction of the ticker tape (1867), the availability of local telephone lines (1878), and direct phone links via cables facilitated a liquid and active secondary market for stocks, substantial brokerage and market-making activities, quick arbitrage between prices, fast price responses to information, and substantial trading activities. Price quotations were known instantly from coast to coast and even across the Atlantic. Much like today, investors had access to a wide range of reputable information sources, while a sizable industry of financial analysts provided market assessments and investment advice.
Further, trading costs in the 19th century were not very different from 20th century costs. Market information and academic studies reveal transaction costs on higher-volume stocks and well-arbitraged NYSE stocks to be around 0.50% but have traded at the minimum tick of 1/8th during both centuries. Further, in the decade prior to World War I, the median quoted spread at the NYSE was 86 basis points and a quarter of trades took place with spreads less than 36 basis points. Moreover, share turnover on NYSE stocks was higher between 1900 and 1926 than in 2000. Overall, US stock markets have been a lively and economically important source of trading since the 19th century, providing an important and reliable out-of-sample testing ground for factor premiums.
The Pre-CRSP Equity Dataset
Constructing this dataset was a major effort. Our sample includes stock returns and characteristics for all major stocks since 1866. Why 1866? It’s the start date of the Commercial and Financial Chronicle, a key source also used by the CRSP database. You may wonder why CRSP starts in 1926. While the exact reason remains speculative, it seems arbitrary, ensuring the inclusion of some data from before the 1929 stock market crash.
In our paper, we hand-collected all market capitalizations — highly relevant to study factor premiums and stock prices. In addition, we hand-validated samples of price and dividend data obtained from Global Financial Data — a data provider specialized in historical price data. Unlike CRSP, we focused our data collection on all major stocks traded across the key exchanges. This includes not only the NYSE, but also the NY Curb (which later became the American Stock Exchange, AMEX), and several regional exchanges. You can imagine the amount of work this has taken and the tremendous amount of research assistants’ time we utilized at the Erasmus University Rotterdam. But the results have been worth the effort. The result is a high-quality dataset of US stock prices from 1866 to 1926, covering approximately 1,500 listed stocks.
Out-of-Sample Performance of Factors Are Eternal
So, how do the out-of-sample results from the 1866-1926 pre-CRSP period look? Before we discuss, please recall that this period has not been well-studied before and hence it allows us to conduct a true out-of-sample test to equity factor premiums.
Figure 2 summarizes the key results from our research. It shows the alpha of the established equity factor premiums over the longest CRSP sample possible (in grey) and the pre-CRSP out-of-sample period (in black). Interestingly, the out-of-sample alphas for value, momentum, and low-risk factors are very similar to those observed in the CRSP sample. In fact, differences between the two samples are statistically insignificant. The 150+ years of evidence on factor premiums (the black bars) confirm this conclusion, showing attractive premiums that are both economically and statistically highly significant. Overall, the independent sample confirms the validity of key equity factor premiums such as value, momentum, and low-risk.
Figure 2.
Source: Global Financial Data, Kenneth French website, Erasmus University Rotterdam
These findings allow for several strong conclusions. First and most importantly, factor premiums are an eternal feature in financial markets. They are not artifacts of researchers’ efforts or specific economic conditions but have existed since the inception of financial markets, persisting for more than 150 years. Second, factor premiums do not decay out-of-sample but tend to remain stable. Third, given their enduring nature, factor premiums offer significant investment opportunities. These results should give investors greater confidence in the robustness of factor premiums, reinforcing their utility in crafting effective investment strategies.