Pockets of Replicability (Post #6)

A series of short articles on finance research replicability. On this issue: Look-ahead bias and the complexities of pricing corporate bonds

Jun 02, 2026

In asset pricing, a stochastic discount factor (SDF) is a special random variable with the property that the price of any asset i can be obtained as the expected value of the asset’s payoff multiplied by the SDF:

\(p_i=\mathbb{E}\left[\tilde{m}\tilde{x}_i\right]\)

The existence of a strictly positive SDF is equivalent to the absence of arbitrage opportunities.

Let R_i denote the asset’s gross return and R_f the gross risk-free return. We can rearrange the pricing equation to show that risk premia are determined by covariation with the SDF:

\(\mathbb{E}[\tilde{R}_i]-R_f=-R_f \text{cov}(\tilde{m},\tilde{R})\)

Asset pricing theory is concerned with specifying a particular form for the SDF, implying that expected returns are determined by covariations with some set of variables or factors. A central question in asset pricing, therefore, is to determine which factors enter the SDF. Since these are the factors that help explain cross-sectional differences in expected returns, this has not only economic but also practical implications, as investors can use these factors to construct and hedge portfolios.

However, economic theory does not provide much guidance on the precise structure of the SDF, which means that we must rely on empirical studies to test which factors actually matter. This is a non-trivial task that involves many steps and can be affected by data quality/availability and several design choices.

The (Equity) Factor Zoo

In the CAPM, the SDF is an affine function of a single systematic risk factor: the market return. Therefore, under the CAPM, the expected return on any asset is entirely determined by its covariance with the market (i.e. its market beta). Extensive research on the stock market has shown that the CAPM is unable to explain the returns on many different types of portfolios. This list of “anomalies” eventually morphed into what we today call the “factor zoo”: a large collection that includes potential risk factors or, more generally, variables that may represent mispricings, trading frictions, or that may simply be the result of data mining. Post #3 of this series discussed some Bayesian approaches to select the “right” factors from the factor zoo, and Post #5 discussed the issues in the replication of the anomalies in the factor zoo.

Equity markets have been extensively studied, leading to the hundreds of factors that we collectively call the “factor zoo”. In contrast, fewer studies have looked at corporate bonds. The main reasons for this are related to data availability and complexity. High-quality bond data is much more difficult to obtain than equity data. A single company can have hundreds of bonds with very different characteristics (maturity, seniority, optionality, coupon structure, etc). In addition, bond trading can be much less liquid and generally happens in over-the-counter markets.

Factors for Bond Returns

In this post, I’m going to discuss the paper “Common risk factors in the cross-section of corporate bond returns”, by Jennie Bai, Turan G. Bali, and Quan Wen. This paper was published in the Journal of Financial Economics in 2019, but subsequently retracted by the authors in 2023. The reason for the retraction was straightforward. Another group of authors (Alexander Dickerson, Philippe Mueller, and Cesare Robotti) tried to replicate the results of Bai et al. and discovered an issue of temporal misalignment. Bai et al. confirmed that the issue was indeed present, and that their paper’s results did not reproduce once the issue was corrected.

The Bai et al. paper had a strong and economically intuitive result. The authors had identified three factors that explained differences in returns of corporate bonds issued by similar companies: downside risk, credit risk, and liquidity risk. Downside risk, in particular, is an intuitively appealing explanation of bond returns, as bond investors do not participate in the upside in the same way as equity investors. Their preferred model appeared to explain returns of bond portfolios quite well.

The strong results of the paper and the fact that the factor data were made available by Bai et al. led to their four-factor model quickly becoming a benchmark in other corporate bond papers.

The Replication

Dickerson, Mueller, and Robotti (2023) revisited the results of Bai et al. (2019). They found two main issues. The first, and most important, was a temporal misalignment. For most of the sample, the downside risk and credit risk factor returns reported for month t were actually the returns for month t + 1. In other words, the factors inadvertently incorporated information from the future. The liquidity-risk factor was also misaligned during the final two years of the sample, although in the opposite direction: its returns lagged by one month.

The second issue concerned the construction of the bond-market factor. Bai et al. truncated extreme returns in both tails of the distribution. This reduced the measured risk premium of the market factor and made the additional downside risk, credit risk, and liquidity risk factors appear stronger in multivariate tests.

Using multiple bond databases and correcting the misalignment issue, Dickerson et al. found that previously proposed corporate-bond factors generally did not add meaningful explanatory power beyond the value-weighted bond-market factor. In other words, the bond CAPM was difficult to outperform. The only marginal exception was traded liquidity.

Figure 1 from Dickerson, Mueller, and Robotti (2023)

This conclusion is striking because the original model appeared to work extremely well. Bai et al. had reported that their four-factor model explained much of the variation in the returns of corporate-bond portfolios, with predicted and realized returns clustering closely around the 45-degree line.

In a more recent paper, Dickerson, Robotti, and Rossetti (2026) broaden the exercise to a corporate-bond “factor zoo” of 108 signals. They argue that measurement error and look-ahead bias affect a wider segment of the literature, and that most reported bond factors do not retain statistically significant bond CAPM alphas after correction. The Bai et al. episode, therefore, is a reminder that corporate bond data are complex, and that empirical bond pricing studies can be sensitive to data construction, temporal alignment, liquidity measurement, and other methodological choices.

The Bayesians Join the Fray

In a paper recently published online in the JFE, Dickerson, Julliard, and Mueller (2026) use a Bayesian approach (similar to the ones discussed in Post #3, but adapted to handle multiple asset classes) to jointly price the cross-section of stock and bond returns. Their results show that equity and nontradable factors are sufficient to price corporate bonds once their Treasury term-structure risk is accounted for. Tradable bond factors become largely redundant for pricing the remaining credit component. However, bond factors, together with nontradable factors, remain necessary to price the Treasury component, which stock factors do not appear to capture.

Final Thoughts

The retraction of the Bai et al. paper, in my view, is an example of the system working as it should. The problem was identified because the factor data were publicly available. An independent group of authors found the issue and flagged it. As a result, our understanding of the subject improved.

An interesting point raised by both Dickerson, Mueller, and Robotti (2023) and Dickerson, Julliard, and Mueller (2026) is the large amount of model uncertainty present in empirical asset pricing studies. The latter paper states:

Overall, we find that the true latent SDF is dense in the space of observable nontradable and tradable bond and stock factors. Importantly, this implies that all low dimensional observable factor models proposed to date are affected by severe misspecification and rejected by the data.

In other words, substantial model uncertainty favors aggregation through Bayesian model averaging rather than reliance on a single sparse representation. This is similar to the conclusion reached by papers that apply Bayesian methods to the equity factor zoo. In Dickerson, Julliard, and Mueller (2026), the model space is gigantic: more than 18 quadrillion possible models. Even without the replication issue, a four-factor model such as the one proposed by Bai et al. should therefore be understood as one possible low-dimensional approximation of an unobservable SDF, rather than as its definitive representation.

References

Bai, Jennie, Turan G. Bali, and Quan Wen. "RETRACTED: Common risk factors in the cross-section of corporate bond returns." (2019): 619-642.

Dickerson, Alexander, Christian Julliard, and Philippe Mueller. “The co-pricing factor zoo.” Journal of Financial Economics 182 (2026): 104295.

Dickerson, Alexander, Philippe Mueller, and Cesare Robotti. “Priced risk in corporate bonds.” Journal of Financial Economics 150, no. 2 (2023): 103707.

Dickerson, Alexander, Cesare Robotti, and Giulio Rossetti. “The Corporate Bond Factor Replication Crisis.” arXiv preprint arXiv:2604.07880 (2026).

Systematically Biased

Discussion about this post

Ready for more?