Core Functions
For more detailed documentation see the R package documentation (PDF). Note that function signatures and exposed functions are equivalent to the R version.
Bayesian Methods
BayesianFactorZoo.BayesianFM
— FunctionBayesianFM(f::Matrix{Float64}, R::Matrix{Float64}, sim_length::Int)
Bayesian Fama-MacBeth regression. Similar to BayesianSDF but estimates factors' risk premia rather than risk prices.
Arguments
f
: Matrix of factors with dimension $t \times k$, where $k$ is the number of factors and $t$ is the number of periodsR
: Matrix of test assets with dimension $t \times N$, where $t$ is the number of periods and $N$ is the number of test assetssim_length
: Length of MCMCs
Details
Unlike BayesianSDF, we use factor loadings, $\beta_f$, instead of covariance exposures, $C_f$, in the Fama-MacBeth regression. After obtaining posterior draws of $\mu_Y$ and $\Sigma_Y$ (see BayesianSDF), we calculate:
Returns
Returns a BayesianFMOutput struct containing:
lambda_ols_path::Matrix{Float64}
: Matrix of size simlength × (k+1) containing OLS risk premia estimates. First column is ``\lambdac`` for constant term, next k columns are factor risk premia.lambda_gls_path::Matrix{Float64}
: Matrix of size sim_length × (k+1) containing GLS risk premia estimates.R2_ols_path::Vector{Float64}
: Vector of length sim_length containing OLS $R^2$ draws.R2_gls_path::Vector{Float64}
: Vector of length sim_length containing GLS $R^2$ draws.- Metadata fields accessible via dot notation:
n_factors::Int
: Number of factors (k)n_assets::Int
: Number of test assets (N)n_observations::Int
: Number of time periods (t)sim_length::Int
: Number of MCMC iterations performed
References
Bryzgalova S, Huang J, Julliard C (2023). "Bayesian solutions for the factor zoo: We just ran two quadrillion models." Journal of Finance, 78(1), 487–557.
Examples
# Run Bayesian FM regression with 10,000 iterations
results = BayesianFM(f, R, 10_000)
# Access results
ols_risk_premia = mean(results.lambda_ols_path, dims=1) # Mean OLS risk premia
gls_r2 = mean(results.R2_gls_path) # Mean GLS R²
BayesianFactorZoo.BayesianSDF
— FunctionBayesianSDF(f::Matrix{Float64}, R::Matrix{Float64}, sim_length::Int=10000;
intercept::Bool=true, type::String="OLS", prior::String="Flat",
psi0::Float64=5.0, d::Float64=0.5)
Bayesian estimation of Linear SDF (B-SDF).
Arguments
f
: Matrix of factors with dimension $t \times k$R
: Matrix of test assets with dimension $t \times N$sim_length
: Length of MCMCsintercept
: Include intercept if true, default=truetype
: "OLS" or "GLS", default="OLS"prior
: "Flat" or "Normal", default="Flat"psi0
: Hyperparameter for normal prior, default=5d
: Hyperparameter for normal prior, default=0.5
Returns
Returns a BayesianSDFOutput struct containing:
lambda_path::Matrix{Float64}
: Matrix of size simlength × (k+1) if intercept=true, or simlength × k if false. Contains posterior draws of risk prices.R2_path::Vector{Float64}
: Vector of length sim_length containing $R^2$ draws.- Metadata fields accessible via dot notation:
n_factors::Int
: Number of factors (k)n_assets::Int
: Number of test assets (N)n_observations::Int
: Number of time periods (t)sim_length::Int
: Number of MCMC iterations performedprior::String
: Prior specification used ("Flat" or "Normal")estimation_type::String
: Estimation type used ("OLS" or "GLS")
Notes
- Input matrices f and R must have the same number of rows (time periods)
- Number of test assets (N) must be larger than number of factors (k) when including intercept
- Number of test assets (N) must be >= number of factors (k) when excluding intercept
- The function performs no pre-standardization of inputs
- Risk prices are estimated in the units of the input data (typically monthly returns)
References
Bryzgalova S, Huang J, Julliard C (2023). "Bayesian solutions for the factor zoo: We just ran two quadrillion models." Journal of Finance, 78(1), 487–557.
Examples
# Basic usage with default settings
results = BayesianSDF(f, R)
# Use GLS with normal prior
results_gls = BayesianSDF(f, R, 10_000;
type="GLS",
prior="Normal",
psi0=5.0,
d=0.5)
# Access results
risk_prices = mean(results.lambda_path, dims=1)
r2_values = mean(results.R2_path)
BayesianFactorZoo.continuous_ss_sdf
— Functioncontinuous_ss_sdf(f::Matrix{Float64}, R::Matrix{Float64}, sim_length::Int;
psi0::Float64=1.0, r::Float64=0.001,
aw::Float64=1.0, bw::Float64=1.0,
type::String="OLS", intercept::Bool=true)
SDF model selection using continuous spike-and-slab prior.
Arguments
f
: Matrix of factors with dimension $t \times k$R
: Matrix of test assets with dimension $t \times N$sim_length
: Length of MCMCspsi0
: Hyperparameter in prior distribution of risk pricesr
: Hyperparameter for spike component ($\ll 1$)aw,bw
: Beta prior parameters for factor inclusion probabilitytype
: "OLS" or "GLS"intercept
: Include intercept if true
Returns
Returns a ContinuousSSSDFOutput struct containing:
- gammapath::Matrix{Float64}: Matrix of size simlength × k containing posterior draws of factor inclusion indicators.
- lambdapath::Matrix{Float64}: Matrix of size simlength × (k+1) if intercept=true, or sim_length × k if false. Contains posterior draws of risk prices.
- sdfpath::Matrix{Float64}: Matrix of size simlength × t containing posterior draws of the SDF.
- bma_sdf::Vector{Float64}: Vector of length t containing the Bayesian Model Averaged SDF.
- Metadata fields accessible via dot notation:
- n_factors::Int: Number of factors (k)
- n_assets::Int: Number of test assets (N)
- n_observations::Int: Number of time periods (t)
- sim_length::Int: Number of MCMC iterations performed
Notes
- Input matrices f and R must have the same number of rows (time periods)
- The method automatically handles both traded and non-traded factors
- Prior parameters aw, bw control beliefs about model sparsity (default values favor no sparsity)
- Parameter psi0 maps into prior beliefs about achievable Sharpe ratios
- The spike component r should be close to zero to effectively shrink irrelevant factors
- The resulting SDF is normalized to have mean 1
References
Bryzgalova S, Huang J, Julliard C (2023). "Bayesian solutions for the factor zoo: We just ran two quadrillion models." Journal of Finance, 78(1), 487–557.
Examples
# Basic usage with default settings
results = continuous_ss_sdf(f, R, 10_000)
# Use GLS with modified priors for more aggressive selection
results_gls = continuous_ss_sdf(f, R, 10_000;
type="GLS",
psi0=0.5, # Tighter prior
aw=1.0,
bw=9.0) # Prior favoring sparsity
# Access results
inclusion_probs = mean(results.gamma_path, dims=1) # Factor inclusion probabilities
risk_prices = mean(results.lambda_path, dims=1) # Posterior mean risk prices
sdf = results.bma_sdf # Model averaged SDF
BayesianFactorZoo.continuous_ss_sdf_v2
— Functioncontinuous_ss_sdf_v2(f1::Matrix{Float64}, f2::Matrix{Float64}, R::Matrix{Float64},
sim_length::Int; psi0::Float64=1.0, r::Float64=0.001,
aw::Float64=1.0, bw::Float64=1.0,
type::String="OLS", intercept::Bool=true)
SDF model selection with continuous spike-and-slab prior, treating tradable factors as test assets.
Arguments
f1
: Matrix of nontradable factors with dimension $t \times k_1$f2
: Matrix of tradable factors with dimension $t \times k_2$R
: Matrix of test assets with dimension $t \times N$ (should NOT contain f2)sim_length
: Length of MCMCspsi0,r,aw,bw,type,intercept
: Same as continuoussssdf
Details
Same prior structure and posterior distributions as continuoussssdf, but:
- Treats tradable factors f2 as test assets
- Total dimension of test assets becomes $N + k_2$
- Factor loadings computed on combined test asset set
Returns
Returns a ContinuousSSSDFOutput struct containing:
gamma_path::Matrix{Float64}
: Matrix of size simlength × k containing posterior draws of factor inclusion indicators, where ``k = k1 + k_2`` (total number of factors).lambda_path::Matrix{Float64}
: Matrix of size simlength × (k+1) if intercept=true, or simlength × k if false. Contains posterior draws of risk prices.sdf_path::Matrix{Float64}
: Matrix of size sim_length × t containing posterior draws of the SDF.bma_sdf::Vector{Float64}
: Vector of length t containing the Bayesian Model Averaged SDF.- Metadata fields accessible via dot notation:
n_factors::Int
: Number of factors ($k_1 + k_2$)n_assets::Int
: Number of test assets (N)n_observations::Int
: Number of time periods (t)sim_length::Int
: Number of MCMC iterations performed
Notes
- Input matrices f1, f2, and R must have the same number of rows (time periods)
- Test assets R should not include the tradable factors f2
- The factor selection combines both sparsity and density aspects through Bayesian Model Averaging
- Prior parameters aw, bw control beliefs about model sparsity
- Parameter psi0 maps into prior beliefs about achievable Sharpe ratios
- The spike component r should be close to zero to effectively shrink irrelevant factors
References
Bryzgalova S, Huang J, Julliard C (2023). "Bayesian solutions for the factor zoo: We just ran two quadrillion models." Journal of Finance, 78(1), 487–557.
Examples
# Basic usage with default settings
results = continuous_ss_sdf_v2(f1, f2, R, 10_000)
# Use GLS with custom priors
results_gls = continuous_ss_sdf_v2(f1, f2, R, 10_000;
type="GLS",
psi0=2.0,
aw=2.0,
bw=2.0)
# Access results
inclusion_probs = mean(results.gamma_path, dims=1) # Factor inclusion probabilities
risk_prices = mean(results.lambda_path, dims=1) # Risk price estimates
avg_sdf = results.bma_sdf # Model averaged SDF
BayesianFactorZoo.dirac_ss_sdf_pvalue
— Functiondirac_ss_sdf_pvalue(f::Matrix{Float64}, R::Matrix{Float64}, sim_length::Int,
lambda0::Vector{Float64}; psi0::Float64=1.0,
max_k::Union{Int,Nothing}=nothing)
Hypothesis testing for risk prices using Dirac spike-and-slab prior.
Arguments
f
: Matrix of factors with dimension $t \times k$R
: Matrix of test assets with dimension $t \times N$sim_length
: Length of MCMCslambda0
: $k \times 1$ vector of null hypothesis valuespsi0
: Hyperparameter in prior distributionmax_k
: Maximum number of factors in models (optional)
Returns
Returns a DiracSSSDFOutput struct containing:
gamma_path::Matrix{Float64}
: Matrix of size sim_length × k containing posterior draws of factor inclusion indicators.lambda_path::Matrix{Float64}
: Matrix of size sim_length × (k+1) containing posterior draws of risk prices.model_probs::Matrix{Float64}
: Matrix of size M × (k+1) where M is the number of possible models. First k columns are model indices (0/1), last column contains model probabilities.- Metadata fields accessible via dot notation:
n_factors::Int
: Number of factors (k)n_assets::Int
: Number of test assets (N)n_observations::Int
: Number of time periods (t)sim_length::Int
: Number of MCMC iterations performed
Notes
- Input matrices f and R must have the same number of rows (time periods)
- The method is particularly useful for testing specific hypotheses about risk prices
- Setting max_k allows for focused testing of sparse models
- The Dirac spike provides a more stringent test than the continuous spike-and-slab
- Bayesian p-values can be constructed by integrating 1-p(γ|data)
- Model probabilities are properly normalized across the considered model space
References
Bryzgalova S, Huang J, Julliard C (2023). "Bayesian solutions for the factor zoo: We just ran two quadrillion models." Journal of Finance, 78(1), 487–557.
Examples
# Test if all risk prices are zero
lambda0 = zeros(size(f, 2))
results = dirac_ss_sdf_pvalue(f, R, 10_000, lambda0)
# Test specific values with max 3 factors
lambda0_alt = [0.5, 0.3, -0.2, 0.1]
results_sparse = dirac_ss_sdf_pvalue(f, R, 10_000, lambda0_alt; max_k=3)
# Access results
inclusion_probs = mean(results.gamma_path, dims=1) # Factor inclusion probabilities
risk_prices = mean(results.lambda_path, dims=1) # Posterior mean risk prices
top_models = results.model_probs[sortperm(results.model_probs[:,end], rev=true)[1:10], :] # Top 10 models
Classical Methods
BayesianFactorZoo.SDF_gmm
— FunctionSDF_gmm(R::Matrix{Float64}, f::Matrix{Float64}, W::Matrix{Float64})
GMM estimation of factor risk prices under linear SDF framework.
Arguments
R
: Matrix of test assets with dimension $t \times N$f
: Matrix of factors with dimension $t \times k$W
: Weighting matrix for GMM estimation, dimension $(N+k) \times (N+k)$
Returns
Returns a SDFGMMOutput struct containing:
lambda_gmm::Vector{Float64}
: Vector of length k+1 containing risk price estimates (includes intercept).mu_f::Vector{Float64}
: Vector of length k containing estimated factor means.Avar_hat::Matrix{Float64}
: Matrix of size (2k+1) × (2k+1) containing asymptotic covariance matrix.R2_adj::Float64
: Adjusted cross-sectional $R^2$.S_hat::Matrix{Float64}
: Matrix of size (N+k) × (N+k) containing estimated spectral density matrix.- Metadata fields accessible via dot notation:
n_factors::Int
: Number of factors (k)n_assets::Int
: Number of test assets (N)n_observations::Int
: Number of time periods (t)
Notes
- Input matrices R and f must have the same number of rows (time periods)
- The weighting matrix W should match dimensions (N+k) × (N+k)
- For tradable factors, weighting matrix should impose self-pricing restrictions
- Implementation assumes no serial correlation in moment conditions
- R² is adjusted for degrees of freedom
- Standard errors are derived under the assumption of correct specification
References
Bryzgalova S, Huang J, Julliard C (2023). "Bayesian solutions for the factor zoo: We just ran two quadrillion models." Journal of Finance, 78(1), 487–557.
Hansen, Lars Peter (1982). "Large Sample Properties of Generalized Method of Moments Estimators." Econometrica, 50(4), 1029-1054.
Examples
# Construct OLS weighting matrix
W_ols = construct_weight_matrix(R, f, "OLS")
# Perform OLS estimation
results_ols = SDF_gmm(R, f, W_ols)
# Construct GLS weighting matrix
W_gls = construct_weight_matrix(R, f, "GLS")
# Perform GLS estimation
results_gls = SDF_gmm(R, f, W_gls)
# Access results
risk_prices = results_ols.lambda_gmm[2:end] # Factor risk prices (excluding intercept)
std_errors = sqrt.(diag(results_ols.Avar_hat)[2:end]) # Standard errors
r_squared = results_ols.R2_adj # Adjusted R²
See Also
construct_weight_matrix
: Function to construct appropriate OLS/GLS weighting matricesBayesianSDF
: Bayesian alternative that is robust to weak factors
BayesianFactorZoo.TwoPassRegression
— FunctionTwoPassRegression(f::Matrix{Float64}, R::Matrix{Float64})
Classical Fama-MacBeth two-pass regression.
Arguments
- f: Matrix of factors with dimension $t \times k$
- R: Matrix of test assets with dimension $t \times N$
Returns
Returns a TwoPassRegressionOutput struct containing:
- lambda::Vector{Float64}: Vector of length k+1 containing OLS risk premia estimates (includes intercept).
- lambda_gls::Vector{Float64}: Vector of length k+1 containing GLS risk premia estimates.
- t_stat::Vector{Float64}: Vector of length k+1 containing OLS t-statistics.
- tstatgls::Vector{Float64}: Vector of length k+1 containing GLS t-statistics.
- R2_adj::Float64: OLS adjusted R².
- R2adjGLS::Float64: GLS adjusted R².
- alpha::Vector{Float64}: Vector of length N containing OLS pricing errors.
- t_alpha::Vector{Float64}: Vector of length N containing t-statistics for OLS pricing errors.
- beta::Matrix{Float64}: Matrix of size N × k containing factor loadings.
- cov_epsilon::Matrix{Float64}: Matrix of size N × N containing residual covariance.
- cov_lambda::Matrix{Float64}: Matrix of size (k+1) × (k+1) containing OLS covariance matrix of risk premia.
- covlambdagls::Matrix{Float64}: Matrix of size (k+1) × (k+1) containing GLS covariance matrix of risk premia.
- R2_GLS::Float64: Unadjusted GLS R².
- cov_beta::Matrix{Float64}: Matrix of size (N(k+1)) × (N(k+1)) containing covariance matrix of beta estimates.
- Metadata fields accessible via dot notation:
- n_factors::Int: Number of factors (k)
- n_assets::Int: Number of test assets (N)
- n_observations::Int: Number of time periods (t)
Notes
- Input matrices f and R must have the same number of rows (time periods)
- The method is vulnerable to bias from weak and useless factors
- Standard errors account for the EIV problem but assume serial independence
- Both OLS and GLS estimates are computed with appropriate standard errors
- R² values are adjusted for degrees of freedom
- Includes corrections for using factors as test assets when applicable
References
Fama, Eugene F., and James D. MacBeth, 1973, Risk, return, and equilibrium: Empirical tests, Journal of Political Economy 81, 607-636.
Shanken, Jay, 1992, On the estimation of beta-pricing models, Review of Financial Studies 5, 1-33.
Examples
# Perform two-pass regression
results = TwoPassRegression(f, R)
# Access OLS results
risk_premia = results.lambda[2:end] # Factor risk premia (excluding intercept)
t_stats = results.t_stat[2:end] # t-statistics
r2_ols = results.R2_adj # Adjusted R²
pricing_errors = results.alpha # Pricing errors
# Access GLS results
risk_premia_gls = results.lambda_gls[2:end]
t_stats_gls = results.t_stat_gls[2:end]
r2_gls = results.R2_adj_GLS
# First-pass results
betas = results.beta # Factor loadings
std_errors_beta = sqrt.(diag(results.cov_beta)) # Standard errors for betas
See Also
BayesianFM
: Bayesian version that is robust to weak factorsSDF_gmm
: GMM-based alternative estimation approach