Fama-Macbeth回归是实证资产定价中最为常用方法之一。它的主要用途是验证因子对资产收益率是否产生系统性影响。与投资组合分析不同的是,Fama-Macbeth回归可以在同时控制多个因子对资产收益率的影响下,考察特定因子对资产收益率产生系统性影响,具体体现在因子是否存在显著的风险溢价(risk premium)。


1. 估计资产承担风险大小(beta值)。通过对资产收益率的时间序列分析,得到资产承担的风险水平。

2. 估计风险溢价时间序列以及统计检验。通过在每个时点的资产收益率对得到的beta值进行截面回归,得到因子在每个时刻的风险溢价。对每个时刻的风险溢价进行平均,并检验均值是否显著异于0。

本文根据Bail et al.的著作Empirical Asset Pricing编写相关程序,投资组合分析的模块是EAP.fama_macbeth,下面将对该模块进行详细介绍。本文的Package已发布于Github:

Github: GitHub - whyecofiliter/EAP: empirical asset pricing


This module is designed for Fama_Macbeth regression(1976).

Fama-Macbeth Regression follows two steps:

  1. Specify the model and take cross-sectional regression.

  2. Take the time-series average of regression coefficient

For more details, please read Empirical Asset Pricing: The Cross Section of Stock Returns. Bali, Engle, Murray, 2016.

class Fama_macbeth_regress():

Package Needed: numpy, statsmodels, scipy, prettytable.

def __init__(self, sample):

input :

sample : data for analysis. The structure of the sample :

The first column is dependent variable/ test portfolio return.

The second to the last-1 columns are independent variable/ factor loadings.

The last column is the time label.

def divide_by_time(self, sample):

This function group the sample by time.

input :

sample : The data for analysis in the __init__ function.

output :

groups_by_time : The sample grouped by time.

def cross_sectional_regress(self, add_constant=True):

This function conducts the first step of Fama-Macebth Regression Fama-Macbeth, that taking the cross-sectional regression for each period.

input :

add_constant : whether add intercept when take the cross-sectional regression

output :

parameters : The regression coefficient/factor risk premium, whose rows are the group coefficient and columns are regression variable.

tvalue : t value for the coefficient.

rsquare : The r-square.

adjrsq : The adjust r-square.

n : The sample quantity in each group.

def time_series_average(self, **kwargs):

This function conducts the second step of Fama-Macbeth regression, take the time series average of cross section regression.

def fit(self, **kwargs):

This function fits the model by running the time_series_average function.


import numpy as np
from fama_macbeth import Fama_macbeth_regress# construct sample
for i in range(19):year=np.append(year,(2019-i)*np.ones((3000,1),dtype=int))
# print('Character:',character)
# print('Sample:',sample)
# print(sample.shape)
model = Fama_macbeth_regress(sample)
result = model.fit(add_constant=False)
para_average: [-0.501 -1.003]
tvalue: [-111.857 -202.247]
R: 0.5579113793318332
ADJ_R: 0.5576164569698131
sample number N: 3000.0

def summary_by_time(self):

This function summarize the cross-section regression result at each time.

Package needed: prettytable.


# continue the previous code
|  Year  |      Param      | R Square | Adj R Square | Sample Number |
| 2001.0 | [-0.499 -0.990] |   0.53   |     0.53     |      3000     |
| 2002.0 | [-0.524 -0.987] |   0.56   |     0.56     |      3000     |
| 2003.0 | [-0.544 -1.015] |   0.58   |     0.58     |      3000     |
| 2004.0 | [-0.474 -0.948] |   0.53   |     0.53     |      3000     |
| 2005.0 | [-0.502 -1.007] |   0.57   |     0.57     |      3000     |
| 2006.0 | [-0.497 -0.981] |   0.55   |     0.55     |      3000     |
| 2007.0 | [-0.526 -1.020] |   0.57   |     0.57     |      3000     |
| 2008.0 | [-0.476 -1.024] |   0.56   |     0.56     |      3000     |
| 2009.0 | [-0.533 -1.011] |   0.57   |     0.57     |      3000     |
| 2010.0 | [-0.493 -1.029] |   0.57   |     0.57     |      3000     |
| 2011.0 | [-0.504 -0.975] |   0.55   |     0.55     |      3000     |
| 2012.0 | [-0.508 -1.002] |   0.56   |     0.56     |      3000     |
| 2013.0 | [-0.474 -1.015] |   0.56   |     0.56     |      3000     |
| 2014.0 | [-0.503 -0.998] |   0.55   |     0.55     |      3000     |
| 2015.0 | [-0.485 -1.034] |   0.55   |     0.55     |      3000     |
| 2016.0 | [-0.514 -1.005] |   0.57   |     0.57     |      3000     |
| 2017.0 | [-0.498 -1.016] |   0.58   |     0.58     |      3000     |
| 2018.0 | [-0.475 -0.994] |   0.55   |     0.55     |      3000     |
| 2019.0 | [-0.487 -0.974] |   0.54   |     0.54     |      3000     |
| 2020.0 | [-0.511 -1.031] |   0.56   |     0.56     |      3000     |

def summary(self, charactername=None):

This function summarize the final result.

input :

charactername : The factors' name in the cross-section regression model.


# continue the previous code
|      Param      |     Param Tvalue    | Average R | Average adj R | Average n |
| [-0.501 -1.003] | [-111.857 -202.247] |   0.558   |     0.558     |   3000.0  |


