【量化笔记】配对交易
配对交易的步骤
1. 如何挑选进行配对的股票
2. 挑选好股票对以后,如何制定交易策略,开仓点如何设计
3. 开仓是,两只股票如何进行多空仓对比
股票对的选择
1. 行业内匹配
2. 产业链配对
3. 财务管理配对
最小距离法
配对交易需要对股票价格进行标准化处理。假设Pti(t=0,1,2,...,T)P_t^i(t=0,1,2,...,T)Pti(t=0,1,2,...,T)表示股票i在第t天的价格,那么股票i在第t天的单期收益率可以表达为:
rti=Pti−Pt−1iPt−1i,t=1,2,3,...,Tr_t^i=\frac{P_t^i-P_{t-1}^{i}}{P_{t-1}^{i}},t=1,2,3,...,Trti=Pt−1iPti−Pt−1i,t=1,2,3,...,T
用p^ti\hat{p}_t^ip^ti 表示股票i在第t天的标准化价格,学界和业界认为p^ti\hat{p}_t^ip^ti可以有这t天内的累计收益率来计算:
p^ti=∏τ=1t(1+rτi)\hat{p}^i_t=\prod_{\tau=1}^t (1+r_\tau^i)p^ti=∏τ=1t(1+rτi)
假设有股票X,Y,则我们可以计算二者之间的标准化价差之平方和SSDX,YSSD_{X,Y}SSDX,Y
SSDX,Y=∑t=1T(p^tX−p^tY)2\\ SSD_{X,Y}=\sum_{t=1}^T (\hat{p}_t^X-\hat{p}_t^Y)^2SSDX,Y=∑t=1T(p^tX−p^tY)2
下面使用python计算SSD
import pandas as pd
sh=pd.read_csv('sh50p.csv',index_col='Trddt')
sh.index=pd.to_datetime(sh.index)
sh.head()
600000 | 600010 | 600015 | 600016 | 600018 | 600028 | 600030 | 600036 | 600048 | 600050 | ... | 601688 | 601766 | 601800 | 601818 | 601857 | 601901 | 601985 | 601988 | 601989 | 601998 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Trddt | |||||||||||||||||||||
2010-01-04 | 9.997 | 2.260 | 6.541 | 4.627 | 4.775 | 8.445 | 17.927 | 13.174 | 6.443 | 6.703 | ... | - | 4.991 | - | - | 11.284 | - | - | 3.055 | 4.591 | 6.498 |
2010-01-05 | 10.072 | 2.250 | 6.706 | 4.708 | 4.783 | 8.494 | 18.804 | 13.188 | 6.243 | 6.937 | ... | - | 4.982 | - | - | 11.499 | - | - | 3.090 | 4.573 | 6.578 |
2010-01-06 | 9.874 | 2.255 | 6.461 | 4.615 | 4.733 | 8.311 | 18.586 | 12.913 | 6.237 | 6.787 | ... | - | 4.947 | - | - | 11.342 | - | - | 3.055 | 4.627 | 6.393 |
2010-01-07 | 9.652 | 2.201 | 6.328 | 4.493 | 4.602 | 8.091 | 18.133 | 12.578 | 6.240 | 6.590 | ... | - | 4.875 | - | - | 11.267 | - | - | 2.998 | 4.573 | 6.176 |
2010-01-08 | 9.761 | 2.211 | 6.370 | 4.539 | 4.618 | 8.005 | 18.483 | 12.578 | 6.322 | 6.674 | ... | - | 4.902 | - | - | 11.135 | - | - | 3.012 | 4.525 | 6.232 |
5 rows × 50 columns
# 定义配对形成期
formStart='2014-01-01'
formEnd='2015-01-01'
# 形成期数据
shform=sh[formStart:formEnd]
shform.head(2)
600000 | 600010 | 600015 | 600016 | 600018 | 600028 | 600030 | 600036 | 600048 | 600050 | ... | 601688 | 601766 | 601800 | 601818 | 601857 | 601901 | 601985 | 601988 | 601989 | 601998 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Trddt | |||||||||||||||||||||
2014-01-02 | 8.307 | 2.360 | 6.371 | 6.183 | 5.031 | 4.117 | 12.288 | 9.719 | 5.156 | 3.146 | ... | 8.534 | 4.869 | 3.781 | 2.383 | 7.245 | 5.91 | - | 2.333 | 5.617 | 3.634 |
2014-01-03 | 8.138 | 2.594 | 6.187 | 6.087 | 4.906 | 4.043 | 11.937 | 9.520 | 5.112 | 3.088 | ... | 8.293 | 4.791 | 3.725 | 2.356 | 7.217 | 5.91 | - | 2.289 | 5.517 | 3.578 |
2 rows × 50 columns
# 提取中国银行 601988 的收盘价数据
PAf=shform['601988']
# 提取浦发银行 600000 的收盘价数据
PBf = shform['600000']
#合并两只股票
pairf=pd.concat([PAf,PBf],axis=1)
len(pairf)
245
import numpy as np
# 构造标准化价格之差平方累计SSD函数
def SSD(priceX,priceY):if priceX is None or priceY is None:print('缺少价格序列')returnX = (priceX-priceX.shift(1))/priceX.shift(1)[1:]returnY = (priceY-priceY.shift(1))/priceY.shift(1)[1:]standardX=(returnX+1).cumprod()standardY=(returnY+1).cumprod()SSD = np.sum((standardX-standardY)**2)return SSD
dis=SSD(PAf,PBf)
dis
0.47481704588389073
SSD_rst=[]
for i in shform.columns:for j in shform.columns:if i==j:continue;A=shform[i]B=shform[j]try:num=SSD(A,B)except:continue# print(i,j,num)try:lst=[i,j,num]except:continueSSD_rst.append(lst)
SSDdf=pd.DataFrame(SSD_rst)
SSDdf.head()
0 | 1 | 2 | |
---|---|---|---|
0 | 600000 | 600010 | 7.366730 |
1 | 600000 | 600015 | 0.668806 |
2 | 600000 | 600016 | 2.563524 |
3 | 600000 | 600018 | 7.823916 |
4 | 600000 | 600028 | 3.976679 |
SSDdf.sort_values(by=2)
0 | 1 | 2 | |
---|---|---|---|
102 | 600015 | 601166 | 0.245662 |
977 | 601166 | 600015 | 0.245662 |
1435 | 601857 | 601398 | 0.329987 |
1244 | 601398 | 601857 | 0.329987 |
387 | 600050 | 601988 | 0.422743 |
1452 | 601988 | 600050 | 0.422743 |
984 | 601166 | 600050 | 0.446014 |
375 | 600050 | 601166 | 0.446014 |
1443 | 601988 | 600000 | 0.474817 |
36 | 600000 | 601988 | 0.474817 |
300 | 600036 | 601318 | 0.485505 |
1099 | 601318 | 600036 | 0.485505 |
1445 | 601988 | 600015 | 0.519690 |
114 | 600015 | 601988 | 0.519690 |
982 | 601166 | 600036 | 0.533362 |
297 | 600036 | 601166 | 0.533362 |
353 | 600050 | 600015 | 0.545751 |
86 | 600015 | 600050 | 0.545751 |
1011 | 601166 | 601988 | 0.546044 |
1468 | 601988 | 601166 | 0.546044 |
1002 | 601166 | 601318 | 0.568370 |
1117 | 601318 | 601166 | 0.568370 |
303 | 600036 | 601398 | 0.653760 |
1216 | 601398 | 600036 | 0.653760 |
948 | 601088 | 600111 | 0.656164 |
491 | 600111 | 601088 | 0.656164 |
78 | 600015 | 600000 | 0.668806 |
1 | 600000 | 600015 | 0.668806 |
381 | 600050 | 601398 | 0.704233 |
1218 | 601398 | 600050 | 0.704233 |
... | ... | ... | ... |
479 | 600111 | 600109 | 64.595128 |
440 | 600109 | 600111 | 64.595128 |
1434 | 601857 | 601390 | 65.941405 |
1205 | 601390 | 601857 | 65.941405 |
947 | 601088 | 600109 | 67.396051 |
452 | 600109 | 601088 | 67.396051 |
1186 | 601390 | 600585 | 71.932721 |
653 | 600585 | 601390 | 71.932721 |
1204 | 601390 | 601766 | 72.876730 |
1395 | 601766 | 601390 | 72.876730 |
1174 | 601390 | 600018 | 73.037316 |
185 | 600018 | 601390 | 73.037316 |
689 | 600637 | 601186 | 75.475072 |
1070 | 601186 | 600637 | 75.475072 |
674 | 600637 | 600109 | 78.702164 |
445 | 600109 | 600637 | 78.702164 |
965 | 601088 | 601390 | 79.636906 |
1194 | 601390 | 601088 | 79.636906 |
1182 | 601390 | 600111 | 81.322417 |
497 | 600111 | 601390 | 81.322417 |
809 | 600887 | 601390 | 82.309679 |
1190 | 601390 | 600887 | 82.309679 |
1067 | 601186 | 600518 | 87.971590 |
572 | 600518 | 601186 | 87.971590 |
442 | 600109 | 600518 | 96.995893 |
557 | 600518 | 600109 | 96.995893 |
692 | 600637 | 601390 | 97.687461 |
1187 | 601390 | 600637 | 97.687461 |
1184 | 601390 | 600518 | 111.693514 |
575 | 600518 | 601390 | 111.693514 |
1560 rows × 3 columns
协整模型
协整模型指如果X股票的对数价格是非平稳时间序列,且对数价格的差分序列是平稳的,责成X股票的对数价格是一阶单整序列。
又log(PtX)−log(Pt−1X)=log(PtXPt−1X)=log(1+rtX)=rtXlog(P_t^X)-log(P_{t-1}^X)=log(\frac{P_t^X}{P_{t-1}^X}) \\=log(1+r_t^X) \\~=r_t^Xlog(PtX)−log(Pt−1X)=log(Pt−1XPtX)=log(1+rtX) =rtX
下面计算中国银行和浦发银行是否是一阶单整序列
from arch.unitroot import ADF
import numpy as np
PAf.head()
Trddt
2014-01-02 2.333
2014-01-03 2.289
2014-01-06 2.262
2014-01-07 2.253
2014-01-08 2.244
Name: 601988, dtype: float64
PAflog=np.log(PAf)
adfA=ADF(PAflog)
print(adfA.summary().as_text())
Augmented Dickey-Fuller Results
=====================================
Test Statistic 3.409
P-value 1.000
Lags 12
-------------------------------------Trend: Constant
Critical Values: -3.46 (1%), -2.87 (5%), -2.57 (10%)
Null Hypothesis: The process contains a unit root.
Alternative Hypothesis: The process is weakly stationary.
统计量较大,我们接受原假设,原序列存在单位根,所以中国银行的收盘价的对数序列是非平稳序列
#对数差分
retA=PAflog.diff()[1:]
retA.head()
Trddt
2014-01-03 -0.019040
2014-01-06 -0.011866
2014-01-07 -0.003987
2014-01-08 -0.004003
2014-01-09 -0.004019
Name: 601988, dtype: float64
adfretA=ADF(retA)
print(adfretA.summary().as_text())
Augmented Dickey-Fuller Results
=====================================
Test Statistic -4.571
P-value 0.000
Lags 11
-------------------------------------Trend: Constant
Critical Values: -3.46 (1%), -2.87 (5%), -2.57 (10%)
Null Hypothesis: The process contains a unit root.
Alternative Hypothesis: The process is weakly stationary.
统计量较小,所以我们拒绝原假设,对数差分序列没有单位根,所以中国银行的对数差分序列是平稳序列
由此说明中国银行的对数价格序列是一阶单整序列
# 对浦发银行进行检验
PBflog=np.log(PBf)
adfB=ADF(PBflog)
print(adfB.summary().as_text())
Augmented Dickey-Fuller Results
=====================================
Test Statistic 2.392
P-value 0.999
Lags 12
-------------------------------------Trend: Constant
Critical Values: -3.46 (1%), -2.87 (5%), -2.57 (10%)
Null Hypothesis: The process contains a unit root.
Alternative Hypothesis: The process is weakly stationary.
统计量较大,我们接受原假设,原序列存在单位根,所以浦发银行的收盘价的对数序列是非平稳序列
#对数差分序列
retB = PBflog.diff()[1:]
adfretB=ADF(retB)
print(adfretB.summary().as_text())
Augmented Dickey-Fuller Results
=====================================
Test Statistic -3.888
P-value 0.002
Lags 11
-------------------------------------Trend: Constant
Critical Values: -3.46 (1%), -2.87 (5%), -2.57 (10%)
Null Hypothesis: The process contains a unit root.
Alternative Hypothesis: The process is weakly stationary.
统计量较小,所以拒绝原假设,不存在单位根,所以浦发银行的对数差分序列是平稳序列
由此说明浦发银行的对数序列是一阶单整序列
#绘制对数时序图
import matplotlib.pyplot as plt
PAflog.plot(label='ZGYH',style='--')
PBflog.plot(label='PFYH',style='-')
plt.legend(loc='upper left')
plt.title("中国银行和浦发银行的对数价格时序图")
Text(0.5, 1.0, '中国银行和浦发银行的对数价格时序图')
retA.plot(label='ZGYH')
retB.plot(label='PFYH')
plt.legend(loc='lower left')
<matplotlib.legend.Legend at 0x1c238d0eb8>
检验两个对数序列的协整性
方式是对两个序列进行线性拟合,对残差进行检验,如果残差序列是平稳的,说明两个对数序列具有协整性
import statsmodels.api as sm
model=sm.OLS(PBflog,sm.add_constant(PAflog))
/Users/yaochenli/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py:2389: FutureWarning: Method .ptp is deprecated and will be removed in a future version. Use numpy.ptp instead.return ptp(axis=axis, out=out, **kwargs)
results=model.fit()
print(results.summary())
OLS Regression Results
==============================================================================
Dep. Variable: 600000 R-squared: 0.949
Model: OLS Adj. R-squared: 0.949
Method: Least Squares F-statistic: 4560.
Date: Fri, 23 Aug 2019 Prob (F-statistic): 1.83e-159
Time: 17:40:05 Log-Likelihood: 509.57
No. Observations: 245 AIC: -1015.
Df Residuals: 243 BIC: -1008.
Df Model: 1
Covariance Type: nonrobust
==============================================================================coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 1.2269 0.015 83.071 0.000 1.198 1.256
601988 1.0641 0.016 67.531 0.000 1.033 1.095
==============================================================================
Omnibus: 19.538 Durbin-Watson: 0.161
Prob(Omnibus): 0.000 Jarque-Bera (JB): 13.245
Skew: 0.444 Prob(JB): 0.00133
Kurtosis: 2.286 Cond. No. 15.2
==============================================================================Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
结果比较显著
对残差进行平稳性检验
#提取回归截距
alpha=results.params[0]
beta =results.params[1]
spread=PBflog-beta*PAflog-alpha
spread.head()
Trddt
2014-01-02 -0.011214
2014-01-03 -0.011507
2014-01-06 0.006511
2014-01-07 0.005361
2014-01-08 0.016112
dtype: float64
spread.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x1c23a44470>
# 价差序列单位根检验
# 因为残差的均值是0,所以trend设为nc
adfSpread=ADF(spread,trend='nc')
print(adfSpread.summary().as_text())
Augmented Dickey-Fuller Results
=====================================
Test Statistic -3.199
P-value 0.001
Lags 0
-------------------------------------Trend: No Trend
Critical Values: -2.57 (1%), -1.94 (5%), -1.62 (10%)
Null Hypothesis: The process contains a unit root.
Alternative Hypothesis: The process is weakly stationary.
统计量较小,所以拒绝原假设,认为残差是平稳的
配对交易策略的制定
最小距离法
计算形成期内标准化的价格序列差P^tX−P^tY\hat{P}^X_t-\hat{P}^Y_tP^tX−P^tY的平均值μ\muμ和标准差σ\sigmaσ,设定交易信号出发点为μ+−2σ\mu+-2\sigmaμ+−2σ交易期为6个月。由于浦发银行和中国银行为银行业股票,价格比较稳定,因此我们设定交易期内价差超过μ+−1.2σ\mu+-1.2\sigmaμ+−1.2σ时触发交易信号。
#最短距离法交易策略
#中国银行的标准化价格
standardA=(1+retA).cumprod()
#浦发银行的标准化价格
standardB=(1+retB).cumprod()
#标准价差
SSD_pair=standardB-standardA
SSD_pair.head()
Trddt
2014-01-03 -0.001514
2014-01-06 0.015407
2014-01-07 0.013962
2014-01-08 0.024184
2014-01-09 0.037629
dtype: float64
meanSSD_pair = np.mean(SSD_pair)
sdSSD_pair=np.std(SSD_pair)
thresholdUP=meanSSD_pair+1.2*sdSSD_pair
thresholdDOWN=meanSSD_pair-1.2*sdSSD_pair
SSD_pair.plot()
plt.axhline(y=meanSSD_pair,color='black')
plt.axhline(y=thresholdUP,color='blue')
plt.axhline(y=thresholdDOWN,color='red')
<matplotlib.lines.Line2D at 0x1c24027860>
在交易期当价差上穿阀值的时候反向开仓,回归平均线左右的时候平仓,当价差下穿阀值线的时候正向开仓,在价差回归平均线左右的时候平仓。
#交易期
tradStart='2015-01-01'
tradEnd='2015-06-30'
PAt=sh.loc[tradStart:tradEnd,'601988']
PBt=sh.loc[tradStart:tradEnd,'600000']
def spreadCal(x,y):retx=(x-x.shift(1))/x.shift(1)[1:]rety=(y-y.shift(1))/y.shift(1)[1:]standardX=(1+retx).cumprod()standardY=(1+rety).cumprod()spread=standardY-standardXreturn spread
TradSpread=spreadCal(PAt,PBt).dropna()
TradSpread.describe()
count 118.000000
mean 0.001064
std 0.054323
min -0.127050
25% -0.028249
50% 0.005682
75% 0.041375
max 0.100249
dtype: float64
TradSpread.plot()
plt.axhline(y=meanSSD_pair,color='black')
plt.axhline(y=thresholdUP,color='blue')
plt.axhline(y=thresholdDOWN,color='red')
<matplotlib.lines.Line2D at 0x1c240f9a58>
协整模型
构建PairTrading类
import re
import pandas as pd
import numpy as np
from arch.unitroot import ADF
import statsmodels.api as sm
class PairTrading:#计算两个股票之间的距离def SSD(self,priceX,priceY):if priceX is None or priceY is None:print('缺少价格序列')returnreturnX=(priceX-priceX.shift(1))/priceX.shift(1)[1:]returnY=(priceY-priceY.shift(1))/priceY.shift(1)[1:]standardX=(1+returnX).cumprod()standardY=(1+returnY).cumprod()SSD=np.sum((standardY-standardX)**2)return SSD#计算标准价差序列def SSDSpread(self,priceX,priceY):if priceX is None or priceY is None:print('缺少价格徐磊')returnreturnX=(priceX-priceX.shift(1))/priceX.shift(1)[1:]returnY=(priceY-priceY.shift(1))/priceY.shift(1)[1:]standardX=(1+returnX).cumprod()standardY=(1+returnY).cumprod()spread=standardY-standardXreturn spread#判断是否是协整序列,并返回协整序列的线性回归参数def cointegration(self,priceX,priceY):if priceX is None or priceY is None:print('缺少价格序列.')priceX=np.log(priceX)priceY=np.log(priceY)results=sm.OLS(priceY,sm.add_constant(priceX)).fit()resid=results.residadfSpread=ADF(resid)if adfSpread.pvalue>=0.05:print('''交易价格不具有协整关系.P-value of ADF test: %fCoefficients of regression:Intercept: %fBeta: %f''' % (adfSpread.pvalue, results.params[0], results.params[1]))return(None)else:print('''交易价格具有协整关系.P-value of ADF test: %fCoefficients of regression:Intercept: %fBeta: %f''' % (adfSpread.pvalue, results.params[0], results.params[1]))return(results.params[0], results.params[1])#计算协整序列差def CointegrationSpread(self,priceX,priceY,formPeriod,tradePeriod):if priceX is None or priceY is None:print('缺少价格序列.')if not (re.fullmatch('\d{4}-\d{2}-\d{2}:\d{4}-\d{2}-\d{2}',formPeriod)or re.fullmatch('\d{4}-\d{2}-\d{2}:\d{4}-\d{2}-\d{2}',tradePeriod)):print('形成期或交易期格式错误.')formX=priceX[formPeriod.split(':')[0]:formPeriod.split(':')[1]]formY=priceY[formPeriod.split(':')[0]:formPeriod.split(':')[1]]coefficients=self.cointegration(formX,formY)if coefficients is None:print('未形成协整关系,无法配对.')else:spread=(np.log(priceY[tradePeriod.split(':')[0]:tradePeriod.split(':')[1]])-coefficients[0]-coefficients[1]*np.log(priceX[tradePeriod.split(':')[0]:tradePeriod.split(':')[1]]))return(spread)#计算边界def calBound(self,priceX,priceY,method,formPeriod,width=1.5):if not (re.fullmatch('\d{4}-\d{2}-\d{2}:\d{4}-\d{2}-\d{2}',formPeriod)or re.fullmatch('\d{4}-\d{2}-\d{2}:\d{4}-\d{2}-\d{2}',tradePeriod)):print('形成期或交易期格式错误.')if method=='SSD':spread=self.SSDSpread(priceX[formPeriod.split(':')[0]:formPeriod.split(':')[1]],priceY[formPeriod.split(':')[0]:formPeriod.split(':')[1]]) mu=np.mean(spread)sd=np.std(spread)UpperBound=mu+width*sdLowerBound=mu-width*sdreturn(UpperBound,LowerBound)elif method=='Cointegration':spread=self.CointegrationSpread(priceX,priceY,formPeriod,formPeriod)mu=np.mean(spread)sd=np.std(spread)UpperBound=mu+width*sdLowerBound=mu-width*sdreturn(UpperBound,LowerBound)else:print('不存在该方法. 请选择"SSD"或是"Cointegration".')
formPeriod='2014-01-01:2015-01-01'
tradePeriod='2015-01-01:2015-06-30'
priceA=sh['601988']
priceB=sh['600000']
priceAf=priceA[formPeriod.split(':')[0]:formPeriod.split(':')[1]]
priceBf=priceB[formPeriod.split(':')[0]:formPeriod.split(':')[1]]
priceAt=priceA[tradePeriod.split(':')[0]:tradePeriod.split(':')[1]]
priceBt=priceB[tradePeriod.split(':')[0]:tradePeriod.split(':')[1]]
pt=PairTrading()
SSD=pt.SSD(priceAf,priceBf)
SSD
0.47481704588389073
SSDspread=pt.SSDSpread(priceAf,priceBf)
SSDspread.describe()
SSDspread.head()
Trddt
2014-01-02 NaN
2014-01-03 -0.001484
2014-01-06 0.015385
2014-01-07 0.013946
2014-01-08 0.024184
dtype: float64
coefficients=pt.cointegration(priceAf,priceBf)
coefficients
交易价格具有协整关系.P-value of ADF test: 0.020415Coefficients of regression:Intercept: 1.226852Beta: 1.064103(1.2268515742404387, 1.0641034525888144)
CoSpreadF=pt.CointegrationSpread(priceA,priceB,formPeriod,formPeriod)
CoSpreadF.head()
交易价格具有协整关系.P-value of ADF test: 0.020415Coefficients of regression:Intercept: 1.226852Beta: 1.064103Trddt
2014-01-02 -0.011214
2014-01-03 -0.011507
2014-01-06 0.006511
2014-01-07 0.005361
2014-01-08 0.016112
dtype: float64
CoSpreadTr=pt.CointegrationSpread(priceA,priceB,formPeriod,tradePeriod)
CoSpreadTr.describe()
交易价格具有协整关系.P-value of ADF test: 0.020415Coefficients of regression:Intercept: 1.226852Beta: 1.064103count 119.000000
mean -0.037295
std 0.052204
min -0.163903
25% -0.063038
50% -0.033336
75% 0.000503
max 0.057989
dtype: float64
bound=pt.calBound(priceA,priceB,'Cointegration',formPeriod,width=1.2)
bound
交易价格具有协整关系.P-value of ADF test: 0.020415Coefficients of regression:Intercept: 1.226852Beta: 1.064103(0.03627938704534019, -0.03627938704533997)
#配对交易实测
#提取形成期数据
formStart='2014-01-01'
formEnd='2015-01-01'
PA=sh['601988']
PB=sh['600000']PAf=PA[formStart:formEnd]
PBf=PB[formStart:formEnd]
#形成期协整关系检验
#一阶单整检验
log_PAf=np.log(PAf)
adfA=ADF(log_PAf)
print(adfA.summary().as_text())
adfAd=ADF(log_PAf.diff()[1:])
print(adfAd.summary().as_text())
Augmented Dickey-Fuller Results
=====================================
Test Statistic 3.409
P-value 1.000
Lags 12
-------------------------------------Trend: Constant
Critical Values: -3.46 (1%), -2.87 (5%), -2.57 (10%)
Null Hypothesis: The process contains a unit root.
Alternative Hypothesis: The process is weakly stationary.Augmented Dickey-Fuller Results
=====================================
Test Statistic -4.571
P-value 0.000
Lags 11
-------------------------------------Trend: Constant
Critical Values: -3.46 (1%), -2.87 (5%), -2.57 (10%)
Null Hypothesis: The process contains a unit root.
Alternative Hypothesis: The process is weakly stationary.
log_PBf=np.log(PBf)
adfB=ADF(log_PBf)
print(adfB.summary().as_text())
adfBd=ADF(log_PBf.diff()[1:])
print(adfBd.summary().as_text())
Augmented Dickey-Fuller Results
=====================================
Test Statistic 2.392
P-value 0.999
Lags 12
-------------------------------------Trend: Constant
Critical Values: -3.46 (1%), -2.87 (5%), -2.57 (10%)
Null Hypothesis: The process contains a unit root.
Alternative Hypothesis: The process is weakly stationary.Augmented Dickey-Fuller Results
=====================================
Test Statistic -3.888
P-value 0.002
Lags 11
-------------------------------------Trend: Constant
Critical Values: -3.46 (1%), -2.87 (5%), -2.57 (10%)
Null Hypothesis: The process contains a unit root.
Alternative Hypothesis: The process is weakly stationary.
#协整关系检验
model=sm.OLS(log_PBf,sm.add_constant(log_PAf)).fit()
model.summary()
Dep. Variable: | 600000 | R-squared: | 0.949 |
---|---|---|---|
Model: | OLS | Adj. R-squared: | 0.949 |
Method: | Least Squares | F-statistic: | 4560. |
Date: | Fri, 23 Aug 2019 | Prob (F-statistic): | 1.83e-159 |
Time: | 17:43:03 | Log-Likelihood: | 509.57 |
No. Observations: | 245 | AIC: | -1015. |
Df Residuals: | 243 | BIC: | -1008. |
Df Model: | 1 | ||
Covariance Type: | nonrobust |
coef | std err | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
const | 1.2269 | 0.015 | 83.071 | 0.000 | 1.198 | 1.256 |
601988 | 1.0641 | 0.016 | 67.531 | 0.000 | 1.033 | 1.095 |
Omnibus: | 19.538 | Durbin-Watson: | 0.161 |
---|---|---|---|
Prob(Omnibus): | 0.000 | Jarque-Bera (JB): | 13.245 |
Skew: | 0.444 | Prob(JB): | 0.00133 |
Kurtosis: | 2.286 | Cond. No. | 15.2 |
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
alpha=model.params[0]
alpha
1.2268515742404387
beta=model.params[1]
beta
1.0641034525888144
#残差单位根检验
spreadf=log_PBf-beta*log_PAf-alpha
adfSpread=ADF(spreadf)
print(adfSpread.summary().as_text())
Augmented Dickey-Fuller Results
=====================================
Test Statistic -3.193
P-value 0.020
Lags 0
-------------------------------------Trend: Constant
Critical Values: -3.46 (1%), -2.87 (5%), -2.57 (10%)
Null Hypothesis: The process contains a unit root.
Alternative Hypothesis: The process is weakly stationary.
mu=np.mean(spreadf)
sd=np.std(spreadf)
#设定交易期tradeStart='2015-01-01'
tradeEnd='2015-06-30'
PAt=PA[tradeStart:tradeEnd]
PBt=PB[tradeStart:tradeEnd]
CoSpreadT=np.log(PBt)-beta*np.log(PAt)-alpha
CoSpreadT.describe()
count 119.000000
mean -0.037295
std 0.052204
min -0.163903
25% -0.063038
50% -0.033336
75% 0.000503
max 0.057989
dtype: float64
CoSpreadT.plot()
plt.title('交易期价差序列(协整配对)')
plt.axhline(y=mu,color='black')
plt.axhline(y=mu+0.2*sd,color='blue',ls='-',lw=2)
plt.axhline(y=mu-0.2*sd,color='blue',ls='-',lw=2)
plt.axhline(y=mu+1.5*sd,color='green',ls='--',lw=2.5)
plt.axhline(y=mu-1.5*sd,color='green',ls='--',lw=2.5)
plt.axhline(y=mu+2.5*sd,color='red',ls='-.',lw=3)
plt.axhline(y=mu-2.5*sd,color='red',ls='-.',lw=3)
<matplotlib.lines.Line2D at 0x1c2431b470>
level=(float('-inf'),mu-2.5*sd,mu-1.5*sd,mu-0.2*sd,mu+0.2*sd,mu+1.5*sd,mu+2.5*sd,float('inf'))
prcLevel=pd.cut(CoSpreadT,level,labels=False)-3
prcLevel.head()
Trddt
2015-01-05 -1
2015-01-06 -2
2015-01-07 -3
2015-01-08 -2
2015-01-09 -3
dtype: int64
def TradeSig(prcLevel):n=len(prcLevel)signal=np.zeros(n)for i in range(1,n):#反向建仓if prcLevel[i-1]==1 and prcLevel[i]==2:signal[i]=-2#反向平仓elif prcLevel[i-1]==1 and prcLevel[i]==0:signal[i]=2#强制平仓elif prcLevel[i-1]==2 and prcLevel[i]==3:signal[i]=3#正向建仓elif prcLevel[i-1]==-1 and prcLevel[i]==-2:signal[i]=1#正向平仓elif prcLevel[i-1]==-1 and prcLevel[i]==0:signal[i]=-1#强制平仓elif prcLevel[i-1]==-2 and prcLevel[i]==-3:signal[i]=-3return(signal)
signal=TradeSig(prcLevel)
position=[signal[0]]
ns=len(signal)
for i in range(1,ns):position.append(position[-1])#正向建仓if signal[i]==1:position[i]=1#反向建仓elif signal[i]==-2:position[i]=-1#正向平仓elif signal[i]==-1 and position[i-1]==1:position[i]=0#反向平仓elif signal[i]==2 and position[i-1]==-1:position[i]=0#强制平仓elif signal[i]==3:position[i]=0elif signal[i]==-3:position[i]=0
position=pd.Series(position,index=CoSpreadT.index)
position.tail()
Trddt
2015-06-24 0.0
2015-06-25 0.0
2015-06-26 0.0
2015-06-29 0.0
2015-06-30 0.0
dtype: float64
def TradeSim(priceX,priceY,position):n=len(position)size=1000shareY=size*positionshareX=[(-beta)*shareY[0]*priceY[0]/priceX[0]]cash=[2000]for i in range(1,n):shareX.append(shareX[i-1])cash.append(cash[i-1])if position[i-1]==0 and position[i]==1:shareX[i]=(-beta)*shareY[i]*priceY[i]/priceX[i]cash[i]=cash[i-1]-(shareY[i]*priceY[i]+shareX[i]*priceX[i])elif position[i-1]==0 and position[i]==-1:shareX[i]=(-beta)*shareY[i]*priceY[i]/priceX[i]cash[i]=cash[i-1]-(shareY[i]*priceY[i]+shareX[i]*priceX[i])elif position[i-1]==1 and position[i]==0:shareX[i]=0cash[i]=cash[i-1]+(shareY[i-1]*priceY[i]+shareX[i-1]*priceX[i])elif position[i-1]==-1 and position[i]==0:shareX[i]=0cash[i]=cash[i-1]+(shareY[i-1]*priceY[i]+shareX[i-1]*priceX[i])cash = pd.Series(cash,index=position.index)shareY=pd.Series(shareY,index=position.index)shareX=pd.Series(shareX,index=position.index)asset=cash+shareY*priceY+shareX*priceXaccount=pd.DataFrame({'Position':position,'ShareY':shareY,'ShareX':shareX,'Cash':cash,'Asset':asset})return(account)
account=TradeSim(PAt,PBt,position)
account.tail()
Position | ShareY | ShareX | Cash | Asset | |
---|---|---|---|---|---|
Trddt | |||||
2015-06-24 | 0.0 | 0.0 | 0.0 | 5992.514 | 5992.514 |
2015-06-25 | 0.0 | 0.0 | 0.0 | 5992.514 | 5992.514 |
2015-06-26 | 0.0 | 0.0 | 0.0 | 5992.514 | 5992.514 |
2015-06-29 | 0.0 | 0.0 | 0.0 | 5992.514 | 5992.514 |
2015-06-30 | 0.0 | 0.0 | 0.0 | 5992.514 | 5992.514 |
account.iloc[:,[1,3,4]].plot(style=['--','-',':'])
plt.title('配对交易账户')
Text(0.5, 1.0, '配对交易账户')
【量化笔记】配对交易相关推荐
- 金融量化 — 配对交易策略 (Pair Trading)
1. 配对交易策略 1.1.引言 在量化投资领域,既然严格的无风险套利机会少.收益率微薄,实际的执行过程中也不能完全消除风险.那么如果有一种选择,能够稍微放松100%无风险的要求,比如允许有5%的风险 ...
- 量化投资实战(三)之配对交易策略---协整法
点赞.关注再看,养成良好习惯 Life is short, U need Python 初学量化投资实战,[快来点我吧] 配对交易策略实战-协整法 基本流程 配对组合 --> 计算价差 --&g ...
- 量化投资实战(二)之配对交易策略---最短距离法
点赞.关注再看,养成良好习惯 Life is short, U need Python 初学量化投资实战,[快来点我吧] 配对交易策略实战-最短距离法 基本流程 配对组合 --> 计算价差 -- ...
- 量化金融分析AQF(12):配对交易 Pair trading - 考虑时间序列平稳性、协整关系
目录 1. 数据准备 & 回测准备 2. 策略开发思路 3.产生交易信号 3. 计算策略年化收益并可视化 4.总结 上节说到,做2只股票配对交易,先判断2只股票的平稳性,不平稳就做一阶差分和协 ...
- 【量化笔记】OBV指标交易策略
本篇博客是量化笔记系列的最后一篇了,之后会停更一段时间量化内容,后面会写什么还待定 OVB主要计算累计成交量,将股价上涨时的成交量进行正累加,对股价下跌时的成交量进行负累加,计算公式是: O B V ...
- 量化交易陷阱和R语言改进股票配对交易策略分析中国股市投资组合
最近我们被客户要求撰写关于量化交易的研究报告,包括一些图形和统计输出. 计算能力的指数级增长,以及量化社区(日益增长的兴趣使量化基金成为投资者蜂拥而至的最热门领域. 量化交易陷阱和R语言改进股票配对交 ...
- 【量化策略系列】股票均值回归策略之一——配对交易策略(Pairs Trading)
本文持续更新中.最后更新时间:11/11/2019 文章目录 1. 往期文章回顾 2. 均值回归策略简介 3. 配对交易策略简介 4. 配对交易策略构建流程 5. 代码实现与回测结果 Python 代 ...
- 【量化】相关系数进行配对交易
根据统计数据,对价差进行买卖,而不去做股票本身趋势的预测,是否能做到旱涝保收呢.下面是利用股票对之间的相关系数来进行配对交易的研究. 1,首先想到利用统计套利,可能会想到两只股票的相关系数是否会让两只 ...
- 配对交易方法_COVID下的自适应配对交易,一种强化学习方法
配对交易方法 Abstract 抽象 This is one of the articles of A.I. Capital Management's Research Article Series, ...
最新文章
- ES6学习笔记之Promise
- 【设计模式】代理模式 ( 动态代理 )
- 查看linux内存存储空间不足,Linux 下判断Server 内存是否不足
- JavaScript——文档对象模型
- sdut 数字三角形问题
- abp dapper mysql_ABP框架—后台:引入Abp.Dapper(10)
- pdo query获取mysql单行结果_php代码连不上mysql的可能?看看这个也许能给你点启发...
- hadoop2.X集群安装与应用
- 使用cp命令拷贝目录下指定文件外的其他文件
- 家庭网络布线图与布线方案
- 微信小程序也可以实现定位打卡/签到打卡了(附源码)
- 802.11协议——初探
- 解决Chrome账户无法同步
- 多视图多行为对比学习推荐系统
- 2021 年年度最佳开源软件!
- 百度 android 市场,百度Q2报告:Android市场份额21.4% 同比增长890%
- 工控机CF卡槽无法使用的解决方案
- javascript案例30——continue、break
- 机械键盘上的ASF是什么意思?
- PHP cURL获取微信公众号access_token
热门文章
- android camera 废弃,Android相机android.hardware.Camera已弃用
- 电脑计算机的符号什么意思,计算机上面的符号代表什么意思 悬赏20
- html5抓鱼游戏,小班捉小鱼游戏教案
- 记录一下近期工作-Qt实现tcp协议接收数据
- 【scratch】class_6_植物大战僵尸(一)
- EAS单据F7设置默认值
- 【微软资源站】MSDN
- XAG拥抱区块链和分布式记账技术标准化时代
- 肺癌新易感位点的发现及多基因遗传评分在肺癌风险预测中的应用--基于中国超大型前瞻性队列研究
- dvd光驱在计算机内怎么找不到,电脑DVD光驱消失找不到怎么处理