HS300股指与其成分股的价格匹配

观察沪深300股指收盘点时，会发现大部分个股的收盘价走势与股指不是同步的。下面一段程序提供了一个方法寻找沪深300股指收盘点位和其成分股收盘价格匹配度比较高的个股。此段程序的主要思路是先确定股指和个股收盘价格的线性关系系数，然后通过计算其线性关系中的残差项，对其进行ADF检测来评估哪些个股走势与沪深300股指走势比较相近。

股指和个股使用数据来自tushare. http://tushare.org/trading.html

统计时间段为2014-10-1到2017-10-1，只发现15只个股走势和沪深300指数比较相近，同步度比较高的股票代码如下：

‘’002024', '600038', '002415', '601333', '000060', '600406', '600332', '000540', '000402', '601988', '601998', '601328', '601111', '600048', '000069']

贴几张走势图，但有的走势图看起来涨跌幅的匹配度并不高。下面三张图中，第三张有些时候匹配不是很好。

**********************************************************************************************

***********************************************************************************************

**********************************************************************************************

以下是Python程序：

#_*_coding:utf-8_*_
'''
Version: V17.1.0
Date: 2017-11-5
@Author: Cheney
'''

# 从tushare上获取数据，查询时间段2014.10-2017.10中，HS300股票收盘价格走势和HS300股指相似的个股

# Part I
import datetime
import numpy as np
import pandas as pd
import tushare as ts
import matplotlib.pyplot as plt
import traceback
import statsmodels.tsa.stattools as sts
import statsmodels.api as smt = datetime.datetime.now()
print ('Program is starting ... %s' %t)

def plot_price_relation(df, start, end, st_a, st_b='hs300'):'''
    Draw HS300 Index and stocks price relation plot
    df--DataFrame, index is date, columns are stock and hs300 index close
    start and end -- set the start and end date for stock and HS300 Index comparision
    st_a , st_b -- stock code and hs300 code or label
    '''
    fig, (ax,bx) = plt.subplots(nrows=2)x_date = [datetime.datetime.strptime(d, '%Y-%m-%d').date() for d in df.index]ax.plot(x_date, df[st_b], label=st_b, c='g')ax.set_title("%s index and stock %s daily prices relation" % (st_b, st_a))ax.set_xticklabels([])ax.set_ylabel("HS300Index")ax.grid(True)ax.legend(loc='best')bx.plot(x_date, df[st_a], label=st_a, c='b')bx.set_xlabel("Year/Month")bx.set_ylabel("Stock Price")bx.grid(True)bx.legend(loc='best')fig.autofmt_xdate()# Save figures in a folder or show in time
    plt.savefig('hs30index_pair_stock_plot/ %s+%s.png' % (st_b, st_a))# plt.show()

def get_df_close(stocka, stockb):# Transform stock data as dateframe format and keep the close columns and date index
    # stocka and stockb--stocks code, like '600036'

    sta = ts.get_hist_data(stocka)stb = ts.get_hist_data(stockb)# To build a new DataFrame to get the close of stock and HS300 Index
    df = pd.concat([sta, stb], axis=1)df = df['close'].fillna(method='ffill')df.columns = ['%s' %stocka, '%s' % stockb] return df#Part II
if __name__ == "__main__":start = datetime.datetime(2014,10,1).strftime('%Y-%m-%d')end = datetime.datetime(2017,10,1).strftime('%Y-%m-%d')# Get HS300 stocks code list
    hs_name = 'hs300'
    hs = ts.get_hs300s()hs_list = hs['code']stockADF = {}
    for code in hs_list:  
        #Get the stock and hs300 index close data
        df = get_df_close(code, hs_name)#Calculate the linear model's coefficient
        x_value= df['%s'%code]x = sm.add_constant(x_value)y = list(df['%s'%hs_name])try:#Calcualte the residuals of linear model, if it can't get the fit data, it will raise exception
            res = sm.OLS(y, x_value)res = res.fit()betaCoef = res.params[0]if (betaCoef-betaCoef) != 0:raise
        except:print ("Can't catch the res params of stock %s and %s"%(code,hs_name))traceback.print_exc()continue

        df['res'] = df['%s'%hs_name] - betaCoef * df['%s'%code]tempStockADF = sts.adfuller(df['res'])#Save the ADF test value in a dict for polting price comparision figure
        stockADF[code+''+ hs_name] = [tempStockADF[0], tempStockADF[4]['1%']]  #Compare the ADF test value and 1% salient threshold to estimate whether meet stationary time series
    for key,value in stockADF.items():        if value[0] < value[1]:print ("The best pairs stocks %s, ADF values %s and percent-1 %s" %(key,value[0],value[1]))keyCode = key.strip("\'\'")code, hs_name = keyCode[:6], keyCode[-5:]df = get_df_close(code,hs_name)plot_price_relation(df, start, end, '%s'%code,'%s'%hs_name)print ('Program total running time is %s' %(datetime.datetime.now() -t))

以上是量化交易学习中一点点的知识积累，有不足之处还望大牛多多指导。

HS300股指与其成分股的价格匹配相关推荐

excel模糊匹配两列文字_如何使用Power Pivot进行模糊匹配
之前在<使用Power Query进行模糊匹配>一文中我们讨论了如何在Power Query中进行模糊匹配,今天我们来讨论下在Power Pivot中的模糊匹配. 还是之前的案例,图1为产 ...
股指期货的理论价格与期限套利
股指期货的定价利用的是无风险套利原理,也就是说股指期货的价格应当消除无风险套利的机会,否则就会有人进行套利.对于一般的投资者来说,只要了解股指期货价格与现货指数.无风险利率.红利率.到期前时间长短有关 ...
玩转百度竞价到底用哪种匹配模式比较好！
玩转百度竞价推广之前,一定要了解好各种匹配模式,千万不要小看这几种匹配模式,根据账户内的关键词以及关键词的意向程度合理设置关键词的匹配模式,能收获意向不到的精准效果. 但在这之前,需要对百度推广的匹配 ...
python之协程与IO操作
协程协程,又称微线程,纤程.英文名Coroutine. 协程的概念很早就提出来了,但直到最近几年才在某些语言(如Lua)中得到广泛应用. 子程序,或者称为函数,在所有语言中都是层级调用,比如A调用B ...
Python量化交易平台开发教程系列4-事件驱动引擎原理和使用
原创文章,转载请注明出处:用Python的交易员前言从这篇开始,后面的教程都会基于Python(终于可以跟C++说再见了). 经过上一篇复杂繁琐的API编译后,我们已经有了一个可以在Python环 ...
R语言VaR市场风险计算方法与回测、用LOGIT逻辑回归、PROBIT模型信用风险与分类模型...
全文链接:http://tecdat.cn/?p=27530 市场风险指的是由金融市场中资产的价格下跌或价格波动增加所导致的可能损失. 相关视频市场风险包含两种类型:相对风险和绝对风险.绝对风险关 ...
期权波动率“微笑曲线”之谜
"波动率微笑"即具有相同到期日和标的资产而执行价格不同的期权,其行权价格偏离标的资产价格越远,隐含波动率越大. 波动率通常是用来描述股票.期货等资产价格变化有多剧烈的一个统计指标. ...
R语言VaR市场风险计算方法与回测、用Logit逻辑回归、Probit模型信用风险与分类模型
最近我们被客户要求撰写关于信用风险与分类的研究报告,包括一些图形和统计输出. 市场风险指的是由金融市场中资产的价格下跌或价格波动增加所导致的可能损失. 市场风险包含两种类型:相对风险和绝对风险.绝对风 ...
购物中心定位分析、调整方案及租金建议
商业调整从来都是一个不变的命题,对购物中心而言,调整也是保持购物中心最佳经营业绩和持续竞争优势的重要措施. 尽管购物中心调整的终极目的是租金收益的提升,但商业品质的提升也是资产增值的重要体现:而且只有 ...
蚂蚁金服发布「定损宝」，推动图像定损技术在车险领域的应用
蚂蚁金服发布「定损宝」,推动图像定损技术在车险领域的应用 By 高静宜2017年6月28日 13:39 6 月 27 日,蚂蚁金服在北京宣布向保险行业全面开放技术产品「定损宝」,用 AI 技术模拟车险 ...

HS300股指与其成分股的价格匹配

HS300股指与其成分股的价格匹配相关推荐

最新文章

热门文章