数据分析项目(python):股票数据导入、计算上涨、计算下跌、收益计算、双均线策略
1.股票数据导入_代码示例
import pandas as pd
import numpy as np
import tushare as ts # 财经数据接口包#获取某只股票的历史行情
#code:字符串形式的股票代码
# df = ts.get_k_data(code='600519', start='2000-01-01')
# print(df) # 爬来的数据展示
'''date open close high low volume code
0 2001-08-27 5.392 5.554 5.902 5.132 406318.00 600519
1 2001-08-28 5.467 5.759 5.781 5.407 129647.79 600519
... ... ... ... ... ... ... ...
4391 2022-01-18 1861.680 1892.270 1908.850 1852.000 31216.00 600519
4392 2022-01-19 1919.000 1912.000 1919.000 1888.880 26371.00 600519
[4393 rows x 7 columns]
'''#将互联网上获取的股票数据保存到本地
# df.to_csv("./maotai.csv") #调用to_xxx方法保存数据df = pd.read_csv("./maotai.csv")
# print(df.head())
'''Unnamed: 0 date open close high low volume code
0 0 2001-08-27 5.392 5.554 5.902 5.132 406318.00 600519
1 1 2001-08-28 5.467 5.759 5.781 5.407 129647.79 600519
2 2 2001-08-29 5.777 5.684 5.781 5.640 53252.75 600519
3 3 2001-08-30 5.668 5.796 5.860 5.624 48013.06 600519
4 4 2001-08-31 5.804 5.782 5.877 5.749 23231.48 600519
'''df.drop(labels="Unnamed: 0", axis=1, inplace=True) # 删除没用的列数据
# print(df.head())
'''date open close high low volume code
0 2001-08-27 5.392 5.554 5.902 5.132 406318.00 600519
1 2001-08-28 5.467 5.759 5.781 5.407 129647.79 600519
2 2001-08-29 5.777 5.684 5.781 5.640 53252.75 600519
3 2001-08-30 5.668 5.796 5.860 5.624 48013.06 600519
4 2001-08-31 5.804 5.782 5.877 5.749 23231.48 600519
'''# print(df.info())
'''
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4876 entries, 0 to 4875
Data columns (total 7 columns):# Column Non-Null Count Dtype
--- ------ -------------- ----- 0 date 4876 non-null object 1 open 4876 non-null float642 close 4876 non-null float643 high 4876 non-null float644 low 4876 non-null float645 volume 4876 non-null float646 code 4876 non-null int64
dtypes: float64(5), int64(1), object(1)
memory usage: 266.8+ KB
'''df['date'] = pd.to_datetime(df['date']) # 把日期改为日期格式
# print(df.info())
'''
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4876 entries, 0 to 4875
Data columns (total 7 columns):# Column Non-Null Count Dtype
--- ------ -------------- ----- 0 date 4876 non-null datetime64[ns]1 open 4876 non-null float64 2 close 4876 non-null float64 3 high 4876 non-null float64 4 low 4876 non-null float64 5 volume 4876 non-null float64 6 code 4876 non-null int64
dtypes: datetime64[ns](1), float64(5), int64(1)
memory usage: 266.8 KB
'''df.set_index('date', inplace=True) # 设置日期为行索引
# print(df)
''''open close high low volume code
date
2001-08-27 5.392 5.554 5.902 5.132 406318.00 600519
2001-08-28 5.467 5.759 5.781 5.407 129647.79 600519
... ... ... ... ... ... ...
2022-01-18 1861.680 1892.270 1908.850 1852.000 31216.00 600519
2022-01-19 1919.000 1912.000 1919.000 1888.880 26371.00 600519[4876 rows x 6 columns]
'''
2.计算上涨_代码示例
import pandas as pd
import numpy as npdf = pd.read_csv("./maotai.csv")
# print(df.info())
'''
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4876 entries, 0 to 4875
Data columns (total 8 columns):# Column Non-Null Count Dtype
--- ------ -------------- ----- 0 Unnamed: 0 4876 non-null int64 1 date 4876 non-null object 2 open 4876 non-null float643 close 4876 non-null float644 high 4876 non-null float645 low 4876 non-null float646 volume 4876 non-null float647 code 4876 non-null int64
dtypes: float64(5), int64(2), object(1)
memory usage: 304.9+ KB
'''df['date'] = pd.to_datetime(df['date']) # 把日期改为日期格式df.set_index('date', inplace=True) #设置日期为索引
# print(df.head())
'''Unnamed: 0 open close high low volume code
date
2001-08-27 0 5.392 5.554 5.902 5.132 406318.00 600519
2001-08-28 1 5.467 5.759 5.781 5.407 129647.79 600519
2001-08-29 2 5.777 5.684 5.781 5.640 53252.75 600519
2001-08-30 3 5.668 5.796 5.860 5.624 48013.06 600519
2001-08-31 4 5.804 5.782 5.877 5.749 23231.48 600519
'''# 输出该股票所有收盘比开盘上涨3%以上的日期
incr_stock = (df['close']-df['open'])/df['open']>0.03
print(incr_stock)
'''
date
2001-08-27 True
2001-08-28 True...
2022-01-19 False
Length: 4876, dtype: bool
'''# 在分析的过程中如果产生了boolean值则将布尔值作为源数据的行索引
true_incr_stock = df.loc[incr_stock]
print(true_incr_stock)
'''Unnamed: 0 open close ... low volume code
date ...
2001-08-27 0 5.392 5.554 ... 5.132 406318.00 600519
2001-08-28 1 5.467 5.759 ... 5.407 129647.79 600519
... ... ... ... ... ... ... ...
2021-12-23 4857 2053.000 2120.000 ... 2025.000 39099.00 600519
[339 rows x 7 columns]
'''# 获取对应的日期
incr_stock_date = true_incr_stock.index
print(incr_stock_date)
'''
Index(['2001-08-27', '2001-08-28', '2001-09-10', '2001-12-21', '2002-01-18',...'2021-09-24', '2021-09-27', '2021-10-13', '2021-12-08', '2021-12-23'],dtype='object', name='date', length=339)
'''
3.计算下跌_代码示例
import pandas as pddf = pd.read_csv("./maotai.csv")
# print(df.info())
'''
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4876 entries, 0 to 4875
Data columns (total 8 columns):# Column Non-Null Count Dtype
--- ------ -------------- ----- 0 Unnamed: 0 4876 non-null int64 1 date 4876 non-null object 2 open 4876 non-null float643 close 4876 non-null float644 high 4876 non-null float645 low 4876 non-null float646 volume 4876 non-null float647 code 4876 non-null int64
dtypes: float64(5), int64(2), object(1)
memory usage: 304.9+ KB
'''df['date'] = pd.to_datetime(df['date']) # 把日期改为日期格式
df.set_index('date', inplace=True)
# 输出该股票所有收盘比开盘下跌2%以上的日期
decr_stock = (df['open']-df['close'])/df['open'] > 0.02
# print(decr_stock)
'''
date
2001-08-27 False
2001-08-28 False...
2022-01-18 False
2022-01-19 False
Length: 4876, dtype: bool
'''true_decr_stock = df.loc[decr_stock]
# print(true_decr_stock)'''Unnamed: 0 open close ... low volume code
date ...
2001-09-07 9 5.702 5.574 ... 5.570 31552.25 600519
2001-10-10 27 5.827 5.640 ... 5.629 17548.69 600519
... ... ... ... ... ... ... ...
2021-12-29 4861 2150.000 2041.000 ... 2041.000 54049.00 600519
2022-01-13 4871 1964.820 1877.390 ... 1865.560 56747.00 600519
[455 rows x 7 columns]
'''decr_stock_date = true_decr_stock.index
# print(decr_stock_date)
'''
Index(['2001-09-07', '2001-10-10', '2001-10-24', '2001-10-25', '2001-11-07',...'2021-11-10', '2021-11-30', '2021-12-17', '2021-12-29', '2022-01-13'],dtype='object', name='date', length=455)'''# print(df['close'])
'''
date
2001-08-27 5.554
2001-08-28 5.759...
2022-01-18 1892.270
2022-01-19 1912.000
Name: close, Length: 4876, dtype: float64
'''# print(df['close'].shift(1)) # 整体数据下移一位
'''
date
2001-08-27 NaN
2001-08-28 5.554...
2022-01-18 1861.610
2022-01-19 1892.270
Name: close, Length: 4876, dtype: float64
'''# 输出该股票所有开盘比前日收盘跌幅超过2%的日期
# print((df['close'].shift(1)-df['open'])/df['close'].shift(1)>0.02)
'''
date
2001-08-27 False
2001-08-28 False...
2022-01-19 False
Length: 4876, dtype: bool
'''# print(df.loc[(df['close'].shift(1)-df['open'])/df['close'].shift(1)>0.02])
'''Unnamed: 0 open close ... low volume code
date ...
2001-09-12 12 5.520 5.621 ... 5.515 25045.19 600519
2002-06-26 192 5.824 5.757 ... 5.712 15423.00 600519
... ... ... ... ... ... ... ...
2021-11-01 4819 1780.000 1803.000 ... 1760.000 41690.00 600519
[90 rows x 7 columns]
'''print(df.loc[(df['close'].shift(1)-df['open'])/df['close'].shift(1)>0.02].index)
'''
Index(['2001-09-12', '2002-06-26', '2002-12-13', '2004-07-01', '2004-10-29',··· ··· ··· ··· ···'2021-02-26', '2021-03-04', '2021-04-28', '2021-08-20', '2021-11-01'],dtype='object', name='date')
'''
4.收益计算_代码示例
import pandas as pd
import numpy as npdf = pd.read_csv("./maotai.csv")
df['date'] = pd.to_datetime(df['date']) # 把日期改为日期格式
df.set_index("date", inplace=True) # 设日期为索引#从2010-01-01开始,每月第一个交易日买入一手股票,每年最后一个交易日卖出
#所有股票,到2020-02,收益多少?
new_df = df['2010-01':'2020-02']
# print(new_df)
'''Unnamed: 0 open close ... low volume code
date ...
2010-01-04 1953 109.760 108.446 ... 108.044 44304.88 600519
2010-01-05 1954 109.116 108.127 ... 107.846 31513.18 600519
... ... ... ... ... ... ... ...
2020-01-23 4393 1076.000 1052.800 ... 1037.000 53468.00 600519
[2441 rows x 7 columns]
'''# 买股票:找到每个月第一个交易日
df_monthly = new_df.resample('M').first() # 数据的重取样
# print(df_monthly) # 次数索引是不对的,但数值是每月第一个交易日的数据
'''Unnamed: 0 open close ... low volume code
date ...
2010-01-31 1953 109.760 108.446 ... 108.044 44304.88 600519
2010-02-28 1973 107.769 107.776 ... 106.576 29655.94 600519
... ... ... ... ... ... ... ...
2020-01-31 4378 1128.000 1130.000 ... 1116.000 148099.00 600519
2020-02-29 4394 985.000 1003.920 ... 980.000 123442.00 600519
[122 rows x 7 columns]
'''# print(type(df_monthly)) # <class 'pandas.core.frame.DataFrame'>
cost = df_monthly['open'].sum()*100 # 买股票花费的钱
# print(cost) # 4010206.1# 每年最后一个交易日,并将2020最后一行切掉[:-1]
df_yearly = new_df.resample('A').last()[:-1]
# print(df_yearly)
'''Unnamed: 0 open close high low volume code
date
2010-12-31 2192 117.103 118.469 118.701 116.620 46084.0 600519
2011-12-31 2435 138.039 138.468 139.600 136.105 29460.0 600519···
2019-12-31 4377 1183.000 1183.000 1188.000 1176.510 22588.0 600519
'''earn = df_yearly['open'].sum()*1200 # 买股票赚的钱
# print(earn) # 4368184.8# 最后手中剩余的股票需要估量其价值计算到总收益中
# 使用昨天的收盘价作为剩余股票的单价,new_df['close'][-1]
last_earn = 200*new_df['close'][-1] + earn - cost
print(type(new_df['close'][-1])) # index为非数值时才能用[-1]取最后一个值
print(last_earn) # 最后赚了569378.6999999997# d_f = pd.DataFrame(np.arange(12).reshape(3, 4),
# columns=list('abcd'), index=['x', 'y', 'z'])
# print(d_f)
# print(d_f['a'][-1])
5.双均线策略_代码示例
import pandas as pd
from matplotlib import pyplot as pltdf = pd.read_csv("./maotai.csv").drop(labels="Unnamed: 0", axis=1)
df['date'] = pd.to_datetime(df["date"]) # 把日期改为日期格式
df.set_index(df['date'], inplace=True) # 设置日期为行索引
# print(df)
'''date open close ... low volume code
date ...
2001-08-27 2001-08-27 5.392 5.554 ... 5.132 406318.00 600519
2001-08-28 2001-08-28 5.467 5.759 ... 5.407 129647.79 600519
... ... ... ... ... ... ... ...
2022-01-19 2022-01-19 1919.000 1912.000 ... 1888.880 26371.00 600519
[4876 rows x 7 columns]
'''# 均线计算方法:MA=(c1+c2+c3+···+cn)/N C为收盘价,N为天数,用移动窗口计算出连续均值
# 常用均线有5,10, 30, 60, 120, 240
five_mean = df['close'].rolling(5).mean()
month_mean = df['close'].rolling(30).mean()
# print(five_mean)
'''
date
2001-08-27 NaN
2001-08-28 NaN
2001-08-29 NaN
2001-08-30 NaN
2001-08-31 5.715...
2022-01-19 1882.054
Name: close, Length: 4876, dtype: float64
'''plt.figure(figsize=(20, 8), dpi=60)
# plt.plot(five_mean[50:180]) # 切出第50到第180之间的130个数
# plt.plot(month_mean[50:180]) # 切出第50到第180之间的130个数_x = five_mean.index
_y = five_mean.values
_yy = month_mean.values
_x = [i.strftime("%Y%m%d") for i in _x] # 截取年月日
plt.plot(range(len(_x)), _y)
plt.plot(range(len(_x)), _yy)
plt.xticks(list(range(len(_x)))[::100], list(_x)[::100], rotation=45)
# plt.show()# 分析输出金叉日期和死叉日期,
# 金叉:短期均线向上穿过长期均线的交点
# 死叉:短期均线线下穿过长期均线的交点
five_mean = five_mean[30:]
month_mean = month_mean[30:]
# print(five_mean)
'''
date
2001-10-15 5.6382
2001-10-16 5.5602...
2022-01-19 1882.0540
Name: close, Length: 4846, dtype: float64
'''s1 = five_mean < month_mean
s2 = five_mean > month_mean
# print(s1)
'''
date
2001-10-15 True
2001-10-16 True...
2022-01-18 True
2022-01-19 True
Name: close, Length: 4846, dtype: bool
'''death_ex = s1 & s2.shift(1) #判定死叉的条件
df = df[30:]
death_date = df.loc[death_ex] # 死叉对应的行数据
# print(death_date)
'''date open close high low volume code
date
2002-01-17 2002-01-17 5.656 5.477 5.701 5.421 24180.31 600519
2002-01-30 2002-01-30 5.562 5.559 5.668 5.499 8419.22 600519
... ... ... ... ... ... ... ...
2022-01-06 2022-01-06 2022.010 1982.220 2036.000 1938.510 51795.00 600519
[105 rows x 7 columns]
'''death_index = death_date.index # 死叉的时间
# print(death_index)
'''
DatetimeIndex(['2002-01-17', '2002-01-30', '2002-03-29', '2002-07-29',...'2021-03-01', '2021-04-15', '2021-05-06', '2021-06-22','2021-11-04', '2022-01-06'],dtype='datetime64[ns]', name='date', length=105, freq=None)
'''golden_ex = ~(s1 | s2.shift(1))
golden_date = df.loc[golden_ex]
# print(golden_date)
'''date open close high low volume code
date
2001-11-22 2001-11-22 5.502 5.487 5.515 5.454 4198.79 600519
2002-01-24 2002-01-24 5.837 5.826 5.916 5.749 26615.80 600519
... ... ... ... ... ... ... ...
2021-11-23 2021-11-23 1852.000 1896.430 1917.000 1852.000 45782.00 600519
[105 rows x 7 columns]
'''golden_index = golden_date.index # 金叉的时间
# print(golden_index)
'''
DatetimeIndex(['2001-11-22', '2002-01-24', '2002-02-04', '2002-06-21','2002-12-05', '2003-01-16', '2003-04-15', '2003-05-30',...'2020-11-05', '2021-04-02', '2021-04-16', '2021-05-20','2021-09-16', '2021-11-23'],dtype='datetime64[ns]', name='date', length=105, freq=None)
'''
数据分析项目(python):股票数据导入、计算上涨、计算下跌、收益计算、双均线策略相关推荐
- Python数据分析之股票双均线策略制定
Python数据分析之股票双均线策略制定 需求:双均线策略制定 库 tushare包 预处理数据 df = pd.read_csv('./maotai.csv').drop(labels='Unnam ...
- 爬虫项目3 - 股票数据爬取
爬虫项目3 - 股票数据爬取 步骤 步骤 爬取股票名和股票列表,使用gucheng网进行爬取,网址: https://hq.gucheng.com/gpdmylb.html import reques ...
- 导入表格excel的方法+sqlsrver数据导入(.xlsx):未在本地计算机上注册“Microsoft.ACE.OLEDB.12.0”提供程序
sqlsrver数据导入(.xlsx):未在本地计算机上注册"Microsoft.ACE.OLEDB.12.0"提供程序 解决方法 弹窗提示下面错误 解决方法:下载AccessDa ...
- 利用双均线策略计算中国平安股票收益
一.知识储备 Hello,各位小伙伴们,本篇博文给大家带来的是利用双均线策略,对中国平安601318股票进行炒股,所能获得的大概收益.为了你能正确理解本文的知识,需要你提前做以下准备... pytho ...
- 数据分析——股票双均线策略分析
在数据分析中,对于股票双均线策略分析是其中一个应用,这对于短期投资来说是非常有用的(虽然咱们不推荐). 什么是均线? 对于每一个交易日,都可以计算出前N天的移动平均值,然后把这些移动平均值连起来,成为 ...
- Python量化交易实战-38使用开源项目回测双均线策略
B站配套视频教程观看 使用PyAlgoTrade回测双均线策略 双均线策略:长短周期均线,通过金叉,死叉的方式买入卖出股票,获取收益的策略. 回顾上节课代码的部分,上节课完成了可视化代码的部分, 主要 ...
- Python双均线策略回测(2021-10-12)
Python双均线策略回测 1.择时策略简介 根据百度百科的解释,择时交易是指利用某种方法来判断大势的走势情况,是上涨还是下跌或者是盘整.如果判断是上涨,则买入持有:如果判断是下跌,则卖出清仓,如果是 ...
- Python量化交易02——双均线策略(移动平均线)
参考书目:深入浅出Python量化交易实战 本次带来最经典的交易策略,双均线策略的构建和其回测方法. 双均线一般采用5天均值和10天均值,如果5日均线上穿突破了10日均线,说明股价在最近的涨势很猛,买 ...
- 用Python写一个简单的双均线策略分析
用Python写一个简单的双均线策略 双均线策略 先罗列一下我知道的量化策略: 双均线:一句话来讲就是金叉买死叉卖. 布林带:突破压力线(上轨)清仓,跌破支撑线(下轨)持仓. PEG:根据PE/G调整 ...
最新文章
- f2 自适应_典型的三行二列居中高度自适应布局
- ios 摇一摇不走响应方法_猫咪不和主人亲近?这几种方法让它变得黏人,赶都赶不走|猫|宠物猫|主人...
- 【颠覆认知】为什么YouTube广告只看五秒更赚钱,微博商业产品经理深度剖析。...
- 【数字图像】数字图像处理博客汇总
- 计算机算法音乐专业,音乐信号分析算法的乐理简说(非音乐专业的乐理)
- 最长公共子序列-dp
- 《Android开发从零开始》——22.数据存储(1)
- C++中dynamic_cast的简介
- switch语句可以被代替吗_爬楼梯可以代替跑步吗?
- 关于Kernel的思考
- Boosting GDBT
- 为什么很多小公司虽然熬过了生存期
- 【完全背包】自然数拆分Lunatic版
- 解决方案:awesomium web-browser frameworkThis View has crashed!
- 【强档推荐】动漫初音未来Ⅱ主题
- @OneToMany mappedBy
- 联想服务器控制口登录地址_联想服务器登录管理界面 联想服务器客服
- 用c语言计算高考成绩,C语言计算距离高考剩余时间
- 2022-2027年中国婴幼儿米粉市场竞争态势及行业投资前景预测报告
- 子组件向父组件传参的几种方法
热门文章
- 线程和进程一--并发和并行
- 如何使用口径正确订购电子书系列
- python findall方法_Python_47findall方法
- 东北大学公共管理考研经验贴
- 一副从1到n的牌,每次从牌堆顶取一张放桌子上,再取一张放牌堆底,直到手中没牌,最后桌子上的牌是从1到n有序,设计程序,输入n,输出牌堆的顺序数组
- PHP中的socket函数
- 【Vuforia AR Unity 2018.3.12f1】MikuAR安卓程序开发实践(二)MMD篇_2019.4.23
- python红绿灯检测opencv识别红绿灯信号灯检测
- 计算机网络—— 3.9以太网交换机自学习和转发帧的流程
- SEnet 通道注意力机制