函数:pearsonr(x,y)

功能:
计算特征与目标变量之间的相关度

参数说明:
1)输入:x为特征,y为目标变量.
2)输出:r: 相关系数 [-1,1]之间,p-value: p值。
注: p值越小,表示相关系数越显著,一般p值在500个样本以上时有较高的可靠性。

pearson相关系数的计算公式为:

cov(X,Y)表示的是协方差
var(x)和var(y)表示的是方差

python实现

import numpy as np
from scipy.stats import pearsonr
import random
np.random.seed(0)
size=300
x=np.random.normal(0,1,size)
print "Lower noise",pearsonr(x,x+np.random.normal(0,1,size))
print "Higher noise",pearsonr(x,x+np.random.normal(0,10,size))

输出:

Lower noise (0.71824836862138408, 7.3240173129983507e-49)
Higher noise (0.057964292079338155, 0.31700993885324752)

用pandas计算相关系数

corr()求解变量相关系数

df.head()Guba    XQ      BCI     Count   Value
0   0.021   0.098   0.175   0.077   0.057
1   0.031   0.097   0.192   0.087   0.069
2   0.018   0.101   0.193   0.075   0.069
3   0.017   0.112   0.203   0.077   0.063
4   0.042   0.158   0.222   0.335   0.567#1.pearson相关系数
df.corr()Guba       XQ          BCI         Count       Value
Guba    1.000000    0.175604    -0.014611   0.200896    0.256166
XQ      0.175604    1.000000    -0.390358   0.654250    0.482809
BCI     -0.014611   -0.390358   1.000000    -0.259319   -0.156440
Count   0.200896    0.654250    -0.259319   1.000000    0.832961
Value   0.256166    0.482809    -0.156440   0.832961    1.000000#2.Kendall Tau相关系数
df.corr('kendall')Guba        XQ          BCI         Count       Value
Guba    1.000000    0.153904    -0.012438   0.133122    0.090707
XQ      0.153904    1.000000    -0.244304   0.374908    0.255377
BCI     -0.012438   -0.244304   1.000000    -0.157442   -0.091950
Count   0.133122    0.374908    -0.157442   1.000000    0.720916
Value   0.090707    0.255377    -0.091950   0.720916    1.000000#3.spearman秩相关
df.corr('spearman')Guba       XQ          BCI         Count       Value
Guba    1.000000    0.219124    -0.017204   0.189752    0.143163
XQ      0.219124    1.000000    -0.358981   0.563938    0.427756
BCI     -0.017204   -0.358981   1.000000    -0.241880   -0.140010
Count   0.189752    0.563938    -0.241880   1.000000    0.877732
Value   0.143163    0.427756    -0.140010   0.877732    1.000000#4.显著性检验
import scipy.stats as stats
#输出结果第一个值为pearsonr相关系数,
#第二个为p-value,所以这里Guba列和Value值是显著相关的
stats.pearsonr(df['Guba'],df['Value'])
(0.256165703418037, 8.10519823509109e-07)

计算两列数据相关系数

import pandas as pd
import pylab as plt
#每小时的阵风风速平均值
all_gust_spd_mean_list = [8.21529411764706, 7.872941176470587, 7.829411764705882, 8.354117647058825, 9.025882352941174, 9.384523809523811, 9.57294117647059, 9.274117647058821, 9.050588235294118, 9.314117647058827, 8.924705882352939, 9.25176470588235, 8.978823529411764, 8.39176470588235, 7.715294117647061, 7.477647058823529, 7.272941176470586, 7.38470588235294, 7.396470588235295, 7.97261904761905, 7.716666666666666, 7.7809523809523835, 7.816666666666668, 7.897590361445783, 8.200000000000001, 8.04761904761905, 7.474999999999999, 9.855952380952383, 11.120000000000001, 10.979761904761906, 10.922619047619051, 10.841176470588234, 9.31566265060241, 8.867058823529415, 9.068235294117642, 8.774698795180722, 8.629411764705884, 8.292941176470586, 7.640000000000007, 7.422352941176469, 7.464705882352944, 8.210588235294113, 8.558823529411763, 8.93095238095238, 9.001176470588234, 8.538095238095238, 8.965882352941172, 9.855294117647057, 8.318918918918921, 9.217647058823525, 8.86470588235294, 8.840000000000002, 9.44235294117647, 9.352380952380953, 9.307058823529408, 9.64047619047619, 9.408333333333333, 9.585882352941175, 8.901190476190477, 7.698823529411764, 7.988235294117645, 9.091764705882353, 9.294117647058819, 8.996470588235297, 9.63764705882353, 9.091764705882353, 8.937647058823533, 8.838823529411764, 8.637647058823534, 8.46, 8.374117647058824, 8.24117647058823, 8.245238095238093, 8.365882352941174, 8.50235294117647, 8.291764705882352, 8.088235294117647, 7.889411764705883, 7.594117647058826, 7.216470588235293, 7.097647058823533, 7.305882352941181, 7.489411764705882, 6.815294117647058, 7.971428571428569, 7.424705882352936, 6.910588235294117, 6.071764705882354, 7.44117647058823, 7.667857142857143, 7.881176470588237, 7.929411764705881, 8.12142857142857, 8.822352941176472, 9.083529411764703, 9.028235294117646, 9.310714285714285, 9.035294117647057, 8.450588235294116, 8.414285714285713, 7.311764705882355, 6.840000000000001, 7.238095238095239, 6.641176470588236, 6.8047619047619055, 6.58705882352941, 6.826190476190474, 6.568235294117643, 7.060000000000001, 7.686904761904761, 8.348235294117643, 8.503529411764701, 8.287058823529414, 8.354117647058823, 7.624705882352941, 7.286904761904765, 7.361176470588235, 7.477647058823531, 7.343529411764706]#每小时的阵风风向标准差
all_gust_agl_dev_list = [0.7507438242046189, 0.768823513771462, 0.849877567310481, 0.8413581558472801, 0.8571319461950748, 0.8665002025305942, 0.9053739533298005, 0.8866979720735791, 0.8045677876888446, 0.873463882661469, 0.832383480871403, 0.778659970340069, 0.7357031045047981, 0.7974723911258534, 0.8039727543149432, 0.8709723763624072, 0.8727745464337923, 0.7896422160341138, 0.8165093346129041, 0.8821296270775546, 0.9193591477905156, 0.8546566314487358, 0.8595040204296921, 0.8075641299052398, 0.7996745617071098, 0.7930869411601498, 0.7578880032016914, 0.9107571156507569, 0.8461201382346486, 0.7553646348127085, 0.8510861123303187, 0.7282631202385544, 0.8588017730198183, 0.7923449370076744, 0.8265083209111689, 0.9599970229643688, 0.8195276021290412, 0.7882592259148272, 0.8036464793287409, 0.8237184691421926, 0.8846862360656914, 0.8136869244513337, 0.8516383375155133, 0.7760301715652644, 0.8644231334629017, 0.831330440569484, 0.8061342111854616, 0.7345896810176235, 1.205089147978776, 0.8266315966774649, 0.8137345300107962, 0.8186966603954983, 0.7836182115343135, 0.8406438908681332, 0.7717723331806998, 0.7932664155269176, 0.7266183593077442, 0.719063143819583, 0.8846434855533486, 0.817552510948495, 0.7571575934024827, 0.865326265251608, 0.9099784335052563, 0.8591794583996128, 0.9295389095340467, 0.8787300860744375, 0.8724277968300532, 0.95284132003256, 0.9288772059881606, 0.8690944948691984, 0.8327213470469693, 0.8339075062700629, 0.886835675339985, 0.8439137877550847, 0.7985495396895048, 0.8406267016063169, 0.8477871130878305, 0.8844025576348077, 0.9186363354492758, 0.8888539157167654, 0.9079462071375304, 0.8699806402308554, 0.8531937701209343, 0.8833108936555343, 0.9317958602705915, 0.9393618445471649, 0.9556065912926689, 0.967220118643412, 0.8882194173154115, 0.9361538853249073, 0.7872261833965604, 0.8608377368219552, 0.8787718518619395, 0.8169189082396561, 0.7965901553530427, 0.8838665737610132, 0.8844338861256802, 0.9008484784943429, 0.8612318707072047, 0.8623792153658019, 1.0033494995180463, 0.9901213381586231, 0.8780115045650467, 0.9172682690843976, 0.9653905755824115, 0.9199829176728873, 0.9180048223906779, 0.9172043382441968, 0.9267783259554074, 0.9231225672912022, 0.7945054721199195, 0.8655558517080688, 0.8306327906597787, 0.8457559701865576, 0.8038459124570336, 0.8519646989317945, 0.7735358658599594, 0.8612134954656397, 0.8879135146161856]g_s_m = pd.Series(all_gust_spd_mean_list) #利用Series将列表转换成新的、pandas可处理的数据
g_a_d = pd.Series(all_gust_agl_dev_list)corr_gust = round(g_s_m.corr(g_a_d), 4) #计算标准差,round(a, 4)是保留a的前四位小数print('corr_gust :', corr_gust)#最后画一下两列表散点图,直观感受下,结合相关系数揣摩揣摩
plt.scatter(all_gust_spd_mean_list, all_gust_agl_dev_list)
plt.title('corr_gust :' + str(corr_gust), fontproperties='SimHei') #给图写上title
plt.show()

根据以上程序,得到结果:

corr_gust : -0.3481

计算矩阵数据相关系数矩阵

import pandas as pd
import numpy as npif __name__ == '__main__':unstrtf_lst = [[2.136, 1.778, 1.746, 2.565, 1.873, 2.413, 1.813, 1.72, 1.932, 1.987, 2.035, 2.178, 2.05, 2.016, 1.645, 1.756, 1.886, 2.106, 2.138, 1.914, 1.984, 1.906, 1.871, 1.939, 1.81, 1.93, 1.898, 1.802, 2.008, 1.724, 1.823, 1.636, 1.774, 2.055, 1.934, 1.629, 2.519, 2.093, 2.004, 1.793, 1.564, 1.962, 2.176, 1.846, 1.816, 2.018, 1.708, 2.465, 1.899, 1.523, 1.41, 2.102, 2.065, 2.402, 2.091, 1.867, 1.77, 1.466, 2.029, 1.659, 1.626, 1.977, 1.837, 2.13, 2.241, 2.184, 2.345, 1.833, 2.113, 1.764, 1.859, 1.868, 1.835, 1.906, 2.237, 1.846, 1.871, 1.769, 1.928, 1.831, 1.875, 2.039, 2.24, 1.835, 1.851], [2.171, 1.831, 1.714, 2.507, 1.793, 2.526, 1.829, 1.705, 1.954, 2.017, 2.022, 2.16, 2.059, 1.966, 1.661, 1.752, 1.884, 2.203, 2.182, 1.97, 2.003, 1.875, 1.852, 1.884, 1.774, 1.916, 1.936, 1.809, 1.926, 1.717, 1.841, 1.59, 1.781, 2.016, 1.898, 1.657, 2.458, 2.134, 2.032, 1.785, 1.575, 1.959, 2.11, 1.854, 1.826, 1.992, 1.706, 2.419, 1.854, 1.514, 1.37, 2.084, 2.024, 2.398, 1.955, 1.859, 1.759, 1.441, 2.059, 1.653, 1.583, 1.987, 1.84, 2.106, 2.262, 2.13, 2.371, 1.776, 2.117, 1.733, 1.814, 1.839, 1.822, 1.883, 2.23, 1.803, 1.894, 1.783, 1.911, 1.813, 1.85, 2.004, 2.191, 1.823, 1.809], [2.157, 1.873, 1.802, 2.761, 1.733, 2.506, 1.842, 1.765, 1.938, 2.058, 1.932, 2.196, 2.004, 2.126, 1.664, 1.698, 1.899, 2.073, 2.117, 2.083, 1.972, 1.969, 1.865, 1.937, 1.752, 1.939, 1.927, 1.804, 2.07, 1.725, 1.846, 1.5, 1.804, 2.1, 1.932, 1.773, 2.431, 2.088, 2.08, 1.812, 1.592, 1.953, 2.044, 2.019, 1.846, 2.061, 1.771, 2.254, 1.891, 1.536, 1.356, 1.952, 2.222, 2.427, 2.015, 1.873, 1.79, 1.384, 1.981, 1.665, 1.815, 2.006, 1.869, 2.102, 2.249, 2.27, 2.296, 1.814, 2.099, 1.702, 1.688, 1.89, 1.82, 1.927, 2.162, 1.825, 1.998, 1.811, 2.0, 1.842, 1.793, 2.115, 2.301, 1.789, 1.826], [2.127, 1.744, 1.747, 2.548, 1.939, 2.296, 1.808, 1.71, 1.901, 1.906, 2.074, 2.167, 2.113, 2.044, 1.632, 1.821, 1.94, 2.076, 2.114, 1.837, 1.978, 1.904, 1.872, 1.98, 1.886, 1.923, 1.875, 1.799, 1.992, 1.704, 1.812, 1.715, 1.756, 2.061, 1.94, 1.554, 2.592, 2.065, 1.983, 1.802, 1.57, 1.955, 2.215, 1.765, 1.796, 2.006, 1.662, 2.573, 1.915, 1.543, 1.439, 2.16, 2.012, 2.42, 2.268, 1.886, 1.767, 1.527, 2.073, 1.65, 1.567, 2.016, 1.819, 2.153, 2.225, 2.237, 2.327, 1.877, 2.115, 1.804, 1.939, 1.867, 1.84, 1.905, 2.302, 1.883, 1.798, 1.725, 1.893, 1.846, 1.916, 2.025, 2.268, 1.867, 1.877], [2.089, 1.664, 1.72, 2.441, 2.031, 2.321, 1.773, 1.702, 1.935, 1.968, 2.119, 2.191, 2.023, 1.925, 1.621, 1.75, 1.822, 2.074, 2.139, 1.764, 1.982, 1.873, 1.895, 1.955, 1.829, 1.945, 1.853, 1.794, 2.046, 1.75, 1.793, 1.741, 1.752, 2.042, 1.965, 1.532, 2.598, 2.086, 1.923, 1.771, 1.517, 1.98, 2.338, 1.743, 1.794, 2.014, 1.693, 2.618, 1.938, 1.5, 1.476, 2.216, 2.003, 2.361, 2.13, 1.85, 1.764, 1.513, 2.001, 1.669, 1.538, 1.897, 1.819, 2.163, 2.226, 2.099, 2.386, 1.865, 2.121, 1.818, 2.0, 1.876, 1.858, 1.908, 2.254, 1.874, 1.791, 1.759, 1.908, 1.822, 1.944, 2.012, 2.201, 1.863, 1.892]]column_lst = ['whole_year', 'spring', 'summer', 'autumn', 'winter']# 计算列表两两间的相关系数data_dict = {} # 创建数据字典,为生成Dataframe做准备for col, gf_lst in zip(column_lst, unstrtf_lst):data_dict[col] = gf_lstunstrtf_df = pd.DataFrame(data_dict)cor1 = unstrtf_df.corr() # 计算相关系数,得到一个矩阵print(cor1)print(unstrtf_df.columns.tolist())

结果如下:

            whole_year    spring    summer    autumn    winter
whole_year    1.000000  0.986011  0.943254  0.980358  0.965415
spring        0.986011  1.000000  0.944394  0.945710  0.930887
summer        0.943254  0.944394  1.000000  0.876008  0.833568
autumn        0.980358  0.945710  0.876008  1.000000  0.977426
winter        0.965415  0.930887  0.833568  0.977426  1.000000

用numpy计算相关系数

import numpy as np
# 这里u1是一个矩阵,可以自己构造,也可以来自dataframe类型:比如u1=a_df.values
np.corrcoef(u1) # 计算矩阵所有行的相关系数
np.corrcoef(u1.T) # 计算矩阵所有列的相关系数
np.around(np.corrcoef(u1), decimals=3) # 这里是将矩阵结果保留3位小数

pearsonr(x,y)、corr()、corrcoef(u1) 相关系数计算相关推荐

  1. 【Python学习系列二十七】pearson相关系数计算

    场景:计算训练特征和目标之间的相关系数,用于判断是否加入训练. 参考代码: # -*- coding: utf-8 -*-import pandas as pd import time from sk ...

  2. Task1:皮尔森相关系数计算 | 数据模拟

    Task1:皮尔森相关系数计算 | 数据模拟 任务描述 数据选取: 计算皮尔森相关系数 1 研究变量: 2 研究问题: 3 演示流程及结果 (1) 演示在SPSS中计算五个变量两两之间的皮尔森相关系数 ...

  3. 根据输入的正整数y所代表的年份,计算输出该年份是否为闰年 闰年的判断标准:

    package spurs_ping; import java.util.*; //根据输入的正整数y所代表的年份,计算输出该年份是否为闰年 闰年的判断标准: //能够被4整除且不能被100整除的年份 ...

  4. 基于Qt实现的股票分析预测软件,实现外排序功能,创建索引,加快数据获取,根据股票的年月和代码进行k线图展示,热力图展示,相关系数计算,最后价格预测和股票价格曲线展示。

    资源下载地址 https://download.csdn.net/download/qq_40335674/87857061 基于Qt实现的股票分析预测软件,实现外排序功能,程序加载数据内存限制不超过 ...

  5. r语言for循环的c(),R语言for循环01-批量完成相关系数计算

    在数据分析过程中,尤其是在做基因筛选时,常会应用到批量筛选,这也是应用R语言分析数据的优势之一.在这一点上往往在线工具不能提供这样的功能. 我们来构建一个虚拟数据,来完成基因之间的批量分析,并导出结果 ...

  6. 【python】pandas库pd.DataFrame.corr、numpy库np.corrcoef求解相关系数

    - 一.首先生成数据: 1.df 数据: import pandas as pd from pandas import DataFrame,Series df = pd.DataFrame(np.ra ...

  7. np.corrcoef相关系数计算

    相关系数计算方法 本质 协方差的归一化,公式的分子是变量的协方差,分母是变量的标准差的乘积 先计算两个变量的相关系数,也就是Cij 然后除以两个变量的标准差的乘积, Cii代表变量i和自己的协方差,也 ...

  8. python中的numpy函数算相关系数_扩展库numpy的函数corrcoef()用来计算相关系数矩阵。...

    [判断题]不同的考试对于词汇能力提出不同的要求.考试水平越高,对学生词汇能力的要求越高. [单选题]利率互换交易的现金流错配风险是指( ) [多选题]石墨可以用来制作铅笔芯,是利用了其 ( )性质. ...

  9. Heatmap热力图、pandas.DataFrame.corr()、皮尔森相关系数

    Heatmap热力图 Heatmap热力图是一个以颜色变化来显示数据的矩阵. matplotlib可通过以下两种方法绘制heamap: matplotlib.axes.Axes.imshow matp ...

最新文章

  1. 135编辑器 html,135编辑器,微信图文编辑器,微信图文美化编辑器,格式多方便的图文编辑-...
  2. 职场减压妙计:主动降职
  3. python编辑器和终端_从python curses程序运行终端文本编辑器
  4. VTK:可视化之LineWidth
  5. 如何检查并unregister本地注册的Chrome service worker
  6. 电商前台模板中文html5_跨境电商平台上产品如何上传?虾皮新版批量上传工具使用指南分享...
  7. ubuntu环境下如何安装jdk,安装eclipse,安装android studio总结
  8. 计算机专业410分能上哪些大学,2021年高考410分能报什么学校
  9. How to enable nested virtualization in KVM
  10. [Android]Handler的消息机制
  11. 领导让我重写测试代码,我该怎么办?
  12. My PaintBrush Pro for mac(专业的绘图画板)
  13. java利用opencv降维_PCA主成分分析 特征降维 opencv实现
  14. [不得不转载]一个时代的结束:微软盖茨的人生掠影
  15. 糙哥出版本,脸都不去洗
  16. 哪些植物最吸毒?这份植物“吸毒”手册千万要收好!
  17. Python基础--01数据类型-标识符命名规范
  18. C#服务端的微信小游戏——多人在线角色扮演(十二)
  19. 【数据结构】【王道】【树与二叉树】中序二叉线索树的实现及基本操作(可直接运行)
  20. 在苹果Mac上如何隐藏桌面图标?

热门文章

  1. SE14 激活并调整数据库
  2. 俄罗斯钓鱼四服务器未响应,俄罗斯钓鱼4《RussianFishing4》sbirolino钓组攻略
  3. 【操作系统】本地ping出现一般故障解决方案
  4. python中import requests是什么意思_Ubuntu中python环境下import requests错误的解决(学习过程问题记录)...
  5. 构建可信软件系统的 10 要素
  6. java实现九宫格解锁_轻松实现Android自定义九宫格图案解锁
  7. ubuntu18.04 alt+ctrl+t 无响应解决方法
  8. java版我的世界hypixel_我的世界Hypixel服务器
  9. 转计算机网络应用申请书,转专业相关申请书最新6篇
  10. web buuctf [SUCTF 2019]Pythonginx1