笔记源于书《Statistics for the Behavioral Sciences 10e》

■ Biased and Unbiased Statistics
Earlier we noted that sample variability tends to underestimate the variability in the corresponding population. To correct for this problem we adjusted the formula for sample variance by dividing by n -1 instead of dividing by n. The result of the adjustment is that sample variance provides a much more accurate representation of the population variance. Specifcally, dividing by n -1 produces a sample variance that provides an unbiased estimate of the corresponding population variance. This does not mean that each individual sample variance will be exactly equal to its population variance. In fact, some sample variances will overestimate the population value and some will underestimate it. However, the average of all the sample variances will produce an accurate estimate of the population variance. This is the idea behind the concept of an unbiased statistic.

A sample statistic is unbiased if the average value of the statistic is equal to the population parameter. (The average value of the statistic is obtained from all the possible samples for a specifc sample size, n.)

A sample statistic is biased if the average value of the statistic either underestimates or overestimates the corresponding population parameter.

The following example demonstrates the concept of biased and unbiased statistics.

E x a m p L E 4 . 9
We begin with a population that consists of exactly N=6 scores: 0, 0, 3, 3, 9, 9. With a
few calculations you should be able to verify that this population has a mean of m=4 and
a variance = 14.

Next, we select samples of n = 2 scores from this population. In fact, we obtain every
single possible sample with n = 2. The complete set of samples is listed in Table 4.1. Notice

that the samples are listed systematically to ensure that every possible sample is included.
We begin by listing all the samples that have X = 0 as the frst score, then all the samples
with X = 3 as the frst score, and so on. Notice that the table shows a total of 9 samples.
Finally, we have computed the mean and the variance for each sample. Note that the
sample variance has been computed two different ways. First, we make no correction for
bias and compute each sample variance as the average of the squared deviations by simply
dividing SS by n. Second, we compute the correct sample variances for which SS is divided
by n - 1 to produce an unbiased measure of variance. You should verify our calculations
by computing one or two of the values for yourself. The complete set of sample means and
sample variances is presented in Table 4.1

First, consider the column of biased sample variances, which were calculated dividing
by n. These 9 sample variances add up to a total of 63, which produces an average value
of 63/9 = 7. The original population variance, however, is 14. Note that the average
of the sample variances is not equal to the population variance. If the sample variance is
computed by dividing by n, the resulting values will not produce an accurate estimate of
the population variance. On average, these sample variances underestimate the population
variance and, therefore, are biased statistics.

Next, consider the column of sample variances that are computed using n - 1. Although
the population has a variance of  14, you should notice that none of the samples has
a variance exactly equal to 14. However, if you consider the complete set of sample variances, you will fnd that the 9 values add up to a total of 126, which produces an average
value of 126/9= 14.00. Thus, the average of the sample variances is exactly equal to the original population variance. On average, the sample variance (computed using n - 1) produces
an accurate, unbiased estimate of the population variance.

Finally, direct your attention to the column of sample means. For this example, the
original population has a mean of m = 4. Although none of the samples has a mean exactly
equal to 4, if you consider the complete set of sample means, you will fnd that the 9 sample
means add up to a total of 36, so the average of the sample means is 63/9=4. Note that the
average of the sample means is exactly equal to the population mean. Again, this is what
is meant by the concept of an unbiased statistic. On average, the sample values provide
an accurate representation of the population. In this example, the average of the 9 sample
means is exactly equal to the population mean.

In summary, both the sample mean and the sample variance (using n - 1) are examples
of unbiased statistics. This fact makes the sample mean and sample variance extremely valuable for use as inferential statistics. Although no individual sample is likely to have a mean
and variance exactly equal to the population values, both the sample mean and the sample
variance, on average, do provide accurate estimates of the corresponding population values.

统计学基础Statistics for the Behavioral Sciences 之 Sample Variance as an Unbiased Statistic相关推荐

  1. 统计学基础学习笔记:描述统计量

    文章目录 一.统计学基础 二.描述统计量 三.数据文件 四.绘制直方图与折线图 五.数据的位置 (一)基本概念 1.样本平均数(mean) (1)算术平均数 (2)几何平均数 2.中位数(median ...

  2. 统计学基础之数据分布

    统计学基础之数据分布 学习几种常用的数据分布 1.正态分布 正态分布(Normal distribution),也称"常态分布",又名高斯分布.正态曲线呈钟型,两头低,中间高,左右 ...

  3. 5. 统计学基础2:协方差、相关系数、协方差矩阵

    文章目录 1. 协方差 2. 相关系数[就是使 |协方差|<=1] 3. 协方差矩阵 1. 协方差 标准差和方差一般是用来描述一维数据的, 具体介绍见:5. 统计学基础1:平均值-四分位数.方差 ...

  4. NumPy 快速入门系列:应用统计学基础概念、相关统计指标与NumPy的实现

    NumPy 快速入门系列:应用统计学基础概念.相关统计指标与NumPy的实现 前言: 统计学导论: 统计学定义: 统计学分类: 统计学基本概念: 统计过程: 统计指标与NumPy: 用 Python ...

  5. 统计学基础——负二项分布的数字特征

    统计学基础--负二项分布的数字特征 一.引言 二.负二项分布定义的引出与理解 2.1 实际意义 2.2 初始定义 2.3 重新定义"负"二项分布 2.3 推导前的知识准备 三.数字 ...

  6. 【大数据人工智能】统计学入门——数据科学领域最需要了解的统计学基础概念

    目录 统计学入门--数据科学领域最需要了解的统计学基础概念 什么是统计学? 数据科学入门必备统计学概念 什么是对象? 什么是总体&

  7. 统计学基础理论学习(1)

    统计学基础知识 统计学基础知识知识点包括: 1. 数据的集中趋势 在统计学中,集中趋势又叫中央趋势,表示一个机率分布的中间值. 常见的几种表示集中趋势的计量包括算数平均数,中位数及众数. 数值平均数: ...

  8. 数据分析与数据挖掘 - 05统计概率 一 统计学基础运算

    一 统计学基础运算 1 方差的计算 在统计学中为了观察数据的离散程度,我们需要用到标准差,方差等计算.我们现在拥有以下两组数据,代表着两组同学们的成绩,现在我们要研究哪一组同学的成绩更稳定一些.方差是 ...

  9. 统计学基础专栏01---探索性数据分析

    统计学基础专栏01-探索性数据分析 0.术语 0.1.探索性数据分析 连续型数据 数据可在一个区间内取任意值 离散型数据 数据只能取整数,例如计数 分类型数据 数据只能从特定集合中取值,表示一系列可能 ...

  10. 统计学基础1:描述性统计(数据的离散度、极差、方差、标准差)

    python 统计学基础1:描述性统计 一.频数 二.数据的位置(平均数.中位数.众数.百分位数) 三.数据的离散度(极差.偏差.方差.标准差) 3.1.极差(Range) 3.2 .平均绝对偏差(M ...

最新文章

  1. UIView Animation
  2. Android 软键盘的显示和隐藏,这样操作就对了
  3. P7 计算机的性能指标
  4. 【tomcat】手动部署动态JavaWeb项目到tomcat
  5. glass fish_Glass Fish 4.0.1中的Jersey SSE功能
  6. Failed to resolve: android.arch.lifecycle:runtime:1.0.0,Failed to resolve: support-v4
  7. 笔记:git常用操作,git使用,git命令行
  8. Daemon线程--《Java并发编程的艺术》学习笔记
  9. 【OpenCV】配置OpenCV教程,OpenCV入门
  10. PHP手机深色模式,哪些手机深色模式比较好?六大主流品牌手机深色模式对比介绍...
  11. 变限积分求导公式总结_积分变限函数求导的基本方法
  12. sql查询本月数据,当天数据
  13. SQL经典语句大全及应用示例汇总
  14. debian11安装aria2以及ariaNg
  15. 电视机未来会成为家庭交互中心?
  16. 目前主流的移动广告联盟有哪些呢?
  17. 外呼系统——外呼中心
  18. GIS概念介绍和对webgis的理解
  19. Android应用面试题及答案汇总
  20. 递归、迭代和分治(1):递归

热门文章

  1. java screenframe_一个关于JFrame的问题
  2. python视频教程免费慕课网-python视频教程慕课 | 最好的python视频教程谁有
  3. a标签传值到另一个页面_用大头儿子和小头爸爸举例,就讲明白 vue 中父子组件的传值? | 原力计划...
  4. mysql5.6设置日志路径_mysql5.6.12切换binlog二进制日志路径_MySQL
  5. 表格存储(TableStore)
  6. L2-001. 紧急救援(迪杰斯特拉算法)
  7. Weex BindingX 尝鲜
  8. 【Spring-Cached】Cached之Caffeine
  9. 移动设备尺寸规范汇总(转)
  10. RESTClient 工具