原文:Introduction to dnorm, pnorm, qnorm, and rnorm for new biostatisticians

Today I was in Dan’s office hours and someone asked, “what is the equivalent in R of the back of the stats textbook table of probabilities and their corresponding Z-scores?” (This is an example of the kind of table the student was talking about.) This question indicated to me that although we’ve been asked to use some of the distribution functions in past homeworks, there may be some misunderstanding about how these functions work.

Right now I’m going to focus on the functions for the normal distribution, but you can find a list of all distribution functions by typing help(Distributions) into your R console.


dnorm

As we all know the probability density for the normal distribution is:

f(x|μ,σ)=1σ2π−−√e−(x−μ)22σ2f(x|μ,σ)=1σ2πe−(x−μ)22σ2

The function dnorm returns the value of the probability density function for the normal distribution given parameters for xx, μμ, and σσ. Some examples of using dnorm are below:

# This is a comment. Anything I write after the octothorpe is not executed.# This is the same as computing the pdf of the normal with x = 0, mu = 0 and
# sigma = 0. The dnorm function takes three main arguments, as do all of the
# *norm functions in R.dnorm(0, mean = 0, sd = 1)
## [1] 0.3989423
# The line of code below does the same thing as the same as the line of code
# above, since mean = 0 and sd = 0 are the default arguments for the dnorm
# function.dnorm(0)
## [1] 0.3989423
# Another exmaple of dnorm where parameters have been changed.dnorm(2, mean = 5, sd = 3)
## [1] 0.08065691

Although xx represents the independent variable of the pdf for the normal distribution, it’s also useful to think of xx as a Z-score. Let me show you what I mean by graphing the pdf of the normal distribution with dnorm.

# First I'll make a vector of Z-scores
z_scores <- seq(-3, 3, by = .1) # Let's print the vector z_scores
##  [1] -3.0 -2.9 -2.8 -2.7 -2.6 -2.5 -2.4 -2.3 -2.2 -2.1 -2.0 -1.9 -1.8 -1.7
## [15] -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3
## [29] -0.2 -0.1  0.0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1.0  1.1
## [43]  1.2  1.3  1.4  1.5  1.6  1.7  1.8  1.9  2.0  2.1  2.2  2.3  2.4  2.5
## [57]  2.6  2.7  2.8  2.9  3.0
# Let's make a vector of the values the function takes given those Z-scores.
# Remember for dnorm the default value for mean is 0 and for sd is 1.
dvalues <- dnorm(z_scores) # Let's examine those values dvalues
##  [1] 0.004431848 0.005952532 0.007915452 0.010420935 0.013582969
##  [6] 0.017528300 0.022394530 0.028327038 0.035474593 0.043983596
## [11] 0.053990967 0.065615815 0.078950158 0.094049077 0.110920835
## [16] 0.129517596 0.149727466 0.171368592 0.194186055 0.217852177
## [21] 0.241970725 0.266085250 0.289691553 0.312253933 0.333224603
## [26] 0.352065327 0.368270140 0.381387815 0.391042694 0.396952547
## [31] 0.398942280 0.396952547 0.391042694 0.381387815 0.368270140
## [36] 0.352065327 0.333224603 0.312253933 0.289691553 0.266085250
## [41] 0.241970725 0.217852177 0.194186055 0.171368592 0.149727466
## [46] 0.129517596 0.110920835 0.094049077 0.078950158 0.065615815
## [51] 0.053990967 0.043983596 0.035474593 0.028327038 0.022394530
## [56] 0.017528300 0.013582969 0.010420935 0.007915452 0.005952532
## [61] 0.004431848
# Now we'll plot these values
plot(dvalues, # Plot where y = values and x = index of the value in the vector xaxt = "n", # Don't label the x-axis type = "l", # Make it a line plot main = "pdf of the Standard Normal", xlab= "Z-score") # These commands label the x-axis axis(1, at=which(dvalues == dnorm(0)), labels=c(0)) axis(1, at=which(dvalues == dnorm(1)), labels=c(-1, 1)) axis(1, at=which(dvalues == dnorm(2)), labels=c(-2, 2))

As you can see, dnorm will give us the “height” of the pdf of the normal distribution at whatever Z-score we provide as an argument to dnorm.


pnorm

The function pnorm returns the integral from −∞−∞ to qq of the pdf of the normal distribution where qq is a Z-score. Try to guess the value of pnorm(0). (pnorm has the same default mean and sd arguments as dnorm).

# To be clear about the arguments in this example:
# q = 0, mean = 0, sd = 1
pnorm(0) 
## [1] 0.5

The pnorm function also takes the argument lower.tail. If lower.tail is set equal to FALSE then pnorm returns the integral from qq to ∞∞ of the pdf of the normal distribution. Note that pnorm(q) is the same as 1-pnorm(q, lower.tail = FALSE)

pnorm(2)
## [1] 0.9772499
pnorm(2, mean = 5, sd = 3)
## [1] 0.1586553
pnorm(2, mean = 5, sd = 3, lower.tail = FALSE)
## [1] 0.8413447
1 - pnorm(2, mean = 5, sd = 3, lower.tail = FALSE)
## [1] 0.1586553

pnorm is the function that replaces the table of probabilites and Z-scores at the back of the statistics textbook. Let’s take our vector of Z-scores from before (z_scores) and compute a new vector of “probability masses” using pnorm. Any guesses about what this plot will look like?

pvalues <- pnorm(z_scores) # Now we'll plot these values plot(pvalues, # Plot where y = values and x = index of the value in the vector xaxt = "n", # Don't label the x-axis type = "l", # Make it a line plot main = "cdf of the Standard Normal", xlab= "Quantiles", ylab="Probability Density") # These commands label the x-axis axis(1, at=which(pvalues == pnorm(-2)), labels=round(pnorm(-2), 2)) axis(1, at=which(pvalues == pnorm(-1)), labels=round(pnorm(-1), 2)) axis(1, at=which(pvalues == pnorm(0)), labels=c(.5)) axis(1, at=which(pvalues == pnorm(1)), labels=round(pnorm(1), 2)) axis(1, at=which(pvalues == pnorm(2)), labels=round(pnorm(2), 2))

It’s the plot of the cumulative distribution function of the normal distribution! Isn’t that neat?


qnorm

The qnorm function is simply the inverse of the cdf, which you can also think of as the inverse of pnorm! You can use qnorm to determine the answer to the question: What is the Z-score of the pthpth quantile of the normal distribution?

# What is the Z-score of the 50th quantile of the normal distribution?
qnorm(.5)
## [1] 0
# What is the Z-score of the 96th quantile of the normal distribution?
qnorm(.96)
## [1] 1.750686
# What is the Z-score of the 99th quantile of the normal distribution?
qnorm(.99)
## [1] 2.326348
# They're truly inverses!
pnorm(qnorm(0))
## [1] 0
qnorm(pnorm(0))
## [1] 0

Let’s plot qnorm and pnorm next to each other to further illustrate the fact they they are inverses.

# This is for getting two graphs next to each other
oldpar <- par() par(mfrow=c(1,2)) # Let's make a vector of quantiles: from 0 to 1 by increments of .05 quantiles <- seq(0, 1, by = .05) quantiles
##  [1] 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65
## [15] 0.70 0.75 0.80 0.85 0.90 0.95 1.00
# Now we'll find the Z-score at each quantile
qvalues <- qnorm(quantiles) qvalues
##  [1]       -Inf -1.6448536 -1.2815516 -1.0364334 -0.8416212 -0.6744898
##  [7] -0.5244005 -0.3853205 -0.2533471 -0.1256613  0.0000000  0.1256613
## [13]  0.2533471  0.3853205  0.5244005  0.6744898  0.8416212  1.0364334
## [19]  1.2815516  1.6448536        Inf
# Plot the z_scores
plot(qvalues,type = "l", # We want a line graph xaxt = "n", # No x-axis xlab="Probability Density", ylab="Z-scores") # Same pnorm plot from before plot(pvalues, # Plot where y = values and x = index of the value in the vector xaxt = "n", # Don't label the x-axis type = "l", # Make it a line plot main = "cdf of the Standard Normal", xlab= "Quantiles", ylab="Probability Density") # These commands label the x-axis axis(1, at=which(pvalues == pnorm(-2)), labels=round(pnorm(-2), 2)) axis(1, at=which(pvalues == pnorm(-1)), labels=round(pnorm(-1), 2)) axis(1, at=which(pvalues == pnorm(0)), labels=c(.5)) axis(1, at=which(pvalues == pnorm(1)), labels=round(pnorm(1), 2)) axis(1, at=which(pvalues == pnorm(2)), labels=round(pnorm(2), 2))

# Restore old plotting settings
par(oldpar)

rnorm

If you want to generate a vector of normally distributed random numbers, rnorm is the function you should use. The first argument n is the number of numbers you want to generate, followed by the standard mean and sd arguments. Let’s illustrate the weak law of large numbers using rnorm.

# set.seed is a function that takes a number as an argument and sets a seed from
# which random numbers are generated. It's important to set a seed so that your
# code is reproduceable. If you wanted to you could always set your seed to the
# same number. I like to set seeds to the "date" which is really just
# the arithmetic equation "month minus day minus year". So today's seed # is -2006. set.seed(10-1-2015) rnorm(5)
## [1] -0.7197035 -1.4442137 -1.0120381  1.4577066 -0.1212466
# If I set the seed to the same seed again, I'll generate the same vector of
# numbers.
set.seed(10-1-2015) rnorm(5)
## [1] -0.7197035 -1.4442137 -1.0120381  1.4577066 -0.1212466
# Now onto using rnorm# Let's generate three different vectors of random numbers from a normal
# distribution
n10 <- rnorm(10, mean = 70, sd = 5) n100 <- rnorm(100, mean = 70, sd = 5) n10000 <- rnorm(10000, mean = 70, sd = 5) # Let's just look at one of the vectors n10
##  [1] 54.70832 72.89000 70.27049 69.16508 72.97937 67.91004 67.77183
##  [8] 72.29231 74.33411 63.57151

Which historgram do you think will be most centered around the true mean of 70?

# This is for getting two graphs next to each other
oldpar <- par() par(mfrow=c(1,3)) # The breaks argument specifies how many bars are in the histogram hist(n10, breaks = 5) hist(n100, breaks = 20) hist(n10000, breaks = 100)

# Restore old plotting settings
par(oldpar)

Closing thoughts

These concepts generally hold true for all the distribution functions built into R. You can learn more about all of the distribution functions by typing help(Distributions) into the R console. If you have any questions about this demonstration or about R programming please send me an email. If you’d like to change or contribute to this document I welcome pull requests on GitHub. This document and all code contained within is licensed CC0.

转载于:https://www.cnblogs.com/leezx/p/8635494.html

Introduction to dnorm, pnorm, qnorm, and rnorm for new biostatisticians相关推荐

  1. R语言入门:正态分布中dnorm(),pnorm(),qnorm(),和rnorm()函数的使用

    dnorm():输入的是x轴上的数值,输出的是该点的概率密度 pnorm():输入的是x的z-score,输出的是面积,不带参数输出的是该点左边的面积,如果后面带lower.tail=F的参数,输出的 ...

  2. R语言:生成正态分布数据生成--rnorm,dnorm,pnorm,qnorm

    norm是正态分布,前面加r表示生成随机正态分布的序列,其中rnorm(10)表示产生10个数:给定正太分布的均值和方差, Density(d), distribution function§, qu ...

  3. R语言中的dnorm(),pnorm(),qnorm(),rnorm()的解释

    dnorm(x, mean = 0, sd = 1, log = FALSE) 返回值是正态分布概率密度函数值,比如dnorm(z)则表示:标准正态分布密度函数f(x)在x=z处的函数值. pnorm ...

  4. 正态分布函数概率运算dnorm/pnrom/qnrom/rnorm

    R语言-正态分布函数概率运算 一.正态分布 二.R语言中的相关函数 三.举例 1.求值函数`dnorm` 2.概率函数`pnrom` 3.函数`qnrom` 4.取样函数`rnrom` 四.引用 一. ...

  5. R语言中rnorm函数

    rnorm(n, mean = 0, sd = 1) n 为产生随机值个数(长度),mean 是平均数, sd 是标准差 . 使用该函数的时候后,一般要赋予它 3个值. rnorm() 函数会随机正态 ...

  6. R语言学习笔记——rnorm函数(正态分布)

    norm(n, x, y): 产生n个平均数为x,标准差为y的数. 默认情况下,平均数为0, 标准差 为1. rnorm(n, mean = 0, sd = 1):r = random = 随机: 随 ...

  7. R语言_高级数据管理

    #数值处理函数 #数学函数 abs sqrt ceiling floor round(x,digits=n) #舍入为指定位的小数 signif #舍入为指定位的有效数字 log(x,base=n) ...

  8. r包调用legend函数_R语言实现基于朴素贝叶斯构造分类模型数据可视化

    本文内容原创,未经作者许可禁止转载! 目录 一.前言 二.摘要 三.关键词 四.算法原理 五.经典应用 六.R建模 1.载入相关包(内含彩蛋): 1.1 library包载入 1.2 pacman包载 ...

  9. 一览R基础包的六个高级绘图函数(盒型boxplot|条形barplot|直方hist|饼pie|dotchart|coplot)...

    除了数理统计,今天我们继续聊一下R语言的另一个任务:绘图. 注意:我们公众号的每一次发文尽量列出一个小系列.如果九阳神功有10层,能以一篇文章写10层,绝不一篇写一层.分散写10篇,追求字典.工具的性 ...

  10. Dots + interval stats and geoms

    Anatomy of geom_dotsinterval() The dotsinterval family of geoms and stats is a sub-family of slabint ...

最新文章

  1. android 设置textview中划线效果
  2. 利用Matlab优化工具箱求解旅行商最短路径问题
  3. 如何用C#检查硬盘是否是固态硬盘SSD
  4. 自动化测试和手工测试
  5. Java精选笔记_JDBC
  6. 已触发了一个断点 vs_VSCode源码分析-断点调试
  7. Visual Studio 20xx试用版升级为正式版(WIN7同样有效)图解、附带序列号
  8. C/C++信息隐写术(二)之字符串藏入BMP文件
  9. hp-ux ftp启动_您可以做12项免费的事情来快速启动UX设计事业
  10. 纯 css 实现 a 标签 loading 效果
  11. 【leetcode】杨辉三角Ⅱ
  12. 详解基于机器学习的恶意代码检测技术
  13. 软件工程系统建模总结
  14. “绝美画卷”网站欣赏
  15. 第八课:ShuffleNet v1、ShuffleNet v2学习
  16. 被SCI收录的火灾方向的期刊
  17. JavaScript如何判定一个给定的时间区间在哪些时间段范围内?
  18. 文献笔记--相关:无线通信、安全加密隐私
  19. HDU-2550-百步穿杨
  20. 【wxPython 安装指南:error: legacy-install-failure】

热门文章

  1. 网工界的TFBOYS——netmikotextfsmntc-templates快速入门
  2. JVM 堆内存设置 -Xmx -Xms
  3. 碎阅:一款基于douban及ONE API开发的资讯类App
  4. C#设计模式之简单工厂模式
  5. 从超大规模云服务提供商处学习效率
  6. 《腾讯iOS测试实践》一一1.8 小结
  7. Java之美[从菜鸟到高手演变]之JVM内存管理及垃圾回收
  8. 压力测试 Monkey 应用程序无响应ANR Application No Response(转)
  9. office word 2007快捷键大全
  10. TCP连接保活之Keepalive