T test, Z test, or MWU test (Wilcoxon rank-sum test)?
T test, Z test和MWU test的原理与代码实现,如何选择他们?
Posted by Chunfu Shawn on
2022/10/21
Last Updated by Chunfu Shawn on
2022/10/21 Total page visits:
一、T test
1、概念
t 检验也称为 Student t
检验,它是一种使用假设检验来评估一个或两个总体均值的工具,用于统计量服从正态分布,但方差未知的情况。t
检验可用于评估某个组是否与已知值有差异(单样本 t
检验),两个组是否彼此有差异(独立双样本 t
检验),或成对测量值中是否存在显著差异(成对或非独立样本 t 检验)。
Performs one and two sample t-tests on vectors of data.
(2) Usage:
1 2 3 4 5 6 7 8
t.test(x, …) # S3 method for default t.test(x, y =NULL, alternative =c("two.sided","less","greater"), mu =0, paired =FALSE, var.equal =FALSE, conf.level =0.95, …) # S3 method for formula t.test(formula, data, subset, na.action, …)
(3) Arguments:
x: a (non-empty) numeric vector of data values.
y: an optional (non-empty) numeric vector of data
values.
alternative: a character string specifying the
alternative hypothesis, must be one
of "two.sided" (default), "greater" or "less".
You can specify just the initial letter.
mu: a number indicating the true value of the mean
(or difference in means if you are performing a two sample test).
paired: a logical indicating whether you want a
paired t-test.
var.equal: a logical variable indicating whether to
treat the two variances as being equal.
If TRUE then the pooled variance is used
to estimate the variance otherwise the Welch (or Satterthwaite)
approximation to the degrees of freedom is used.
conf.level: confidence level of the interval.
formula: a formula of the
form lhs ~ rhs where lhs is
a numeric variable giving the data values
and rhs a factor with two levels giving
the corresponding groups.
data: an optional matrix or data frame (or similar:
see [model.frame](https://www.rdocumentation.org/link/model.frame?package=stats&version=3.6.2))
containing the variables in the
formula formula. By default the variables
are taken from environment(formula).
subset: an optional vector specifying a subset of
observations to be used.
na.action: a function which indicates what should
happen when the data contain NAs. Defaults
to getOption("na.action").
(4) Details
The formula interface is only applicable for the 2-sample tests.
alternative = "greater" is the
alternative that x has a larger mean
than y.
If paired is TRUE then
both x and y must
be specified and they must be the same length. Missing values are
silently removed (in pairs
if paired is TRUE).
If var.equal is TRUE then
the pooled estimate of the variance is used. By default,
if var.equal is FALSE then
the variance is estimated separately for both groups and the Welch
modification to the degrees of freedom is used.
If the input data are effectively constant (compared to the larger of
the two means) an error is generated.
二、Z test
1、概念
Z分布:即为正态分布 (normal
distribution);根据中心极限定理,通过抽样模拟试验表明,在正态分布总体中以固定
n 抽取若干个样本时,样本均数的分布仍服从正态分布,即。所以,对样本均数的分布进行Z变换,也可变换为标准正态分布