catalog

history <https://blog.csdn.net/ChenVast/article/details/82797532#%E5%8E%86%E5%8F%B2>

use <https://blog.csdn.net/ChenVast/article/details/82797532#%E4%BD%BF%E7%94%A8>

hypothesis <https://blog.csdn.net/ChenVast/article/details/82797532#%E5%81%87%E8%AE%BE>

Unpaired and paired double samples t-test
<https://blog.csdn.net/ChenVast/article/details/82797532#%E6%9C%AA%E9%85%8D%E5%AF%B9%E5%92%8C%E9%85%8D%E5%AF%B9%E7%9A%84%E5%8F%8C%E6%A0%B7%E6%9C%ACt-test>

independent ( unpaired ) sample
<https://blog.csdn.net/ChenVast/article/details/82797532#%E7%8B%AC%E7%AB%8B%EF%BC%88%E6%9C%AA%E9%85%8D%E5%AF%B9%EF%BC%89%E6%A0%B7%E6%9C%AC>

Paired samples
<https://blog.csdn.net/ChenVast/article/details/82797532#%E9%85%8D%E5%AF%B9%E6%A0%B7%E6%9C%AC>

calculation <https://blog.csdn.net/ChenVast/article/details/82797532#%E8%AE%A1%E7%AE%97>

Single sample t test
<https://blog.csdn.net/ChenVast/article/details/82797532#%E5%8D%95%E6%A0%B7%E6%9C%ACt%E6%A3%80%E9%AA%8C>

Slope of regression line
<https://blog.csdn.net/ChenVast/article/details/82797532#%E5%9B%9E%E5%BD%92%E7%BA%BF%E7%9A%84%E6%96%9C%E7%8E%87>

Independent double sample t test
<https://blog.csdn.net/ChenVast/article/details/82797532#%E7%8B%AC%E7%AB%8B%E7%9A%84%E5%8F%8C%E6%A0%B7%E6%9C%ACt%E6%A3%80%E9%AA%8C>

replace t- Test location problem
<https://blog.csdn.net/ChenVast/article/details/82797532#%E6%9B%BF%E4%BB%A3t-%E6%B5%8B%E8%AF%95%E4%BD%8D%E7%BD%AE%E9%97%AE%E9%A2%98>

Multivariate testing
<https://blog.csdn.net/ChenVast/article/details/82797532#%E5%A4%9A%E5%8F%98%E9%87%8F%E6%B5%8B%E8%AF%95>

Algorithm implementation
<https://blog.csdn.net/ChenVast/article/details/82797532#%E7%AE%97%E6%B3%95%E5%AE%9E%E7%8E%B0>

example <https://blog.csdn.net/ChenVast/article/details/82797532#%E4%BE%8B%E5%AD%90>

 

T A test is any statistical hypothesis test , In this test , The test statistics follow a student's zero hypothesis t distribution .

T The test is most often applied when the test statistic will follow a normal distribution <https://en.wikipedia.org/wiki/Normal_distribution> If a Value of
Scaling terms <https://en.wikipedia.org/wiki/Scale_parameter> The test statistics are known . When the scaling item is unknown and is based on data
<https://en.wikipedia.org/wiki/Data> Estimated replacement time of , Test statistics ( on certain conditions ) follow t distribution . The T test
have access to , for example , To determine whether the two sets of data are significant <https://en.wikipedia.org/wiki/Statistical_significance>
Different from each other .

 

history

Statistics were compiled by 1908 Introduced William · Cili · Gosset , Chemist works at Guinness brewery in Dublin , Ireland .“ student ” It's his pseudonym .

because Claude Guinness The policy is to recruit the best graduates from Oxford and Cambridge , Applying biochemistry and statistics to industrial processes in Guinness , So we hired Gosset
.Gosset Designed t Inspection as an economic method for monitoring bulky quality . This one T Test work has been submitted to Biometrika Journal and 1908 Published in . Guinness's policy forbids its chemists from publishing their results , therefore Gosset Under a pseudonym “ student ” Published his statistical work ( Detailed history of this kana , Please refer to the student's t distribution , It should not be confused with literal students ).

Guinness has a policy that allows technicians to leave school ( The so-called “ Study vacation ”), Gost is here 1906-1907 Professor Karl Pearson's biometric laboratory at University College London was used in the first two semesters of the academic year .
Gosset He was later a statistician and editor in chief Karl Pearson What we know .

 

use

Most commonly used t- test There are :

* Single sample position test , Test whether the mean value of the population has the value specified in the null hypothesis .
* Double sample position test for null hypothesis , Make the mean value of the two groups equal . All of these tests are often referred to as students t test , But strictly speaking , This name should only be used if the variance of two populations is assumed to be equal ;
The form of testing used when this assumption is removed is sometimes referred to as Welch Of t test . These tests are often referred to as “ unpaired ” or “ Independent sample ”
t- test , Because they are usually applied when the basic statistical units of the two samples being compared do not overlap .[8]
* Test of null hypothesis
, That is, the difference between two responses measured on the same statistical unit has a zero average . for example , Suppose we measure the size of cancer patients before and after treatment . If the treatment works , We expect the tumor size to decrease in many patients after treatment . This is often referred to as “ pair ” or “ Repeated measurement ”
t- test :[8] [9] See paired difference test .
* Whether the slope of the regression line is consistent with 0 Significantly different .
 

hypothesis

majority t -test The statistical data are in the form of , among Z and s It's a function of data . usually ,Z Designed to be sensitive to alternative assumptions ( Namely , When the alternative hypothesis is true, the range tends to be larger ), and s Yes, it is allowed to determine t Distribution of
Scaling parameters <https://blog.csdn.net/ChenVast/article/details/82797532>.

formula :



Among them are samples X 1,X 2,...X n Sample mean value of
<https://blog.csdn.net/ChenVast/article/details/82797532>, The size is n,s It's the mean
<https://blog.csdn.net/ChenVast/article/details/82797532> Standard error of
<https://blog.csdn.net/ChenVast/article/details/82797532>,σ Is the overall standard deviation of the data
<https://blog.csdn.net/ChenVast/article/details/82797532>,μ It's the overall mean
<https://blog.csdn.net/ChenVast/article/details/82797532>.

 

t The hypothesis of the test is

* X It follows the normal distribution with mean value  μ Sum variance  
* follow   Distribution and  p  Zero hypothesis of degree of freedom , Middle and lower  p It's a positive constant
* Z and  s It's independent .
 

In particular types of t Under test , These conditions are the consequence of the population studied , And the way of data sampling . for example , When comparing the mean of two independent samples t Under test , The following assumptions should be satisfied :

*
Each of the two populations to be compared should follow a normal distribution . This can be tested using a normal test , for example Shapiro-Wilk or Kolmogorov-Smirnov test , Or it can be evaluated graphically using a normal quantile plot .
*
If you use the t Original definition of inspection , The two populations should have the same variance ( Available F-test,Levene test ,Bartlett Inspection or Brown-Forsythe Inspection and inspection  ; Or use Q-Q Graphs are evaluated graphically ). If the sample size in the two groups being compared is equal , The original of students t The test is very robust to the existence of unequal variance . Whether the sample size is similar or not , Welch's t The test is not sensitive to the equality of the difference .
*
The data used for testing should be sampled independently of the two populations being compared . This usually cannot be tested from the data , But if the known data is dependent sampling ( Namely , If they are sampled as clusters ), So the classic discussed here t- Tests can produce misleading results .
Most double samples t- The test deviates greatly from all assumptions .

 

Unpaired and paired double samples t-test


Double sample t- The difference in the mean value of the test involves independent samples ( Unpaired samples ) Or paired samples . pair t- Testing is a form of blocking , And when the pairing unit and “ Noise factor ” Similar time , It has more power than the unpaired test , The “ Noise factor ” Independent of membership in the two groups being compared . In different contexts , pair t-test Can be used to reduce the impact of confounding factors in observational studies .

 

independent ( unpaired ) sample

When two groups of independent samples with the same distribution are obtained , Using independent samples t -test
, One group came from each of the two groups being compared . for example , Suppose we're evaluating the effectiveness of medical treatment , We will 100 Subjects were included in our study , Then the 50 Subjects were randomly assigned to the treatment group , take 50 Subjects were randomly assigned to the control group . under these circumstances , We have two separate samples , And will use unpaired t
-test form . Randomization is not necessary here -
If we contact by phone 100 Individuals and get everyone's age and gender , Then double samples are used t Test to determine if the average age varies by gender , It will also be a separate sample t- test , Even if the data is observational .

 

Paired samples

Paired samples t- The test usually consists of a sample of a matched pair of similar units or a set of units that have been tested twice (“ Repeated measurement ” t- test ) form .


Typical examples of repeated measurements t- The trial was to test subjects before treatment , For example, hypertension , The same subjects were tested again after treatment with antihypertensive drugs . By comparing the same number of patients before and after treatment , We effectively use each patient as our own control . such , Correct rejection of null hypothesis ( here : There was no difference in treatment ) Become more likely , Statistical power increases only because random variation between patients has now been eliminated . But please note , Statistical power increases at a cost : More testing is needed , Each topic must be tested twice . Because now half of the sample depends on the other half , So students t
-test Only the paired version of n/2- 1 freedom ( n It's the total number of observations ). In pairs, they become separate test units , And the sample must be doubled to achieve the same number of degrees of freedom . usually , Yes n - 1 Degrees of freedom (
n It's the total number of observations ).

be based on “ Matching pairs of samples ”
Paired samples of t- The test results were from unmatched samples , It is then used to form paired samples , By using additional variables measured with variables of interest . The matching is achieved by identifying observations from one of the two samples , In the case of similar values in other measured variables in the pair . This method is sometimes used in observational studies , To reduce or eliminate the influence of confounding factors .

Paired samples t- Tests are often referred to as “ Dependent sample t- test ”.

 

calculation


The following is a list of the various t- Explicit expression of test . In each case , It is shown that under the null hypothesis, it is exactly followed or very close t- The formula of test statistics of distribution . and , The appropriate degrees of freedom are given in each case . Each of these statistics can be used to perform single tailed or double tailed tests .


once t Values and degrees of freedom are determined , One p Values can use the table of values found , From students t distribution . If p The value is below the threshold selected for statistical significance ( Usually 0.10,0.05 or 0.01 level ), The original hypothesis is rejected to support the alternative hypothesis .

Single sample t test

At zero of the test, the population average is assumed to be equal to the specified value

formula



Is the sample mean ,s Is the sample standard deviation of the sample <https://en.wikipedia.org/wiki/Standard_deviation#Estimation>,
n Is the sample size . The degrees of freedom used in this test are n - 1.

Although the parent population does not need a normal distribution , But the distribution of the sample population means it is considered normal . Through the central limit theorem
<https://en.wikipedia.org/wiki/Central_limit_theorem>
, If the sample of the parent group is independent and the second moment of the parent group exists , Then the sample mean will be approximately normal in the large sample limit .( The degree of approximation depends on the closeness of parent population to normal distribution and sample size n.)

 

Slope of regression line

Suppose a suitable model



among X It's known ,α and β It's unknown , also ε Yes, the mean is 0, Random variables of normal distribution with unknown variance , and ÿ It's the result of interest . The slope of the zero hypothesis that we're going to test β Equal to a specified value
( Usually taken as 0, under these circumstances , The zero hypothesis is x and y It's not relevant ).



then



There is one t Distribution if null hypothesis holds , Then there are n-2 Degree of freedom t distribution .

Standard error of slope coefficient :



It can be written in terms of residuals



Then it is given by the following equation :



Another way to determine :



among r yes Pearson correlation coefficient <https://blog.csdn.net/ChenVast/article/details/82797532>.

The method can be determined from



Where is the sample variance

 

 

Independent double sample t test

Equal sample size , The variance is equal

Give two groups (1,2), This test only applies to :

* Two sample sizes ( The number of participants in each group n) equal ;
* It can be assumed that the two distributions have the same variance ;
Violations of these assumptions are discussed below .

Is the mean value different t The statistics can be calculated as follows :



among



Here is n = n1 = n2 and s 2 The sum of the combined standard deviations is an unbiased estimate of the variance of two samples . t The denominator of is the standard error of the difference between the two methods .

For significance test , The degree of freedom of the test is 2n-2, among n Is the number of participants in each group .

 

Equal or unequal sample size , The variance is equal

Use this test only if you can assume that two distributions have the same variance .( When this assumption is violated , See below .) Please note that , The previous formula is a special case of the following formula , When the two samples are equal in size , They can be restored :n
 = n 1 = n 2.

Is the mean value different t The statistics can be calculated as follows :



among



Is an estimate of the aggregate standard deviation of two samples : It is defined in this way , So that its square is an unbiased estimate of the common variance , Whether or not the overall mean is the same . In these formulas ,ni -
1 Is the number of degrees of freedom for each group , Total sample size minus 2( Namely n1 + n2 - 2) Is the total number of degrees of freedom used . In the importance test .

 

Equal or unequal sample size , Variance inequality

This test is also known as Welch t test , It is only assumed that the variances of the two populations are equal ( The two samples may be the same size or not ) When using , Therefore, it must be estimated separately . Used to test whether the population mean is different t The statistics are calculated as follows :



among



here , Is an unbiased estimate of the variance of each of the two samples , among ni = group i Number of participants in (1 or 2).

be careful , under these circumstances ,s2 It's not a set variance . For use in testing , The distribution of test statistics is similar to that of ordinary students t Degrees of freedom for distribution and use  



This is called Welch-Satterthwaite equation . The true distribution of test statistics is actually ( slightly ) Depends on two unknown demographic differences ( See Behrens-Fisher problem ).

 

Correlation of paired samples t test

Use this test when the sample is dependent ; in other words , When only one sample has been tested twice ( Repeated measurement ) Or two samples have been matched or “ pair ” Time . This is an example of a paired difference test .




For this equation , The difference between all pairs must be calculated . These pairs are a person's scores before and after the test , Or a pair of people who match into meaningful groups ( for example , From the same family or age group : See table ). The average of these differences (
) And standard deviation () Used in equations . Constant is nonzero , If we want to test whether the average of the differences is from a significant difference . Degrees of freedom used 是n -1,其中n表示对的数量.



 

替代t-测试位置问题

所述吨
-test提供两个正态总体与未知的,但是相等时,方差的装置的一个平等精确检验.(对于数据正常但差异可能不同的情况,Welch的t检验几乎是一个非常精确的检验.)对于中等大样本和单尾检验,t检验对于适度违反正态假设是相对稳健的.

对于正确性,所述t -test和z -test要求样品的正常装置,并且t -test另外需要,对于样本方差如下缩放的χ
2分布,并且所述样本均值和样本方差是统计独立的.如果满足这些条件,则不需要单个数据值的正常性.通过中心极限定理,即使数据不是正态分布,中等大样本的样本均值通常也能通过正态分布很好地近似.对于非正常的数据,样本方差的分布可以从基本上偏离χ
2分配.但是,如果样本量很大,Slutsky定理意味着样本方差的分布对检验统计量的分布几乎没有影响.如果数据基本上不正常且样本量很小,则t检验可能会产生误导性结果.有关与一个特定非正态分布族相关的理论,请参阅高斯尺度混合分布的位置测试.


当常态假设不成立时,t检验的非参数替代通常可以具有更好的统计功效.类似地,在存在异常值的情况下,t检验不稳健.例如,对于两个独立的样本,当数据分布是不对称的(即,分布是偏斜的)或分布有大尾巴时,那么Wilcoxon秩和检验(也称为Mann-Whitney
U检验)可以有三个功率比t测试高四倍.配对样本的非参数对应物t-test是配对样本的Wilcoxon符号秩检验.有关在t
-test和非参数选择之间进行选择的讨论,请参见Sawilowsky(2005).

单向方差分析(ANOVA)概括了两样本吨当数据属于两个以上的组-test.

 

多变量测试

Student's
t统计量的推广,称为Hotelling的t-平方统计量,允许在同一样本中测试多个(通常是相关的)度量的假设.例如,研究人员可能会将一些科目提交给由多个人格量表组成的人格测验(例如明尼苏达多相人格量表).由于这种类型的度量通常是正相关的,因此不建议单独进行单变量t检验以检验假设,因为这些会忽略度量之间的协方差,并且会增加错误拒绝至少一个假设的可能性(类型I错误)).在这种情况下,单个多变量检验对于假设检验更为可取.Fisher's方法将多个测试与α相结合,减少了测试之间的正相关性.另一个是Hotelling的
统计量遵循分布.但是,实际上很少使用分布,因为很难找到的列表值.通常,被转换为F统计量.

对于一示例多变量测试中,假设是平均向量(μ)是等于给定的矢量().测试统计数据是Hotelling的t 2
<https://blog.csdn.net/ChenVast/article/details/82797532>:



其中n是样本大小,是列均值的向量,S是m × m 样本协方差矩阵
<https://blog.csdn.net/ChenVast/article/details/82797532>.

对于两样品多元测试中,假设是均值向量(,)的两个样品是相等的.测试统计数据是Hotelling的双样本t 2
<https://blog.csdn.net/ChenVast/article/details/82797532>:



 

算法实现

python的科学计算库scipy中的函数

scipy.stats.ttest_ind(a, b, axis=0, equal_var=True)

计算两个独立分数样本均值的T检验.

这是零假设的双侧检验,即2个独立样本具有相同的平均(预期)值.该测试假设群体默认具有相同的方差.

参数:
a,b : array_like

除了与轴对应的维度(默认情况下为第一个)外,数组必须具有相同的形状.

axis : int或None,可选

用于计算测试的轴.如果为None,则计算整个数组a和b.

equal_var : bool,可选

如果为True(默认值),则执行标准的独立2样本检验,该检验假设人口差异相等[1]
<https://blog.csdn.net/ChenVast/article/details/82797532#r3566833beaa2-1>
.如果为False,则执行Welch的t检验,该检验不假设人口方差相等[2]
<https://blog.csdn.net/ChenVast/article/details/82797532#r3566833beaa2-2>.

版本0.11.0中的新功能.

nan_policy : {'propagate','raise','omit'},可选

定义输入包含nan时的处理方式.'propagate'返回nan,'raise'抛出错误,'省略'执行忽略nan值的计算.默认为'传播'.

返回:
statistic : 浮点数或数组

计算出的t统计量.

pvalue : float或array

双尾p值.

 

例子
from scipy import stats np.random.seed(12345678)
用相同的方法测试样品:
rvs1 = stats.norm.rvs(loc=5,scale=10,size=500) rvs2 =
stats.norm.rvs(loc=5,scale=10,size=500) stats.ttest_ind(rvs1,rvs2)
#(0.26833823296239279, 0.78849443369564776) stats.ttest_ind(rvs1,rvs2,
equal_var = False) #(0.26833823296239279, 0.78849452749500748)
ttest_ind 低估了不等方差的p:
rvs3 = stats.norm.rvs(loc=5, scale=20, size=500) stats.ttest_ind(rvs1, rvs3)
#(-0.46580283298287162, 0.64145827413436174) stats.ttest_ind(rvs1, rvs3,
equal_var = False) #(-0.46580283298287162, 0.64149646246569292)
当n1!= n2时,等方差t-统计量不再等于不等方差t-统计量:
rvs4 = stats.norm.rvs(loc=5, scale=20, size=100) stats.ttest_ind(rvs1, rvs4)
#(-0.99882539442782481, 0.3182832709103896) stats.ttest_ind(rvs1, rvs4,
equal_var = False) #(-0.69712570584654099, 0.48716927725402048)
使用不同均值,方差和n进行T检验:
rvs5 = stats.norm.rvs(loc=8, scale=20, size=100) stats.ttest_ind(rvs1, rvs5)
#(-1.4679669854490653, 0.14263895620529152) stats.ttest_ind(rvs1, rvs5,
equal_var = False) #(-0.94365973617132992, 0.34744170334794122)
 

参考:

https://en.wikipedia.org/wiki/Student%27s_t-test
<https://en.wikipedia.org/wiki/Student%27s_t-test>


https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html#scipy.stats.ttest_ind

<https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html#scipy.stats.ttest_ind>