Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,480 --> 00:00:04,110
-: This is the final lesson we will do on testing.
2
00:00:04,110 --> 00:00:06,390
The last case we'll examine here is the one with
3
00:00:06,390 --> 00:00:09,090
independent samples and unknown variances,
4
00:00:09,090 --> 00:00:10,683
which are assumed to be equal.
5
00:00:12,060 --> 00:00:13,530
I'll quickly brush up your memory
6
00:00:13,530 --> 00:00:16,430
on the data set we did in the confidence interval section.
7
00:00:17,430 --> 00:00:19,620
You were trying to see if apples in New York
8
00:00:19,620 --> 00:00:22,560
are as expensive as the ones in LA.
9
00:00:22,560 --> 00:00:24,990
You went to 10 grocery shops in New York
10
00:00:24,990 --> 00:00:26,490
and your friend Paul,
11
00:00:26,490 --> 00:00:27,690
who lives in LA,
12
00:00:27,690 --> 00:00:29,613
went to eight grocery shops there.
13
00:00:30,750 --> 00:00:33,303
You got all the prices and put them in a table,
14
00:00:34,380 --> 00:00:37,080
with what the population variance of apple prices is.
15
00:00:37,080 --> 00:00:40,053
But you assume it should be the same for New York and LA.
16
00:00:41,850 --> 00:00:44,823
Let's state the null and alternative hypotheses.
17
00:00:46,260 --> 00:00:50,730
H zero: Mu in New York is equal To Mu in LA.
18
00:00:50,730 --> 00:00:53,220
Or Mu in New York minus Mu in LA
19
00:00:53,220 --> 00:00:54,603
is equal to zero.
20
00:00:56,700 --> 00:01:01,350
H one: Mu in New York is different to Mu in LA.
21
00:01:01,350 --> 00:01:05,613
Mu in New York minus Mu in LA differs from zero.
22
00:01:07,290 --> 00:01:09,720
All right, that's our data set.
23
00:01:09,720 --> 00:01:11,970
We have also calculated the sample means,
24
00:01:11,970 --> 00:01:14,523
standard deviations and sample sizes.
25
00:01:15,900 --> 00:01:18,390
What can we do when the variance is unknown
26
00:01:18,390 --> 00:01:20,073
but assumed to be equal?
27
00:01:21,450 --> 00:01:24,450
Earlier, we use the pooled variance formula.
28
00:01:24,450 --> 00:01:26,820
Well, here it is again.
29
00:01:26,820 --> 00:01:27,653
Remember?
30
00:01:30,510 --> 00:01:33,150
All right, it's all about plugging in numbers
31
00:01:33,150 --> 00:01:35,550
so I'll save you the trouble.
32
00:01:35,550 --> 00:01:38,103
The pooled variance is 0.05.
33
00:01:40,440 --> 00:01:42,870
One last thing we need is the standard error
34
00:01:42,870 --> 00:01:44,460
of the difference of means.
35
00:01:44,460 --> 00:01:46,563
It is given by the following formula.
36
00:01:49,740 --> 00:01:51,270
I'm going faster than usual,
37
00:01:51,270 --> 00:01:53,160
as we've seen all of this before.
38
00:01:53,160 --> 00:01:55,710
Moreover, testing is about understanding,
39
00:01:55,710 --> 00:01:57,780
computation is routine.
40
00:01:57,780 --> 00:02:00,033
So, let's start testing, shall we?
41
00:02:01,560 --> 00:02:04,320
Small samples, variance unknown.
42
00:02:04,320 --> 00:02:06,060
Which statistic do we need?
43
00:02:06,060 --> 00:02:06,990
Exactly.
44
00:02:06,990 --> 00:02:08,793
It's the T statistic again.
45
00:02:10,139 --> 00:02:11,763
How many degrees of freedom?
46
00:02:12,900 --> 00:02:14,640
You may recall it from earlier.
47
00:02:14,640 --> 00:02:16,590
It was the combined sample size
48
00:02:16,590 --> 00:02:18,660
minus the number of variables.
49
00:02:18,660 --> 00:02:22,953
So 10 plus eight minus two, which gives us 16.
50
00:02:25,050 --> 00:02:27,213
Let's see the T statistic formula.
51
00:02:28,200 --> 00:02:31,020
Once again, the difference between sample means
52
00:02:31,020 --> 00:02:34,020
minus the difference between hypothesized true means
53
00:02:34,020 --> 00:02:35,793
divided by the standard error.
54
00:02:38,160 --> 00:02:39,690
After plugging in everything,
55
00:02:39,690 --> 00:02:42,573
we get a test statistic of 6.53.
56
00:02:44,910 --> 00:02:46,233
Do we need to compare it?
57
00:02:47,580 --> 00:02:51,360
This is by far the most extreme test statistic we have seen.
58
00:02:51,360 --> 00:02:53,960
You will have a hard time finding it in the T table.
59
00:02:55,230 --> 00:02:57,030
For common tests, a rule of thumb is
60
00:02:57,030 --> 00:02:59,670
to reject the null hypothesis When T-score is
61
00:02:59,670 --> 00:03:00,693
bigger than two.
62
00:03:02,160 --> 00:03:04,710
Generally, for Z-score and T-score
63
00:03:04,710 --> 00:03:07,863
a value that is higher than four is extremely significant.
64
00:03:10,140 --> 00:03:12,033
Let's see the two-sided P value.
65
00:03:13,260 --> 00:03:17,160
The P value of this test is lower than 0.000,
66
00:03:17,160 --> 00:03:20,643
somewhere around 0.000001.
67
00:03:21,900 --> 00:03:24,810
In our lesson about P value, we said that researchers
68
00:03:24,810 --> 00:03:28,080
are always looking for those three zeros after the dot.
69
00:03:28,080 --> 00:03:30,750
It means that the test is extremely significant
70
00:03:30,750 --> 00:03:33,150
and the probability of making a type one error
71
00:03:33,150 --> 00:03:34,443
is virtually zero.
72
00:03:35,760 --> 00:03:39,150
Therefore, we reject the null hypothesis at all common
73
00:03:39,150 --> 00:03:41,073
and uncommon levels of significance.
74
00:03:42,390 --> 00:03:44,700
There is a strong statistical evidence that the price
75
00:03:44,700 --> 00:03:47,343
of apples in New York differs from in LA.
76
00:03:49,800 --> 00:03:51,780
But such an extreme result may also mean
77
00:03:51,780 --> 00:03:55,230
that the hypothesis is pointless or poorly designed.
78
00:03:55,230 --> 00:03:59,130
From the mean values of 3.94 and 3.25, and
79
00:03:59,130 --> 00:04:03,330
with such small and close standard deviations of around 0.2,
80
00:04:03,330 --> 00:04:05,640
we could easily say that the prices are different.
81
00:04:05,640 --> 00:04:06,993
No testing needed.
82
00:04:08,400 --> 00:04:11,100
A much more interesting question would be if the price
83
00:04:11,100 --> 00:04:14,673
of apples in New York is 20% higher than that in LA.
84
00:04:16,079 --> 00:04:19,350
I will leave you this exercise for homework.
85
00:04:19,350 --> 00:04:23,130
All right, we are done with hypothesis testing.
86
00:04:23,130 --> 00:04:24,903
Cheers, and thanks for watching.
6617
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.