All language subtitles for 014 Test for the mean. Independent Samples (Part 2)_en

af Afrikaans
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian Download
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,480 --> 00:00:04,110 -: This is the final lesson we will do on testing. 2 00:00:04,110 --> 00:00:06,390 The last case we'll examine here is the one with 3 00:00:06,390 --> 00:00:09,090 independent samples and unknown variances, 4 00:00:09,090 --> 00:00:10,683 which are assumed to be equal. 5 00:00:12,060 --> 00:00:13,530 I'll quickly brush up your memory 6 00:00:13,530 --> 00:00:16,430 on the data set we did in the confidence interval section. 7 00:00:17,430 --> 00:00:19,620 You were trying to see if apples in New York 8 00:00:19,620 --> 00:00:22,560 are as expensive as the ones in LA. 9 00:00:22,560 --> 00:00:24,990 You went to 10 grocery shops in New York 10 00:00:24,990 --> 00:00:26,490 and your friend Paul, 11 00:00:26,490 --> 00:00:27,690 who lives in LA, 12 00:00:27,690 --> 00:00:29,613 went to eight grocery shops there. 13 00:00:30,750 --> 00:00:33,303 You got all the prices and put them in a table, 14 00:00:34,380 --> 00:00:37,080 with what the population variance of apple prices is. 15 00:00:37,080 --> 00:00:40,053 But you assume it should be the same for New York and LA. 16 00:00:41,850 --> 00:00:44,823 Let's state the null and alternative hypotheses. 17 00:00:46,260 --> 00:00:50,730 H zero: Mu in New York is equal To Mu in LA. 18 00:00:50,730 --> 00:00:53,220 Or Mu in New York minus Mu in LA 19 00:00:53,220 --> 00:00:54,603 is equal to zero. 20 00:00:56,700 --> 00:01:01,350 H one: Mu in New York is different to Mu in LA. 21 00:01:01,350 --> 00:01:05,613 Mu in New York minus Mu in LA differs from zero. 22 00:01:07,290 --> 00:01:09,720 All right, that's our data set. 23 00:01:09,720 --> 00:01:11,970 We have also calculated the sample means, 24 00:01:11,970 --> 00:01:14,523 standard deviations and sample sizes. 25 00:01:15,900 --> 00:01:18,390 What can we do when the variance is unknown 26 00:01:18,390 --> 00:01:20,073 but assumed to be equal? 27 00:01:21,450 --> 00:01:24,450 Earlier, we use the pooled variance formula. 28 00:01:24,450 --> 00:01:26,820 Well, here it is again. 29 00:01:26,820 --> 00:01:27,653 Remember? 30 00:01:30,510 --> 00:01:33,150 All right, it's all about plugging in numbers 31 00:01:33,150 --> 00:01:35,550 so I'll save you the trouble. 32 00:01:35,550 --> 00:01:38,103 The pooled variance is 0.05. 33 00:01:40,440 --> 00:01:42,870 One last thing we need is the standard error 34 00:01:42,870 --> 00:01:44,460 of the difference of means. 35 00:01:44,460 --> 00:01:46,563 It is given by the following formula. 36 00:01:49,740 --> 00:01:51,270 I'm going faster than usual, 37 00:01:51,270 --> 00:01:53,160 as we've seen all of this before. 38 00:01:53,160 --> 00:01:55,710 Moreover, testing is about understanding, 39 00:01:55,710 --> 00:01:57,780 computation is routine. 40 00:01:57,780 --> 00:02:00,033 So, let's start testing, shall we? 41 00:02:01,560 --> 00:02:04,320 Small samples, variance unknown. 42 00:02:04,320 --> 00:02:06,060 Which statistic do we need? 43 00:02:06,060 --> 00:02:06,990 Exactly. 44 00:02:06,990 --> 00:02:08,793 It's the T statistic again. 45 00:02:10,139 --> 00:02:11,763 How many degrees of freedom? 46 00:02:12,900 --> 00:02:14,640 You may recall it from earlier. 47 00:02:14,640 --> 00:02:16,590 It was the combined sample size 48 00:02:16,590 --> 00:02:18,660 minus the number of variables. 49 00:02:18,660 --> 00:02:22,953 So 10 plus eight minus two, which gives us 16. 50 00:02:25,050 --> 00:02:27,213 Let's see the T statistic formula. 51 00:02:28,200 --> 00:02:31,020 Once again, the difference between sample means 52 00:02:31,020 --> 00:02:34,020 minus the difference between hypothesized true means 53 00:02:34,020 --> 00:02:35,793 divided by the standard error. 54 00:02:38,160 --> 00:02:39,690 After plugging in everything, 55 00:02:39,690 --> 00:02:42,573 we get a test statistic of 6.53. 56 00:02:44,910 --> 00:02:46,233 Do we need to compare it? 57 00:02:47,580 --> 00:02:51,360 This is by far the most extreme test statistic we have seen. 58 00:02:51,360 --> 00:02:53,960 You will have a hard time finding it in the T table. 59 00:02:55,230 --> 00:02:57,030 For common tests, a rule of thumb is 60 00:02:57,030 --> 00:02:59,670 to reject the null hypothesis When T-score is 61 00:02:59,670 --> 00:03:00,693 bigger than two. 62 00:03:02,160 --> 00:03:04,710 Generally, for Z-score and T-score 63 00:03:04,710 --> 00:03:07,863 a value that is higher than four is extremely significant. 64 00:03:10,140 --> 00:03:12,033 Let's see the two-sided P value. 65 00:03:13,260 --> 00:03:17,160 The P value of this test is lower than 0.000, 66 00:03:17,160 --> 00:03:20,643 somewhere around 0.000001. 67 00:03:21,900 --> 00:03:24,810 In our lesson about P value, we said that researchers 68 00:03:24,810 --> 00:03:28,080 are always looking for those three zeros after the dot. 69 00:03:28,080 --> 00:03:30,750 It means that the test is extremely significant 70 00:03:30,750 --> 00:03:33,150 and the probability of making a type one error 71 00:03:33,150 --> 00:03:34,443 is virtually zero. 72 00:03:35,760 --> 00:03:39,150 Therefore, we reject the null hypothesis at all common 73 00:03:39,150 --> 00:03:41,073 and uncommon levels of significance. 74 00:03:42,390 --> 00:03:44,700 There is a strong statistical evidence that the price 75 00:03:44,700 --> 00:03:47,343 of apples in New York differs from in LA. 76 00:03:49,800 --> 00:03:51,780 But such an extreme result may also mean 77 00:03:51,780 --> 00:03:55,230 that the hypothesis is pointless or poorly designed. 78 00:03:55,230 --> 00:03:59,130 From the mean values of 3.94 and 3.25, and 79 00:03:59,130 --> 00:04:03,330 with such small and close standard deviations of around 0.2, 80 00:04:03,330 --> 00:04:05,640 we could easily say that the prices are different. 81 00:04:05,640 --> 00:04:06,993 No testing needed. 82 00:04:08,400 --> 00:04:11,100 A much more interesting question would be if the price 83 00:04:11,100 --> 00:04:14,673 of apples in New York is 20% higher than that in LA. 84 00:04:16,079 --> 00:04:19,350 I will leave you this exercise for homework. 85 00:04:19,350 --> 00:04:23,130 All right, we are done with hypothesis testing. 86 00:04:23,130 --> 00:04:24,903 Cheers, and thanks for watching. 6617

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.