subtitlecat.com

All language subtitles for 014 Test for the mean. Independent Samples (Part 2)_en

Afrikaans

Akan

Albanian

Amharic

Arabic

Armenian

Azerbaijani

Basque

Belarusian

Bemba

Bengali

Bihari

Bosnian

Breton

Bulgarian

Cambodian

Catalan

Cebuano

Cherokee

Chichewa

Chinese (Simplified)

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English

Esperanto

Estonian

Ewe

Faroese

Filipino

Finnish

French

Frisian

Galician

Georgian

German

Greek

Guarani

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Interlingua

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Kinyarwanda

Kirundi

Kongo

Korean

Krio (Sierra Leone)

Kurdish

Kurdish (Soranî)

Kyrgyz

Laothian

Latin

Latvian

Lingala

Lithuanian

Lozi

Luganda

Luo

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mauritian Creole

Moldavian

Mongolian

Myanmar (Burmese)

Montenegrin

Nepali

Nigerian Pidgin

Northern Sotho

Norwegian

Norwegian (Nynorsk)

Occitan

Oriya

Oromo

Pashto

Persian Download

Polish

Portuguese (Brazil)

Portuguese (Portugal)

Punjabi

Quechua

Romanian

Romansh

Runyakitara

Russian

Samoan

Scots Gaelic

Serbian

Serbo-Croatian

Sesotho

Setswana

Seychellois Creole

Shona

Sindhi

Sinhalese

Slovak

Slovenian

Somali

Spanish

Spanish (Latin American)

Sundanese

Swahili

Swedish

Tajik

Tamil

Tatar

Telugu

Thai

Tigrinya

Tonga

Tshiluba

Tumbuka

Turkish

Turkmen

Twi

Uighur

Ukrainian

Urdu

Uzbek

Vietnamese

Welsh

Wolof

Xhosa

Yiddish

Yoruba

Zulu

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,480 --> 00:00:04,110 -: This is the final lesson we will do on testing. 2 00:00:04,110 --> 00:00:06,390 The last case we'll examine here is the one with 3 00:00:06,390 --> 00:00:09,090 independent samples and unknown variances, 4 00:00:09,090 --> 00:00:10,683 which are assumed to be equal. 5 00:00:12,060 --> 00:00:13,530 I'll quickly brush up your memory 6 00:00:13,530 --> 00:00:16,430 on the data set we did in the confidence interval section. 7 00:00:17,430 --> 00:00:19,620 You were trying to see if apples in New York 8 00:00:19,620 --> 00:00:22,560 are as expensive as the ones in LA. 9 00:00:22,560 --> 00:00:24,990 You went to 10 grocery shops in New York 10 00:00:24,990 --> 00:00:26,490 and your friend Paul, 11 00:00:26,490 --> 00:00:27,690 who lives in LA, 12 00:00:27,690 --> 00:00:29,613 went to eight grocery shops there. 13 00:00:30,750 --> 00:00:33,303 You got all the prices and put them in a table, 14 00:00:34,380 --> 00:00:37,080 with what the population variance of apple prices is. 15 00:00:37,080 --> 00:00:40,053 But you assume it should be the same for New York and LA. 16 00:00:41,850 --> 00:00:44,823 Let's state the null and alternative hypotheses. 17 00:00:46,260 --> 00:00:50,730 H zero: Mu in New York is equal To Mu in LA. 18 00:00:50,730 --> 00:00:53,220 Or Mu in New York minus Mu in LA 19 00:00:53,220 --> 00:00:54,603 is equal to zero. 20 00:00:56,700 --> 00:01:01,350 H one: Mu in New York is different to Mu in LA. 21 00:01:01,350 --> 00:01:05,613 Mu in New York minus Mu in LA differs from zero. 22 00:01:07,290 --> 00:01:09,720 All right, that's our data set. 23 00:01:09,720 --> 00:01:11,970 We have also calculated the sample means, 24 00:01:11,970 --> 00:01:14,523 standard deviations and sample sizes. 25 00:01:15,900 --> 00:01:18,390 What can we do when the variance is unknown 26 00:01:18,390 --> 00:01:20,073 but assumed to be equal? 27 00:01:21,450 --> 00:01:24,450 Earlier, we use the pooled variance formula. 28 00:01:24,450 --> 00:01:26,820 Well, here it is again. 29 00:01:26,820 --> 00:01:27,653 Remember? 30 00:01:30,510 --> 00:01:33,150 All right, it's all about plugging in numbers 31 00:01:33,150 --> 00:01:35,550 so I'll save you the trouble. 32 00:01:35,550 --> 00:01:38,103 The pooled variance is 0.05. 33 00:01:40,440 --> 00:01:42,870 One last thing we need is the standard error 34 00:01:42,870 --> 00:01:44,460 of the difference of means. 35 00:01:44,460 --> 00:01:46,563 It is given by the following formula. 36 00:01:49,740 --> 00:01:51,270 I'm going faster than usual, 37 00:01:51,270 --> 00:01:53,160 as we've seen all of this before. 38 00:01:53,160 --> 00:01:55,710 Moreover, testing is about understanding, 39 00:01:55,710 --> 00:01:57,780 computation is routine. 40 00:01:57,780 --> 00:02:00,033 So, let's start testing, shall we? 41 00:02:01,560 --> 00:02:04,320 Small samples, variance unknown. 42 00:02:04,320 --> 00:02:06,060 Which statistic do we need? 43 00:02:06,060 --> 00:02:06,990 Exactly. 44 00:02:06,990 --> 00:02:08,793 It's the T statistic again. 45 00:02:10,139 --> 00:02:11,763 How many degrees of freedom? 46 00:02:12,900 --> 00:02:14,640 You may recall it from earlier. 47 00:02:14,640 --> 00:02:16,590 It was the combined sample size 48 00:02:16,590 --> 00:02:18,660 minus the number of variables. 49 00:02:18,660 --> 00:02:22,953 So 10 plus eight minus two, which gives us 16. 50 00:02:25,050 --> 00:02:27,213 Let's see the T statistic formula. 51 00:02:28,200 --> 00:02:31,020 Once again, the difference between sample means 52 00:02:31,020 --> 00:02:34,020 minus the difference between hypothesized true means 53 00:02:34,020 --> 00:02:35,793 divided by the standard error. 54 00:02:38,160 --> 00:02:39,690 After plugging in everything, 55 00:02:39,690 --> 00:02:42,573 we get a test statistic of 6.53. 56 00:02:44,910 --> 00:02:46,233 Do we need to compare it? 57 00:02:47,580 --> 00:02:51,360 This is by far the most extreme test statistic we have seen. 58 00:02:51,360 --> 00:02:53,960 You will have a hard time finding it in the T table. 59 00:02:55,230 --> 00:02:57,030 For common tests, a rule of thumb is 60 00:02:57,030 --> 00:02:59,670 to reject the null hypothesis When T-score is 61 00:02:59,670 --> 00:03:00,693 bigger than two. 62 00:03:02,160 --> 00:03:04,710 Generally, for Z-score and T-score 63 00:03:04,710 --> 00:03:07,863 a value that is higher than four is extremely significant. 64 00:03:10,140 --> 00:03:12,033 Let's see the two-sided P value. 65 00:03:13,260 --> 00:03:17,160 The P value of this test is lower than 0.000, 66 00:03:17,160 --> 00:03:20,643 somewhere around 0.000001. 67 00:03:21,900 --> 00:03:24,810 In our lesson about P value, we said that researchers 68 00:03:24,810 --> 00:03:28,080 are always looking for those three zeros after the dot. 69 00:03:28,080 --> 00:03:30,750 It means that the test is extremely significant 70 00:03:30,750 --> 00:03:33,150 and the probability of making a type one error 71 00:03:33,150 --> 00:03:34,443 is virtually zero. 72 00:03:35,760 --> 00:03:39,150 Therefore, we reject the null hypothesis at all common 73 00:03:39,150 --> 00:03:41,073 and uncommon levels of significance. 74 00:03:42,390 --> 00:03:44,700 There is a strong statistical evidence that the price 75 00:03:44,700 --> 00:03:47,343 of apples in New York differs from in LA. 76 00:03:49,800 --> 00:03:51,780 But such an extreme result may also mean 77 00:03:51,780 --> 00:03:55,230 that the hypothesis is pointless or poorly designed. 78 00:03:55,230 --> 00:03:59,130 From the mean values of 3.94 and 3.25, and 79 00:03:59,130 --> 00:04:03,330 with such small and close standard deviations of around 0.2, 80 00:04:03,330 --> 00:04:05,640 we could easily say that the prices are different. 81 00:04:05,640 --> 00:04:06,993 No testing needed. 82 00:04:08,400 --> 00:04:11,100 A much more interesting question would be if the price 83 00:04:11,100 --> 00:04:14,673 of apples in New York is 20% higher than that in LA. 84 00:04:16,079 --> 00:04:19,350 I will leave you this exercise for homework. 85 00:04:19,350 --> 00:04:23,130 All right, we are done with hypothesis testing. 86 00:04:23,130 --> 00:04:24,903 Cheers, and thanks for watching. 6617