All language subtitles for 015 Variance_en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian Download
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,420 --> 00:00:01,859 Instructor: Next on our to-do list 2 00:00:01,859 --> 00:00:04,019 are the measures of variability. 3 00:00:04,019 --> 00:00:06,060 There are many ways to quantify variability. 4 00:00:06,060 --> 00:00:08,600 However, we will focus on the most common ones, 5 00:00:08,600 --> 00:00:12,753 variance, standard deviation, and coefficient of variation. 6 00:00:14,010 --> 00:00:15,270 In the field of statistics, 7 00:00:15,270 --> 00:00:17,550 we will typically use different formulas when working 8 00:00:17,550 --> 00:00:20,100 with population data and sample data. 9 00:00:20,100 --> 00:00:21,753 Let's think about this for a bit. 10 00:00:22,830 --> 00:00:24,600 When you have the whole population, 11 00:00:24,600 --> 00:00:25,950 each data point is known 12 00:00:25,950 --> 00:00:28,863 so you are 100% sure of the measures you are calculating. 13 00:00:29,790 --> 00:00:31,800 When you take a sample of this population 14 00:00:31,800 --> 00:00:33,990 and you compute a sample statistic, 15 00:00:33,990 --> 00:00:35,760 it is interpreted as an approximation 16 00:00:35,760 --> 00:00:37,680 of the population parameter. 17 00:00:37,680 --> 00:00:40,110 Moreover, if you extract 10 different samples 18 00:00:40,110 --> 00:00:41,640 from the same population, 19 00:00:41,640 --> 00:00:43,815 you will get 10 different measures. 20 00:00:43,815 --> 00:00:46,140 Statisticians have solved the problem 21 00:00:46,140 --> 00:00:48,030 by adjusting the algebraic formulas 22 00:00:48,030 --> 00:00:50,760 for many statistics to reflect this issue. 23 00:00:50,760 --> 00:00:53,400 Therefore, we will explore both population 24 00:00:53,400 --> 00:00:56,223 and sample formulas as they are both used. 25 00:00:57,480 --> 00:01:00,450 You must be asking yourself why there are unique formulas 26 00:01:00,450 --> 00:01:02,970 for the mean, median, and mode. 27 00:01:02,970 --> 00:01:05,760 Well, actually, the sample mean is the average 28 00:01:05,760 --> 00:01:07,230 of the sample data points, 29 00:01:07,230 --> 00:01:09,630 while the population mean is the average 30 00:01:09,630 --> 00:01:11,670 of the population data points. 31 00:01:11,670 --> 00:01:14,400 So technically, there are two different formulas 32 00:01:14,400 --> 00:01:16,803 but they are computed in the same way. 33 00:01:18,030 --> 00:01:21,390 Okay, now after this short clarification, 34 00:01:21,390 --> 00:01:23,493 it's time to get onto variance. 35 00:01:24,450 --> 00:01:27,120 Variance measures the dispersion of a set of data points 36 00:01:27,120 --> 00:01:28,623 around their mean value. 37 00:01:29,580 --> 00:01:32,520 Population variance, denoted by sigma squared, 38 00:01:32,520 --> 00:01:34,500 is equal to the sum of square differences 39 00:01:34,500 --> 00:01:37,500 between the observed values and the population mean 40 00:01:37,500 --> 00:01:40,143 divided by the total number of observations. 41 00:01:41,790 --> 00:01:46,080 Sample variance, on the other hand, is denoted by S squared, 42 00:01:46,080 --> 00:01:48,360 and is equal to the sum of squared differences 43 00:01:48,360 --> 00:01:51,687 between observed sample values and the sample mean 44 00:01:51,687 --> 00:01:55,863 divided by the number of sample observations minus one. 45 00:01:57,360 --> 00:01:58,590 All right. 46 00:01:58,590 --> 00:02:00,510 When you are getting acquainted with statistics, 47 00:02:00,510 --> 00:02:03,390 it is hard to grasp everything right away. 48 00:02:03,390 --> 00:02:06,660 Therefore, let's stop for a second to examine the formula 49 00:02:06,660 --> 00:02:09,513 for the population and try to clarify its meaning. 50 00:02:10,530 --> 00:02:12,810 The main part of the formula is its numerator 51 00:02:12,810 --> 00:02:15,810 so that's what we want to comprehend. 52 00:02:15,810 --> 00:02:18,090 The sum of differences between the observations 53 00:02:18,090 --> 00:02:20,220 and the mean squared. 54 00:02:20,220 --> 00:02:23,730 Hmm, so the closer a number to the mean, 55 00:02:23,730 --> 00:02:26,403 the lower the results we will obtain, right? 56 00:02:27,270 --> 00:02:29,610 And the further away from the mean it lies, 57 00:02:29,610 --> 00:02:31,942 the larger this difference. 58 00:02:31,942 --> 00:02:33,600 Easy. 59 00:02:33,600 --> 00:02:36,243 But why do we elevate to the second degree? 60 00:02:37,170 --> 00:02:40,770 Squaring the differences has two main purposes. 61 00:02:40,770 --> 00:02:42,600 First, by squaring the numbers, 62 00:02:42,600 --> 00:02:45,450 we always get non-negative computations. 63 00:02:45,450 --> 00:02:48,060 Without going too deep into the mathematics of it, 64 00:02:48,060 --> 00:02:51,180 it is intuitive that dispersion cannot be negative. 65 00:02:51,180 --> 00:02:52,890 Dispersion is about distance, 66 00:02:52,890 --> 00:02:55,053 and distance cannot be negative. 67 00:02:56,430 --> 00:02:59,040 If on the other hand, we calculate the difference, 68 00:02:59,040 --> 00:03:01,140 and do not elevate to the second degree, 69 00:03:01,140 --> 00:03:03,540 we would obtain both positive and negative values 70 00:03:03,540 --> 00:03:05,610 that, when summed, would cancel out, 71 00:03:05,610 --> 00:03:08,313 leaving us with no information about the dispersion. 72 00:03:09,810 --> 00:03:13,503 Second, squaring amplifies the effect of large differences. 73 00:03:14,400 --> 00:03:16,410 For example, if the mean is zero, 74 00:03:16,410 --> 00:03:18,540 and you have an observation of 100, 75 00:03:18,540 --> 00:03:21,570 the squared spread is 10,000. 76 00:03:21,570 --> 00:03:23,730 All right, enough dry theory. 77 00:03:23,730 --> 00:03:26,223 It is time for a practical example. 78 00:03:27,150 --> 00:03:30,060 We have a population of five observations, 79 00:03:30,060 --> 00:03:33,480 One, two, three, four, and five. 80 00:03:33,480 --> 00:03:35,043 Let's find its variance. 81 00:03:35,910 --> 00:03:38,070 We start by calculating the mean, 82 00:03:38,070 --> 00:03:42,150 one plus two plus three plus four plus five 83 00:03:42,150 --> 00:03:44,583 divided by five equals three. 84 00:03:45,510 --> 00:03:48,150 Then we apply the formula we just saw. 85 00:03:48,150 --> 00:03:53,150 One minus three squared plus two minus three squared, 86 00:03:53,700 --> 00:03:57,120 plus three minus three squared, 87 00:03:57,120 --> 00:04:00,660 plus four minus three squared, 88 00:04:00,660 --> 00:04:04,500 plus five minus three squared. 89 00:04:04,500 --> 00:04:07,410 All of these components have to be divided by five. 90 00:04:07,410 --> 00:04:10,410 When we do the math, we get two. 91 00:04:10,410 --> 00:04:14,520 So the population variance of the data set is two. 92 00:04:14,520 --> 00:04:17,040 But what about the sample variance? 93 00:04:17,040 --> 00:04:19,079 This would only be suitable if we were told 94 00:04:19,079 --> 00:04:21,540 that these five observations were a sample drawn 95 00:04:21,540 --> 00:04:23,250 from a population. 96 00:04:23,250 --> 00:04:25,563 So let's imagine that's the case. 97 00:04:26,430 --> 00:04:29,370 The sample mean is once again, three. 98 00:04:29,370 --> 00:04:30,660 The numerator is the same, 99 00:04:30,660 --> 00:04:33,930 but the denominator is going to be four instead of five, 100 00:04:33,930 --> 00:04:37,023 giving us a sample variance of 2.5. 101 00:04:38,250 --> 00:04:40,110 To conclude the variance topic, 102 00:04:40,110 --> 00:04:42,030 we should interpret the result. 103 00:04:42,030 --> 00:04:43,890 Why is the sample variance bigger 104 00:04:43,890 --> 00:04:46,080 than the population variance? 105 00:04:46,080 --> 00:04:48,300 In the first case, we knew the population. 106 00:04:48,300 --> 00:04:49,950 That is, we had all the data, 107 00:04:49,950 --> 00:04:52,050 and we calculated the variance. 108 00:04:52,050 --> 00:04:54,030 In the second case, we were told 109 00:04:54,030 --> 00:04:57,720 that one, two, three, four, and five was a sample 110 00:04:57,720 --> 00:05:00,330 drawn from a bigger population. 111 00:05:00,330 --> 00:05:02,430 Imagine the population of this sample 112 00:05:02,430 --> 00:05:04,140 where these nine numbers, 113 00:05:04,140 --> 00:05:08,100 one, one, one, two, three, four, 114 00:05:08,100 --> 00:05:11,160 five, five, five, and five. 115 00:05:11,160 --> 00:05:13,320 Clearly the numbers are the same, 116 00:05:13,320 --> 00:05:14,850 but there is a concentration 117 00:05:14,850 --> 00:05:19,470 around the two extremes of the data set, one and five. 118 00:05:19,470 --> 00:05:22,953 The variance of this population is 2.96. 119 00:05:24,060 --> 00:05:27,810 So our sample variance has rightfully corrected upwards 120 00:05:27,810 --> 00:05:30,663 in order to reflect the higher potential variability. 121 00:05:31,680 --> 00:05:34,230 This is the reason why there are different formulas 122 00:05:34,230 --> 00:05:36,393 for sample and population data. 123 00:05:37,477 --> 00:05:39,690 This was a very important lesson, 124 00:05:39,690 --> 00:05:42,690 so please make sure that you have understood it well. 125 00:05:42,690 --> 00:05:44,400 You can reinforce what you learned here 126 00:05:44,400 --> 00:05:46,380 by doing the exercise available 127 00:05:46,380 --> 00:05:48,750 in the course resources section. 128 00:05:48,750 --> 00:05:50,910 Remember, the subject of statistics 129 00:05:50,910 --> 00:05:53,880 is only understood when practiced. 130 00:05:53,880 --> 00:05:54,880 Thanks for watching. 10095

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.