All language subtitles for 03 - Analyze data using variance and standard deviation

af Afrikaans
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,004 --> 00:00:01,003 - [Instructor] In this movie, 2 00:00:01,003 --> 00:00:03,008 I will show you how to calculate two important measures: 3 00:00:03,008 --> 00:00:06,006 variance and standard deviation. 4 00:00:06,006 --> 00:00:08,005 Variance is one measure of error 5 00:00:08,005 --> 00:00:11,006 that is the distance of values from the mean. 6 00:00:11,006 --> 00:00:13,009 Standard deviation is the square root of the variance, 7 00:00:13,009 --> 00:00:16,001 and as we'll see elsewhere in this course, 8 00:00:16,001 --> 00:00:18,003 it's extremely useful. 9 00:00:18,003 --> 00:00:20,007 My sample file is 01_03_Variance 10 00:00:20,007 --> 00:00:22,009 and you can find it in the Chapter01 folder 11 00:00:22,009 --> 00:00:25,008 of the Exercise Files collection. 12 00:00:25,008 --> 00:00:28,003 In this workbook, I have a dataset 13 00:00:28,003 --> 00:00:31,005 in cells A4 through A13. 14 00:00:31,005 --> 00:00:35,005 And in it, I have values that reflect orders 15 00:00:35,005 --> 00:00:36,007 for a company. 16 00:00:36,007 --> 00:00:42,008 I've already calculated the mean or average in cell C1. 17 00:00:42,008 --> 00:00:45,007 To start, I need to calculate the error 18 00:00:45,007 --> 00:00:49,005 or distance from the mean of each of my order values. 19 00:00:49,005 --> 00:00:53,001 So I'll click in cell B4 20 00:00:53,001 --> 00:00:54,007 and I'll type equal. 21 00:00:54,007 --> 00:00:58,009 And I want to start with the value in A4, 22 00:00:58,009 --> 00:01:00,000 which is the order, 23 00:01:00,000 --> 00:01:02,008 and I will subtract the value in C1, 24 00:01:02,008 --> 00:01:04,008 which is the average. 25 00:01:04,008 --> 00:01:08,000 So if the order is lower than the average, 26 00:01:08,000 --> 00:01:10,008 then we'll get a negative value. 27 00:01:10,008 --> 00:01:13,001 I do want the cell reference to A4 to change 28 00:01:13,001 --> 00:01:15,000 but I don't want C1 to change 29 00:01:15,000 --> 00:01:17,005 'cause we'll always be looking at the same average. 30 00:01:17,005 --> 00:01:21,007 So I'll press F4 to get an absolute reference. 31 00:01:21,007 --> 00:01:26,005 Press Enter, and I get 3,090. 32 00:01:26,005 --> 00:01:30,004 And we're rounding to whole numbers here. 33 00:01:30,004 --> 00:01:34,003 Now I can copy that formula to the rest of the range. 34 00:01:34,003 --> 00:01:36,006 So I'll click cell B4 35 00:01:36,006 --> 00:01:39,001 and double click the fill handle 36 00:01:39,001 --> 00:01:41,002 at the bottom right corner of the cell 37 00:01:41,002 --> 00:01:45,001 and you can see that the formula has been copied down. 38 00:01:45,001 --> 00:01:46,009 Now to calculate the variance accurately, 39 00:01:46,009 --> 00:01:49,001 I need to square the error. 40 00:01:49,001 --> 00:01:52,005 So in cell C4, I'll type equal 41 00:01:52,005 --> 00:01:56,002 and that will just be the value in B4 squared. 42 00:01:56,002 --> 00:01:57,009 So I'll type a caret, 43 00:01:57,009 --> 00:01:59,009 indicating I want an exponent 44 00:01:59,009 --> 00:02:01,008 and then the number two. 45 00:02:01,008 --> 00:02:05,002 Press Enter, and I get that value there. 46 00:02:05,002 --> 00:02:08,008 I'll click cell C4, double click the fill handle again. 47 00:02:08,008 --> 00:02:11,004 And copy it down. 48 00:02:11,004 --> 00:02:12,008 One very common question 49 00:02:12,008 --> 00:02:15,000 is why you need to square the error 50 00:02:15,000 --> 00:02:17,000 to calculate variance. 51 00:02:17,000 --> 00:02:19,000 Well, let's see what happens if you don't. 52 00:02:19,000 --> 00:02:21,001 I'll go to cell B14 53 00:02:21,001 --> 00:02:23,009 and then I will add up all of the errors. 54 00:02:23,009 --> 00:02:25,007 So I'll press Alt + equal 55 00:02:25,007 --> 00:02:28,000 to create an AutoSum formula. 56 00:02:28,000 --> 00:02:30,000 And then press Tab 57 00:02:30,000 --> 00:02:33,006 and you can see that the error is zero. 58 00:02:33,006 --> 00:02:36,009 And that's because when you take an average, 59 00:02:36,009 --> 00:02:39,004 the sum of all the differences must be zero. 60 00:02:39,004 --> 00:02:42,000 That happens by rule. 61 00:02:42,000 --> 00:02:45,008 So I will replace the formula in cell C14 62 00:02:45,008 --> 00:02:47,005 and go ahead and click that cell 63 00:02:47,005 --> 00:02:49,005 if you happen to click away. 64 00:02:49,005 --> 00:02:52,009 Press Alt + equal to get my AutoSum 65 00:02:52,009 --> 00:02:55,009 for the values from C4 to C13, 66 00:02:55,009 --> 00:02:59,006 Enter and there is that sum. 67 00:02:59,006 --> 00:03:01,009 Now I can calculate the variance. 68 00:03:01,009 --> 00:03:05,007 So I'll go to cell C16 69 00:03:05,007 --> 00:03:08,003 and I'll type an equal sign 70 00:03:08,003 --> 00:03:10,009 and that will be C14, which again is the sum 71 00:03:10,009 --> 00:03:12,005 of my squared errors. 72 00:03:12,005 --> 00:03:15,002 And then a forward slash for division, 73 00:03:15,002 --> 00:03:16,009 left parentheses and it'll be a count 74 00:03:16,009 --> 00:03:26,000 for the number of values in C4 through C13 minus one. 75 00:03:26,000 --> 00:03:28,008 Then type a right parentheses and Enter. 76 00:03:28,008 --> 00:03:32,007 And I get that square root of 7,100,000 77 00:03:32,007 --> 00:03:35,004 and a little bit more. 78 00:03:35,004 --> 00:03:38,000 Now if I want to calculate the standard deviation, 79 00:03:38,000 --> 00:03:43,009 I can go to cell C18 80 00:03:43,009 --> 00:03:45,009 and take the square root 81 00:03:45,009 --> 00:03:47,007 of the value in C16. 82 00:03:47,007 --> 00:03:51,008 So I'll type equal and square root, SQRT, 83 00:03:51,008 --> 00:03:56,003 and that's the value in C16. 84 00:03:56,003 --> 00:03:57,007 Right parentheses and Tab 85 00:03:57,007 --> 00:04:00,000 so I don't scroll down in the workbook 86 00:04:00,000 --> 00:04:04,002 and I get a standard deviation of 2,666. 87 00:04:04,002 --> 00:04:06,004 Great, that value makes a lot of sense 88 00:04:06,004 --> 00:04:09,004 for a dataset that has a mean 89 00:04:09,004 --> 00:04:15,006 of 5,482.6 and a maximum value of about 9,500. 90 00:04:15,006 --> 00:04:18,002 There are quicker ways to calculate the variance 91 00:04:18,002 --> 00:04:19,002 and standard deviation. 92 00:04:19,002 --> 00:04:22,004 So I'll go ahead and show those to you now. 93 00:04:22,004 --> 00:04:25,005 I'll click in cell F16, type an equal sign 94 00:04:25,005 --> 00:04:28,005 and I will use the VAR.S function. 95 00:04:28,005 --> 00:04:31,003 So I'll type VAR.S 96 00:04:31,003 --> 00:04:33,002 and we're assuming that we have a sample 97 00:04:33,002 --> 00:04:36,009 and not all of the data in a population. 98 00:04:36,009 --> 00:04:38,009 This is where the -1 comes in. 99 00:04:38,009 --> 00:04:42,003 We are assuming that we don't have all of our data 100 00:04:42,003 --> 00:04:45,005 so that we're dividing by a sightly smaller number 101 00:04:45,005 --> 00:04:46,005 than we would otherwise, 102 00:04:46,005 --> 00:04:49,004 so we have a more conservative estimate. 103 00:04:49,004 --> 00:04:51,009 So I'll type a left parentheses 104 00:04:51,009 --> 00:04:55,002 and we are taking the variance 105 00:04:55,002 --> 00:04:58,006 of the sample in cells A4 to A13. 106 00:04:58,006 --> 00:05:00,005 Right parentheses and Enter. 107 00:05:00,005 --> 00:05:06,003 And as you can see, we get 7,106,604. 108 00:05:06,003 --> 00:05:09,005 And if I go down to cell F18, 109 00:05:09,005 --> 00:05:13,007 and type equal STDEV.S, 110 00:05:13,007 --> 00:05:15,005 again for the sample, 111 00:05:15,005 --> 00:05:19,000 and A4 through A13, 112 00:05:19,000 --> 00:05:21,001 right parentheses and Tab, 113 00:05:21,001 --> 00:05:24,009 then we get a standard deviation of 2,666. 114 00:05:24,009 --> 00:05:26,006 I wanted show you the mechanics 115 00:05:26,006 --> 00:05:29,007 of how variance and standard deviation are calculated 116 00:05:29,007 --> 00:05:32,007 so you have an idea of what going on 117 00:05:32,007 --> 00:05:36,002 when Excel makes those calculations for you. 118 00:05:36,002 --> 00:05:38,004 With variance and standard deviation in mind, 119 00:05:38,004 --> 00:05:41,000 you will get a lot out of the rest of the course. 8769

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.