All language subtitles for 4. IMPORTANT Correlation vs. Causation

af Afrikaans
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,540 --> 00:00:00,840 All right. 2 00:00:00,860 --> 00:00:06,480 So as we talk more about A.I. and machine learning and statistical modeling it's important to remember 3 00:00:06,480 --> 00:00:12,720 that these tools can be incredibly powerful when they're used appropriately but incredibly dangerous 4 00:00:12,780 --> 00:00:14,230 when they aren't. 5 00:00:14,250 --> 00:00:19,920 Now the thing about tools like this key influence or visual here is that they're great because they 6 00:00:20,130 --> 00:00:23,520 helped make machine learning accessible to everyday users. 7 00:00:23,520 --> 00:00:29,490 But a little scary because those same users often lack the foundational knowledge to understand what's 8 00:00:29,490 --> 00:00:35,370 happening behind the curtain and how to properly interpret the results or make intelligent decisions 9 00:00:35,400 --> 00:00:36,630 based on them. 10 00:00:36,630 --> 00:00:42,810 So speaking of interpreting results I think this is a good time to take a step back take a pause and 11 00:00:42,810 --> 00:00:47,540 review one of the most important rules in statistics and analytics. 12 00:00:47,610 --> 00:00:51,210 Correlation does not imply causation. 13 00:00:51,240 --> 00:00:55,460 Now I'm sure many of you have heard this before especially if you work in data or analytics. 14 00:00:55,530 --> 00:01:02,860 But let's take two minutes and break this down correlation is one two variables x and y move together 15 00:01:03,400 --> 00:01:04,510 kind of like this. 16 00:01:04,600 --> 00:01:11,830 They move in the same direction causation on the other hand is one variable X causes variable Y. 17 00:01:11,830 --> 00:01:16,970 In other words there's a clear cause and effect relationship here now. 18 00:01:17,020 --> 00:01:23,170 What if I were to show you a scatter plot like this where plotting violent crime rate on the y axis 19 00:01:23,590 --> 00:01:31,870 and some mystery variable on the X and based on this 25 30 observations here we've got a very tight 20 00:01:31,870 --> 00:01:36,900 correlation pretty clear linear relationship between the two variables. 21 00:01:36,970 --> 00:01:39,760 So clearly they move in the same direction. 22 00:01:39,760 --> 00:01:47,050 Clearly they're correlated but you might be tempted to think you know that this x axis variable is the 23 00:01:47,050 --> 00:01:50,790 driver behind violent crimes it's causing violent crimes. 24 00:01:50,950 --> 00:01:56,500 And that if only we could cut back on whatever this variable is we might be able to make our streets 25 00:01:56,500 --> 00:01:57,550 safer. 26 00:01:57,550 --> 00:02:02,830 The problem with that is that we're looking at ice cream cones sold and you may be scratching your head. 27 00:02:02,830 --> 00:02:08,050 You may be a little confused and that's totally understandable because the human brain is biased to 28 00:02:08,050 --> 00:02:13,040 look for cause and effect relationships where in fact they don't exist. 29 00:02:13,060 --> 00:02:18,100 So if you're still wondering kind of what's going on here and how this could possibly be true here's 30 00:02:18,100 --> 00:02:27,640 a hint our y axis violent crime rate could just as easily be drowning deaths or forest fires or even 31 00:02:27,700 --> 00:02:30,040 your dreaded crab attack. 32 00:02:30,040 --> 00:02:33,500 So think about what those things have in common. 33 00:02:33,670 --> 00:02:38,950 And you probably start to realize that this has nothing to do with ice cream at all and everything to 34 00:02:38,950 --> 00:02:40,850 do with temperature. 35 00:02:41,140 --> 00:02:47,110 As temperatures rise you have more people out later gathering in public spaces as a result. 36 00:02:47,110 --> 00:02:48,700 Crime rates increase. 37 00:02:48,700 --> 00:02:51,720 You also have more people going to the beach and swimming in the ocean. 38 00:02:51,760 --> 00:02:55,230 So drowning deaths increase and so on and so forth. 39 00:02:55,270 --> 00:03:02,530 So because ice cream sales are such a close proxy temperature we've created a false narrative that paints 40 00:03:02,530 --> 00:03:07,130 a completely misleading story so key takeaways here. 41 00:03:07,130 --> 00:03:12,170 Ice cream does not turn you into a violent criminal does not make you drown. 42 00:03:12,290 --> 00:03:18,040 It does not start forest fires and it certainly does not encourage crabs to attack you. 43 00:03:18,080 --> 00:03:24,680 Now obviously these are silly examples here but the core principle that concept holds true and it's 44 00:03:24,680 --> 00:03:26,830 a really important one to keep in mind. 45 00:03:26,960 --> 00:03:32,800 I'll leave you with one kind of more real world business case here should be a scenario like this. 46 00:03:32,920 --> 00:03:34,580 You know maybe you run a startup. 47 00:03:34,730 --> 00:03:39,380 You've been live for about four months and you're plotting your weekly marketing spend which you've 48 00:03:39,380 --> 00:03:42,270 been ramping up against your total revenue. 49 00:03:42,290 --> 00:03:48,740 Now if you were to imply causation based on this chart these results here you might think that ramping 50 00:03:48,740 --> 00:03:53,120 up your marketing spend is a surefire way to drive more revenue. 51 00:03:53,120 --> 00:03:57,230 And that may be the case it may be true but it also may not. 52 00:03:57,230 --> 00:04:02,720 So the idea is you've got to think about the other factors that might be at play here maybe over this 53 00:04:02,720 --> 00:04:04,340 three or four month period. 54 00:04:04,340 --> 00:04:11,930 You've also been ramping up a new sales team or maybe your organic traffic has been growing due to referrals 55 00:04:11,990 --> 00:04:14,210 or PR or something like that. 56 00:04:14,210 --> 00:04:21,050 So bottom line here be thoughtful about how you interpret these results and Please use caution before 57 00:04:21,050 --> 00:04:23,810 you make big decisions based on these findings. 6244

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.