All language subtitles for 002 Types of Probability Distributions_en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian Download
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:03,000 --> 00:00:04,200 -: Hello again. 2 00:00:04,200 --> 00:00:06,090 In this lecture, we are going to talk about 3 00:00:06,090 --> 00:00:08,700 various types of Probability distributions 4 00:00:08,700 --> 00:00:11,733 and what kind of events they can be used to describe. 5 00:00:12,840 --> 00:00:15,150 Certain distributions share features, 6 00:00:15,150 --> 00:00:17,370 so we group them into types. 7 00:00:17,370 --> 00:00:20,220 Some like rolling a die or picking a card, 8 00:00:20,220 --> 00:00:23,010 have a finite number of outcomes. 9 00:00:23,010 --> 00:00:25,470 They follow Discrete distributions, 10 00:00:25,470 --> 00:00:27,960 and we use the formulas we already introduced 11 00:00:27,960 --> 00:00:30,993 to calculate their probabilities and expected values. 12 00:00:32,280 --> 00:00:35,790 Others like, recording time and distance in track and field, 13 00:00:35,790 --> 00:00:38,130 have infinitely many outcomes. 14 00:00:38,130 --> 00:00:40,770 They follow Continuous distributions. 15 00:00:40,770 --> 00:00:42,360 And we use different formulas 16 00:00:42,360 --> 00:00:44,103 from the ones we mentioned so far. 17 00:00:45,360 --> 00:00:46,920 Throughout the course of this video, 18 00:00:46,920 --> 00:00:48,870 we are going to examine the characteristics 19 00:00:48,870 --> 00:00:51,600 of some of the most common distributions. 20 00:00:51,600 --> 00:00:54,960 For each one, we will focus on an important aspect of it 21 00:00:54,960 --> 00:00:56,493 or when it is used. 22 00:00:57,330 --> 00:00:58,950 Before we get into the specifics, 23 00:00:58,950 --> 00:01:01,440 you will need to know the proper notation we implement 24 00:01:01,440 --> 00:01:03,153 when defining distributions. 25 00:01:04,080 --> 00:01:06,690 We start off by writing down the variable name 26 00:01:06,690 --> 00:01:08,340 for our set of values, 27 00:01:08,340 --> 00:01:10,323 followed by the Tilde sign. 28 00:01:11,280 --> 00:01:13,470 This is superseded by a capital letter 29 00:01:13,470 --> 00:01:15,390 depicting the type of the distribution 30 00:01:15,390 --> 00:01:18,513 and some characteristics of the data set in parenthesis. 31 00:01:19,380 --> 00:01:22,440 The characteristics are usually mean and variance, 32 00:01:22,440 --> 00:01:25,533 but they may vary depending on the type of the distribution. 33 00:01:27,720 --> 00:01:32,040 All right, let us start by talking about the discrete ones. 34 00:01:32,040 --> 00:01:33,630 We will give an overview of them 35 00:01:33,630 --> 00:01:36,603 and then we will devote a separate lecture to each one. 36 00:01:37,590 --> 00:01:40,500 So we looked at problems relating to drawing cards 37 00:01:40,500 --> 00:01:42,660 from a deck or flipping a coin. 38 00:01:42,660 --> 00:01:45,420 Both examples show events where all outcomes 39 00:01:45,420 --> 00:01:46,683 are equally likely. 40 00:01:47,520 --> 00:01:50,220 Such outcomes are called equiprobable, 41 00:01:50,220 --> 00:01:53,463 and these sorts of events follow a uniform distribution. 42 00:01:54,660 --> 00:01:57,870 Then there are events with only two possible outcomes, 43 00:01:57,870 --> 00:01:59,760 true or false. 44 00:01:59,760 --> 00:02:02,013 They follow a Bernoulli distribution. 45 00:02:03,000 --> 00:02:05,900 Regardless of whether one outcome is more likely to occur, 46 00:02:06,780 --> 00:02:09,570 any event with two outcomes can be transformed 47 00:02:09,570 --> 00:02:11,850 into a Bernoulli event. 48 00:02:11,850 --> 00:02:14,280 We simply assign one of them to be true 49 00:02:14,280 --> 00:02:15,963 and the other one to be false. 50 00:02:17,100 --> 00:02:19,290 Imagine we are required to elect a captain 51 00:02:19,290 --> 00:02:21,240 for our college sports team. 52 00:02:21,240 --> 00:02:23,760 The team consists of seven native students 53 00:02:23,760 --> 00:02:26,370 and three international students. 54 00:02:26,370 --> 00:02:29,190 We assign the captain being domestic to be true, 55 00:02:29,190 --> 00:02:32,310 and the captain being an international as false. 56 00:02:32,310 --> 00:02:35,520 Since the outcome can now only be true or false, 57 00:02:35,520 --> 00:02:37,473 we have a Bernoulli distribution. 58 00:02:39,150 --> 00:02:41,160 Now, if we carry out a similar experiment 59 00:02:41,160 --> 00:02:42,570 several times in a row, 60 00:02:42,570 --> 00:02:45,153 we are dealing with a Binomial distribution. 61 00:02:46,110 --> 00:02:48,000 Just like the Bernoulli distribution, 62 00:02:48,000 --> 00:02:50,610 the outcomes for each iteration are two, 63 00:02:50,610 --> 00:02:52,443 but we have many iterations. 64 00:02:53,430 --> 00:02:55,500 For example, we could be flipping the coin 65 00:02:55,500 --> 00:02:57,450 we mentioned earlier three times, 66 00:02:57,450 --> 00:02:59,280 and trying to calculate the likelihood 67 00:02:59,280 --> 00:03:00,843 of getting heads twice. 68 00:03:02,760 --> 00:03:06,240 Lastly, we should mention the Poisson distribution. 69 00:03:06,240 --> 00:03:08,010 We used it when we want to test out 70 00:03:08,010 --> 00:03:11,910 how unusual an event frequency is for a given interval. 71 00:03:11,910 --> 00:03:14,610 For example, imagine we know that so far 72 00:03:14,610 --> 00:03:17,670 LeBron James has averaged 35 points per game 73 00:03:17,670 --> 00:03:19,524 during the regular season. 74 00:03:19,524 --> 00:03:21,180 We want to know how likely it is 75 00:03:21,180 --> 00:03:23,760 that he will score 12 points in the first quarter 76 00:03:23,760 --> 00:03:24,783 of his next game. 77 00:03:25,860 --> 00:03:27,510 Since the frequency changes, 78 00:03:27,510 --> 00:03:29,883 so should our expectations for the outcome. 79 00:03:30,750 --> 00:03:32,880 Using the Poisson distribution, 80 00:03:32,880 --> 00:03:35,220 we are able to determine the chance of LeBron 81 00:03:35,220 --> 00:03:38,853 scoring exactly 12 points for the specified time interval. 82 00:03:41,730 --> 00:03:42,563 Great. 83 00:03:42,563 --> 00:03:45,690 Now on to the Continuous distributions. 84 00:03:45,690 --> 00:03:46,860 One thing to remember, 85 00:03:46,860 --> 00:03:49,560 is that since we are dealing with continuous outcomes, 86 00:03:49,560 --> 00:03:52,410 the Probability distribution would be a curve 87 00:03:52,410 --> 00:03:55,413 as opposed to unconnected individual bars. 88 00:03:56,490 --> 00:04:00,003 The first one we will talk about is the Normal distribution. 89 00:04:00,840 --> 00:04:02,730 The outcomes of many events in nature 90 00:04:02,730 --> 00:04:06,543 closely resembled this distribution, hence the name normal. 91 00:04:07,380 --> 00:04:09,540 For instance, according to numerous reports 92 00:04:09,540 --> 00:04:11,220 throughout the last few decades, 93 00:04:11,220 --> 00:04:13,440 the weight of an adult male polar bear 94 00:04:13,440 --> 00:04:16,649 is usually around 500 kilograms. 95 00:04:16,649 --> 00:04:19,470 However, there have been records of individual species 96 00:04:19,470 --> 00:04:23,853 weighing anywhere between 350 kilograms and 700 kilograms. 97 00:04:24,840 --> 00:04:29,840 Extreme values like 350 and 700 are called outliers 98 00:04:30,570 --> 00:04:33,963 and do not feature very frequently in Normal distributions. 99 00:04:35,130 --> 00:04:37,320 Sometimes we have limited data for events 100 00:04:37,320 --> 00:04:39,333 that resemble a normal distribution. 101 00:04:40,590 --> 00:04:44,673 In those cases, we observe the Student's-T distribution. 102 00:04:45,750 --> 00:04:48,030 It serves as a small sample approximation 103 00:04:48,030 --> 00:04:49,563 of a Normal distribution. 104 00:04:50,820 --> 00:04:52,020 Another difference is that 105 00:04:52,020 --> 00:04:54,990 the Student's-T accommodates extreme values 106 00:04:54,990 --> 00:04:56,193 significantly better. 107 00:04:57,420 --> 00:04:59,760 Graphically, that is represented by the curve 108 00:04:59,760 --> 00:05:01,293 having fatter tails. 109 00:05:02,370 --> 00:05:05,220 Overall, this results in a larger number of values 110 00:05:05,220 --> 00:05:07,530 located far away from the mean. 111 00:05:07,530 --> 00:05:09,960 So the curve would probably more closely resemble 112 00:05:09,960 --> 00:05:13,473 a Student's-T distribution than a Normal distribution. 113 00:05:15,030 --> 00:05:17,880 Now, imagine only looking at the recorded weights 114 00:05:17,880 --> 00:05:21,420 of the last 10 sightings across Alaska and Canada. 115 00:05:21,420 --> 00:05:23,820 The lower number of elements would make the occurrence 116 00:05:23,820 --> 00:05:25,200 of any extreme value 117 00:05:25,200 --> 00:05:26,670 represent a much bigger part 118 00:05:26,670 --> 00:05:28,320 of the population than it should. 119 00:05:31,980 --> 00:05:33,093 Good job everyone. 120 00:05:34,230 --> 00:05:35,700 Another Continuous distribution 121 00:05:35,700 --> 00:05:37,050 we would like to introduce, 122 00:05:37,050 --> 00:05:39,840 is the Chi-Squared distribution. 123 00:05:39,840 --> 00:05:41,520 It is the first asymmetric 124 00:05:41,520 --> 00:05:44,070 Continuous distribution we are dealing with 125 00:05:44,070 --> 00:05:46,713 as it only consists of non-negative values. 126 00:05:47,700 --> 00:05:50,940 Graphically, that means that the Chi-Squared distribution 127 00:05:50,940 --> 00:05:53,493 always starts from zero on the left. 128 00:05:54,540 --> 00:05:58,110 Depending on the average and maximum values within the set, 129 00:05:58,110 --> 00:06:00,240 the curve of the Chi-Squared graph 130 00:06:00,240 --> 00:06:01,953 is typically skewed to the right. 131 00:06:03,600 --> 00:06:05,640 Unlike the previous two distributions, 132 00:06:05,640 --> 00:06:09,450 the Chi-Squared does not often mirror real-life events. 133 00:06:09,450 --> 00:06:12,780 However, it is often used in hypothesis testing 134 00:06:12,780 --> 00:06:15,270 to help determine goodness of fit. 135 00:06:15,270 --> 00:06:16,860 The next distribution on our list 136 00:06:16,860 --> 00:06:18,753 is the Exponential distribution. 137 00:06:19,650 --> 00:06:22,290 The Exponential distribution is usually present 138 00:06:22,290 --> 00:06:23,460 when we are dealing with events 139 00:06:23,460 --> 00:06:25,323 that are rapidly changing early on. 140 00:06:26,670 --> 00:06:28,350 An easy to understand example 141 00:06:28,350 --> 00:06:31,470 is how online news articles generate hits. 142 00:06:31,470 --> 00:06:34,650 They get most of their clicks when the topic is still fresh. 143 00:06:34,650 --> 00:06:35,970 The more time passes, 144 00:06:35,970 --> 00:06:39,183 the more irrelevant it becomes and interest dies off. 145 00:06:40,440 --> 00:06:42,540 The last Continuous distribution we will mention 146 00:06:42,540 --> 00:06:44,943 is the Logistic distribution. 147 00:06:45,930 --> 00:06:48,600 We often find it useful in forecast analysis 148 00:06:48,600 --> 00:06:50,370 when we try to determine a cutoff point 149 00:06:50,370 --> 00:06:52,440 for a successful outcome. 150 00:06:52,440 --> 00:06:56,640 For instance, take a competitive Esport like Dota 2. 151 00:06:56,640 --> 00:06:58,650 We can use the Logistic distribution 152 00:06:58,650 --> 00:07:00,960 to determine how much of an in-game advantage 153 00:07:00,960 --> 00:07:02,970 at the 10-minute mark is necessary 154 00:07:02,970 --> 00:07:05,523 to confidently predict victory for either team. 155 00:07:06,390 --> 00:07:08,550 Just like with other types of forecasting, 156 00:07:08,550 --> 00:07:11,730 our predictions would never reach true certainty, 157 00:07:11,730 --> 00:07:12,880 but more on that later. 158 00:07:15,750 --> 00:07:18,150 Whoa, good job, folks. 159 00:07:18,150 --> 00:07:19,140 In the next video, 160 00:07:19,140 --> 00:07:22,140 we are going to focus on Discrete distributions. 161 00:07:22,140 --> 00:07:25,020 We will introduce formulas for competing expected values 162 00:07:25,020 --> 00:07:26,310 and standard deviations 163 00:07:26,310 --> 00:07:29,790 before looking into each distribution individually. 164 00:07:29,790 --> 00:07:30,843 Thanks for watching. 12875

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.