Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:03,210 --> 00:00:04,770
Instructor: Hello again.
2
00:00:04,770 --> 00:00:05,992
When we started this section of the course
3
00:00:05,992 --> 00:00:08,880
we mentioned how some events have infinitely
4
00:00:08,880 --> 00:00:11,160
many consecutive outcomes.
5
00:00:11,160 --> 00:00:13,800
We call such distributions continuous
6
00:00:13,800 --> 00:00:16,623
and they differ vastly from discreet ones.
7
00:00:17,550 --> 00:00:20,880
For starters, their sample space is infinite.
8
00:00:20,880 --> 00:00:22,560
Therefore we cannot record the
9
00:00:22,560 --> 00:00:25,440
frequency of each distinct value.
10
00:00:25,440 --> 00:00:27,690
Thus we can no longer represent
11
00:00:27,690 --> 00:00:29,340
these distributions with a table.
12
00:00:30,240 --> 00:00:33,093
What we can do is represent them with a graph,
13
00:00:35,602 --> 00:00:37,110
more precisely the graph of the
14
00:00:39,177 --> 00:00:40,973
probability density function, or PDF, for short.
15
00:00:42,712 --> 00:00:45,720
We denote it as F of Y where Y is an element
16
00:00:45,720 --> 00:00:46,983
of the sample space.
17
00:00:48,090 --> 00:00:50,430
As the name suggests, the function depicts
18
00:00:50,430 --> 00:00:53,672
the associated probability for every possible value, Y.
19
00:00:53,672 --> 00:00:57,420
Since it expresses probability
20
00:00:57,420 --> 00:00:59,340
the value it associates with an element
21
00:00:59,340 --> 00:01:02,913
of the sample space would be greater than or equal to zero.
22
00:01:04,920 --> 00:01:08,130
Great, the graphs for continuous distributions
23
00:01:08,130 --> 00:01:11,520
slightly resemble the ones for discreet distributions.
24
00:01:11,520 --> 00:01:13,260
However, there are more elements
25
00:01:13,260 --> 00:01:17,310
in the sample space so there are more bars on the graph.
26
00:01:17,310 --> 00:01:21,093
Furthermore, the more bars, the narrower each one must be.
27
00:01:22,020 --> 00:01:24,060
This results in a smooth curve
28
00:01:24,060 --> 00:01:26,820
that goes along the top of these bars.
29
00:01:26,820 --> 00:01:29,610
We call this the probability distribution curve
30
00:01:29,610 --> 00:01:32,313
since it shows the likelihood of each outcome.
31
00:01:34,260 --> 00:01:35,760
Now onto some further differences
32
00:01:35,760 --> 00:01:37,713
between distinct and continuous.
33
00:01:39,240 --> 00:01:42,360
Imagine we use the favored overall formula
34
00:01:42,360 --> 00:01:45,810
to calculate probabilities for such variables.
35
00:01:45,810 --> 00:01:48,420
Since the sample space is infinite, the likelihood
36
00:01:48,420 --> 00:01:51,123
of each individual one would be extremely small.
37
00:01:51,990 --> 00:01:55,710
If we assume the numerator stays constant, algebra dictates
38
00:01:55,710 --> 00:01:57,960
that the greater the denominator becomes
39
00:01:57,960 --> 00:01:59,943
the closer the fraction is to zero.
40
00:02:00,870 --> 00:02:04,620
For reference one third is closer to zero than a half
41
00:02:04,620 --> 00:02:07,413
and a quarter is closer to zero than either of them.
42
00:02:08,250 --> 00:02:09,330
Since the denominator
43
00:02:09,330 --> 00:02:10,979
of the favored overall formula
44
00:02:10,979 --> 00:02:13,440
would be so big, it is commonly accepted
45
00:02:13,440 --> 00:02:17,130
that such probabilities are extremely insignificant.
46
00:02:17,130 --> 00:02:18,960
In fact, we assume their likelihood
47
00:02:18,960 --> 00:02:21,543
of occurring to be essentially zero.
48
00:02:22,380 --> 00:02:23,970
Thus, it is accepted
49
00:02:23,970 --> 00:02:26,460
that the probability for any individual value
50
00:02:26,460 --> 00:02:29,373
from a continuous distribution to be equal to zero.
51
00:02:31,530 --> 00:02:33,630
This assumption is crucial in understanding
52
00:02:33,630 --> 00:02:36,720
why the likelihood of an event being strictly greater
53
00:02:36,720 --> 00:02:40,440
than X is equal to the likelihood of the event being greater
54
00:02:40,440 --> 00:02:44,853
than or equal to X for some value X within the sample space.
55
00:02:45,720 --> 00:02:47,520
For example, the probability
56
00:02:47,520 --> 00:02:49,260
of a college student running a mile
57
00:02:49,260 --> 00:02:50,970
in under six minutes is
58
00:02:50,970 --> 00:02:54,540
the same as them running it for at most six minutes.
59
00:02:54,540 --> 00:02:56,400
That is because we consider the likelihood
60
00:02:56,400 --> 00:02:59,793
of finishing in exactly six minutes to be zero.
61
00:03:01,230 --> 00:03:03,030
That wasn't too complicated, was it?
62
00:03:04,410 --> 00:03:05,760
So far we've been using
63
00:03:05,760 --> 00:03:08,370
the term probability function to refer
64
00:03:08,370 --> 00:03:11,910
to the probability density function of a distribution.
65
00:03:11,910 --> 00:03:13,230
All the graphs we explored
66
00:03:13,230 --> 00:03:16,473
for discrete distributions were depicting their PDFs.
67
00:03:17,640 --> 00:03:19,410
Now we need to introduce
68
00:03:19,410 --> 00:03:21,690
the cumulative distribution function
69
00:03:21,690 --> 00:03:23,523
or CDF for short.
70
00:03:24,690 --> 00:03:26,310
Since it is cumulative
71
00:03:26,310 --> 00:03:30,780
this function encompasses everything up to a certain value.
72
00:03:30,780 --> 00:03:33,352
We denote the CDF as capital F of Y
73
00:03:33,352 --> 00:03:36,543
for any continuous random variable Y.
74
00:03:38,160 --> 00:03:40,830
As the name suggests, it represents probability
75
00:03:40,830 --> 00:03:43,050
of the random variable being lower than
76
00:03:43,050 --> 00:03:45,453
or equal to a specific value.
77
00:03:46,680 --> 00:03:48,180
Since no value could be lower
78
00:03:48,180 --> 00:03:50,940
than or equal to negative infinity,
79
00:03:50,940 --> 00:03:54,273
the CDF value for negative infinity would equal zero.
80
00:03:55,230 --> 00:03:58,210
Similarly, since any value would be lower
81
00:03:59,132 --> 00:04:00,990
than plus infinity, we would get a one
82
00:04:00,990 --> 00:04:04,233
if we plug plus infinity into the distribution function.
83
00:04:05,610 --> 00:04:08,460
Discreet distributions also have CDFs
84
00:04:08,460 --> 00:04:11,040
but they're used far less frequently.
85
00:04:11,040 --> 00:04:12,660
That is because we can always add
86
00:04:12,660 --> 00:04:14,790
up the PDF values associated
87
00:04:14,790 --> 00:04:17,540
with the individual probabilities we are interested in.
88
00:04:19,740 --> 00:04:21,510
Good job folks,
89
00:04:21,510 --> 00:04:24,120
the CDF is especially useful when we want
90
00:04:24,120 --> 00:04:26,673
to estimate the probability of some interval.
91
00:04:28,452 --> 00:04:30,900
Graphically the area under the density curve would represent
92
00:04:30,900 --> 00:04:33,573
the chance of getting a value within that interval.
93
00:04:34,470 --> 00:04:36,930
We find this area by computing the integral
94
00:04:36,930 --> 00:04:40,923
of the density curve over the interval from A to B.
95
00:04:42,600 --> 00:04:45,360
For those of you who do not know how to calculate integrals,
96
00:04:45,360 --> 00:04:50,360
you can use some free online software like WolframAlpha.com
97
00:04:50,400 --> 00:04:53,881
if you understand probability correctly, determining
98
00:04:53,881 --> 00:04:56,831
and calculating these integrals should feel very intuitive.
99
00:04:58,650 --> 00:04:59,483
All right.
100
00:04:59,483 --> 00:05:01,740
Notice how the cumulative probability is simply
101
00:05:01,740 --> 00:05:03,600
the probability of the interval
102
00:05:03,600 --> 00:05:05,613
from negative infinity to Y.
103
00:05:06,870 --> 00:05:09,180
For those who know calculus, this suggests
104
00:05:09,180 --> 00:05:12,690
that the CDF for a specific value Y is equal
105
00:05:12,690 --> 00:05:15,210
to the integral of the density function
106
00:05:15,210 --> 00:05:18,783
over the interval from minus infinity to Y.
107
00:05:20,280 --> 00:05:23,673
This gives us a way to obtain the CDF from the PDF.
108
00:05:24,720 --> 00:05:28,320
The opposite of integration is derivation.
109
00:05:28,320 --> 00:05:31,770
So to attain a PDF from a CDF, we would have
110
00:05:31,770 --> 00:05:36,300
to find its first derivative, in more technical terms,
111
00:05:36,300 --> 00:05:39,900
the PDF for any element of the sample space Y
112
00:05:39,900 --> 00:05:42,540
equals the first derivative of the CDF
113
00:05:42,540 --> 00:05:43,833
with respect to Y.
114
00:05:48,582 --> 00:05:51,180
Okay oftentimes, when dealing with continuous variables
115
00:05:51,180 --> 00:05:54,660
we are only given their probability density functions.
116
00:05:54,660 --> 00:05:56,387
To understand what its graph looks like, we.
117
00:05:56,387 --> 00:05:57,450
-: Should should be able
118
00:05:57,450 --> 00:06:01,053
to compute the expected value and variance for any PDF.
119
00:06:03,090 --> 00:06:05,193
Let's start with expected values.
120
00:06:06,870 --> 00:06:11,190
The probability of each individual element Y is zero.
121
00:06:11,190 --> 00:06:12,360
Therefore, we cannot apply
122
00:06:12,360 --> 00:06:15,960
the summation formula we used for discrete outcomes.
123
00:06:15,960 --> 00:06:17,910
When dealing with continuous distributions,
124
00:06:17,910 --> 00:06:20,850
the expected value is an integral.
125
00:06:20,850 --> 00:06:23,610
More specifically, it is an integral of the product
126
00:06:23,610 --> 00:06:27,690
of any element Y and its associated PDF value
127
00:06:27,690 --> 00:06:30,010
over the interval from negative infinity
128
00:06:31,270 --> 00:06:32,183
to positive infinity.
129
00:06:33,510 --> 00:06:37,500
Right, now let us quickly discuss the variance.
130
00:06:37,500 --> 00:06:39,630
Luckily for us, we can still apply the same
131
00:06:39,630 --> 00:06:44,010
variance formula we used earlier for discrete distributions.
132
00:06:44,010 --> 00:06:45,930
Namely, the variance is equal
133
00:06:45,930 --> 00:06:49,740
to the expected value of the squared variable minus
134
00:06:49,740 --> 00:06:52,593
the expected value of the variable squared.
135
00:06:54,690 --> 00:06:55,833
Marvelous work.
136
00:06:57,270 --> 00:06:58,980
We now know the main characteristics
137
00:06:58,980 --> 00:07:01,020
of any continuous distribution
138
00:07:01,020 --> 00:07:04,170
so we can begin exploring specific types.
139
00:07:04,170 --> 00:07:06,330
In the next lecture, we will introduce
140
00:07:06,330 --> 00:07:09,930
the normal distribution and its main features.
141
00:07:09,930 --> 00:07:11,043
Thanks for watching.
11261
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.