Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,004 --> 00:00:01,007
- [Instructor] In the previous movie,
2
00:00:01,007 --> 00:00:03,009
I showed you a way to draw a graph
3
00:00:03,009 --> 00:00:06,000
that visualizes the way probability
4
00:00:06,000 --> 00:00:09,002
is combined to perform Bayesian analysis.
5
00:00:09,002 --> 00:00:11,002
In this movie, I will show you how to create
6
00:00:11,002 --> 00:00:13,000
a classification matrix,
7
00:00:13,000 --> 00:00:15,000
which will take us one step closer
8
00:00:15,000 --> 00:00:17,006
to implementing this analysis in Excel.
9
00:00:17,006 --> 00:00:19,002
So let's review what we know
10
00:00:19,002 --> 00:00:22,007
about our base rates and our accuracy.
11
00:00:22,007 --> 00:00:27,005
First, we know that the base rate of green cabs is 85%.
12
00:00:27,005 --> 00:00:30,009
That means that 15% of the cabs will be blue.
13
00:00:30,009 --> 00:00:33,004
Also, we know that witnesses are accurate
14
00:00:33,004 --> 00:00:37,003
as for color identification, 80% of the time.
15
00:00:37,003 --> 00:00:41,001
And again, the cab can be green or blue.
16
00:00:41,001 --> 00:00:43,007
Also, the witness can be accurate or not.
17
00:00:43,007 --> 00:00:46,000
And those are the factors that we will use
18
00:00:46,000 --> 00:00:48,007
to create a two by two probability
19
00:00:48,007 --> 00:00:50,004
or a classification matrix.
20
00:00:50,004 --> 00:00:51,006
So we know the probability
21
00:00:51,006 --> 00:00:55,000
of arriving at each combination.
22
00:00:55,000 --> 00:00:59,004
So let's take a look at a classification matrix.
23
00:00:59,004 --> 00:01:02,000
Here's the data again, I won't repeat it,
24
00:01:02,000 --> 00:01:04,009
but here is what the matrix look like.
25
00:01:04,009 --> 00:01:07,006
And it is the compound probability
26
00:01:07,006 --> 00:01:11,008
of reaching each of these four states.
27
00:01:11,008 --> 00:01:16,009
So a green cab will be reported as green 68% of the time.
28
00:01:16,009 --> 00:01:21,008
And that's the multiple of 0.85 times 0.8.
29
00:01:21,008 --> 00:01:24,005
In a similar way, if a cab is blue,
30
00:01:24,005 --> 00:01:30,001
it will be reported as blue 12% of the time or 0.12.
31
00:01:30,001 --> 00:01:32,000
Adding those two values reflects
32
00:01:32,000 --> 00:01:34,002
the amount of time that the witness
33
00:01:34,002 --> 00:01:36,006
will be correct 80% of the time.
34
00:01:36,006 --> 00:01:39,004
And you can see that the other two cells
35
00:01:39,004 --> 00:01:42,005
have the values 0.17 and 0.03.
36
00:01:42,005 --> 00:01:47,006
Those add to 0.2, and those are the incorrect guesses.
37
00:01:47,006 --> 00:01:49,009
Now let's take a look at the probabilities
38
00:01:49,009 --> 00:01:51,005
of each of these scenarios.
39
00:01:51,005 --> 00:01:55,002
From the standpoint of witness accuracy.
40
00:01:55,002 --> 00:01:58,005
Here again is the classification matrix.
41
00:01:58,005 --> 00:02:02,001
And let's do the calculation that we did
42
00:02:02,001 --> 00:02:04,003
for reporting a cab is blue
43
00:02:04,003 --> 00:02:06,007
and finding the probabilities actually blue
44
00:02:06,007 --> 00:02:09,000
for each of the four scenarios.
45
00:02:09,000 --> 00:02:10,006
So we have the probability
46
00:02:10,006 --> 00:02:15,007
of a cab being blue and reported as blue as 41%.
47
00:02:15,007 --> 00:02:19,002
That's the calculation I described earlier.
48
00:02:19,002 --> 00:02:20,003
On the other hand,
49
00:02:20,003 --> 00:02:24,001
the witness will identify a blue cab is green.
50
00:02:24,001 --> 00:02:27,003
That's the second item 59% of the time.
51
00:02:27,003 --> 00:02:33,001
And you can see that 41% and 59 add up to 100%.
52
00:02:33,001 --> 00:02:35,001
So if the cab's blue, that's the percent
53
00:02:35,001 --> 00:02:38,003
of the time that the witness will be correct.
54
00:02:38,003 --> 00:02:41,002
Then at the bottom, we have the same scenarios
55
00:02:41,002 --> 00:02:43,005
but we're assuming the cab is green.
56
00:02:43,005 --> 00:02:46,002
So if the cab is green and the witness report
57
00:02:46,002 --> 00:02:49,006
it is green, that will happen 96% of the time.
58
00:02:49,006 --> 00:02:53,001
On the other hand, the witness will identify
59
00:02:53,001 --> 00:02:57,003
a green cab as blue only 4% of the time.
60
00:02:57,003 --> 00:02:58,008
And what makes these calculations
61
00:02:58,008 --> 00:03:01,000
so different is the base rate.
62
00:03:01,000 --> 00:03:03,004
Only 15% of cabs are blue
63
00:03:03,004 --> 00:03:06,008
and that makes the impact of the probability
64
00:03:06,008 --> 00:03:10,000
of an improper identification that much greater.
4956
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.