Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,004 --> 00:00:02,007
- Previously in this course, I have discussed
2
00:00:02,007 --> 00:00:05,008
what are known as descriptive statistics.
3
00:00:05,008 --> 00:00:08,003
Descriptive statistics, as the name implies,
4
00:00:08,003 --> 00:00:10,003
provide facts about your data.
5
00:00:10,003 --> 00:00:11,008
Those can include measures
6
00:00:11,008 --> 00:00:15,009
such as medians, means, variance and standard deviation.
7
00:00:15,009 --> 00:00:18,001
You can then use these facts about your data
8
00:00:18,001 --> 00:00:21,003
to make estimates at a known confidence level.
9
00:00:21,003 --> 00:00:23,007
For example, you might think that most of your customers
10
00:00:23,007 --> 00:00:26,005
live within a 25 mile radius,
11
00:00:26,005 --> 00:00:28,007
plus or minus three or four miles.
12
00:00:28,007 --> 00:00:32,001
And you might be wondering, what else is there?
13
00:00:32,001 --> 00:00:36,009
It turns out there's a lot more. Let me give you an example.
14
00:00:36,009 --> 00:00:39,001
Let's assume, for the sake of argument,
15
00:00:39,001 --> 00:00:40,009
that you might have the flu.
16
00:00:40,009 --> 00:00:44,002
And your test for the flu has come back positive.
17
00:00:44,002 --> 00:00:47,002
The test has the following characteristics
18
00:00:47,002 --> 00:00:50,005
and all of these will be important for our analysis.
19
00:00:50,005 --> 00:00:54,007
It returns the correct result 85% of the time.
20
00:00:54,007 --> 00:00:57,002
It identifies healthy individuals
21
00:00:57,002 --> 00:00:59,008
as having the flu 10% of the time.
22
00:00:59,008 --> 00:01:01,009
So in other words, if you do have the flu,
23
00:01:01,009 --> 00:01:05,002
you'll get a positive result 85% of the time.
24
00:01:05,002 --> 00:01:07,009
A negative result 15% of the time.
25
00:01:07,009 --> 00:01:09,004
If you don't have the flu,
26
00:01:09,004 --> 00:01:12,003
then you'll get a positive result 10% of the time,
27
00:01:12,003 --> 00:01:14,001
a false positive.
28
00:01:14,001 --> 00:01:18,005
And you'll get the correct result of negative the other 90%.
29
00:01:18,005 --> 00:01:22,000
We also have a base rate which assumes
30
00:01:22,000 --> 00:01:24,000
that about 1% of the population
31
00:01:24,000 --> 00:01:28,001
actually has the flu at the time you take the test.
32
00:01:28,001 --> 00:01:29,005
So the question is,
33
00:01:29,005 --> 00:01:33,005
what is the probability that you actually have the flu?
34
00:01:33,005 --> 00:01:35,006
For this, we use Bayes' Rule.
35
00:01:35,006 --> 00:01:38,009
We combine the accuracy, false positive and base rate
36
00:01:38,009 --> 00:01:40,004
to find the result.
37
00:01:40,004 --> 00:01:44,004
So if 1% of the population has the flu,
38
00:01:44,004 --> 00:01:47,002
and the test is 85% accurate,
39
00:01:47,002 --> 00:01:51,000
but there's also a 10% false positive rate,
40
00:01:51,000 --> 00:01:53,005
a positive test means the probability
41
00:01:53,005 --> 00:01:57,003
that you actually have the flu is
42
00:01:57,003 --> 00:02:00,004
7.91%.
43
00:02:00,004 --> 00:02:01,006
How we got to that number
44
00:02:01,006 --> 00:02:04,000
is the subject of the rest of this chapter.
3412
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.