Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,840 --> 00:00:02,460
Instructor: Correlation adjusts covariance
2
00:00:02,460 --> 00:00:03,900
so that the relationship between
3
00:00:03,900 --> 00:00:05,670
the two variables becomes easy
4
00:00:05,670 --> 00:00:07,113
and intuitive to interpret.
5
00:00:07,950 --> 00:00:08,783
The formulas
6
00:00:08,783 --> 00:00:11,790
for the correlation coefficient are the covariance divided
7
00:00:11,790 --> 00:00:13,650
by the product of the standard deviations
8
00:00:13,650 --> 00:00:15,292
of the two variables.
9
00:00:15,292 --> 00:00:17,670
This is either sample or population,
10
00:00:17,670 --> 00:00:20,082
depending on the data you are working with.
11
00:00:20,082 --> 00:00:22,440
We already have the standard deviations
12
00:00:22,440 --> 00:00:23,910
of the two data sets.
13
00:00:23,910 --> 00:00:25,770
Now, we'll use the formula
14
00:00:25,770 --> 00:00:28,533
in order to find the sample correlation coefficient.
15
00:00:29,610 --> 00:00:30,450
Mathematically,
16
00:00:30,450 --> 00:00:32,708
there is no way to obtain a correlation value greater
17
00:00:32,708 --> 00:00:35,103
than one or a less than minus one.
18
00:00:35,970 --> 00:00:38,190
Remember, the coefficient of variation we talked
19
00:00:38,190 --> 00:00:39,870
about a couple of lessons ago,
20
00:00:39,870 --> 00:00:42,479
well, this concept is similar.
21
00:00:42,479 --> 00:00:45,270
We manipulated the strange covariance value
22
00:00:45,270 --> 00:00:47,670
in order to get something intuitive.
23
00:00:47,670 --> 00:00:49,590
Let's examine it for a bit.
24
00:00:49,590 --> 00:00:52,980
We got a sample correlation coefficient of 0.87,
25
00:00:52,980 --> 00:00:56,937
so there is a strong relationship between the two values.
26
00:00:56,937 --> 00:00:59,220
The correlation of one, also known
27
00:00:59,220 --> 00:01:01,230
as perfect positive correlation,
28
00:01:01,230 --> 00:01:02,597
means that the entire variability
29
00:01:02,597 --> 00:01:06,180
of one variable is explained by the other variable.
30
00:01:06,180 --> 00:01:10,410
However, logically we know that size determines the price.
31
00:01:10,410 --> 00:01:12,690
On average, the bigger house you build,
32
00:01:12,690 --> 00:01:14,850
the more expensive it will be.
33
00:01:14,850 --> 00:01:17,640
This relationship goes only this way.
34
00:01:17,640 --> 00:01:19,230
Once a house is built,
35
00:01:19,230 --> 00:01:21,750
if for some reason it becomes more expensive,
36
00:01:21,750 --> 00:01:23,430
its size doesn't increase,
37
00:01:23,430 --> 00:01:26,159
although there is a positive correlation.
38
00:01:26,159 --> 00:01:30,210
Okay, a correlation of zero between two variables means
39
00:01:30,210 --> 00:01:33,060
that they are absolutely independent from each other.
40
00:01:33,060 --> 00:01:34,860
We would expect a correlation of zero
41
00:01:34,860 --> 00:01:36,900
between the price of coffee in Brazil
42
00:01:36,900 --> 00:01:39,690
and the price of houses in London, right?
43
00:01:39,690 --> 00:01:42,963
The two variables don't have anything in common.
44
00:01:42,963 --> 00:01:47,487
Finally, we can have a negative correlation coefficient.
45
00:01:47,487 --> 00:01:50,414
It can be perfect negative correlation of minus one
46
00:01:50,414 --> 00:01:53,880
or much more likely an imperfect negative correlation
47
00:01:53,880 --> 00:01:56,936
of a value between minus one and zero.
48
00:01:56,936 --> 00:01:59,100
Think of the following businesses.
49
00:01:59,100 --> 00:02:00,720
A company producing ice cream
50
00:02:00,720 --> 00:02:02,457
and a company selling umbrellas.
51
00:02:02,457 --> 00:02:05,310
Ice cream tends to be sold more when the weather is very
52
00:02:05,310 --> 00:02:08,370
good and people buy umbrellas when it's rainy.
53
00:02:08,370 --> 00:02:11,520
Obviously there is a negative correlation between the two,
54
00:02:11,520 --> 00:02:14,580
and hence, when one of the companies makes more money,
55
00:02:14,580 --> 00:02:16,182
the other won't.
56
00:02:16,182 --> 00:02:18,996
All right, before we continue, we must note
57
00:02:18,996 --> 00:02:22,020
that the correlation between two variables X
58
00:02:22,020 --> 00:02:26,010
and Y is the same as the correlation between Y and X.
59
00:02:26,010 --> 00:02:28,590
The formula is completely symmetrical with respect
60
00:02:28,590 --> 00:02:30,180
to both variables.
61
00:02:30,180 --> 00:02:33,510
Therefore, the correlation of price and size is the same
62
00:02:33,510 --> 00:02:36,032
as the one of size and price.
63
00:02:36,032 --> 00:02:38,550
This leads us to causality.
64
00:02:38,550 --> 00:02:41,070
It is very important for any analyst or researcher
65
00:02:41,070 --> 00:02:44,220
to understand the direction of causal relationships.
66
00:02:44,220 --> 00:02:46,920
In the housing business, size causes the price,
67
00:02:46,920 --> 00:02:48,690
and not vice versa.
68
00:02:48,690 --> 00:02:51,330
We will explore this topic in much more detail
69
00:02:51,330 --> 00:02:53,643
in the regression analysis section later on.
70
00:02:54,540 --> 00:02:55,980
For now, it is only needed
71
00:02:55,980 --> 00:03:00,688
that you realize that correlation does not imply causation.
72
00:03:00,688 --> 00:03:03,600
Okay, very good.
73
00:03:03,600 --> 00:03:04,920
With this example,
74
00:03:04,920 --> 00:03:08,070
we conclude the section on descriptive statistics.
75
00:03:08,070 --> 00:03:09,120
In the next lesson,
76
00:03:09,120 --> 00:03:11,520
you will see a real life database example
77
00:03:11,520 --> 00:03:14,970
that applies all the knowledge you acquired in this section.
78
00:03:14,970 --> 00:03:16,923
You definitely don't wanna miss it.
6078
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.