Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:11,140 --> 00:00:16,750
So in this lecture, we are going to discuss some common at times serious transformations, if you're
2
00:00:16,750 --> 00:00:22,000
familiar with machine learning, then you know that it's often useful to transform your data before
3
00:00:22,000 --> 00:00:24,000
passing it into a machine learning model.
4
00:00:24,550 --> 00:00:32,380
For example, standardization or min scaling four time series, we'll be discussing three common transformations
5
00:00:32,620 --> 00:00:36,700
the power transform, the log transform and the Buzzcocks transform.
6
00:00:37,240 --> 00:00:40,210
As you'll see, these all essentially serve the same purpose.
7
00:00:44,910 --> 00:00:50,160
So let's start with the power transform, the power transform involves raising all your data points
8
00:00:50,160 --> 00:00:55,530
to a power, for example, by raising every data point to the power of one half, you'll be taking the
9
00:00:55,530 --> 00:00:56,900
square root of your data set.
10
00:00:57,810 --> 00:00:59,100
So why is this useful?
11
00:01:00,030 --> 00:01:03,560
Well, imagine that your data appears to grow quadratic in time.
12
00:01:04,020 --> 00:01:09,090
If you take the square root, the result would be that you transform your data to grow linearly.
13
00:01:09,630 --> 00:01:11,070
So why is that useful?
14
00:01:11,670 --> 00:01:16,250
Well, you'll soon learn about some machine learning models that can learn linear trends very well,
15
00:01:16,680 --> 00:01:20,480
but there's no model for quadratic trends or Kubic trends and so forth.
16
00:01:21,090 --> 00:01:26,430
Thus, by transforming your data to appear like it has a linear trend, you give your model a better
17
00:01:26,430 --> 00:01:31,650
chance of forecasting future data points and modeling the true nature of the Time series more closely.
18
00:01:36,370 --> 00:01:42,760
So another transformation with a similar purpose is the log transform, like the power transform, it
19
00:01:42,760 --> 00:01:46,160
basically ends up squashing your data into a smaller range.
20
00:01:46,600 --> 00:01:51,940
In fact, a lot of the time I'll just end up using the log transform by default without considering
21
00:01:51,940 --> 00:01:52,880
other options.
22
00:01:53,800 --> 00:01:58,530
One common application of the log transform is in finance and finance.
23
00:01:58,540 --> 00:02:02,620
It's common to model stock prices as following a normal distribution.
24
00:02:03,640 --> 00:02:07,960
It's also common to model log returns instead of returns based on percentages.
25
00:02:08,770 --> 00:02:12,340
As an example, this is the basis for the famous Black-Scholes formula.
26
00:02:13,750 --> 00:02:19,630
Note that one possible issue with the log transform is that it doesn't accept zero or negative values
27
00:02:19,630 --> 00:02:20,270
as input.
28
00:02:21,190 --> 00:02:27,450
For this reason, it can only be used for data which is strictly positive for data that might be non-negative.
29
00:02:27,460 --> 00:02:30,610
It's common to simply add one before taking the log.
30
00:02:35,170 --> 00:02:41,350
OK, so a third transform we're going to discuss is the box cox transform, which generalises the concept
31
00:02:41,350 --> 00:02:44,270
of both the power transform and the log transform.
32
00:02:44,740 --> 00:02:50,020
You can see that it involves this parameter lambda, which is the power to use when taking the transform.
33
00:02:51,010 --> 00:02:52,630
So why does this make sense?
34
00:02:53,200 --> 00:02:58,900
This makes sense because the natural logarithm is actually the limit of this specific power transform
35
00:02:59,140 --> 00:03:00,730
as the power approaches zero.
36
00:03:01,990 --> 00:03:06,850
Now inside the box Cox function will automatically choose the value of Lambda for us.
37
00:03:07,210 --> 00:03:10,620
So we don't need to worry about finding the optimal value ourselves.
38
00:03:11,080 --> 00:03:15,880
But if you're interested in learning how this value is chosen, I'd encourage you to check out the CPA
39
00:03:15,880 --> 00:03:20,440
documentation as well as this article I've included in extra reading tea.
40
00:03:25,150 --> 00:03:30,790
So one common reason people give for why they use the Buzzcocks transform is that they want to make
41
00:03:30,790 --> 00:03:32,510
the data normally distributed.
42
00:03:33,370 --> 00:03:37,510
However, note that this motivation does not apply to Raw Time series.
43
00:03:38,020 --> 00:03:39,110
So why is this?
44
00:03:39,940 --> 00:03:42,850
Well, remember that Time series data is dynamic.
45
00:03:43,000 --> 00:03:44,300
It changes in time.
46
00:03:44,350 --> 00:03:45,470
It can have a trend.
47
00:03:46,000 --> 00:03:51,340
So when you take time series data and plot a histogram hoping that it will be normal, this is actually
48
00:03:51,340 --> 00:03:55,290
the wrong thing to do was discuss this more later in the course.
49
00:03:55,300 --> 00:04:01,300
But in order to take data over time and plot its distribution or histogram, we need that data to be
50
00:04:01,300 --> 00:04:02,140
stationary.
51
00:04:02,740 --> 00:04:06,490
Stationary essentially means distribution doesn't change over time.
52
00:04:07,780 --> 00:04:09,430
So why is this a requirement?
53
00:04:10,060 --> 00:04:14,380
Well, imagine you have some data which simply follows a line that grows at a constant rate.
54
00:04:15,010 --> 00:04:17,670
Does plotting the histogram of this data makes sense?
55
00:04:18,100 --> 00:04:19,060
The answer is no.
56
00:04:19,720 --> 00:04:22,030
What do we want this to be normally distributed?
57
00:04:22,420 --> 00:04:23,440
The answer is no.
58
00:04:23,950 --> 00:04:27,310
In fact, this data behaves much better with a linear trend.
59
00:04:27,820 --> 00:04:33,460
The point of plotting a histogram is to understand the distribution of the data, but the distribution
60
00:04:33,460 --> 00:04:38,020
at the bottom of this plot is clearly different from the distribution at the top of this plot.
61
00:04:38,650 --> 00:04:43,390
Therefore, it makes no sense to mix this data together into a single histogram.
62
00:04:43,780 --> 00:04:46,330
This does not tell us how the data is distributed.
63
00:04:50,960 --> 00:04:56,510
The final topic I want to discuss in this lecture is why the log transform is deeply fundamental.
64
00:04:56,990 --> 00:05:01,490
Not only is it useful mathematically, but it also seems to be part of nature itself.
65
00:05:02,180 --> 00:05:04,170
One example of this is perception.
66
00:05:04,730 --> 00:05:10,070
For example, although a normal conversation is ten thousand times louder than a whisper, it doesn't
67
00:05:10,070 --> 00:05:12,750
have ten thousand times the effect on your senses.
68
00:05:13,310 --> 00:05:18,050
That's why we use the decibel scale to measure sound, which is essentially a log transform.
69
00:05:19,810 --> 00:05:25,540
Another example of how the logarithm seems to simply be a part of nature is how we as humans interpret
70
00:05:25,540 --> 00:05:26,210
numbers.
71
00:05:26,740 --> 00:05:31,630
For example, if you have one thousand dollars in the bank, then losing one thousand dollars would
72
00:05:31,630 --> 00:05:32,710
be a pretty big deal.
73
00:05:33,190 --> 00:05:38,080
But if you have one billion dollars in the bank, spending one thousand dollars on a pair of jeans would
74
00:05:38,080 --> 00:05:39,300
feel completely normal.
75
00:05:40,330 --> 00:05:45,490
Another way to think of this is imagine going from zero dollars in wealth to one million.
76
00:05:45,880 --> 00:05:47,050
That's a pretty big jump.
77
00:05:47,620 --> 00:05:49,570
How about one million to two million?
78
00:05:49,990 --> 00:05:54,400
Although you still made the same amount of money, its utility is less so.
79
00:05:54,400 --> 00:05:58,690
One might model the utility of wealth as the logarithm of the wealth and.
8299
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.