All language subtitles for 6. Power, Log, and Box-Cox Transformations

af Afrikaans
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:11,140 --> 00:00:16,750 So in this lecture, we are going to discuss some common at times serious transformations, if you're 2 00:00:16,750 --> 00:00:22,000 familiar with machine learning, then you know that it's often useful to transform your data before 3 00:00:22,000 --> 00:00:24,000 passing it into a machine learning model. 4 00:00:24,550 --> 00:00:32,380 For example, standardization or min scaling four time series, we'll be discussing three common transformations 5 00:00:32,620 --> 00:00:36,700 the power transform, the log transform and the Buzzcocks transform. 6 00:00:37,240 --> 00:00:40,210 As you'll see, these all essentially serve the same purpose. 7 00:00:44,910 --> 00:00:50,160 So let's start with the power transform, the power transform involves raising all your data points 8 00:00:50,160 --> 00:00:55,530 to a power, for example, by raising every data point to the power of one half, you'll be taking the 9 00:00:55,530 --> 00:00:56,900 square root of your data set. 10 00:00:57,810 --> 00:00:59,100 So why is this useful? 11 00:01:00,030 --> 00:01:03,560 Well, imagine that your data appears to grow quadratic in time. 12 00:01:04,020 --> 00:01:09,090 If you take the square root, the result would be that you transform your data to grow linearly. 13 00:01:09,630 --> 00:01:11,070 So why is that useful? 14 00:01:11,670 --> 00:01:16,250 Well, you'll soon learn about some machine learning models that can learn linear trends very well, 15 00:01:16,680 --> 00:01:20,480 but there's no model for quadratic trends or Kubic trends and so forth. 16 00:01:21,090 --> 00:01:26,430 Thus, by transforming your data to appear like it has a linear trend, you give your model a better 17 00:01:26,430 --> 00:01:31,650 chance of forecasting future data points and modeling the true nature of the Time series more closely. 18 00:01:36,370 --> 00:01:42,760 So another transformation with a similar purpose is the log transform, like the power transform, it 19 00:01:42,760 --> 00:01:46,160 basically ends up squashing your data into a smaller range. 20 00:01:46,600 --> 00:01:51,940 In fact, a lot of the time I'll just end up using the log transform by default without considering 21 00:01:51,940 --> 00:01:52,880 other options. 22 00:01:53,800 --> 00:01:58,530 One common application of the log transform is in finance and finance. 23 00:01:58,540 --> 00:02:02,620 It's common to model stock prices as following a normal distribution. 24 00:02:03,640 --> 00:02:07,960 It's also common to model log returns instead of returns based on percentages. 25 00:02:08,770 --> 00:02:12,340 As an example, this is the basis for the famous Black-Scholes formula. 26 00:02:13,750 --> 00:02:19,630 Note that one possible issue with the log transform is that it doesn't accept zero or negative values 27 00:02:19,630 --> 00:02:20,270 as input. 28 00:02:21,190 --> 00:02:27,450 For this reason, it can only be used for data which is strictly positive for data that might be non-negative. 29 00:02:27,460 --> 00:02:30,610 It's common to simply add one before taking the log. 30 00:02:35,170 --> 00:02:41,350 OK, so a third transform we're going to discuss is the box cox transform, which generalises the concept 31 00:02:41,350 --> 00:02:44,270 of both the power transform and the log transform. 32 00:02:44,740 --> 00:02:50,020 You can see that it involves this parameter lambda, which is the power to use when taking the transform. 33 00:02:51,010 --> 00:02:52,630 So why does this make sense? 34 00:02:53,200 --> 00:02:58,900 This makes sense because the natural logarithm is actually the limit of this specific power transform 35 00:02:59,140 --> 00:03:00,730 as the power approaches zero. 36 00:03:01,990 --> 00:03:06,850 Now inside the box Cox function will automatically choose the value of Lambda for us. 37 00:03:07,210 --> 00:03:10,620 So we don't need to worry about finding the optimal value ourselves. 38 00:03:11,080 --> 00:03:15,880 But if you're interested in learning how this value is chosen, I'd encourage you to check out the CPA 39 00:03:15,880 --> 00:03:20,440 documentation as well as this article I've included in extra reading tea. 40 00:03:25,150 --> 00:03:30,790 So one common reason people give for why they use the Buzzcocks transform is that they want to make 41 00:03:30,790 --> 00:03:32,510 the data normally distributed. 42 00:03:33,370 --> 00:03:37,510 However, note that this motivation does not apply to Raw Time series. 43 00:03:38,020 --> 00:03:39,110 So why is this? 44 00:03:39,940 --> 00:03:42,850 Well, remember that Time series data is dynamic. 45 00:03:43,000 --> 00:03:44,300 It changes in time. 46 00:03:44,350 --> 00:03:45,470 It can have a trend. 47 00:03:46,000 --> 00:03:51,340 So when you take time series data and plot a histogram hoping that it will be normal, this is actually 48 00:03:51,340 --> 00:03:55,290 the wrong thing to do was discuss this more later in the course. 49 00:03:55,300 --> 00:04:01,300 But in order to take data over time and plot its distribution or histogram, we need that data to be 50 00:04:01,300 --> 00:04:02,140 stationary. 51 00:04:02,740 --> 00:04:06,490 Stationary essentially means distribution doesn't change over time. 52 00:04:07,780 --> 00:04:09,430 So why is this a requirement? 53 00:04:10,060 --> 00:04:14,380 Well, imagine you have some data which simply follows a line that grows at a constant rate. 54 00:04:15,010 --> 00:04:17,670 Does plotting the histogram of this data makes sense? 55 00:04:18,100 --> 00:04:19,060 The answer is no. 56 00:04:19,720 --> 00:04:22,030 What do we want this to be normally distributed? 57 00:04:22,420 --> 00:04:23,440 The answer is no. 58 00:04:23,950 --> 00:04:27,310 In fact, this data behaves much better with a linear trend. 59 00:04:27,820 --> 00:04:33,460 The point of plotting a histogram is to understand the distribution of the data, but the distribution 60 00:04:33,460 --> 00:04:38,020 at the bottom of this plot is clearly different from the distribution at the top of this plot. 61 00:04:38,650 --> 00:04:43,390 Therefore, it makes no sense to mix this data together into a single histogram. 62 00:04:43,780 --> 00:04:46,330 This does not tell us how the data is distributed. 63 00:04:50,960 --> 00:04:56,510 The final topic I want to discuss in this lecture is why the log transform is deeply fundamental. 64 00:04:56,990 --> 00:05:01,490 Not only is it useful mathematically, but it also seems to be part of nature itself. 65 00:05:02,180 --> 00:05:04,170 One example of this is perception. 66 00:05:04,730 --> 00:05:10,070 For example, although a normal conversation is ten thousand times louder than a whisper, it doesn't 67 00:05:10,070 --> 00:05:12,750 have ten thousand times the effect on your senses. 68 00:05:13,310 --> 00:05:18,050 That's why we use the decibel scale to measure sound, which is essentially a log transform. 69 00:05:19,810 --> 00:05:25,540 Another example of how the logarithm seems to simply be a part of nature is how we as humans interpret 70 00:05:25,540 --> 00:05:26,210 numbers. 71 00:05:26,740 --> 00:05:31,630 For example, if you have one thousand dollars in the bank, then losing one thousand dollars would 72 00:05:31,630 --> 00:05:32,710 be a pretty big deal. 73 00:05:33,190 --> 00:05:38,080 But if you have one billion dollars in the bank, spending one thousand dollars on a pair of jeans would 74 00:05:38,080 --> 00:05:39,300 feel completely normal. 75 00:05:40,330 --> 00:05:45,490 Another way to think of this is imagine going from zero dollars in wealth to one million. 76 00:05:45,880 --> 00:05:47,050 That's a pretty big jump. 77 00:05:47,620 --> 00:05:49,570 How about one million to two million? 78 00:05:49,990 --> 00:05:54,400 Although you still made the same amount of money, its utility is less so. 79 00:05:54,400 --> 00:05:58,690 One might model the utility of wealth as the logarithm of the wealth and. 8299

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.