subtitlecat.com

All language subtitles for 13. Naive Forecast and Forecasting Metrics in Code

Afrikaans

Akan

Albanian

Amharic

Arabic Download

Armenian

Azerbaijani

Basque

Belarusian

Bemba

Bengali

Bihari

Bosnian

Breton

Bulgarian

Cambodian

Catalan

Cebuano

Cherokee

Chichewa

Chinese (Simplified)

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English

Esperanto

Estonian

Ewe

Faroese

Filipino

Finnish

French

Frisian

Galician

Georgian

German

Greek

Guarani

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Interlingua

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Kinyarwanda

Kirundi

Kongo

Korean

Krio (Sierra Leone)

Kurdish

Kurdish (Soranî)

Kyrgyz

Laothian

Latin

Latvian

Lingala

Lithuanian

Lozi

Luganda

Luo

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mauritian Creole

Moldavian

Mongolian

Myanmar (Burmese)

Montenegrin

Nepali

Nigerian Pidgin

Northern Sotho

Norwegian

Norwegian (Nynorsk)

Occitan

Oriya

Oromo

Pashto

Persian

Polish

Portuguese (Brazil)

Portuguese (Portugal)

Punjabi

Quechua

Romanian

Romansh

Runyakitara

Russian

Samoan

Scots Gaelic

Serbian

Serbo-Croatian

Sesotho

Setswana

Seychellois Creole

Shona

Sindhi

Sinhalese

Slovak

Slovenian

Somali

Spanish

Spanish (Latin American)

Sundanese

Swahili

Swedish

Tajik

Tamil

Tatar

Telugu

Thai

Tigrinya

Tonga

Tshiluba

Tumbuka

Turkish

Turkmen

Twi

Uighur

Ukrainian

Urdu

Uzbek

Vietnamese

Welsh

Wolof

Xhosa

Yiddish

Yoruba

Zulu

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:11,130 --> 00:00:17,160 OK, so in this lecture, we are going to look at how to implement the Niyi forecast as well as evaluate 2 00:00:17,160 --> 00:00:19,110 it using the metrics we learned about. 3 00:00:20,810 --> 00:00:23,690 OK, so let's start by importing no giant pandas. 4 00:00:29,950 --> 00:00:35,470 The next step is to upgrade psyche, learn the reason for this is the current version of Saika Learn 5 00:00:35,470 --> 00:00:38,640 installed in Google CoLab does not have the map metric. 6 00:00:39,070 --> 00:00:43,570 So as I frequently mentioned in my courses, machine learning is a field that moves fast. 7 00:00:43,870 --> 00:00:45,610 This kind of thing is completely normal. 8 00:00:45,610 --> 00:00:47,070 Libraries change all the time. 9 00:00:53,910 --> 00:01:00,420 The next step is to import our metrics from Saikat learn so we have the map, the M80, the R-squared 10 00:01:00,420 --> 00:01:01,330 and the Mzee. 11 00:01:02,190 --> 00:01:05,880 You'll notice that this does not include the army nor the SMAP. 12 00:01:06,180 --> 00:01:07,710 We'll see how to deal with those later. 13 00:01:12,930 --> 00:01:19,050 So for this exercise, we'll be using prices from the S&P 500, which can be downloaded from my website. 14 00:01:25,750 --> 00:01:30,730 The next step is to call pedigreed CSFI in order to get a data frame of our data. 15 00:01:34,000 --> 00:01:38,740 As always, I'd like to do a DFAT head to get some sense for the data we're working with. 16 00:01:42,970 --> 00:01:47,070 So we can see all the expected columns open, high, low and so forth. 17 00:01:51,070 --> 00:01:56,980 The next step is to generate our predictions, so basically the new forecast is a forecast where we 18 00:01:57,130 --> 00:02:03,370 simply predict the previous value that we can accomplish this by calling the shift function on the close 19 00:02:03,370 --> 00:02:03,580 call. 20 00:02:03,580 --> 00:02:06,580 And we'll call this new column close prediction. 21 00:02:10,470 --> 00:02:14,550 The next step is to call DFG head once again to see our new column. 22 00:02:19,930 --> 00:02:25,750 Notice that the first row now contains not a number, since, of course, there is no last value for 23 00:02:25,750 --> 00:02:26,560 the first row. 24 00:02:30,150 --> 00:02:36,030 OK, so for convenience, we're going to assign the true closed prices to a variable called Y true and 25 00:02:36,030 --> 00:02:39,210 the predicted closed prices to a variable called a widespread. 26 00:02:40,140 --> 00:02:42,470 This is what these arguments are called inside you learn. 27 00:02:42,660 --> 00:02:45,720 So I felt it would be appropriate to call them the same thing. 28 00:02:51,380 --> 00:02:56,810 OK, so the next portion of this notebook will be to look at our metrics, the main purpose of this 29 00:02:56,810 --> 00:03:01,070 is not just to understand how to do this in code, since that's pretty easy. 30 00:03:01,460 --> 00:03:05,070 But I want you to pay attention to how these values relate to each other. 31 00:03:05,540 --> 00:03:09,590 Think about what would be considered a good value and what would be considered bad. 32 00:03:12,410 --> 00:03:17,090 OK, so let's start with the sum of squared errors, since there's no function for this, we're going 33 00:03:17,090 --> 00:03:18,630 to calculate it ourselves. 34 00:03:19,280 --> 00:03:24,920 So since why shouldn't we prêt or effectively one dimensional arrays, we can just take the difference 35 00:03:24,920 --> 00:03:27,460 into a dot products with the same difference. 36 00:03:30,640 --> 00:03:32,990 OK, so the result is about 6000. 37 00:03:33,430 --> 00:03:36,490 Of course, we don't necessarily know whether this is bad or good. 38 00:03:36,490 --> 00:03:37,630 It's just a no. 39 00:03:39,560 --> 00:03:44,090 The next step is to calculate the mean square error where we use our psyche to learn function. 40 00:03:46,620 --> 00:03:52,410 OK, so this is the result, as you can see, this brings the number down to a more reasonable range. 41 00:03:55,090 --> 00:03:59,180 Now, as Python coders, we can't be afraid of implementing things ourselves. 42 00:03:59,590 --> 00:04:03,460 Some students get absolutely frightened when they see that you're not using a library. 43 00:04:03,610 --> 00:04:06,230 But I urge all students not to take this approach. 44 00:04:06,670 --> 00:04:09,090 In fact, implementing the MSA is trivial. 45 00:04:09,400 --> 00:04:13,840 It's just what we had before, divided by the length of either flour, white, widespread. 46 00:04:17,010 --> 00:04:19,850 OK, and so we get the same answer as expected. 47 00:04:22,930 --> 00:04:28,750 The next step is to calculate the root mean squared error now, surprisingly, this is done by the mean 48 00:04:28,750 --> 00:04:32,900 squared error function where you can pass in the argument squared equal to false. 49 00:04:33,370 --> 00:04:34,270 So let's try that. 50 00:04:37,050 --> 00:04:40,370 OK, so we get about one point six, seven, which makes sense. 51 00:04:43,430 --> 00:04:48,230 And of course, we can just take the square root of our previous calculation, so let's try that also. 52 00:04:50,650 --> 00:04:52,900 And we get the same answer as expected. 53 00:04:55,850 --> 00:05:00,950 The next step is to calculate the mean absolute error, since we have a secure, learned function for 54 00:05:00,950 --> 00:05:02,840 this, we are going to make use of it. 55 00:05:06,020 --> 00:05:11,270 OK, so you can see that although the root mean square error and the mean absolute error have the same 56 00:05:11,270 --> 00:05:13,880 units, they do not give you the same value. 57 00:05:17,070 --> 00:05:21,560 Now, we know that for all the previous metrics we've seen, they are scale dependent. 58 00:05:22,140 --> 00:05:26,990 So the next step is to look at the R-squared, which does not depend on the scale of the data. 59 00:05:30,660 --> 00:05:35,820 OK, so this should be surprising, as you recall, we said that the best R-squared is one. 60 00:05:36,240 --> 00:05:39,670 Where is the R-squared of simply predicting the mean value is zero? 61 00:05:40,770 --> 00:05:45,550 It turns out that our naive forecast gets an R squared of zero point nine nine nine. 62 00:05:46,080 --> 00:05:51,300 This kind of makes sense since stock prices don't vary that wildly from one day to the next. 63 00:05:51,690 --> 00:05:56,260 So predicting the last value in the series should give us pretty good predictors. 64 00:05:56,820 --> 00:06:01,800 However, in another sense, these are also very bad predictions because they are really just the dumbest 65 00:06:01,800 --> 00:06:02,840 predictions possible. 66 00:06:03,510 --> 00:06:09,120 So let this be a lesson that if you see a model that happens to predict stock prices very well, don't 67 00:06:09,120 --> 00:06:11,430 assume that such a model is actually useful. 68 00:06:14,840 --> 00:06:19,130 The next step is to compute the make, which is another scale and very symmetric. 69 00:06:22,120 --> 00:06:26,620 OK, so this is nearly zero, which makes sense since the R-squared is nearly one. 70 00:06:30,110 --> 00:06:32,090 The next step is to compute the SMAP. 71 00:06:32,600 --> 00:06:38,240 Now you'll notice that I didn't import this function from Sakia learn this is because no such function 72 00:06:38,240 --> 00:06:38,850 exists. 73 00:06:39,290 --> 00:06:42,800 So this is the power of being able to implement these things yourself. 74 00:06:43,190 --> 00:06:48,320 Someone who does not have these skills would probably start by going to Google and then checking stack 75 00:06:48,320 --> 00:06:49,670 overflow and so forth. 76 00:06:49,910 --> 00:06:54,500 They might end up wasting their whole day trying to figure out where is the function for SMAP. 77 00:06:54,890 --> 00:06:59,840 But when you do have these skills, you're able to get this done in just a few lines of code and a few 78 00:06:59,840 --> 00:07:00,950 seconds of effort. 79 00:07:04,480 --> 00:07:09,430 OK, and we see that the result is pretty close to the non symmetric map, which makes sense. 8470