subtitlecat.com

All language subtitles for 10 - K-fold Cross Validation.en

Afrikaans

Akan

Albanian

Amharic

Arabic

Armenian

Azerbaijani

Basque

Belarusian

Bemba

Bengali

Bihari

Bosnian

Breton

Bulgarian

Cambodian

Catalan

Cebuano

Cherokee

Chichewa

Chinese (Simplified)

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English

Esperanto

Estonian

Ewe

Faroese

Filipino

Finnish

French Download

Frisian

Galician

Georgian

German

Greek

Guarani

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Interlingua

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Kinyarwanda

Kirundi

Kongo

Korean

Krio (Sierra Leone)

Kurdish

Kurdish (Soranî)

Kyrgyz

Laothian

Latin

Latvian

Lingala

Lithuanian

Lozi

Luganda

Luo

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mauritian Creole

Moldavian

Mongolian

Myanmar (Burmese)

Montenegrin

Nepali

Nigerian Pidgin

Northern Sotho

Norwegian

Norwegian (Nynorsk)

Occitan

Oriya

Oromo

Pashto

Persian

Polish

Portuguese (Brazil)

Portuguese (Portugal)

Punjabi

Quechua

Romanian

Romansh

Runyakitara

Russian

Samoan

Scots Gaelic

Serbian

Serbo-Croatian

Sesotho

Setswana

Seychellois Creole

Shona

Sindhi

Sinhalese

Slovak

Slovenian

Somali

Spanish

Spanish (Latin American)

Sundanese

Swahili

Swedish

Tajik

Tamil

Tatar

Telugu

Thai

Tigrinya

Tonga

Tshiluba

Tumbuka

Turkish

Turkmen

Twi

Uighur

Ukrainian

Urdu

Uzbek

Vietnamese

Welsh

Wolof

Xhosa

Yiddish

Yoruba

Zulu

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 0 00:00:00,940 --> 00:00:02,470 [Autogenerated] So here is there be a rat. 1 00:00:02,470 --> 00:00:04,230 We know that we need to split our data 2 00:00:04,230 --> 00:00:07,120 into training. Validation and test data 3 00:00:07,120 --> 00:00:09,039 will produce end candidate models for 4 00:00:09,039 --> 00:00:11,220 running and training and validation 5 00:00:11,220 --> 00:00:13,970 processes. But just one test process to 6 00:00:13,970 --> 00:00:15,779 evaluate the finding model that we have 7 00:00:15,779 --> 00:00:19,039 chosen. This is referred to us singular 8 00:00:19,039 --> 00:00:22,030 cross validation the splitter original 9 00:00:22,030 --> 00:00:25,309 data in tow, training tests on a single 10 00:00:25,309 --> 00:00:28,629 validation set. Let's visually see how we 11 00:00:28,629 --> 00:00:31,190 use these three subsets off our data to 12 00:00:31,190 --> 00:00:33,600 get the best possible model. We trained 13 00:00:33,600 --> 00:00:35,500 the different candidate models on the 14 00:00:35,500 --> 00:00:37,390 training data. Evaluate them on the 15 00:00:37,390 --> 00:00:39,969 validation data. This process is called 16 00:00:39,969 --> 00:00:42,700 hyper parameter tuning. Each candidate, 17 00:00:42,700 --> 00:00:44,840 Morty will have different design 18 00:00:44,840 --> 00:00:46,840 parameters. You're trying to figure out 19 00:00:46,840 --> 00:00:49,340 which design off your model works well for 20 00:00:49,340 --> 00:00:52,179 your data. And finally, after you've used 21 00:00:52,179 --> 00:00:54,520 hyper parameter tuning to find the best 22 00:00:54,520 --> 00:00:57,369 design for your model, you'll do our final 23 00:00:57,369 --> 00:00:59,820 evaluation. The test data. So you know 24 00:00:59,820 --> 00:01:02,479 this is how your model performs the EU's 25 00:01:02,479 --> 00:01:05,040 off. A holdout validation set is a huge 26 00:01:05,040 --> 00:01:06,790 improvement over what we were doing 27 00:01:06,790 --> 00:01:09,209 earlier. However, that is still a problem. 28 00:01:09,209 --> 00:01:11,030 The model's performance on the validation 29 00:01:11,030 --> 00:01:14,030 sec get incorporated into the model 30 00:01:14,030 --> 00:01:17,890 itself, and this may introduce bias. So 31 00:01:17,890 --> 00:01:21,180 the validation set data become partof the 32 00:01:21,180 --> 00:01:23,939 models designed. And that's not good. 33 00:01:23,939 --> 00:01:25,819 What? We're trying to get us a model that 34 00:01:25,819 --> 00:01:28,760 is as robust as we can make it, which is 35 00:01:28,760 --> 00:01:31,540 why an alternative to using singular cross 36 00:01:31,540 --> 00:01:34,840 validation is key. Fold cross validation. 37 00:01:34,840 --> 00:01:36,719 Here. You don't have a single set off 38 00:01:36,719 --> 00:01:39,180 validation data to generate each candidate 39 00:01:39,180 --> 00:01:41,200 mortal. You'd repeatedly trained and 40 00:01:41,200 --> 00:01:43,250 validate using different subsets off 41 00:01:43,250 --> 00:01:46,189 training data. Now, this might not seem 42 00:01:46,189 --> 00:01:48,239 intuitive to you at first, but we'll see 43 00:01:48,239 --> 00:01:49,810 it visually and you'll understand what's 44 00:01:49,810 --> 00:01:52,510 going on. Okay, full cross validation 45 00:01:52,510 --> 00:01:54,299 tends to be very compute. A Shin Lee 46 00:01:54,299 --> 00:01:57,620 intensive but very robust. It does not 47 00:01:57,620 --> 00:02:00,019 waste. Eight are all off. The data is used 48 00:02:00,019 --> 00:02:03,109 well to generate a good model. Let's 49 00:02:03,109 --> 00:02:05,140 visually understand how k fold cross 50 00:02:05,140 --> 00:02:07,599 validation books. You have all of the data 51 00:02:07,599 --> 00:02:09,120 available to you in the real ball. You 52 00:02:09,120 --> 00:02:12,020 split it into training data on dhe test 53 00:02:12,020 --> 00:02:14,349 data. Test data is what you lose to 54 00:02:14,349 --> 00:02:16,780 perform a final evaluation on the model. 55 00:02:16,780 --> 00:02:19,159 Now instruct using the same validation 56 00:02:19,159 --> 00:02:21,349 data to evaluate different candidate 57 00:02:21,349 --> 00:02:23,560 models, you'll stay split your training 58 00:02:23,560 --> 00:02:25,960 data into different falls. Here I have 59 00:02:25,960 --> 00:02:28,419 five fold. This is fivefold. Cross 60 00:02:28,419 --> 00:02:31,219 validation with five full cross validation 61 00:02:31,219 --> 00:02:33,830 for each candidate model. You'll train 62 00:02:33,830 --> 00:02:37,240 your model five times the first time. Fold 63 00:02:37,240 --> 00:02:40,039 234 and five will be the training data. 64 00:02:40,039 --> 00:02:42,740 Fold one will be the validation data. 65 00:02:42,740 --> 00:02:44,860 You'll then train the same candidate of 66 00:02:44,860 --> 00:02:47,569 model with a different subset of training 67 00:02:47,569 --> 00:02:50,819 data full 134 and five complex. The 68 00:02:50,819 --> 00:02:53,110 training data fall, too. It's a validation 69 00:02:53,110 --> 00:02:56,139 data. You'll then do 1/3 round of training 70 00:02:56,139 --> 00:02:58,240 for the same candidate, Marty. This time, 71 00:02:58,240 --> 00:03:00,840 fold three is the validation data that the 72 00:03:00,840 --> 00:03:03,270 meaning falls. Make up your training data, 73 00:03:03,270 --> 00:03:05,460 and you'll continue this for split four 74 00:03:05,460 --> 00:03:08,629 and split five as well. So when you use 75 00:03:08,629 --> 00:03:11,219 fivefold cross validation for a single 76 00:03:11,219 --> 00:03:13,969 candidate model, you've run five training 77 00:03:13,969 --> 00:03:17,439 processes and five validation processes. 78 00:03:17,439 --> 00:03:19,800 Training and validation is run on each 79 00:03:19,800 --> 00:03:23,030 full off your training data. Once you run 80 00:03:23,030 --> 00:03:25,060 these five different training and 81 00:03:25,060 --> 00:03:27,509 validation processes, you average the 82 00:03:27,509 --> 00:03:29,699 performance off this candidate model 83 00:03:29,699 --> 00:03:32,639 across all fools, so you'll get one 84 00:03:32,639 --> 00:03:35,680 average score. And for this particular 85 00:03:35,680 --> 00:03:38,020 candidate, Morty, this average performance 86 00:03:38,020 --> 00:03:40,650 scores what you'll use to find the best 87 00:03:40,650 --> 00:03:43,250 candidate model, which candidate model has 88 00:03:43,250 --> 00:03:45,759 the best average performance school across 89 00:03:45,759 --> 00:03:48,259 all falls off training and validation. 90 00:03:48,259 --> 00:03:50,110 Once you've trained all of her candidate 91 00:03:50,110 --> 00:03:52,610 models on all of these falls, on average, 92 00:03:52,610 --> 00:03:54,740 their performance score, you'll take the 93 00:03:54,740 --> 00:03:57,569 best one that you phoned, evaluated on the 94 00:03:57,569 --> 00:04:00,990 test data. Thus, with careful cross 95 00:04:00,990 --> 00:04:03,520 validation since the validation data 96 00:04:03,520 --> 00:04:06,770 changes in each fold off training, it's 97 00:04:06,770 --> 00:04:08,840 impossible for the information in the 98 00:04:08,840 --> 00:04:13,000 validation data to become incorporated as part of the model. 7715