All language subtitles for 05_regularized-logistic-regression.en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian Download
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,710 --> 00:00:03,330 In this video, you see how to 2 00:00:03,330 --> 00:00:05,970 implement regularized logistic regression. 3 00:00:05,970 --> 00:00:09,450 Just as the gradient update for logistic regression has 4 00:00:09,450 --> 00:00:11,250 seemed surprisingly similar to 5 00:00:11,250 --> 00:00:13,455 the gradient update for linear regression, 6 00:00:13,455 --> 00:00:15,540 you find that the gradient descent update 7 00:00:15,540 --> 00:00:17,370 for regularized logistic regression will 8 00:00:17,370 --> 00:00:18,960 also look similar to 9 00:00:18,960 --> 00:00:21,375 the update for regularized linear regression. 10 00:00:21,375 --> 00:00:23,940 Let's take a look. Here is the idea. 11 00:00:23,940 --> 00:00:25,290 We saw earlier that 12 00:00:25,290 --> 00:00:27,920 logistic regression can be prone to overfitting 13 00:00:27,920 --> 00:00:29,810 if you fit it with very high order 14 00:00:29,810 --> 00:00:32,345 polynomial features like this. 15 00:00:32,345 --> 00:00:37,400 Here, z is a high order polynomial that gets passed into 16 00:00:37,400 --> 00:00:42,950 the sigmoid function like so to compute f. In particular, 17 00:00:42,950 --> 00:00:45,620 you can end up with a decision boundary that is 18 00:00:45,620 --> 00:00:49,345 overly complex and overfits as training set. 19 00:00:49,345 --> 00:00:52,100 More generally, when you train 20 00:00:52,100 --> 00:00:54,680 logistic regression with a lot of features, 21 00:00:54,680 --> 00:00:58,505 whether polynomial features or some other features, 22 00:00:58,505 --> 00:01:01,480 there could be a higher risk of overfitting. 23 00:01:01,480 --> 00:01:05,270 This was the cost function for logistic regression. 24 00:01:05,270 --> 00:01:09,095 If you want to modify it to use regularization, 25 00:01:09,095 --> 00:01:13,355 all you need to do is add to it the following term. 26 00:01:13,355 --> 00:01:16,640 Let's add lambda to regularization parameter over 27 00:01:16,640 --> 00:01:21,635 2m times the sum from j equals 1 through n, 28 00:01:21,635 --> 00:01:26,275 where n is the number of features as usual of wj squared. 29 00:01:26,275 --> 00:01:28,910 When you minimize this cost function 30 00:01:28,910 --> 00:01:30,870 as a function of w and b, 31 00:01:30,870 --> 00:01:34,280 it has the effect of penalizing parameters w_1, 32 00:01:34,280 --> 00:01:36,185 w_2 through w_n, 33 00:01:36,185 --> 00:01:39,160 and preventing them from being too large. 34 00:01:39,160 --> 00:01:42,380 If you do this, then even though you're fitting 35 00:01:42,380 --> 00:01:45,380 a high order polynomial with a lot of parameters, 36 00:01:45,380 --> 00:01:49,255 you still get a decision boundary that looks like this. 37 00:01:49,255 --> 00:01:51,395 Something that looks more reasonable 38 00:01:51,395 --> 00:01:54,200 for separating positive and negative examples 39 00:01:54,200 --> 00:01:56,540 while also generalizing hopefully 40 00:01:56,540 --> 00:01:59,890 to new examples not in the training set. 41 00:01:59,890 --> 00:02:02,390 When using regularization, 42 00:02:02,390 --> 00:02:04,310 even when you have a lot of features. 43 00:02:04,310 --> 00:02:07,145 How can you actually implement this? 44 00:02:07,145 --> 00:02:09,830 How can you actually minimize this cost function j of 45 00:02:09,830 --> 00:02:13,280 wb that includes the regularization term? 46 00:02:13,280 --> 00:02:17,545 Well, let's use gradient descent as before. 47 00:02:17,545 --> 00:02:21,255 Here's a cost function that you want to minimize. 48 00:02:21,255 --> 00:02:24,260 To implement gradient descent, as before, 49 00:02:24,260 --> 00:02:27,380 we'll carry out the following simultaneous updates 50 00:02:27,380 --> 00:02:30,820 over wj and b. 51 00:02:30,820 --> 00:02:34,795 These are the usual update rules for gradient descent. 52 00:02:34,795 --> 00:02:37,955 Just like regularized linear regression, 53 00:02:37,955 --> 00:02:41,285 when you compute where there are these derivative terms, 54 00:02:41,285 --> 00:02:44,000 the only thing that changes now is that 55 00:02:44,000 --> 00:02:49,115 the derivative respect to wj gets this additional term, 56 00:02:49,115 --> 00:02:54,315 lambda over m times wj added here at the end. 57 00:02:54,315 --> 00:02:56,990 Again, it looks a lot like 58 00:02:56,990 --> 00:02:59,500 the update for regularized linear regression. 59 00:02:59,500 --> 00:03:02,035 In fact is the exact same equation, 60 00:03:02,035 --> 00:03:04,535 except for the fact that the definition of 61 00:03:04,535 --> 00:03:07,460 f is now no longer the linear function, 62 00:03:07,460 --> 00:03:11,365 it is the logistic function applied to z. 63 00:03:11,365 --> 00:03:13,850 Similar to linear regression, 64 00:03:13,850 --> 00:03:16,910 we will regularize only the parameters w, j, 65 00:03:16,910 --> 00:03:19,100 but not the parameter b, 66 00:03:19,100 --> 00:03:20,660 which is why there's no change 67 00:03:20,660 --> 00:03:23,275 the update you will make for b. 68 00:03:23,275 --> 00:03:26,570 In the final optional lab of 69 00:03:26,570 --> 00:03:29,970 this week, you revisit overfitting. 70 00:03:29,970 --> 00:03:33,665 In the interactive plot in the optional lab, 71 00:03:33,665 --> 00:03:36,455 you can now choose to regularize your models, 72 00:03:36,455 --> 00:03:38,705 both regression and classification, 73 00:03:38,705 --> 00:03:41,120 by enabling regularization during 74 00:03:41,120 --> 00:03:44,820 gradient descent by selecting a value for lambda. 75 00:03:44,820 --> 00:03:46,925 Please take a look at the code for 76 00:03:46,925 --> 00:03:48,260 implementing regularized 77 00:03:48,260 --> 00:03:50,045 logistic regression in particular, 78 00:03:50,045 --> 00:03:51,680 because you'll implement this in 79 00:03:51,680 --> 00:03:55,260 practice lab yourself at the end of this week. 80 00:03:55,940 --> 00:03:58,450 Now you know how to implement 81 00:03:58,450 --> 00:04:00,770 regularized logistic regression. 82 00:04:00,770 --> 00:04:03,110 When I walk around Silicon Valley, 83 00:04:03,110 --> 00:04:04,930 there are many engineers using machine 84 00:04:04,930 --> 00:04:06,790 learning to create a ton of value, 85 00:04:06,790 --> 00:04:09,490 sometimes making a lot of money for the companies. 86 00:04:09,490 --> 00:04:11,680 I know you've only been studying 87 00:04:11,680 --> 00:04:14,060 this stuff for a few weeks but 88 00:04:14,060 --> 00:04:15,820 if you understand and can 89 00:04:15,820 --> 00:04:18,655 apply linear regression and logistic regression, 90 00:04:18,655 --> 00:04:20,740 that's actually all you need to create 91 00:04:20,740 --> 00:04:23,400 some very valuable applications. 92 00:04:23,400 --> 00:04:25,360 While the specific learning outcomes 93 00:04:25,360 --> 00:04:26,935 you use are important, 94 00:04:26,935 --> 00:04:29,860 knowing things like when and how to reduce 95 00:04:29,860 --> 00:04:31,780 overfitting turns out to be one of 96 00:04:31,780 --> 00:04:34,805 the very valuable skills in the real world as well. 97 00:04:34,805 --> 00:04:37,300 I want to say congratulations 98 00:04:37,300 --> 00:04:39,490 on how far you've come and I want 99 00:04:39,490 --> 00:04:41,285 to say great job for getting through 100 00:04:41,285 --> 00:04:44,120 all the way to the end of this video. 101 00:04:44,120 --> 00:04:46,115 I hope you also work through 102 00:04:46,115 --> 00:04:48,470 the practice labs and quizzes. 103 00:04:48,470 --> 00:04:50,600 Having said that, there are still 104 00:04:50,600 --> 00:04:53,120 many more exciting things to learn. 105 00:04:53,120 --> 00:04:55,910 In the second course of this specialization, 106 00:04:55,910 --> 00:04:57,769 you'll learn about neural networks, 107 00:04:57,769 --> 00:05:00,515 also called deep learning algorithms. 108 00:05:00,515 --> 00:05:02,570 Neural networks are responsible for 109 00:05:02,570 --> 00:05:04,870 many of the latest breakthroughs in the eye today, 110 00:05:04,870 --> 00:05:07,789 from practical speech recognition to computers 111 00:05:07,789 --> 00:05:09,500 accurately recognizing objects and 112 00:05:09,500 --> 00:05:11,620 images, to self-driving cars. 113 00:05:11,620 --> 00:05:13,970 The way neural network gets built 114 00:05:13,970 --> 00:05:16,790 actually uses a lot of what you've already learned, 115 00:05:16,790 --> 00:05:18,115 like cost functions, 116 00:05:18,115 --> 00:05:20,830 and gradient descent, and sigmoid functions. 117 00:05:20,830 --> 00:05:23,150 Again, congratulations on reaching 118 00:05:23,150 --> 00:05:26,255 the end of this third and final week of Course 1. 119 00:05:26,255 --> 00:05:29,090 I hope you have [inaudible] and I will see you 120 00:05:29,090 --> 00:05:32,640 in next week's material on neural networks.8862

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.