All language subtitles for 005 Haar-like Features-en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese Download
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,600 --> 00:00:05,820 Hello welcome back to the course on computer vision we talk about the horror like features the name 2 00:00:05,880 --> 00:00:14,970 Haar comes from Alfred Haar who was a Hungarian mathematician in the 19th century and he developed a 3 00:00:15,240 --> 00:00:24,870 mathematical concept called the haar wavelet which is a basis of way for functions which act similar 4 00:00:24,870 --> 00:00:30,900 to the four year transformation but we're not going to get into detail on that right now because we 5 00:00:30,900 --> 00:00:35,790 don't really need that we just need the horror like features which have since been derived from the 6 00:00:35,790 --> 00:00:42,840 hard wavelet and our that's why they called Horo like features they just kind of like the descendants 7 00:00:42,870 --> 00:00:48,650 of the haar wavelet and so we're going to look at the HARLICK features straight away. 8 00:00:48,840 --> 00:00:50,840 So what are these features. 9 00:00:51,060 --> 00:00:52,520 They are. 10 00:00:52,530 --> 00:00:53,340 Let's first look at them. 11 00:00:53,370 --> 00:00:59,010 They are the edge features the line features and the four rectangle features these are the Horlick features 12 00:00:59,010 --> 00:01:04,460 that Viola Jones male and Jones identified in their research in their paper. 13 00:01:04,590 --> 00:01:07,590 And so what are they. 14 00:01:07,630 --> 00:01:11,220 What does it mean to be a feature. 15 00:01:11,220 --> 00:01:14,100 Basically what we have is pixels. 16 00:01:14,100 --> 00:01:20,220 So on an image you might have somewhere for example like a edge. 17 00:01:20,250 --> 00:01:25,200 Right so an edge of feature like for instance an edge of an image for example just imagine like a photo 18 00:01:25,200 --> 00:01:26,580 of a table or something. 19 00:01:26,700 --> 00:01:33,990 Then part of the table where you can see the table top might be light and or lighter than the part where 20 00:01:33,990 --> 00:01:39,360 the table ends might be darker because it's further away it's the floor or the carpet might be a different 21 00:01:39,360 --> 00:01:40,320 color or something. 22 00:01:40,500 --> 00:01:48,120 And so you can identify that that's the end of the table because you know there's a line of pixels which 23 00:01:48,120 --> 00:01:55,920 are brighter line of pixels which are dark in terms of the age in terms of line features you might you 24 00:01:55,920 --> 00:02:01,620 know might identify some objects and it doesn't have to be just like one line of pixel this could be 25 00:02:01,620 --> 00:02:06,030 like three columns of pixels and then you know 15 rows of pixels. 26 00:02:06,270 --> 00:02:08,210 So these are scalable. 27 00:02:08,520 --> 00:02:11,680 They can be longer or they can be wider and so on. 28 00:02:11,850 --> 00:02:17,160 So let's have a look at how all of these work out in the in terms of a face. 29 00:02:17,260 --> 00:02:17,460 Right. 30 00:02:17,460 --> 00:02:23,760 So here we've got our friend J.D. and let's see if we can identify some Horlick features here and that 31 00:02:23,790 --> 00:02:27,840 will help us understand better why exactly they were picked. 32 00:02:27,840 --> 00:02:31,040 What's a role what purpose they serve. 33 00:02:31,230 --> 00:02:32,310 So have a look here. 34 00:02:32,310 --> 00:02:39,250 So if you look at his mouth you can see here there is a tiny tiny tiny blackline then. 35 00:02:39,290 --> 00:02:43,530 So that line that part is darker than the lips. 36 00:02:43,530 --> 00:02:46,130 So that's a horror like feature. 37 00:02:46,230 --> 00:02:52,220 You can see that that's a feature that's a line feature because that part is darker than that. 38 00:02:52,230 --> 00:02:53,630 Those two parts bottom and top. 39 00:02:53,640 --> 00:03:03,370 And this is all very rough very very very kind of rough approach to this image but nevertheless it works. 40 00:03:03,540 --> 00:03:04,570 And you will see why it works. 41 00:03:04,580 --> 00:03:08,420 Further down as we talk about things like the integral image and so on. 42 00:03:08,430 --> 00:03:13,890 But for now let's just let's just stick to this bar so we kind of like breaking out these features we're 43 00:03:13,890 --> 00:03:15,430 trying to see them in this image. 44 00:03:15,480 --> 00:03:17,040 So what else can we see here. 45 00:03:17,070 --> 00:03:19,560 Can you see any other of the features that we discussed. 46 00:03:19,680 --> 00:03:22,300 If you if we go back we've got the edge features. 47 00:03:22,440 --> 00:03:25,760 So we just identified this one we just identified a line feature. 48 00:03:26,070 --> 00:03:32,450 Can we find maybe an age feature in several There's a line feature they'll look at the eyebrow you can 49 00:03:32,450 --> 00:03:41,390 see that the eyebrow is darker than the forehead so let's put a feature there so you can see that that 50 00:03:41,570 --> 00:03:46,700 signifies the edge of the forehead that line that eyebrow. 51 00:03:46,910 --> 00:03:59,000 And so therefore the hard feature of an edge can serve a as a concern to show where this break this 52 00:03:59,330 --> 00:04:00,510 division happens. 53 00:04:00,860 --> 00:04:05,600 And that's normally going to be the case for in most photos that you process. 54 00:04:05,840 --> 00:04:09,980 Most of the time the eyebrow is going to be darker in the face. 55 00:04:10,060 --> 00:04:15,770 So for you know some some rare instances or some non frequent instances in most cases are going to be 56 00:04:15,770 --> 00:04:19,750 like this and we'll I will talk about how all of these features were together. 57 00:04:19,850 --> 00:04:25,430 So it doesn't have to be like every single time it's is like this but they're going to have a kind of 58 00:04:25,430 --> 00:04:30,020 like a voting system that the more features the more likely it is a face. 59 00:04:30,140 --> 00:04:31,300 Let's have a look at some more. 60 00:04:31,520 --> 00:04:35,350 What can we see on for example on the nose. 61 00:04:35,450 --> 00:04:40,040 Can we see that part of the nose is brighter than the other part. 62 00:04:40,040 --> 00:04:42,400 So you can see here. 63 00:04:42,930 --> 00:04:44,780 Let's see on the left here. 64 00:04:44,800 --> 00:04:45,740 The nose is brighter. 65 00:04:45,740 --> 00:04:48,660 The bridge of the nose is brighter than this part. 66 00:04:48,710 --> 00:04:51,130 And that's normally going to be the case. 67 00:04:51,170 --> 00:04:57,560 So if you look at pretty much any photo in grayscale you will often find that the bridge of the nose 68 00:04:57,710 --> 00:05:02,240 is brighter than either just one side or even both sides. 69 00:05:02,240 --> 00:05:03,110 It depends on the lighting. 70 00:05:03,110 --> 00:05:08,360 Sometimes you might find here like a line type of feature so this part is wide and then you've got dark 71 00:05:08,780 --> 00:05:09,330 and dark. 72 00:05:09,350 --> 00:05:14,390 That's just how the human face reflects light in this case the lights coming from the left. 73 00:05:14,480 --> 00:05:17,500 And that's why the left part is a bit brighter overall. 74 00:05:17,540 --> 00:05:18,360 The right parts are. 75 00:05:18,390 --> 00:05:20,230 But nevertheless you can see this edge. 76 00:05:20,270 --> 00:05:23,160 So here you are nobody have a line or an edge feature. 77 00:05:23,390 --> 00:05:25,880 That's exactly what we're drawing here. 78 00:05:25,880 --> 00:05:28,060 There's our edge feature. 79 00:05:28,840 --> 00:05:30,050 And then let's have a look another one. 80 00:05:30,050 --> 00:05:36,410 You can see the eye for instance because though the outside outside of the outer part of the eye is 81 00:05:36,420 --> 00:05:39,350 white and the inner part is not white. 82 00:05:39,500 --> 00:05:45,740 It means it's dark or so it doesn't have to be like black and white it's black and white would be ideal. 83 00:05:45,740 --> 00:05:50,540 So for instance the eyebrow in the forehead in this case are very very close to black and white and 84 00:05:50,600 --> 00:05:52,540 because of the eye it's less close. 85 00:05:52,550 --> 00:05:58,460 But nevertheless it still works you can see that you can identify that feature inside the eye as you 86 00:05:58,460 --> 00:06:03,020 could could go keep going like that you could find another one for the other eye than eyebrow. 87 00:06:03,200 --> 00:06:08,060 You could you just keep finding these features and some of them would be dependent on the lighting some 88 00:06:08,060 --> 00:06:10,550 of them would be dependent on the person inside. 89 00:06:10,570 --> 00:06:19,580 But overall you'd have kind of like a set of features that are predominantly present on most human faces 90 00:06:19,580 --> 00:06:27,880 or like you have like a set of features most of which are present on any face meaning that if you you 91 00:06:27,890 --> 00:06:33,110 have to have like at least 80 or 90 percent of the features like speaking roughly of course we'll get 92 00:06:33,110 --> 00:06:35,220 into more detail about this further down. 93 00:06:35,420 --> 00:06:39,860 But you'd have a speaker after you'd have to have like 80 or 90 percent of those features actually and 94 00:06:39,860 --> 00:06:44,690 fight for it something to be a face would be very rare that you would have a face that would have only 95 00:06:44,690 --> 00:06:45,770 half of those. 96 00:06:46,070 --> 00:06:50,600 So now let's see how exactly this works in terms of the pixels. 97 00:06:50,720 --> 00:06:56,310 So let's talk about the nose bridge featured or identified this age feature over here. 98 00:06:56,390 --> 00:07:00,480 So if we zoom in a little bit you can see it is a feature. 99 00:07:00,520 --> 00:07:02,520 Now I'm going to draw. 100 00:07:02,780 --> 00:07:11,040 I'm going to show the nose without the feature overlaid but instead we're going to have the pixels so 101 00:07:11,050 --> 00:07:18,540 of course this is a very simplified representation because there are many more pixels in this space. 102 00:07:18,550 --> 00:07:23,050 But just for argument's sake let's say that there's four pixels by eight pixels thirty two pixels in 103 00:07:23,050 --> 00:07:25,130 total and this nose. 104 00:07:25,240 --> 00:07:28,470 And now we're going to understand how the algorithm works. 105 00:07:28,540 --> 00:07:36,100 So let's ride out these pixels and here I've given them values between 0 and 10 there is 0 and 1 0 means 106 00:07:36,100 --> 00:07:36,880 very white. 107 00:07:36,880 --> 00:07:43,950 So 100 percent white 10 mean and 1 means 100 percent black and anywhere in between is greyscale. 108 00:07:43,960 --> 00:07:51,810 And of course greyscale is from 0 to 255 because it's like 256 values and so on. 109 00:07:51,820 --> 00:07:53,740 But we're not going to bother with writing them a lot. 110 00:07:53,740 --> 00:08:00,560 We're just going to normalize them two from 0 to 1 and just keep it that way. 111 00:08:00,940 --> 00:08:02,790 So it will be easy just for us to discuss. 112 00:08:02,840 --> 00:08:08,020 And so here you can look at this image and you can see that this is not 100 percent white. 113 00:08:08,020 --> 00:08:09,210 This would be 100 percent white. 114 00:08:09,370 --> 00:08:10,630 But this is close to white. 115 00:08:10,630 --> 00:08:18,010 So in an image like for argument's sake again this could be like 0.1 this pixel could have like a 0.1 116 00:08:18,010 --> 00:08:24,430 intensity given that zeros and white and one is 100 percent like this pixel is a bit darker. 117 00:08:24,430 --> 00:08:29,320 So is there a point to this pixel is about the same brightness or upon one's report ones or report one 118 00:08:29,560 --> 00:08:31,770 0.2 it's a bit darker over here. 119 00:08:31,870 --> 00:08:35,720 0.3 it's even darker if you compare to that one. 120 00:08:35,740 --> 00:08:37,450 This one is 0.1. 121 00:08:37,450 --> 00:08:42,040 This is a point and so on and so on so you can see that these are they're not all zeros but they're 122 00:08:42,040 --> 00:08:42,700 close. 123 00:08:42,760 --> 00:08:49,540 They're on the side you can see that this one is about 0.4 or somewhere on their 0.5. 124 00:08:49,540 --> 00:08:55,780 This gets darker 0.8 3.6 this was a bit brighter sort of potent force similar to that. 125 00:08:56,350 --> 00:09:01,310 Closer to that than to that and then to that for instance the reported fires are pretty high. 126 00:09:01,360 --> 00:09:03,610 So you can see kind of the values here. 127 00:09:03,610 --> 00:09:11,770 Now what's the algorithm will do what the algorithm will do is now that we've identified all of this 128 00:09:11,770 --> 00:09:17,680 so this is this is where we see the feature we can see that kind of like white and black but this is 129 00:09:17,680 --> 00:09:22,530 what is happening in reality not actually 100 percent black and Harmsen white that somewhere greyscale. 130 00:09:22,720 --> 00:09:25,860 So what the Belgians algorithm will do is it will compare. 131 00:09:25,870 --> 00:09:32,410 How close is this real world tonight or what is happening in the real world to the ideal Haar like feature 132 00:09:32,410 --> 00:09:33,970 that it's looking for the white and black. 133 00:09:33,970 --> 00:09:34,740 How close is it. 134 00:09:34,750 --> 00:09:35,880 So how would it do that. 135 00:09:36,070 --> 00:09:40,740 Well it will take the average of these Vaso First of all it will calculate the sum. 136 00:09:40,750 --> 00:09:45,820 That's the first thing it does it calculates the sum of all these pixels and then divide by the number 137 00:09:45,820 --> 00:09:52,180 of pixels it takes the average intensity of the white pixels and 0.1 6:06 takes the same thing for the 138 00:09:52,180 --> 00:09:57,360 black pixels takes the sum of all of these values and now takes the average. 139 00:09:57,370 --> 00:10:04,240 So as devised by whatever 16 here adds them up to as 16 gets average and this is the average intensity 140 00:10:04,240 --> 00:10:07,240 of all of these pixels Fizer 0.06. 141 00:10:07,480 --> 00:10:13,120 And now it's simply subtract one from the other subtracts this from that. 142 00:10:13,120 --> 00:10:17,680 So the black minus the white is 0.4 02. 143 00:10:17,920 --> 00:10:24,000 So these pixels and minus these pixels are the average minus averages 0.4. 144 00:10:24,490 --> 00:10:31,150 Now so the question is what would this difference be in the ideal scenario. 145 00:10:31,150 --> 00:10:38,890 Kind of the ideal scenario that where you would find exactly this feature where this would be all black 146 00:10:38,890 --> 00:10:40,110 and this will be all white. 147 00:10:40,390 --> 00:10:41,500 Well exactly right. 148 00:10:41,500 --> 00:10:43,510 If it is all black this is all one. 149 00:10:43,530 --> 00:10:45,380 If so why this is all zeros. 150 00:10:45,460 --> 00:10:50,320 So it'd be one minus zero the average be one the average here would be 0 1 0 would be 1. 151 00:10:50,320 --> 00:10:59,210 So basically in this in the in this approach the closer it is to one the closer the real world scenario. 152 00:10:59,230 --> 00:11:03,020 This one is to what is being looked for. 153 00:11:03,070 --> 00:11:10,390 Now of course we understand that we're humans we're not robots we don't have black and white delimited 154 00:11:10,390 --> 00:11:16,180 parts like that so when we're And you know it depends on the lighting depends on on all these different 155 00:11:16,180 --> 00:11:16,440 things. 156 00:11:16,440 --> 00:11:20,880 So of course you're never going to have an exact one here or an exact zero. 157 00:11:20,880 --> 00:11:26,200 This is always going to be some sort of gradients some sort of different colors shades and so on. 158 00:11:26,500 --> 00:11:33,250 That's why the bell Jones has some thresholds and through the training which we'll discuss further down 159 00:11:33,640 --> 00:11:36,090 through the training these thresholds identified. 160 00:11:36,100 --> 00:11:42,190 And for instance for this specific feature in this position it's identified that the minimum threshold 161 00:11:42,190 --> 00:11:43,920 might be 0.3. 162 00:11:44,110 --> 00:11:52,600 And so whenever you get a result a calculated value here that is greater than 0.3 that means that indeed 163 00:11:52,720 --> 00:11:57,820 this is somewhat similar to this HARLICK feature which we're looking for and we're going to say that 164 00:11:57,850 --> 00:12:02,530 yes this Harlech feature is present in the image where in that position where we were looking for it 165 00:12:03,280 --> 00:12:09,980 if on the other hand you get like 0.1 or basically a value that's less than your threshold then is the 166 00:12:10,020 --> 00:12:13,450 difference between the two colorings for instance here. 167 00:12:13,940 --> 00:12:18,580 For answers if you just basically applied looked for this feature because the algorithm doesn't know 168 00:12:18,580 --> 00:12:22,600 that it needs to look for it here it will just look for it everywhere and look for it here or look for 169 00:12:22,600 --> 00:12:23,100 a hero. 170 00:12:23,110 --> 00:12:27,790 It's looking for the nose and at the end all it knows about the nose is that the nose looks like this 171 00:12:28,210 --> 00:12:29,770 like this like this feature. 172 00:12:29,770 --> 00:12:31,000 So it's going to look for it everywhere. 173 00:12:31,000 --> 00:12:36,580 So if for instance algorithm were to apply this exact HARLICK feature some over here in the top left 174 00:12:36,580 --> 00:12:43,890 corner then the difference between the real world of whites and the real or black would be nothing right 175 00:12:43,900 --> 00:12:46,090 because it would be zero. 176 00:12:46,120 --> 00:12:51,430 In this case because it's exactly the same it's just looking at the wall in the white and the wall in 177 00:12:51,430 --> 00:12:52,140 the black. 178 00:12:52,360 --> 00:13:00,140 So it would look for the nose in many different places and then it might find a nose somewhere else. 179 00:13:00,160 --> 00:13:04,120 But then through other features it will understand that that's not the face. 180 00:13:04,240 --> 00:13:06,760 But for now what will it find here in. 181 00:13:06,850 --> 00:13:09,420 Once it gets to the actual nose it will see that yes. 182 00:13:09,450 --> 00:13:15,340 In this case it it surpasses the three threshold that exceeds the threshold so this we're going to say 183 00:13:15,340 --> 00:13:20,080 that there is this potentially or knows over here that might be a potential nose over there that might 184 00:13:20,080 --> 00:13:23,830 be a potential nose or here but it's definitely going to say that there is no nose in many other places 185 00:13:24,160 --> 00:13:27,780 and then through the other features and we'll learn more about this throughout the section. 186 00:13:27,790 --> 00:13:32,710 It will be able to identify where exactly the nose is where exactly the faces but that's that's the 187 00:13:32,710 --> 00:13:33,370 starting point. 188 00:13:33,370 --> 00:13:35,380 And so that's where the threshold comes in. 189 00:13:35,380 --> 00:13:43,270 The threshold is very useful to understand if this feature is actually present or if this is or we should 190 00:13:43,270 --> 00:13:47,640 move on and there is this feature is not present in that location we're looking. 191 00:13:47,650 --> 00:13:51,010 So that's how the Horlick features work. 192 00:13:51,130 --> 00:13:52,280 Very important. 193 00:13:52,320 --> 00:13:54,960 They lie in the foundation of this algorithm. 194 00:13:55,000 --> 00:14:01,840 They're kind of like once those features that were identified in the face the like for eyebrows and 195 00:14:01,840 --> 00:14:02,240 so on. 196 00:14:02,260 --> 00:14:09,160 Once you once the Aluf you don't have to do this by your algorithm identifies through training a a number 197 00:14:09,160 --> 00:14:17,410 of features quite a large number of features then that is kind of like forms the basis for to understand 198 00:14:17,410 --> 00:14:24,060 what elements roughly a face is constructed from and then it will be able to use those to detect the 199 00:14:24,060 --> 00:14:24,530 face. 200 00:14:24,640 --> 00:14:30,340 But in the in a nutshell this is how it works through the pixels through the white and black regions 201 00:14:30,400 --> 00:14:34,410 which it calculates averages for and then subtract. 202 00:14:34,900 --> 00:14:35,590 So there we go. 203 00:14:35,590 --> 00:14:39,040 Hope you enjoyed today said Tauriel and I'll look for see in the next one. 204 00:14:39,050 --> 00:14:40,770 And until then enjoy computer vision. 22398

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.