Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,470 --> 00:00:04,170
Hello and welcome back to the course on computer vision in today's tutorial we're talking about the
2
00:00:04,260 --> 00:00:05,580
integral image.
3
00:00:05,580 --> 00:00:12,770
So what is this image that is so important in computer vision in the way all the Jones algorithm.
4
00:00:12,930 --> 00:00:18,750
Well previously we stopped off here where we introduced the horror like features and we found out how
5
00:00:18,750 --> 00:00:26,700
they were calculated so briefly to recap we have a certain feature that is commonly found in certain
6
00:00:27,480 --> 00:00:36,120
parts of the face for instance here we can see this line where it is like more white half was more black
7
00:00:36,780 --> 00:00:42,540
is Representative sometimes on some certain photos in some sort in a certain lighting it can be present
8
00:00:42,570 --> 00:00:43,680
in the nose here.
9
00:00:43,890 --> 00:00:49,470
And then we calculated that through the actual pixel values in the image.
10
00:00:49,530 --> 00:00:50,790
When we talked about threshold.
11
00:00:50,970 --> 00:00:57,480
So basically we found out that in order to calculate this in value for the feature to find out if it's
12
00:00:57,480 --> 00:01:05,880
present in the image or not so that the criterion for that to evaluate the criterion for us and feature
13
00:01:05,880 --> 00:01:10,950
we need to make perform calculations and we need to calculate the sum of all of these values and this
14
00:01:10,950 --> 00:01:13,690
rectangle and the sum of all the values in this rectangle.
15
00:01:13,690 --> 00:01:14,930
Strike one from the other.
16
00:01:15,510 --> 00:01:21,070
And as you can imagine this can be quite a costly exercise in terms of computations.
17
00:01:21,090 --> 00:01:27,080
It take quite some time to add all these numbers up especially if you have lots of these features at
18
00:01:27,100 --> 00:01:30,040
evaluating and especially if these features are quite large.
19
00:01:30,570 --> 00:01:37,630
So there is a hack to doing this very quickly and this is where the integral image comes in.
20
00:01:37,650 --> 00:01:44,870
So let's just imagine that we have an image with a certain number of pixels us as shown here.
21
00:01:45,000 --> 00:01:51,290
And for simplicity's sake we're only going to look at the intensity of the gray scale from 0 to 1.
22
00:01:51,300 --> 00:01:56,460
We're just going to do it from 0 to 10 so we'll just multiply by time just so it's easier for us to
23
00:01:57,330 --> 00:02:01,910
look at this example we don't have to deal in decimal point so let's randomly fill in this image.
24
00:02:01,930 --> 00:02:04,140
Doesn't really matter what kind of image we're talking about.
25
00:02:04,290 --> 00:02:09,890
Any image can be represented in this type of form where we have a value for every single pixel.
26
00:02:10,140 --> 00:02:16,080
And so let's say we're calculating a whore like feature and we need to in order to calculate evaluate
27
00:02:16,080 --> 00:02:23,700
that feature on this image we need to calculate the sum of all of the intensities in this specific valks
28
00:02:23,700 --> 00:02:24,860
that we've outlined.
29
00:02:24,870 --> 00:02:30,680
So what we normally do is we would just take 10 plus four plus nine plus eight percent plus four plus
30
00:02:30,680 --> 00:02:34,960
one plus nine point nine percent plus we'll add them up and we would get some.
31
00:02:35,160 --> 00:02:40,890
As we discussed this can become potentially expensive if you're looking at large features like this
32
00:02:40,890 --> 00:02:46,620
one and there a three by four that's 12 values that you have to add if you if they can grow if they're
33
00:02:46,920 --> 00:02:50,140
larger in size then is it going to be even more values that you need.
34
00:02:50,160 --> 00:02:54,980
So basically the the larger the feature The larger the image the more you have to do some calculations
35
00:02:54,990 --> 00:03:02,840
the more expensive that calculation is in terms of time and as we know in computer vision if we especially
36
00:03:02,850 --> 00:03:07,220
for real time computer vision then time is very valuable we cannot.
37
00:03:07,260 --> 00:03:12,630
We want to minimize the amount of time we spend calculating and plus you need to not only evaluate one
38
00:03:12,630 --> 00:03:17,580
feature there can be lots and lots of features that you want to evaluate in one image.
39
00:03:17,580 --> 00:03:21,360
So again another reason to minimize the time spent on this.
40
00:03:21,390 --> 00:03:24,880
So what are going to do now is we are going to put this image the side here.
41
00:03:24,900 --> 00:03:26,710
Are we going to forget about the box for now.
42
00:03:26,730 --> 00:03:28,010
We'll get back to it.
43
00:03:28,050 --> 00:03:32,130
I now we're going to move to constructing the integral image So here we've got the image on the left
44
00:03:32,130 --> 00:03:32,670
and on the right.
45
00:03:32,670 --> 00:03:38,830
We're going to create the integral image integral image is exactly the same size as the original image.
46
00:03:39,210 --> 00:03:45,540
And all you need to do to calculate the integral image is a very interesting operation.
47
00:03:45,540 --> 00:03:51,930
So for any given square in the interval and let's take the square for example the value that will reside
48
00:03:51,930 --> 00:03:59,810
in that square is the sum of the values on the original image above and to the left.
49
00:04:00,030 --> 00:04:01,180
So the square is here.
50
00:04:01,230 --> 00:04:07,260
That means we take all of the values that are sitting here and the original image and we add them put
51
00:04:07,260 --> 00:04:09,440
their sum into the integral image.
52
00:04:09,450 --> 00:04:14,200
So in this case it's one plus 9 10 2 plus 8 25 plus 0 25.
53
00:04:14,280 --> 00:04:16,450
And that's why we're going to put 25 again.
54
00:04:16,800 --> 00:04:23,030
Now there's another one just to solidify this process was just around the square here.
55
00:04:23,040 --> 00:04:29,100
So in order to calculate this square we're going to have to calculate all of the values they like to
56
00:04:29,100 --> 00:04:30,330
take.
57
00:04:30,670 --> 00:04:38,040
And it's not like over here which is this square and then go up and left and everything in this rectangle.
58
00:04:38,040 --> 00:04:40,710
We need to add them up and put them put them into that squares.
59
00:04:40,710 --> 00:04:42,970
If you add them up you'll get all 134.
60
00:04:43,110 --> 00:04:46,500
And that's the value that goes into this square in integral image.
61
00:04:46,500 --> 00:04:50,970
Now what we're going to do is we're going to fill in the whole integral image exactly that way so you
62
00:04:50,970 --> 00:04:54,790
can just if you would like to you can pull this video and you just stick your hand value.
63
00:04:54,780 --> 00:05:01,500
So for instance 201 here would be the sum of all of the values that are in this rectangle.
64
00:05:01,670 --> 00:05:08,460
And so that is how you calculate the integral image and so why do why do we use integral.
65
00:05:08,550 --> 00:05:15,530
Because it's a very efficient hack to the exact problem that we have with the exact challenge we have
66
00:05:15,530 --> 00:05:18,850
of calculating those howre like features.
67
00:05:18,890 --> 00:05:25,100
And the reason for that is because HARLICK features are actually rectangle so let's have a look at an
68
00:05:25,100 --> 00:05:25,670
example here.
69
00:05:25,670 --> 00:05:31,010
So back to square that we originally were rectangle there were originally identified so we wanted we
70
00:05:31,010 --> 00:05:38,360
said that hypothetically we want to calculate the sum of all of the values inside this rectangle in
71
00:05:38,360 --> 00:05:40,900
order to evaluate a certain horror like feature.
72
00:05:41,120 --> 00:05:46,070
And so what we would do here is we would use the integral image to help us remember we said that in
73
00:05:46,070 --> 00:05:51,200
this original image if if we took our original approach of just adding stuff up we would have to do
74
00:05:51,200 --> 00:05:55,310
four times three 12 you'd have to add up 12 values.
75
00:05:55,660 --> 00:06:00,350
Now let's see one integral image can help us do so in an integral image what we'll do is we'll take
76
00:06:00,620 --> 00:06:05,720
the value that represents this bottom right corner but we'll take an integral image.
77
00:06:05,720 --> 00:06:06,660
So there it is.
78
00:06:06,920 --> 00:06:10,660
And so this value is actually what does it represent.
79
00:06:10,670 --> 00:06:13,940
It represents the sum of all of the values in here.
80
00:06:13,940 --> 00:06:15,560
So let's outline them.
81
00:06:15,560 --> 00:06:16,390
There they are.
82
00:06:16,400 --> 00:06:23,500
So that's the sum of all those values is 235 So write that down at the bottom 235.
83
00:06:23,600 --> 00:06:29,570
Now we're going to take the value in the top right corner just above the top right corner of rectangle.
84
00:06:29,630 --> 00:06:35,000
And that value that 83 is the sum of everything.
85
00:06:35,270 --> 00:06:37,980
Above all rectangle above.
86
00:06:38,390 --> 00:06:41,140
In this area in these first two lines.
87
00:06:41,180 --> 00:06:49,130
So if we subtract this new value from our existing value from 2 3 5 then we will remove all of these
88
00:06:49,130 --> 00:06:49,420
values.
89
00:06:49,410 --> 00:06:50,560
So we'll take it out.
90
00:06:50,560 --> 00:06:55,840
So you can see like 235 we've got this big red box that's shaded.
91
00:06:55,940 --> 00:07:00,150
If we subtract eighty three we will get just this box.
92
00:07:00,200 --> 00:07:02,620
So let's write that down.
93
00:07:02,630 --> 00:07:09,020
Now we're going to take the value in the top left just above the top left corner of rectangle So this
94
00:07:09,020 --> 00:07:15,340
value over here of course ponse to this is a space in the image or here.
95
00:07:15,350 --> 00:07:19,700
And that means if we add it we will add back this rectangle.
96
00:07:20,150 --> 00:07:24,050
So it might be a bit counterintuitive but we're going to add it back because that's going to help us
97
00:07:24,050 --> 00:07:24,590
in the next step.
98
00:07:24,590 --> 00:07:30,020
So we added back and now finally we're going to take the value at the bottom left corner over here and
99
00:07:30,020 --> 00:07:35,720
we'll subtract that as well because that represents the values on all of this rectangle.
100
00:07:35,720 --> 00:07:43,400
So if we subtract that and write that down as you can see what we're left with is exactly that rectangle
101
00:07:43,400 --> 00:07:44,770
that we are after in the first place.
102
00:07:44,840 --> 00:07:47,780
So we took that value subtract that value.
103
00:07:47,770 --> 00:07:52,850
So it took that value subtract that value took the green bottom right value over here.
104
00:07:52,870 --> 00:07:56,840
Thirty five to check that A-3 added 47 to 71 34.
105
00:07:57,230 --> 00:08:01,040
And the result is 65 and that is exactly what we were after.
106
00:08:01,040 --> 00:08:09,650
So as you can see we only needed to perform four operations in order to calculate this rectangle the
107
00:08:09,650 --> 00:08:12,610
sum of the values inside this rectangle as opposed to.
108
00:08:12,620 --> 00:08:17,690
So when we need to to add or subtract for values as opposed to 12 values if you're doing it by hand
109
00:08:17,690 --> 00:08:20,700
if we had to calculate every single you know add them all up here.
110
00:08:21,080 --> 00:08:27,650
And the beauty of this is you will always only need to look at for values regardless of the size of
111
00:08:27,650 --> 00:08:33,050
the rectangle so even if your rectangle is a thousand by a thousand you will still only need to look
112
00:08:33,050 --> 00:08:37,850
at four values in the integral image whereas in the original image you would have to deal with 1000
113
00:08:37,850 --> 00:08:40,550
times 1000 You have to deal with a million values.
114
00:08:41,030 --> 00:08:47,420
So as you can see the integral image doesn't scale it does the time it takes you to process the integral
115
00:08:47,420 --> 00:08:55,880
image doesn't change proportionately and doesn't change at all regardless of the size of the feature
116
00:08:55,880 --> 00:08:56,560
that you're dealing with.
117
00:08:56,560 --> 00:09:01,790
And therefore it makes it very easy and once again it is only possible to use an integral image because
118
00:09:01,790 --> 00:09:08,010
as you remember all of the horror like features there are actually rectangles there actually are composed
119
00:09:08,180 --> 00:09:13,220
comprised of rectangles black or white rectangle so you always will need to look at rectangles as like
120
00:09:13,220 --> 00:09:19,310
a circle in there if there was some odd looking shape then it wouldn't be possible to help to use the
121
00:09:19,310 --> 00:09:20,640
integral image.
122
00:09:20,840 --> 00:09:27,530
And so that's why the hard features of while powerful are because even though they might not be ideal
123
00:09:27,530 --> 00:09:33,830
they might not ideally find the right of the exact features of an image there might be a bit rigid some
124
00:09:33,920 --> 00:09:40,700
in some places because they're just rectangles at the same time they make up for that with their computational
125
00:09:40,700 --> 00:09:47,090
efficiency because we're using the integral image and that's one of the major advantages of horror like
126
00:09:47,090 --> 00:09:53,560
features because you might find better features like circles or some other features that might be better
127
00:09:53,570 --> 00:09:59,140
classifiers overall but because hard but HARLICK features they are much easier to calculate through
128
00:09:59,150 --> 00:10:04,520
the integral image therefore you can use a lot of them you can use it you can evaluate them much much
129
00:10:04,520 --> 00:10:06,230
quicker and that's why they make up.
130
00:10:06,680 --> 00:10:09,630
So there we go that's how the integral image works.
131
00:10:09,740 --> 00:10:18,200
And that's why it's such a huge advantage for the viola Jones algorithm that's why it implies it.
132
00:10:18,410 --> 00:10:21,600
And yeah I hope you enjoyed it as a tural and aliquot seeing it next time.
133
00:10:21,600 --> 00:10:23,180
And until then enjoy computer vision.
15591
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.