Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,670 --> 00:00:06,790
Hello and welcome to this new tutorial today we're going to start with the big first step of the implementation
2
00:00:06,790 --> 00:00:11,760
of our deep convolutional Ganns that consists of defining the generator.
3
00:00:11,950 --> 00:00:13,210
That's the big first step.
4
00:00:13,330 --> 00:00:18,700
And the second big step will be to define the discriminator then we'll have our brains and then we will
5
00:00:18,700 --> 00:00:21,650
train these brains to generate fake images.
6
00:00:21,790 --> 00:00:23,890
But today we're going to start with this big first step.
7
00:00:23,900 --> 00:00:30,790
Defining the generator and we'll do it in two steps steps first we will make the class that will define
8
00:00:30,790 --> 00:00:33,830
the architecture of the neural network of the generator.
9
00:00:33,850 --> 00:00:39,790
So we will put in a sequence to different modules that is the different layers of the neural network
10
00:00:40,150 --> 00:00:45,130
then we'll make the forward function to propagate the signal inside and will that work.
11
00:00:45,130 --> 00:00:51,280
And then once this class is done we'll be able to create the generator itself that is the neural network
12
00:00:51,400 --> 00:00:57,690
of the generator by simply creating an object of the class an instance of the class.
13
00:00:57,910 --> 00:01:00,980
So let's do this let's make this class.
14
00:01:01,120 --> 00:01:08,290
So to define a class in Python We start with class and then we give the name of the class which is going
15
00:01:08,290 --> 00:01:14,550
to be G for regenerator and then we'll call the class for the discriminator D.
16
00:01:15,010 --> 00:01:21,840
And now we're going to use a trick an object oriented programming language which is inheritance.
17
00:01:21,840 --> 00:01:28,660
We are going to inherit from in that module which contains all the tools that allow us to build a neural
18
00:01:28,660 --> 00:01:35,260
network and it's called an end or modules because the modules are the different applications and connections
19
00:01:35,260 --> 00:01:37,390
you can make inside a neural network.
20
00:01:37,450 --> 00:01:40,730
For example one module can be a convolution.
21
00:01:40,840 --> 00:01:43,420
Another module can be a full connection.
22
00:01:43,600 --> 00:01:49,450
Therefore the name and to apply the inheritance there is nothing more simple we just need to put as
23
00:01:49,510 --> 00:01:51,060
argument of this class G.
24
00:01:51,190 --> 00:01:56,220
The module we want to inherit from which is and that module.
25
00:01:56,680 --> 00:01:58,570
There we go this one.
26
00:01:59,410 --> 00:02:02,510
And in that module so that introduces a new class.
27
00:02:02,530 --> 00:02:07,300
And now inside this new class we have to define the whole architecture.
28
00:02:07,480 --> 00:02:12,230
And besides we'll include the forward function to be able to afford propagate a signal.
29
00:02:12,250 --> 00:02:18,640
That's the advantage of making a class A class is an advance structure inside of which you can include
30
00:02:18,640 --> 00:02:22,540
some properties of the future instances of the class that will be created.
31
00:02:22,540 --> 00:02:24,570
That is the future object.
32
00:02:24,580 --> 00:02:30,250
And also some different functions that will be like some tools of your object.
33
00:02:30,280 --> 00:02:35,490
Quick and the best example for those of you who are seeing your class for the first time.
34
00:02:35,530 --> 00:02:42,460
Well we could define a self-driving car with a class in one class we could include all the properties
35
00:02:42,550 --> 00:02:48,610
of the Sandrine car and then different functions for example to move forward function to turn the function
36
00:02:48,730 --> 00:02:51,700
to turn right function the stop function.
37
00:02:51,700 --> 00:02:55,670
All these functions that would allow to control the self-driving car.
38
00:02:55,810 --> 00:03:01,300
Well here that's exactly the same we're making in class to not only define the architecture of the neural
39
00:03:01,300 --> 00:03:07,120
network of the generator but also we'll add the forward function which is like a tool that we can apply
40
00:03:07,210 --> 00:03:08,410
on the neural network.
41
00:03:08,530 --> 00:03:13,340
And this tool will be used to propagate the signal inside the neural network.
42
00:03:13,690 --> 00:03:19,360
So let's do this let's first start by defining the architecture of the neural network.
43
00:03:19,360 --> 00:03:28,990
And as for every neural network that we make with a class we do this inside that in its function the
44
00:03:28,990 --> 00:03:32,320
innate function is the function you always start with a class.
45
00:03:32,350 --> 00:03:37,660
It is the function that defines the properties of the future objects that will be created from your
46
00:03:37,660 --> 00:03:38,660
class.
47
00:03:39,190 --> 00:03:44,810
And speaking of this object that's exactly the argument that we have to employ right now.
48
00:03:44,860 --> 00:03:51,610
And this is called self self has been a mystery for a lot of people but there is absolutely no mystery
49
00:03:51,610 --> 00:03:52,270
about it.
50
00:03:52,270 --> 00:03:56,280
It just refers to the future object that will be created.
51
00:03:56,350 --> 00:04:02,500
It's just a common way to specify that when we create a variable we use the self to specify that this
52
00:04:02,500 --> 00:04:04,920
variable belongs to the object.
53
00:04:04,930 --> 00:04:06,870
So each time I put self.
54
00:04:06,910 --> 00:04:08,850
That means I am referring to the object.
55
00:04:08,890 --> 00:04:13,070
That means I'm taking the object and setting some properties to it.
56
00:04:13,680 --> 00:04:14,910
So self.
57
00:04:15,010 --> 00:04:22,780
And now we are arguing to activate the inheritance because we import the n and module to inherit from
58
00:04:22,780 --> 00:04:23,130
it.
59
00:04:23,260 --> 00:04:31,210
But now we have to activate this inheritance and to do this we use the super function inside of which
60
00:04:31,420 --> 00:04:37,410
we input our Class G which is inheriting from the end in that module right here.
61
00:04:37,660 --> 00:04:45,160
And then the second argument is self because we will use the tools of that module on our object because
62
00:04:45,160 --> 00:04:49,360
our object is nothing else than the new one that work at the generator and the neural network and the
63
00:04:49,360 --> 00:04:52,630
generator will be composed of the different modules.
64
00:04:52,780 --> 00:04:57,760
And in that module the conclusions the full connections and more.
65
00:04:57,760 --> 00:05:03,990
All right and then that's not over we just need to add a that and underscore underscore in it double
66
00:05:04,010 --> 00:05:06,820
underscore again and parenthesis.
67
00:05:06,840 --> 00:05:07,590
All right.
68
00:05:07,670 --> 00:05:10,570
Nothing more important to know about this.
69
00:05:10,610 --> 00:05:17,250
That's just what we have to type here to activate the inheritance of the end and that much our next
70
00:05:17,250 --> 00:05:19,060
step.
71
00:05:19,070 --> 00:05:23,870
Now the next step is to make a meta module what is a metal module.
72
00:05:23,870 --> 00:05:28,540
That's a huge module that will be composed of several modules.
73
00:05:28,670 --> 00:05:34,580
I remind that by module I mean the different layers the different connections inside the neural network.
74
00:05:34,580 --> 00:05:41,150
So now we'll create a big metal module which will be a sequence of several modules some convolutions
75
00:05:41,360 --> 00:05:46,390
full connections inversed convolutions and more integrate this metal module.
76
00:05:46,430 --> 00:05:53,300
Well we're going to take our object because this metal module is going to be a property of the object
77
00:05:53,750 --> 00:06:00,080
and therefore I'm taking my object and this metal module of the object will be called Maine.
78
00:06:00,750 --> 00:06:07,050
Main will be the metal module that will contain all the different modules in a sequence of layers.
79
00:06:07,070 --> 00:06:08,870
And speaking of sequence of layers.
80
00:06:08,960 --> 00:06:17,000
That's exactly what we'll introduce right now with the N-N that sequence show class inside of which
81
00:06:17,150 --> 00:06:20,360
will create the different modules.
82
00:06:20,360 --> 00:06:20,990
All right.
83
00:06:21,020 --> 00:06:26,870
So you probably recognize that sequential is actually a class so self that main is actually a new object
84
00:06:26,960 --> 00:06:33,080
of the sequential class but of course this object represents this method module composed of the different
85
00:06:33,080 --> 00:06:34,100
modules.
86
00:06:34,100 --> 00:06:40,460
All right so now we're starting to define the architecture of the neural network and let's start with
87
00:06:40,460 --> 00:06:41,450
the first module.
88
00:06:41,750 --> 00:06:47,690
According to you according to the intuition lectures you saw with curial what is going to be this first
89
00:06:47,690 --> 00:06:48,520
module.
90
00:06:48,890 --> 00:06:54,380
Well of course that's going to be a non-first convolution not a convolution.
91
00:06:54,440 --> 00:06:58,030
And in verse 1 that is exactly the inverse of a convolution.
92
00:06:58,250 --> 00:07:01,620
And why do we have to start with an inverse convolution.
93
00:07:01,820 --> 00:07:06,480
Well that's because the role of the generator is to generate some fake images.
94
00:07:06,680 --> 00:07:13,730
And therefore since CNN takes as input some images and returns as output a vector.
95
00:07:13,820 --> 00:07:16,940
Well and then CNN will do exactly the opposite.
96
00:07:17,030 --> 00:07:23,240
It will take as inputs a vector we will create that vector we'll call it noise later on in the training
97
00:07:23,570 --> 00:07:29,410
that will be the input of this inverse CNN and it will return a fake image.
98
00:07:29,570 --> 00:07:36,740
So you're going to see that the input of this inverse CNN that we're about to make will be a random
99
00:07:36,740 --> 00:07:39,380
vector of size 100.
100
00:07:39,830 --> 00:07:46,760
And to get the inverse of a convolution in PI torch we're going to use the can transpose to the class
101
00:07:47,030 --> 00:07:52,610
that is that this inverse convolution module that we're going to create will be an object of the current
102
00:07:52,820 --> 00:07:54,600
transpose to the class.
103
00:07:54,800 --> 00:07:55,850
So let's do it.
104
00:07:55,970 --> 00:08:02,210
We take our end in that module because this can transpose to the class is taken from the end module
105
00:08:02,610 --> 00:08:08,670
so end then dot com capitals C transpose to the.
106
00:08:08,670 --> 00:08:10,350
Here is this one.
107
00:08:10,350 --> 00:08:12,080
That's the inverse convolution.
108
00:08:12,150 --> 00:08:18,510
The first module of our big Maira module that is the first module of our neural network can be transposed
109
00:08:18,510 --> 00:08:19,030
to D.
110
00:08:19,050 --> 00:08:21,870
And now we need to input several arguments.
111
00:08:21,870 --> 00:08:23,670
Get ready for them as you can see.
112
00:08:23,670 --> 00:08:25,560
There are many of them.
113
00:08:25,590 --> 00:08:25,890
All right.
114
00:08:25,890 --> 00:08:28,110
So the first argument is 100.
115
00:08:28,260 --> 00:08:36,850
That's the size of the input 100 meaning that the inputs are inverse CNN will be a vector of size 100.
116
00:08:37,020 --> 00:08:39,840
Then the second argument is 512.
117
00:08:39,900 --> 00:08:43,170
And that's the number of feature maps of the output.
118
00:08:43,230 --> 00:08:45,400
Then the third argument is four.
119
00:08:45,540 --> 00:08:50,790
And that's the size of the kernel which means that the kernels will be squares of size four by four
120
00:08:51,330 --> 00:08:53,100
then the next argument is one.
121
00:08:53,190 --> 00:08:59,190
And that's the stride and the last argument is zero and that's the padding.
122
00:08:59,190 --> 00:09:03,440
By the way the architecture that I'm about to define right now I didn't invent it.
123
00:09:03,510 --> 00:09:09,450
It is coming from a lot of experimenting that was done by researchers and the machine learning and AI
124
00:09:09,480 --> 00:09:10,510
contributors.
125
00:09:10,530 --> 00:09:12,500
So I did not invent it.
126
00:09:12,540 --> 00:09:18,840
I just took an architecture that is open source and that turns out to work very well for adversarial
127
00:09:18,840 --> 00:09:19,560
networks.
128
00:09:19,680 --> 00:09:21,820
So don't worry about the choice of the numbers.
129
00:09:21,870 --> 00:09:24,560
They just come from experimentation.
130
00:09:24,610 --> 00:09:30,090
All right and then we have one last argument and I'm specifying it because it is actually about the
131
00:09:30,090 --> 00:09:32,740
bias which by default is equal to true.
132
00:09:32,880 --> 00:09:36,330
But I'm going to set it equal to false because we don't want any bias.
133
00:09:36,330 --> 00:09:37,970
It works better this way.
134
00:09:38,250 --> 00:09:43,060
And therefore I'm saying the bias here to false.
135
00:09:43,110 --> 00:09:49,770
So one hundred for the size of the input 512 feature maps and the output meaning the output of the inverse
136
00:09:49,770 --> 00:09:52,170
convolution not the final output.
137
00:09:52,170 --> 00:09:59,560
Then a kernel of size for four by four is stride of 1 a batting of zero and no buyers.
138
00:09:59,580 --> 00:10:02,810
That's the first Mudgal of our neural network.
139
00:10:02,820 --> 00:10:03,920
Congratulations.
140
00:10:03,960 --> 00:10:13,030
You just applied an inversed convolution all right and now we are going to normalize all the features
141
00:10:13,150 --> 00:10:17,900
along the dimension of the Bachche the diamonds and of the batch is 512.
142
00:10:18,010 --> 00:10:26,410
We have 512 feature maps and we're going to Bachan on each of these 512 feature maps and to do this
143
00:10:26,410 --> 00:10:33,210
we take again our and then Mudgal and then we apply the batch we should find in here.
144
00:10:33,210 --> 00:10:40,090
Here we go Bachan on 2D and as arguments we need to input the number of feature maps we want to bet
145
00:10:40,210 --> 00:10:40,560
on.
146
00:10:40,570 --> 00:10:47,490
And since the 512 feature maps Well we are going to bet on 512 feature maps.
147
00:10:47,610 --> 00:10:48,240
All right.
148
00:10:48,270 --> 00:10:52,100
So playing batched normalization is pretty simple.
149
00:10:52,120 --> 00:10:59,200
Then we are going to apply Ereli to rectification for the non-linearity of the neural network that is
150
00:10:59,200 --> 00:11:02,720
to break the linearity and to do this.
151
00:11:02,770 --> 00:11:10,500
Nothing more simple things to our end and module we take are in a module that and we apply the review
152
00:11:11,320 --> 00:11:18,360
rectifier activation function and as arguments we just need to input just like that.
153
00:11:18,390 --> 00:11:26,750
That applies really to rectification then we are going to apply another inverse convolution.
154
00:11:27,180 --> 00:11:33,280
So I'm simply going to take my end and can transpose to the module.
155
00:11:33,360 --> 00:11:34,390
Pasting it here.
156
00:11:34,440 --> 00:11:38,150
And then here we go with the series of arguments.
157
00:11:38,190 --> 00:11:45,270
So this time we don't have 100 for the size of input because the input is now the output of the previous
158
00:11:45,450 --> 00:11:53,400
module which was an inverse convolution but in the output we got 512 feature maps and these feature
159
00:11:53,400 --> 00:11:58,050
maps become the new input of our new inversed conclusion.
160
00:11:58,050 --> 00:12:02,430
So this time we don't input 100 year when put 512.
161
00:12:03,000 --> 00:12:06,680
That's the number of inputs of this new inverse convolution.
162
00:12:06,900 --> 00:12:12,860
Then we need to choose a new number feature maps in the output of this new inverse convolution.
163
00:12:13,020 --> 00:12:15,630
And this time we're not going to pick five phonons well.
164
00:12:15,630 --> 00:12:19,020
We're going to pick two hundred and fifty six.
165
00:12:19,020 --> 00:12:23,140
Again that's a choice motivated by experimentation.
166
00:12:23,190 --> 00:12:27,760
That's really not his job to figure out such an architecture.
167
00:12:27,780 --> 00:12:35,880
All right then we're going to choose a kernel size of four four by four then as I tried to this time
168
00:12:36,150 --> 00:12:46,520
and padding of one and then same we don't want any buyers and therefore adding Byars equals false perfect.
169
00:12:46,590 --> 00:12:48,500
That's our next module.
170
00:12:48,510 --> 00:12:51,000
We now have two inversed convolution.
171
00:12:51,060 --> 00:12:53,990
Our generator is getting into shape.
172
00:12:54,150 --> 00:13:00,390
So now we're going to do the same we're going to batched norm each of the new 256 feature maps.
173
00:13:00,390 --> 00:13:12,390
So I'm just going to take this and that batch Enorme 2D it here but then replace 512 by 256 because
174
00:13:12,390 --> 00:13:21,990
we're batch norming each of the new 256 feature maps and then we apply another rectification which is
175
00:13:21,990 --> 00:13:23,610
going to be the loop again.
176
00:13:23,640 --> 00:13:29,230
So here let's just copy this and paste it here.
177
00:13:30,490 --> 00:13:34,100
All right then guess what we're going to do again.
178
00:13:34,240 --> 00:13:37,800
We're going to apply another inverse convolution.
179
00:13:37,900 --> 00:13:40,650
So I am taking this.
180
00:13:40,840 --> 00:13:44,260
And then that can transpose to D.
181
00:13:44,290 --> 00:13:45,740
I am pasting it here.
182
00:13:46,000 --> 00:13:53,410
And then the input is actually the output of the previous operation that is the inverse convolution.
183
00:13:53,590 --> 00:14:02,760
So that's 256 the 256 outputs feature maps become the input of our new inversed convolution.
184
00:14:02,810 --> 00:14:04,550
So 256.
185
00:14:04,720 --> 00:14:07,850
And now guess what we want in the outputs.
186
00:14:08,170 --> 00:14:11,760
Well you probably understood the logic we wanted divided by two.
187
00:14:11,800 --> 00:14:16,540
So we're going to take one hundred and twenty eight output future maps.
188
00:14:16,540 --> 00:14:23,650
In this new inverse convolution and then the good news is that we're keeping a size four by four for
189
00:14:23,650 --> 00:14:24,500
the kernels.
190
00:14:24,640 --> 00:14:28,580
Let's try to add a padding of one and then no bias.
191
00:14:28,690 --> 00:14:29,740
Congratulations.
192
00:14:29,800 --> 00:14:32,870
You just applied your third inverse convolution.
193
00:14:33,220 --> 00:14:38,380
And so then that's the same we're going to apply a batch known to the two batched norm.
194
00:14:38,380 --> 00:14:43,510
Each of the new 128 feature maps.
195
00:14:43,510 --> 00:14:53,980
So here I am replacing tournament 56 by 128 and we are going to apply reglue rectification to make sure
196
00:14:53,980 --> 00:15:00,150
we break the linearity for the non-linearity of the neural network.
197
00:15:00,160 --> 00:15:00,580
All right.
198
00:15:00,580 --> 00:15:02,670
And then guess what we're going to do again.
199
00:15:02,770 --> 00:15:07,520
We are going to apply an inversed convolution again.
200
00:15:07,720 --> 00:15:16,450
So I'm taking this again I'm taking the cones transpose to 2D and I'm pasting in here and this time
201
00:15:16,480 --> 00:15:20,010
we have as input of this new inverse convolution.
202
00:15:20,070 --> 00:15:26,930
One hundred and twenty eight feature maps because that was the output of the previous inverse convolution.
203
00:15:27,130 --> 00:15:32,880
And this time guess what we want for the number of the output feature maps of this new inverse convolution.
204
00:15:33,130 --> 00:15:40,960
Well of course we want 64 new feature maps and then good news again we're keeping a size of four by
205
00:15:40,960 --> 00:15:44,280
four for the Colonel's a stride of two and a betting of one.
206
00:15:44,470 --> 00:15:47,140
And then nobody has a right.
207
00:15:47,140 --> 00:15:49,010
We're getting close to the end.
208
00:15:49,270 --> 00:15:52,140
We need again to apply a batch norm.
209
00:15:52,150 --> 00:15:53,900
So let's copy this again.
210
00:15:54,220 --> 00:16:04,210
We are going to batched norm each of the new 64 feature maps that are the output of our new inverse
211
00:16:04,210 --> 00:16:05,410
convolution.
212
00:16:05,410 --> 00:16:12,850
And then we are going to apply a really rectification to again break the linearity.
213
00:16:13,360 --> 00:16:16,180
Right there we go almost over.
214
00:16:16,180 --> 00:16:17,580
Don't get crazy too fast.
215
00:16:17,590 --> 00:16:21,410
We have one final inverse convolution to make.
216
00:16:21,430 --> 00:16:23,440
So that's exactly what we're going to do.
217
00:16:23,650 --> 00:16:32,140
We are going to apply our last module the inverse convolution and I'm just copying the name of the module
218
00:16:32,140 --> 00:16:36,200
because this time we are going to input different arguments.
219
00:16:36,220 --> 00:16:41,420
So the first argument is the number of inputs which is the output of the previous inverse convolution.
220
00:16:41,470 --> 00:16:43,160
So that's 64 again.
221
00:16:43,780 --> 00:16:49,630
And then for the output That's important since we're making the generator that is supposed to generate
222
00:16:49,630 --> 00:16:51,160
some fake images.
223
00:16:51,340 --> 00:16:56,530
And since these fake images are going to be with three channels while the output of the generator is
224
00:16:56,530 --> 00:17:02,500
exactly going to be the three channels of the fake images and therefore the number of dimensions for
225
00:17:02,500 --> 00:17:07,740
the output is going to be three corresponding to the three channels.
226
00:17:07,750 --> 00:17:08,150
All right.
227
00:17:08,170 --> 00:17:10,180
And then actually that's the same.
228
00:17:10,180 --> 00:17:14,700
We're going to use kernels of size four by four a stride of two the padding of one.
229
00:17:14,800 --> 00:17:16,250
And again no bias.
230
00:17:16,300 --> 00:17:23,130
So I can just copy this and paste it here and there we go.
231
00:17:23,140 --> 00:17:28,130
We have applied all our inversed convolution that's almost over.
232
00:17:28,150 --> 00:17:34,990
We just need to end this up with a hyperbolic tangent rectification to not only break the law in the
233
00:17:34,990 --> 00:17:40,120
area again but also to make sure we are between minus 1 and plus 1.
234
00:17:40,120 --> 00:17:44,720
Have a look at the age function you'll see that the values are between minus 1 and plus 1.
235
00:17:44,890 --> 00:17:47,620
And also it is centered around 0.
236
00:17:47,650 --> 00:17:53,420
And why do we want to output values to fall between minus 1 and plus 1 and being centered around 0.
237
00:17:53,560 --> 00:17:57,660
That's because we want the same standards as the images of the data set.
238
00:17:57,850 --> 00:18:03,550
And because the created images of the generator will become the input of the discriminator that will
239
00:18:03,550 --> 00:18:04,780
come after that.
240
00:18:04,780 --> 00:18:12,640
All right so now we just need to and this architecture with our n module again and then that we apply
241
00:18:13,180 --> 00:18:20,320
the term age to hyperbolic tangent rectification for the non-linearity of the neural network and to
242
00:18:20,320 --> 00:18:23,930
be between minus 1 and plus 1 centered around zero.
243
00:18:24,010 --> 00:18:24,420
Whew.
244
00:18:24,430 --> 00:18:25,800
Congratulations.
245
00:18:25,810 --> 00:18:28,000
You made a neural network.
246
00:18:28,000 --> 00:18:30,460
That was definitely not a piece of cake.
247
00:18:30,880 --> 00:18:37,150
So I guess now it would be nice to have a break in the next story will make the food function that will
248
00:18:37,150 --> 00:18:45,070
propagate the signal that is the input of the generator through all the different layers of this new
249
00:18:45,070 --> 00:18:46,270
will that work.
250
00:18:46,270 --> 00:18:49,290
So have a good break and I'll see you in the next tutorial.
251
00:18:49,360 --> 00:18:51,370
Until then enjoy the revision.
26621
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.