All language subtitles for 046 Applications of GANs-en

af Afrikaans
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese Download
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,750 --> 00:00:05,310 Hello and welcome back to the course on computer vision then we're going to talk about some applications 2 00:00:05,310 --> 00:00:06,340 of Ganns. 3 00:00:06,690 --> 00:00:06,990 All right. 4 00:00:06,990 --> 00:00:08,260 So let's have a look. 5 00:00:08,340 --> 00:00:16,620 Gannes can be used for many different things and for instance here's a couple generating images image 6 00:00:16,620 --> 00:00:25,900 modification super resolution assisting artists photo realistic images speech generation and face Ajahn. 7 00:00:26,190 --> 00:00:32,340 So those are just a couple of examples of how guns can be used and we're going to look into a few of 8 00:00:32,340 --> 00:00:33,420 them in more detail. 9 00:00:33,430 --> 00:00:35,500 So not all of them but just a few. 10 00:00:35,940 --> 00:00:42,210 Just to kind of like give you some ideas maybe for how you can use guides in your own life and your 11 00:00:42,210 --> 00:00:49,050 work maybe in your own projects maybe in some applications that you might want to create. 12 00:00:49,050 --> 00:00:53,890 All right so generating images what can guns be used for in this space. 13 00:00:53,890 --> 00:01:01,140 Well where you saw how Gannes generate images that's kind of like the core Skase for gas or how they 14 00:01:01,170 --> 00:01:02,610 came to be. 15 00:01:02,730 --> 00:01:04,640 We saw that example dogs. 16 00:01:04,710 --> 00:01:11,010 Let's have a look at a more successful example like you could see that it's really hard for a guy generative 17 00:01:11,010 --> 00:01:15,410 adversarial network to generate an image of a like a realistic dog. 18 00:01:15,420 --> 00:01:17,070 But let's see where they do actually. 19 00:01:17,190 --> 00:01:24,870 Well so Franzen's these are just by looking at this can you say what this is what do you think these 20 00:01:24,870 --> 00:01:25,530 are images of. 21 00:01:25,530 --> 00:01:30,500 They're all images of one certain thing. 22 00:01:30,600 --> 00:01:38,610 So you probably can already tell what they are and by the way this is just like after one iteration 23 00:01:38,610 --> 00:01:40,680 of the one epoch. 24 00:01:40,680 --> 00:01:46,350 And this is taken from an official paper which I will link to at the end of this tutorial but this is 25 00:01:46,350 --> 00:01:52,140 just like a very under-trained gambit you Gary kind of see what it is. 26 00:01:52,140 --> 00:01:56,850 If you can tell then I'm going to show you just now what it looks like what these same images look like 27 00:01:56,850 --> 00:01:58,050 after more steps. 28 00:01:58,050 --> 00:02:06,400 I think it was like after 12-Step so there's not steps after 12 epochs up to 12 training cycles. 29 00:02:06,420 --> 00:02:12,740 So if we do that and let's see if you'll see if it's easier to tell what's going on. 30 00:02:13,230 --> 00:02:13,950 So there you go. 31 00:02:13,950 --> 00:02:19,680 So these are actually generated images and you can probably tell these are images of bedroom's. 32 00:02:19,890 --> 00:02:26,480 And this is an example of where Gad's are doing really well some of these bedroom's looks so realistic 33 00:02:26,490 --> 00:02:32,500 right so here you got a badge you got a pillow you've got a window. 34 00:02:32,700 --> 00:02:34,160 Let's see what else. 35 00:02:34,200 --> 00:02:34,710 Here you go. 36 00:02:34,710 --> 00:02:38,130 Like a big window here somewhere and somewhere there's a bend here. 37 00:02:38,340 --> 00:02:43,230 Here you've got a bed of some pillows and other window you can see the lighting changes the colors are 38 00:02:43,230 --> 00:02:44,780 changing. 39 00:02:45,010 --> 00:02:48,060 Here you go to bed and kind of like a closet over there. 40 00:02:48,090 --> 00:02:54,720 You've got to like a bed up close over here so you can see is doing well and that's one of the cases 41 00:02:54,720 --> 00:03:03,270 where you can do well not just bedrooms but and then just images of bedrooms but actually when you have 42 00:03:03,360 --> 00:03:11,310 a very confined domain where something is like very you know like bedrooms on like dogs are usually 43 00:03:11,700 --> 00:03:13,540 much more similar one to another. 44 00:03:13,650 --> 00:03:19,800 You know they can be a of variety but there's less variety than in images of dogs because dogs are so 45 00:03:19,800 --> 00:03:22,690 different and backgrounds and different positions are different. 46 00:03:22,830 --> 00:03:25,650 But for instance Benham's it works quite well. 47 00:03:25,740 --> 00:03:31,260 So we can just go back and forth a couple times so you can see just pick an image and try to observe 48 00:03:31,260 --> 00:03:32,010 how it changes. 49 00:03:32,010 --> 00:03:38,230 For instance look at this one you can see that it's quite pronounced here. 50 00:03:38,370 --> 00:03:45,960 Before it was not as well defined have another one. 51 00:03:46,180 --> 00:03:47,760 This one looks quite good. 52 00:03:47,870 --> 00:03:49,090 It look like that. 53 00:03:49,220 --> 00:03:56,960 I do look a little bit different as well so as you can see the more training you put in the better off 54 00:03:56,960 --> 00:04:00,740 on our parts like these ones they don't look as defined over here. 55 00:04:00,760 --> 00:04:03,950 But once you do more training they look more. 56 00:04:04,170 --> 00:04:05,100 So there you go. 57 00:04:05,140 --> 00:04:11,930 You know if you want to generate some fake images of bedrooms and gowns are your go to but also not 58 00:04:12,010 --> 00:04:21,070 bedrooms any kind of like confined space or confined type like very tight type of image or object that 59 00:04:21,070 --> 00:04:28,780 where the variety isn't that great and gowns can do a pretty good job image modification. 60 00:04:29,050 --> 00:04:31,060 OK so what does that mean. 61 00:04:31,090 --> 00:04:32,820 Well let's have a look at this example. 62 00:04:32,950 --> 00:04:38,600 So here we've got this from the same paper and here we got a smiling woman on the left. 63 00:04:38,610 --> 00:04:42,490 This is these are images generated by Gahn all of them. 64 00:04:42,490 --> 00:04:47,600 Then you go to a neutral woman on in the middle and then you go to a neutral man. 65 00:04:48,010 --> 00:04:50,390 So if you take the smiling woman. 66 00:04:50,390 --> 00:04:55,000 So basically what they're showing here is that through Gannes through the way that these images are 67 00:04:55,000 --> 00:04:58,250 generated they're actually assigned a vector. 68 00:04:58,300 --> 00:05:04,180 If you look at the neural network that represents them and represent that in a vector you can then do 69 00:05:04,540 --> 00:05:08,050 arithmetic arithmetics with those vectors so you can take a picture of a smiling woman and subtract 70 00:05:08,050 --> 00:05:10,720 the vector for a neutral woman and you'll get. 71 00:05:10,840 --> 00:05:13,330 And then you add a veteran for unusual man. 72 00:05:13,340 --> 00:05:14,830 Well you you'll get the smiling man. 73 00:05:14,830 --> 00:05:20,200 So basically by subtracting this from this you'll get a vector for the smile you added on top of the 74 00:05:20,500 --> 00:05:23,480 natural man and you get them one that is smiling. 75 00:05:23,620 --> 00:05:26,020 Very very interesting. 76 00:05:26,020 --> 00:05:27,270 A little bit creepy as well. 77 00:05:27,280 --> 00:05:29,800 But you know that's they stayed there. 78 00:05:29,830 --> 00:05:31,690 Now here's another example. 79 00:05:31,800 --> 00:05:36,430 Manner of glasses subtract magnifying glasses completely different man as you can see. 80 00:05:36,820 --> 00:05:42,130 And then you add a woman with fogged glasses and you get a woman with glasses. 81 00:05:42,190 --> 00:05:51,220 Very interesting and so similar results have been already seen in the space of context where for instance 82 00:05:51,820 --> 00:05:59,290 a very famous example is if you take her use neural networks to her present words or like sentences 83 00:05:59,290 --> 00:06:05,580 and then you take or like the word sorry if you just take send. 84 00:06:05,590 --> 00:06:13,690 If you take words and represent them vectors and then for instance you take king and you subtract man 85 00:06:13,810 --> 00:06:22,570 and you add the word woman then you get something close to a queen and that can be that has been achieved 86 00:06:22,570 --> 00:06:23,320 before. 87 00:06:23,320 --> 00:06:31,990 But it's not as like as not as big as a breakthrough as the same with images because images actually 88 00:06:31,990 --> 00:06:37,750 represent not just something that we have like an alphabet or different words actually represent something 89 00:06:37,770 --> 00:06:43,570 I can see in real life and you can see this movement of glass all the way here is pretty distinct. 90 00:06:43,570 --> 00:06:48,520 It's pretty you can really see that it's warm and glossy even though these people never existed. 91 00:06:48,520 --> 00:06:53,170 This is just all gang generated images. 92 00:06:55,480 --> 00:07:02,890 And for example here they're showing that if you try to do the same with just arithmetic operations 93 00:07:02,890 --> 00:07:11,440 for pixels then you will never actually have the result as good because as you can see you just get 94 00:07:11,440 --> 00:07:18,820 like overlay on different images and that's nowhere near to as good as what we had over here. 95 00:07:19,540 --> 00:07:19,950 All right. 96 00:07:19,990 --> 00:07:23,910 So that's what does this. 97 00:07:23,920 --> 00:07:26,320 What does this application image modification. 98 00:07:26,320 --> 00:07:30,030 So this next one is super resolution. 99 00:07:30,340 --> 00:07:35,410 Here's an example from the end Goodfellows presentation where they had an original image which looked 100 00:07:35,410 --> 00:07:35,920 like that. 101 00:07:35,950 --> 00:07:46,920 Then they reduced the resolution so made it like well like a more grainy and less detail. 102 00:07:47,140 --> 00:07:51,720 And then they applied different algorithms to increase their resolution back. 103 00:07:51,730 --> 00:07:57,040 And so that and by Kubik there is as far as s. 104 00:07:57,070 --> 00:08:02,110 NET and this superposition by again. 105 00:08:02,530 --> 00:08:09,040 And so you can see that that can do a really good job even though this was a bit smoother and a really 106 00:08:09,040 --> 00:08:12,010 good job and you can see that it's quite Kristo. 107 00:08:12,010 --> 00:08:20,290 Chris backflushing they could really not really call if they had the image of the the the low resolution 108 00:08:20,480 --> 00:08:25,240 that they created in between here so the image that they took from this what they created and then the 109 00:08:25,240 --> 00:08:28,870 results would be called to compare but I couldn't find that image unfortunately. 110 00:08:28,870 --> 00:08:31,310 So all we have to compare is to the original. 111 00:08:31,680 --> 00:08:39,400 And we can see kind of like how this could perform over well overall well but if you look in detail 112 00:08:39,400 --> 00:08:45,140 like for instance here on the hat there is a pattern and it's not here. 113 00:08:45,290 --> 00:08:50,360 There are some like horses or elephants there that are not here. 114 00:08:50,440 --> 00:08:57,670 There's certain detail here that has been destroyed but nevertheless this is a bear of them. 115 00:08:57,680 --> 00:09:02,860 You could see Prius's So now imagine if you didn't have the original if you just have a low resolution 116 00:09:02,860 --> 00:09:09,780 image and you want to recreate the higher resolution version of that image then then you could use a 117 00:09:09,780 --> 00:09:11,660 gown for that as well. 118 00:09:12,490 --> 00:09:13,280 OK. 119 00:09:13,290 --> 00:09:14,870 Photo realistic images. 120 00:09:14,870 --> 00:09:16,570 This is a very interesting one. 121 00:09:16,570 --> 00:09:24,260 So it was this company owner company was this application online pics depicts which allows you to like 122 00:09:24,260 --> 00:09:31,250 you draw something with just on line and then you apply and generative adversarial literally it just 123 00:09:31,690 --> 00:09:39,420 presses button basically and you get a photo realistic image so you get. 124 00:09:39,520 --> 00:09:46,140 So again he uses this as an input and learns what a human should look like. 125 00:09:46,140 --> 00:09:51,220 He knows kind of from it's training where a human should look like and overlay styles on top of that 126 00:09:51,550 --> 00:09:53,310 overlay. 127 00:09:53,680 --> 00:09:57,570 Turn this image into something that looks like a real photo. 128 00:09:57,790 --> 00:10:04,470 And this service was so popular they got over like 2 million views 2 million visitors on the Web site. 129 00:10:04,580 --> 00:10:10,630 The people who put it up there they just didn't have enough funds to maintain the service so they had 130 00:10:10,630 --> 00:10:12,240 to close that unfortunately. 131 00:10:12,300 --> 00:10:19,500 But there's a really cool YouTube video of a an artist playing around with this and we'll link to it 132 00:10:19,510 --> 00:10:20,250 at the end. 133 00:10:20,270 --> 00:10:27,130 In additional reading or viewing in this case and you'll be able to check it out there's some very interesting 134 00:10:27,130 --> 00:10:27,950 stuff. 135 00:10:28,270 --> 00:10:35,530 But this is really crazy like from just a pencil drawing a line drawing you get a realistic looking 136 00:10:35,530 --> 00:10:39,250 photo like that face aging. 137 00:10:39,250 --> 00:10:44,050 So a popular one so I think these are all Hollywood actors. 138 00:10:44,230 --> 00:10:49,310 I'm not actually sure they look like Hollywood action so. 139 00:10:49,900 --> 00:10:55,660 And I don't really know the background of this how they did it did they lose. 140 00:10:55,720 --> 00:10:58,850 Maybe they took these photos of here at the start and then age them. 141 00:10:59,020 --> 00:11:06,520 But basically the word was applied here is a generator about a central network to generate a sequence 142 00:11:06,520 --> 00:11:08,010 of photos from the photo. 143 00:11:08,010 --> 00:11:15,160 So basically you can use you sort of like a young photo and see how a person will age over time by applying 144 00:11:15,250 --> 00:11:16,630 Genter of adversarial networks. 145 00:11:16,630 --> 00:11:22,990 Of course they have to go through certain training to understand you know the concepts of aging too 146 00:11:23,750 --> 00:11:28,720 for those concepts to be programmed into the nor embedded into the neural network. 147 00:11:28,900 --> 00:11:34,290 But this is another application of general advertising in general. 148 00:11:34,340 --> 00:11:38,410 There are several networks as you can see quite a lot of different applications. 149 00:11:38,450 --> 00:11:47,260 So the couple of resources that I mentioned this one is a paper called as it provides representation 150 00:11:47,260 --> 00:11:53,370 learning with deep convolutional generation generative adversarial networks so DC Gannes This is like 151 00:11:53,370 --> 00:11:59,420 a step above Gannes or an enhancement on top of Gannes whether networks or even deeper. 152 00:11:59,640 --> 00:12:07,560 And you can find it on our archive there and it's actually going to a lot of those examples are we talked 153 00:12:07,560 --> 00:12:07,950 about. 154 00:12:08,070 --> 00:12:15,280 They are given in this paper if you want to read about it and see some additional images and the YouTube 155 00:12:15,340 --> 00:12:19,600 that are referenced is called artist versus picks topics. 156 00:12:19,620 --> 00:12:22,070 Is this humor or horror. 157 00:12:22,080 --> 00:12:22,540 Check it out. 158 00:12:22,540 --> 00:12:27,780 This just this art is drawing these images. 159 00:12:27,780 --> 00:12:32,010 If you just Google artist versus Bicks depicts you'll find this artist drawing these images and then 160 00:12:32,040 --> 00:12:38,690 applying the Gahn to see how it will overlay human skin and hair and tones over and. 161 00:12:38,700 --> 00:12:42,110 Very interesting all of the time it's very creepy. 162 00:12:42,120 --> 00:12:45,200 All right so that's a precaution of Ganze. 163 00:12:45,240 --> 00:12:48,360 Hope you enjoyed this Tauriel and courtsey next time. 164 00:12:48,360 --> 00:12:50,450 Until then enjoy computer vision. 17802

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.