Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:06,070 --> 00:00:12,580
Welcome back everyone to this lecture on the perception model to begin understanding deep learning.
2
00:00:12,620 --> 00:00:16,100
We're gonna start by actually building up our model abstractions.
3
00:00:16,130 --> 00:00:20,190
We're going to begin by exploring what real single biological neurons look like.
4
00:00:20,270 --> 00:00:26,060
Then we'll model those mathematically as a single perception see how we can group together perceptions
5
00:00:26,120 --> 00:00:32,060
or artificial neurons to build out a multilayer perception model and then see how that expands to a
6
00:00:32,060 --> 00:00:38,870
deep learning neural network framework so as we learn more about complex models we also then introduce
7
00:00:38,990 --> 00:00:45,410
some key mathematical concepts to the actual mathematical functionality of those neurons.
8
00:00:45,410 --> 00:00:49,920
And that is things like activation functions gradient descent and back propagation.
9
00:00:49,920 --> 00:00:55,820
But for now we're gonna focus on just building out that single precept Tron model based off a real biological
10
00:00:55,820 --> 00:01:03,470
neuron so if the whole idea behind deep learning is to have computers artificially mimic biological
11
00:01:03,470 --> 00:01:08,330
and natural intelligence it's probably a good idea that we should build a general understanding of how
12
00:01:08,330 --> 00:01:15,420
biological neurons work so here we can see some real state neurons in a cerebral cortex.
13
00:01:15,510 --> 00:01:20,430
And basically what we're going to attempt to do is check out the way these biological neurons work and
14
00:01:20,430 --> 00:01:26,350
see if we can develop a simplified abstraction of these real neurons.
15
00:01:26,480 --> 00:01:29,370
So here's a illustration of biological neurons.
16
00:01:29,410 --> 00:01:31,670
You can see that there is a lot going on and don't worry.
17
00:01:31,670 --> 00:01:33,480
This isn't a biology class.
18
00:01:33,480 --> 00:01:37,180
And the main things we're going to focus on is just the neuron itself.
19
00:01:37,190 --> 00:01:42,140
So the main things to see here is that there is some sort of nucleus or center of this neuron There's
20
00:01:42,140 --> 00:01:48,500
the cell body and it looks like we have both inputs and outputs or axons spreading out from this nucleus
21
00:01:49,570 --> 00:01:53,890
so let's take this illustration of a biological neuron and we're we're gonna do is we're going to really
22
00:01:53,890 --> 00:01:59,080
simplify this we're gonna simplify it so much that I will actually create my own illustration and I'm
23
00:01:59,080 --> 00:01:59,980
definitely not an artist.
24
00:01:59,980 --> 00:02:05,020
It's an extremely kind of rough illustration but the worry you'll see why draw it like this later on
25
00:02:05,020 --> 00:02:06,010
in this lecture.
26
00:02:06,010 --> 00:02:11,260
But the main thing you want to focus off the biological opponents are the dendrites and we can think
27
00:02:11,260 --> 00:02:17,860
of those as input going into the main nucleus of this neuron and then the axon which is some sort of
28
00:02:17,950 --> 00:02:18,930
output.
29
00:02:18,970 --> 00:02:24,310
Now again this isn't 100 percent biologically correct but this feeds into our perception model.
30
00:02:24,310 --> 00:02:30,880
Basically we're gonna think of neurons biologically as accepting some sort of input signal and the input
31
00:02:30,880 --> 00:02:36,280
signal can come from a variety of sources and then we'll do some sort of calculation the nucleus or
32
00:02:36,280 --> 00:02:42,520
something happens in the nucleus of this neuron and then it outputs as a single output at Axon.
33
00:02:42,640 --> 00:02:48,400
And that can then lead over to another nucleus as that then right to another neuron.
34
00:02:48,400 --> 00:02:56,220
Now how do we take this very simplified biological neuron model and convert it into a mathematical model.
35
00:02:56,240 --> 00:03:02,330
This is where the idea of the perception comes in a perception was a form of neural network introduced
36
00:03:02,450 --> 00:03:06,540
all the way back in 1958 by Frank Rosenblatt.
37
00:03:06,560 --> 00:03:13,700
And amazingly even back then in the 1950s he saw huge potential stating quote a perception may eventually
38
00:03:13,700 --> 00:03:17,660
be able to learn make decisions and translate languages.
39
00:03:17,660 --> 00:03:23,480
And if you have ever used something like Google Translate Google Translate actually uses a neural network
40
00:03:23,810 --> 00:03:27,500
as its main way of converting from one language to another.
41
00:03:27,500 --> 00:03:33,560
And Frank Rosenblatt actually saw the perception as one of the main base units to building out that
42
00:03:33,560 --> 00:03:34,610
sort of technology.
43
00:03:34,610 --> 00:03:41,720
And he thought of this all the way back in the late 1950s however about a decade later in the late 1960s
44
00:03:42,010 --> 00:03:49,310
two researchers Marvin Minsky and Seymour paper they published a book labeled perceptions and it suggested
45
00:03:49,310 --> 00:03:54,860
that there were actually severe limitations to what perceptions would be able to do and a big part of
46
00:03:54,860 --> 00:04:02,080
that limitation was the actual computational power necessary to work with multilayer perception models.
47
00:04:02,090 --> 00:04:08,180
So back in the late 1960s there wasn't really enough computational power to actually make full use of
48
00:04:08,330 --> 00:04:11,300
the idea of the perceptual mathematical model.
49
00:04:11,420 --> 00:04:17,300
And this publication this book actually marks the beginning of what's known as the A.I. winter where
50
00:04:17,300 --> 00:04:23,420
very little funding went into research in artificial intelligence and neural networks in the 1970s.
51
00:04:23,420 --> 00:04:30,410
So that's why even though this idea was spawned in the late 1950s it's taken up to the modern age to
52
00:04:30,410 --> 00:04:37,680
actually fully implement neural networks now fortunately for us we already now live in the present and
53
00:04:37,680 --> 00:04:42,810
we know the amazing power of neural networks which all really stem from this simple perceptual model
54
00:04:43,080 --> 00:04:45,870
that was created all the way back in 1958.
55
00:04:45,900 --> 00:04:51,240
So let's go ahead and head back out and convert that simple biological neuron model into the perception
56
00:04:51,240 --> 00:04:58,150
model and then go through a simple example of how a perception model works so recall we have a very
57
00:04:58,150 --> 00:05:01,190
simplified biological neuron model.
58
00:05:01,210 --> 00:05:07,660
So this is the biological neuron model than a very simple fashion we have incoming then right then some
59
00:05:07,660 --> 00:05:12,610
sort of center nucleus and a single output to the axon.
60
00:05:12,610 --> 00:05:19,030
So what we're going to do is instead we're going to replace these biological units with some mathematical
61
00:05:19,030 --> 00:05:19,420
ones.
62
00:05:20,020 --> 00:05:25,630
So we're going to do is we're going to define some set of inputs going into this single point which
63
00:05:25,630 --> 00:05:31,560
we'll call the neuron and then has that single output so let's go ahead and work through a simple example
64
00:05:31,650 --> 00:05:34,460
of how the simple perception model actually works.
65
00:05:34,590 --> 00:05:40,200
What we're gonna do is we're gonna imagine essentially two data points X1 next to so these two variables
66
00:05:40,470 --> 00:05:47,770
X1 next to are going into as input into this perception and you could imagine that inside this perception
67
00:05:48,130 --> 00:05:50,440
that neuron is going to have some sort of function.
68
00:05:50,440 --> 00:05:57,760
So it's going to have some sort of functionality and perform something on those inputs of X and it's
69
00:05:57,760 --> 00:05:59,600
going to then output some y.
70
00:05:59,620 --> 00:06:04,060
So we have some function inside the neuron takes in these x values perform something on them and then
71
00:06:04,060 --> 00:06:12,830
outputs y so if ever X that function is just the sum then Y is equal to X1 plus X2 and this is a very
72
00:06:12,830 --> 00:06:14,780
very simplified perception model.
73
00:06:14,780 --> 00:06:17,480
Later on we'll expand on this with activation functions.
74
00:06:17,480 --> 00:06:22,660
But right now this is just showing you how we can go from a biological neuron to a perceptual model.
75
00:06:22,730 --> 00:06:28,100
So we have these inputs X1 next to we push them through some sort of function of x and then we output
76
00:06:28,100 --> 00:06:35,380
some y so realistically we would want to be able to adjust some parameter in order for the perception
77
00:06:35,410 --> 00:06:36,460
to learn.
78
00:06:36,550 --> 00:06:43,330
Right now there's no real way for the perception to adjust its ability to learn to kind of correct the
79
00:06:43,330 --> 00:06:44,230
output of Y.
80
00:06:44,800 --> 00:06:51,550
So what we can do is we can add an adjustable weights that will multiply against these inputs of x.
81
00:06:51,760 --> 00:06:59,050
So we're going to say here is that we'll take in each input of X and apply it to its own weight and
82
00:06:59,050 --> 00:07:04,480
then we can adjust that weight as necessary in order to get the correct value of y or the expected value
83
00:07:04,480 --> 00:07:08,150
of y that maybe we're performing some sort of supervised learning on.
84
00:07:08,230 --> 00:07:13,270
So maybe we're trying to solve a regression task where y is the actual continuous label we're trying
85
00:07:13,270 --> 00:07:17,260
to predict or maybe we're doing classification and Y is a category.
86
00:07:17,260 --> 00:07:22,810
But the most important part here is we're gonna need to be able to take this perceptual model and adjust
87
00:07:22,810 --> 00:07:28,900
it somehow in order to get the correct value of y and one way we can do this is by applying a weight
88
00:07:29,200 --> 00:07:31,390
to each input and then we can adjust the weight.
89
00:07:31,450 --> 00:07:38,260
So we'll label these weights corresponding to their inputs as w 1 for x 1 and w 2 4 x 2 which means
90
00:07:38,410 --> 00:07:45,520
if our function still simplified is just an addition we'll say Y is equal to x 1 times w one plus x
91
00:07:45,520 --> 00:07:55,700
2 times l B2 now we could update the weights in order to affect the output y but what happens if X input
92
00:07:55,880 --> 00:07:56,970
is actually zero.
93
00:07:56,990 --> 00:08:01,280
That means no matter what you do to the weight that actually won't change anything.
94
00:08:02,130 --> 00:08:06,330
So to fix that problem we could add in a bias term B to the inputs.
95
00:08:06,450 --> 00:08:09,790
So regardless if x 1 is 0 or x 2 a zero.
96
00:08:09,960 --> 00:08:14,310
Which means that when you multiply it by a weight it always end up being zero regardless of how you
97
00:08:14,310 --> 00:08:15,240
update the weight.
98
00:08:15,240 --> 00:08:21,750
Well we can do is we can add in a bias term B to the inputs so we essentially assign this neuron its
99
00:08:21,780 --> 00:08:28,080
own particular bias so that the inputs end up not just getting multiplied by a weight but having a bias
100
00:08:28,170 --> 00:08:28,890
added to them.
101
00:08:28,920 --> 00:08:34,290
And keep in mind the weights can be positive or negative and the biases can also be positive or negative.
102
00:08:34,290 --> 00:08:41,370
A good way to think about the bias is that the multiplication or the product of X 1 times w 1 or the
103
00:08:41,370 --> 00:08:48,090
input times its weight has to overcome the bias value in order to start having an effect on the output
104
00:08:48,180 --> 00:08:56,760
of Y so here for this very simple example we end up approximating the output of Y to be equal to x 1
105
00:08:56,760 --> 00:09:04,470
times they'll be 1 plus B plus x 2 times they'll be 2 plus B and is just a very simple summation that's
106
00:09:04,470 --> 00:09:09,540
happening inside the neuron later on we'll talk about more realistic activation functions that you'll
107
00:09:09,540 --> 00:09:11,880
have in typical networks.
108
00:09:12,070 --> 00:09:17,680
Keep in mind we can expand this to a generalization for a variety of inputs all the way to some input
109
00:09:17,740 --> 00:09:25,760
and and so now we've been able to see how we can model a biological neuron as a simple perception MF
110
00:09:25,760 --> 00:09:26,300
perfectly.
111
00:09:26,330 --> 00:09:32,300
Our generalization was simply starting at AIS equal to one all the way to any number of inputs we simply
112
00:09:32,360 --> 00:09:37,880
grabbed that particular input value multiply it by some weight that later on we'll be adjusting plus
113
00:09:37,970 --> 00:09:43,520
a bias value later on we're actually going to see how we can expand this model to have X be a tensor
114
00:09:43,520 --> 00:09:50,620
of information right tensor is an end dimensional matrix so let's review what we've learned so far we
115
00:09:50,620 --> 00:09:55,240
understand the very very basics of a biological neuron the fact that there's dendrites providing input
116
00:09:55,240 --> 00:10:01,300
to a nucleus and then some Axon as a single output and what we've been able to do is take the research
117
00:10:01,300 --> 00:10:07,660
done in the late 1950s of the simple perceptual model and create a very simple mathematical model replicating
118
00:10:07,660 --> 00:10:13,210
those core concepts behind a neuron where we have inputs we apply some sort of weight to them as well
119
00:10:13,210 --> 00:10:18,360
as adding a bias term pass them through some sort of function and then we get out that single output
120
00:10:19,030 --> 00:10:25,150
so let's go ahead and then in the next lecture learn how we can expand a single perception to a full
121
00:10:25,180 --> 00:10:30,430
neural network and then we'll see how we can add in more complex ideas such as activation functions
122
00:10:30,760 --> 00:10:36,340
cost functions and then see how that works of a feed for or network as well as back propagation.
123
00:10:36,340 --> 00:10:38,950
OK thanks and I'll see you at the next lecture.
14937
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.