Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,000 --> 00:00:04,000
If you want to learn about computer science and the art\n
2
00:00:04,000 --> 00:00:09,300
CS50 is considered by many to be one of the\n
3
00:00:09,300 --> 00:00:13,000
This is a Harvard University course\ntaught by Dr. David Malan
4
00:00:13,000 --> 00:00:17,000
and we are proud to bring it to\n
5
00:00:17,000 --> 00:00:23,000
Throughout a series of lectures, Dr. Malan will teach you \n
6
00:00:23,000 --> 00:00:27,300
And make sure to check the description for a lot of\n
7
00:01:45,801 --> 00:01:50,281
DAVID MALAN: All right, this is CS50,\n
8
00:01:50,281 --> 00:01:52,591
to the intellectual\nenterprises of computer science
9
00:01:52,590 --> 00:01:56,340
and the art of programming, back here\n
10
00:01:56,340 --> 00:01:58,410
for the first time in quite a while.
11
00:02:13,311 --> 00:02:16,911
And I took this class myself\nsome time ago, but almost didn't.
12
00:02:16,911 --> 00:02:20,121
It was sophomore fall and I\nwas sitting in on the class.
13
00:02:20,121 --> 00:02:22,431
And I was a little curious\nbut, eh, it didn't really
14
00:02:24,508 --> 00:02:26,841
I was definitely a computer\nperson, but computer science
15
00:02:26,841 --> 00:02:28,314
felt like something altogether.
16
00:02:28,313 --> 00:02:30,230
And I only got up the\nnerve to take the class
17
00:02:30,230 --> 00:02:32,870
ultimately, because the professor\nat the time, Brian Kernighan
18
00:02:32,871 --> 00:02:35,600
allowed me to take the\nclass pass/fail, initially.
19
00:02:35,600 --> 00:02:37,490
And that is what made\nall the difference.
20
00:02:37,491 --> 00:02:39,981
I quickly found that\ncomputer science is not just
21
00:02:39,980 --> 00:02:42,800
about programming and working\nin isolation on your computer.
22
00:02:42,800 --> 00:02:45,390
It's really about problem\nsolving more generally.
23
00:02:45,390 --> 00:02:48,080
And there was something\nabout homework, frankly
24
00:02:48,080 --> 00:02:51,470
that was, like, actually fun for perhaps\n
25
00:02:51,471 --> 00:02:53,996
And there was something\nabout this ability
26
00:02:53,996 --> 00:02:56,121
that I discovered, along\nwith all of my classmates
27
00:02:56,121 --> 00:03:00,373
to actually create something and bring\n
28
00:03:00,372 --> 00:03:03,080
and sort of bring to bear something\n
29
00:03:03,080 --> 00:03:06,260
but didn't really know how to harness,\n
30
00:03:06,260 --> 00:03:08,150
and definitely challenging\nand frustrating.
31
00:03:08,151 --> 00:03:10,753
Like, to this day,\nall these years later
32
00:03:10,752 --> 00:03:13,460
you're going to run up against\n
33
00:03:13,461 --> 00:03:15,111
in programming, that\njust drive you nuts.
34
00:03:15,110 --> 00:03:16,610
And you feel like you've hit a wall.
35
00:03:16,610 --> 00:03:18,950
But the trick really is\nto give it enough time
36
00:03:18,950 --> 00:03:21,180
to take a step back, take\na break when you need to.
37
00:03:21,181 --> 00:03:24,441
And there's nothing better, I daresay,\n
38
00:03:24,441 --> 00:03:26,169
and pride, really,\nwhen you get something
39
00:03:26,169 --> 00:03:28,461
to work, and in a class like\nthis, present, ultimately
40
00:03:28,461 --> 00:03:32,091
at term's end, something like\nyour very own final project.
41
00:03:32,091 --> 00:03:35,551
Now, this isn't to say that\nI took to it 100% perfectly.
42
00:03:35,550 --> 00:03:40,760
In fact, just this past week, I looked\n
43
00:03:40,760 --> 00:03:43,281
have from some 25 years\nago, and took a photo
44
00:03:43,281 --> 00:03:47,961
of what was apparently the very first\n
45
00:03:47,961 --> 00:03:50,271
and quickly received minus 2 points on.
46
00:03:50,270 --> 00:03:53,450
But this is a program that we'll\n
47
00:03:53,450 --> 00:03:57,740
does something quite simply like\n
48
00:03:58,408 --> 00:04:00,200
And to be fair, I\ntechnically hadn't really
49
00:04:00,200 --> 00:04:02,480
followed the directions, which is\n
50
00:04:02,480 --> 00:04:05,802
But if you just look at this, especially\n
51
00:04:05,802 --> 00:04:07,760
you might have heard\nabout programming language
52
00:04:07,760 --> 00:04:09,718
but you've never typed\nsomething like this out
53
00:04:09,718 --> 00:04:11,480
undoubtedly it's going to look cryptic.
54
00:04:11,480 --> 00:04:13,520
But unlike human\nlanguages, frankly, which
55
00:04:13,520 --> 00:04:17,480
were a lot more sophisticated, a\nlot more vocabulary, a lot more
56
00:04:17,480 --> 00:04:21,620
grammatical rules, programming, once\n
57
00:04:21,620 --> 00:04:24,733
it is and how it works and what these\n
58
00:04:24,733 --> 00:04:26,900
you'll see, after a few\nmonths of a class like this
59
00:04:26,901 --> 00:04:29,001
to start teaching\nyourself, subsequently
60
00:04:29,000 --> 00:04:32,730
other languages, as they may\ncome, in the coming years as well.
61
00:04:32,730 --> 00:04:36,050
So what ultimately matters\nin this particular course
62
00:04:36,050 --> 00:04:38,690
is not so much where you end\nup relative to your classmates
63
00:04:38,690 --> 00:04:41,900
but where you end up relative\nto yourself when you began.
64
00:04:41,901 --> 00:04:43,381
And indeed, you'll begin today.
65
00:04:43,380 --> 00:04:46,910
And the only experience that matters\n
66
00:04:46,911 --> 00:04:49,040
And so, consider where you are today.
67
00:04:49,040 --> 00:04:51,531
Consider, perhaps, just how\ncryptic something like that
68
00:04:52,850 --> 00:04:56,180
And take comfort in knowing just\n
69
00:04:56,180 --> 00:04:58,408
will be within your own grasp.
70
00:04:58,408 --> 00:05:01,700
And if you're thinking that, OK, surely\n
71
00:05:01,701 --> 00:05:05,421
to the right, behind me, knows more than\n
72
00:05:05,420 --> 00:05:10,100
2/3 of CS50 students have never taken\n
73
00:05:10,100 --> 00:05:14,730
you're in very good company\nthroughout this whole term.
74
00:05:14,730 --> 00:05:16,820
So then, what is computer science?
75
00:05:16,821 --> 00:05:18,441
I claim that it's problem solving.
76
00:05:18,440 --> 00:05:20,720
And the upside of that is\nthat problem solving is
77
00:05:20,721 --> 00:05:23,031
something we sort of do all the time.
78
00:05:23,031 --> 00:05:25,834
But a computer science\nclass, learning to program
79
00:05:25,834 --> 00:05:27,500
I think kind of cleans up your thoughts.
80
00:05:27,500 --> 00:05:31,040
It helps you learn how to think more\n
81
00:05:32,630 --> 00:05:34,463
Because, honestly, the\ncomputer is not going
82
00:05:34,463 --> 00:05:37,670
to do what you want unless you are\n
83
00:05:37,670 --> 00:05:39,896
And so, as such, there's\nthese fringe benefits
84
00:05:39,896 --> 00:05:42,771
of just learning to think like a\n
85
00:05:42,771 --> 00:05:45,561
And it doesn't take all\nthat much to start doing so.
86
00:05:45,560 --> 00:05:49,221
This, for instance, is perhaps the\n
87
00:05:49,221 --> 00:05:51,141
sure, but really problem\nsolving in general.
88
00:05:51,141 --> 00:05:54,471
Problems are all about taking input,\n
89
00:05:54,471 --> 00:05:56,360
You want to get the solution, a.k.a.
90
00:05:57,050 --> 00:05:59,750
And so, something interesting\nhas got to be happening in here
91
00:05:59,750 --> 00:06:03,240
in here, when you're trying to\nget from those inputs to outputs.
92
00:06:03,240 --> 00:06:05,810
Now, in the world of\ncomputers specifically
93
00:06:05,810 --> 00:06:09,680
we need to decide in advance how we\n
94
00:06:09,680 --> 00:06:13,723
We all just need to decide, whether\n
95
00:06:13,723 --> 00:06:16,640
else, that we're all going to speak\n
96
00:06:16,641 --> 00:06:18,841
of our human languages as well.
97
00:06:18,841 --> 00:06:22,550
And you may very well know that\ncomputers tend to speak only
98
00:06:26,740 --> 00:06:29,230
Assembly, one, but binary,\ntwo, might be your go-to.
99
00:06:29,230 --> 00:06:32,320
And binary, by implying two,\nmeans that the world of computers
100
00:06:32,321 --> 00:06:35,981
has just two digits at\nits disposal, 0 and 1.
101
00:06:35,980 --> 00:06:40,060
And indeed, we humans have many more\n
102
00:06:40,690 --> 00:06:43,120
But a computer indeed\nonly has zeros and ones.
103
00:06:43,120 --> 00:06:45,341
And yet, somehow they can do so much.
104
00:06:45,341 --> 00:06:47,711
They can crunch numbers in\nExcel, send text messages
105
00:06:47,711 --> 00:06:51,081
create images and artwork\nand movies and more.
106
00:06:51,081 --> 00:06:54,790
And so, how do you get from something\n
107
00:06:54,790 --> 00:06:56,920
to all of the stuff\nthat we're doing today
108
00:06:56,920 --> 00:06:58,870
in our pockets and laptops and desktops?
109
00:06:58,870 --> 00:07:01,570
Well, it turns out that\nwe can start quite simply.
110
00:07:01,571 --> 00:07:05,471
If a computer were to want to do\n
111
00:07:06,190 --> 00:07:09,190
Well, in our human world,\nwe might count doing this
112
00:07:09,190 --> 00:07:13,778
like 1, 2, 3, 4, 5, using so-called\n
113
00:07:13,778 --> 00:07:16,570
on your fingers where one finger\n
114
00:07:16,571 --> 00:07:18,521
if I'm, for instance, taking attendance.
115
00:07:18,521 --> 00:07:22,600
Now, we humans would typically\nactually count 1, 2, 3, 4, 5, 6.
116
00:07:22,600 --> 00:07:25,480
And we'd go past just those five\ndigits and count much higher
117
00:07:26,951 --> 00:07:29,871
But computers, somehow, only\nhave these zeros and ones.
118
00:07:29,870 --> 00:07:33,190
So if a computer only somehow\nspeaks binary, zeros and ones
119
00:07:33,190 --> 00:07:36,180
how does it even count\npast the number 1?
120
00:07:36,180 --> 00:07:38,740
Well, here are 3 zeros, of course.
121
00:07:38,740 --> 00:07:42,250
And if you translate this\nnumber in binary, 000
122
00:07:42,250 --> 00:07:46,150
to a more familiar number in decimal,\n
123
00:07:47,021 --> 00:07:49,871
If we were to represent, with\na computer, the number 1
124
00:07:49,870 --> 00:07:52,930
it would actually be 001,\nwhich, not surprisingly
125
00:07:52,930 --> 00:07:55,990
is exactly the same as we\nmight do in our human world
126
00:07:55,990 --> 00:07:59,500
but we might not bother writing\n
127
00:07:59,500 --> 00:08:02,170
But a computer, now, if it\nwants to count as high as two
128
00:08:03,701 --> 00:08:06,431
And so it has to use a different\npattern of zeros and ones.
129
00:08:08,620 --> 00:08:10,810
So this is not 10 with\na zero in front of it.
130
00:08:10,810 --> 00:08:13,281
It's indeed zero one zero\nin the context of binary.
131
00:08:13,281 --> 00:08:15,401
And if we want to count\nhigher now than two
132
00:08:15,401 --> 00:08:19,540
we're going to have to tweak these\n
133
00:08:19,540 --> 00:08:24,011
And then if we want 4\nor 5 or 6 or 7, we're
134
00:08:24,011 --> 00:08:26,921
just kind of toggling these\nzeros and ones, a.k.a.
135
00:08:26,920 --> 00:08:31,408
bits, for binary digits that represent,\n
136
00:08:31,408 --> 00:08:33,490
different numbers that you\nand I, as humans, know
137
00:08:33,490 --> 00:08:36,730
of course, as the so-called\ndecimal system, 0 through 9
138
00:08:36,730 --> 00:08:40,390
dec implying 10, 10 digits,\nthose zeros through nine.
139
00:08:40,390 --> 00:08:42,760
So why that particular pattern?
140
00:08:42,760 --> 00:08:44,680
And why these particular zeros and ones?
141
00:08:44,681 --> 00:08:48,011
Well, it turns out that\nrepresenting one thing or the other
142
00:08:48,010 --> 00:08:50,360
is just really simple for a computer.
143
00:08:50,860 --> 00:08:53,110
At the end of the day, they're\npowered by electricity.
144
00:08:53,110 --> 00:08:56,081
And it's a really simple thing to\n
145
00:08:56,081 --> 00:08:57,791
or don't store some electricity.
146
00:08:57,791 --> 00:09:00,911
Like, that's as simple as\nthe world can get, on or off.
147
00:09:03,110 --> 00:09:05,821
So, in fact, inside of a\ncomputer, a phone, anything
148
00:09:05,821 --> 00:09:07,571
these days that's\nelectronic, pretty much
149
00:09:07,571 --> 00:09:10,763
is some number of switches,\notherwise known as transistors.
150
00:09:11,471 --> 00:09:14,852
You've got thousands, millions of them\n
151
00:09:14,852 --> 00:09:17,810
And these are just tiny little switches\n
152
00:09:17,811 --> 00:09:20,471
And by turning those things\non and off in patterns
153
00:09:20,471 --> 00:09:24,274
a computer can count from 0 on up\n
154
00:09:24,274 --> 00:09:27,191
And so these switches, really, you\n
155
00:09:27,691 --> 00:09:29,983
Let me just borrow one of\nour little stage lights here.
156
00:09:32,260 --> 00:09:34,690
And so, I could just think\nof this as representing
157
00:09:34,691 --> 00:09:38,110
in my laptop, a transistor,\na switch, representing 0.
158
00:09:38,110 --> 00:09:43,120
But if I allow some electricity\n
159
00:09:43,120 --> 00:09:44,730
Well, how do I count higher than 1?
160
00:09:44,730 --> 00:09:46,341
I, of course, need another light bulb.
161
00:09:46,341 --> 00:09:48,531
So let me grab another one here.
162
00:09:48,530 --> 00:09:53,350
And if I put it in that same kind of\n
163
00:09:53,350 --> 00:09:57,140
That's sort of the old finger\ncounting way of unary, just 1, 2.
164
00:09:57,140 --> 00:09:59,140
I want to actually take\ninto account the pattern
165
00:09:59,140 --> 00:10:00,680
of these things being on and off.
166
00:10:00,681 --> 00:10:06,730
So if this was one a moment ago, what I\n
167
00:10:06,730 --> 00:10:10,660
and let the next one over be on, a.k.a.
168
00:10:12,071 --> 00:10:15,110
And let me get us a\nthird bit, if you will.
169
00:10:16,600 --> 00:10:20,440
Here is that same pattern now,\nstarting at the beginning with 3.
170
00:10:25,721 --> 00:10:32,770
Here is 010, a.k.a., in our\nhuman world of decimal, 2.
171
00:10:32,770 --> 00:10:35,170
And then we could, of course,\nkeep counting further.
172
00:10:35,171 --> 00:10:37,941
This now would be 3 and dot dot dot.
173
00:10:37,941 --> 00:10:40,870
If this other bulb now goes\non, and that switch is turned
174
00:10:40,870 --> 00:10:43,360
and all three stay on--\nthis, again, was what number?
175
00:10:45,581 --> 00:10:49,811
So it's just as simple,\nrelatively, as that, if you will.
176
00:10:49,811 --> 00:10:53,980
But how is it that these\npatterns came to be?
177
00:10:53,980 --> 00:10:56,806
Well, these patterns actually\nfollow something very familiar.
178
00:10:56,806 --> 00:10:58,931
You and I don't really\nthink about it at this level
179
00:10:58,931 --> 00:11:02,801
anymore because we've probably been\n
180
00:11:04,270 --> 00:11:09,160
But if we consider something in\ndecimal, like the number 123
181
00:11:10,301 --> 00:11:12,971
This looks like 123 in decimal.
182
00:11:13,660 --> 00:11:17,800
It's really just three symbols,\n
183
00:11:17,801 --> 00:11:20,291
with a couple of curves, that\nyou and I now instinctively
184
00:11:22,150 --> 00:11:27,000
But if we do rewind a few years,\n
185
00:11:27,000 --> 00:11:30,480
because you're assigning meaning\nto each of these columns.
186
00:11:30,480 --> 00:11:33,331
The 3 is in the so-called ones place.
187
00:11:33,331 --> 00:11:36,931
The 2 is in the so-called tens place.
188
00:11:36,931 --> 00:11:39,841
And the 1 is in the\nso-called hundreds place.
189
00:11:39,841 --> 00:11:41,850
And then the math ensues\nquickly in your head.
190
00:11:41,850 --> 00:11:47,311
This is technically 100 times 1, plus\n
191
00:11:48,990 --> 00:11:54,390
And there we get the sort of\nmathematical notion we know as 123.
192
00:11:54,390 --> 00:11:58,570
Well, nicely enough, in binary,\nit's actually the same thing.
193
00:11:58,571 --> 00:12:01,021
It's just these columns mean\na little something different.
194
00:12:01,020 --> 00:12:05,010
If you use three digits in decimal,\nand you have the ones place
195
00:12:05,010 --> 00:12:09,331
the tens place, and the hundreds place,\n
196
00:12:09,331 --> 00:12:11,230
They're technically just powers of 10.
197
00:12:11,230 --> 00:12:13,620
So 10 to the 0, 10 to\nthe 1, 10 to the 2.
198
00:12:14,520 --> 00:12:16,440
Decimal system, "dec" meaning 10.
199
00:12:16,441 --> 00:12:18,841
You have 8 and 10 digits, 0 through 9.
200
00:12:18,841 --> 00:12:21,390
In the binary system, if you're\ngoing to use three digits
201
00:12:21,390 --> 00:12:24,910
just change the bases if you're\nusing only zeros and ones.
202
00:12:24,910 --> 00:12:29,130
So now it's powers of 2, 2 to the\n
203
00:12:31,900 --> 00:12:36,120
And if you keep going, it's going\n
204
00:12:37,510 --> 00:12:40,260
So, why did we get these\npatterns that we did?
205
00:12:40,260 --> 00:12:46,290
Here's your 000 because it's 4 times\n
206
00:12:46,291 --> 00:12:49,471
This is why we got the\ndecimal number 1 in binary.
207
00:12:49,471 --> 00:12:53,341
This is why we got the number 2\nin binary, because it's 4 times
208
00:12:53,341 --> 00:13:01,051
0, plus 2 times 1, plus 1 times 0, and\n
209
00:13:01,831 --> 00:13:05,373
And, of course, if you wanted to\n
210
00:13:06,331 --> 00:13:08,831
What does a computer need to\ndo to count even higher than 7?
211
00:13:10,880 --> 00:13:12,530
Add another light bulb, another switch.
212
00:13:12,530 --> 00:13:14,810
And, indeed, computers\nhave standardized just how
213
00:13:14,811 --> 00:13:17,240
many zeros and ones,\nor bits or switches
214
00:13:17,240 --> 00:13:19,020
they throw at these kinds of problems.
215
00:13:19,020 --> 00:13:23,300
And, in fact, most computers would\n
216
00:13:23,301 --> 00:13:25,971
And even if you're only counting\nas high as three or seven
217
00:13:25,971 --> 00:13:28,431
you would still use eight and\nhave a whole bunch of zeros.
218
00:13:28,431 --> 00:13:31,551
But that's OK, because the\ncomputers these days certainly
219
00:13:31,551 --> 00:13:35,301
have so many more, thousands,\n
220
00:13:37,921 --> 00:13:41,671
All right, so, with that said, if\n
221
00:13:41,671 --> 00:13:44,161
or, frankly, as high\nas we want, that only
222
00:13:44,160 --> 00:13:46,770
seems to make computers\nuseful for things like Excel
223
00:13:48,030 --> 00:13:50,280
But computers, of course,\nlet you send text messages
224
00:13:50,280 --> 00:13:52,240
write documents, and so much more.
225
00:13:52,240 --> 00:13:55,530
So how would a computer represent\nsomething like a letter
226
00:13:55,530 --> 00:13:59,860
like the letter A of the English\n
227
00:14:03,721 --> 00:14:05,961
AUDIENCE: You can represent\nletters in numbers.
228
00:14:05,961 --> 00:14:08,571
DAVID MALAN: OK, so we could\nrepresent letters using numbers.
229
00:14:10,004 --> 00:14:11,421
What number should represent what?
230
00:14:11,421 --> 00:14:15,221
AUDIENCE: Say if you were starting\n
231
00:14:15,221 --> 00:14:18,745
you could say 1 is A, 2 is B, 3 is C.
232
00:14:19,620 --> 00:14:22,630
Yeah, we just all have to agree\nsomehow that one number is
233
00:14:22,630 --> 00:14:23,880
going to represent one letter.
234
00:14:23,880 --> 00:14:28,490
So 1 is A, 2 is B, 3 is\nC, Z is 26, and so forth.
235
00:14:28,490 --> 00:14:30,990
Maybe we can even take into\naccount uppercase and lowercase.
236
00:14:30,990 --> 00:14:34,230
We just have to agree and sort of\n
237
00:14:34,230 --> 00:14:36,461
And humans, indeed, did just that.
238
00:14:38,010 --> 00:14:40,110
It turns out they started\na little higher up.
239
00:14:40,110 --> 00:14:44,130
Capital A has been\nstandardized as the number 65.
240
00:14:44,130 --> 00:14:47,581
And capital B has been\nstandardized as the number 66.
241
00:14:47,581 --> 00:14:50,370
And you can kind of imagine\nhow it goes up from there.
242
00:14:50,370 --> 00:14:53,250
And that's because whatever\nyou're representing
243
00:14:53,250 --> 00:14:57,461
ultimately, can only be stored, at\n
244
00:14:57,461 --> 00:15:01,110
And so, some humans in a room before,\n
245
00:15:01,110 --> 00:15:05,100
or, really, this pattern of zeros\n
246
00:15:08,791 --> 00:15:12,481
So if that pattern of zeros and\nones ever appears in a computer
247
00:15:12,480 --> 00:15:17,910
it might be interpreted then as indeed\n
248
00:15:18,811 --> 00:15:23,191
But I worry, just to be clear, we\n
249
00:15:23,191 --> 00:15:25,681
It might seem, if I play\nthis naively, that, OK
250
00:15:25,681 --> 00:15:28,860
how do I now actually do\nmath with the number 65?
251
00:15:28,860 --> 00:15:33,571
If now Excel displays 65 is\nan A, let alone Bs and Cs.
252
00:15:33,571 --> 00:15:37,230
So how might a computer\ndo as you've proposed
253
00:15:37,230 --> 00:15:41,875
have this mapping from numbers to\n
254
00:15:41,875 --> 00:15:43,500
It feels like we've given something up.
255
00:15:44,000 --> 00:15:46,262
AUDIENCE: By having\na prefix for letters?
256
00:15:46,263 --> 00:15:47,596
DAVID MALAN: By having a prefix?
257
00:15:47,596 --> 00:15:48,887
AUDIENCE: You could have\nprefixes and suffixes.
258
00:15:48,886 --> 00:15:51,640
DAVID MALAN: OK, so we could\nperhaps have some kind of prefix
259
00:15:51,640 --> 00:15:53,230
like some pattern of zeros and ones--
260
00:15:53,230 --> 00:15:56,201
I like this-- that\nindicates to the computer
261
00:15:56,201 --> 00:15:58,870
here comes another pattern\nthat represents a letter.
262
00:15:58,870 --> 00:16:02,920
Here comes another pattern that\nrepresents a number or a letter.
263
00:16:05,620 --> 00:16:08,480
How might a computer\ndistinguish these two?
264
00:16:08,980 --> 00:16:11,360
AUDIENCE: Have a\ndifferent file format, so
265
00:16:11,360 --> 00:16:16,120
like, odd text or just\ncheck the graphic or--
266
00:16:16,120 --> 00:16:17,841
DAVID MALAN: Indeed, and that's spot-on.
267
00:16:17,841 --> 00:16:20,781
Nothing wrong with what you suggested,\n
268
00:16:20,780 --> 00:16:23,613
The reason we have all of these\n
269
00:16:23,614 --> 00:16:28,521
like JPEG and GIF and PNGs\nand Word documents, .docx
270
00:16:28,520 --> 00:16:32,270
and Excel files and so forth, is\n
271
00:16:32,270 --> 00:16:35,780
and decided, well, in the context\n
272
00:16:35,780 --> 00:16:38,600
more specifically, in the\ncontext of this type of program
273
00:16:38,600 --> 00:16:42,140
Excel versus Photoshop versus\nGoogle Docs or the like
274
00:16:42,140 --> 00:16:46,550
we shall interpret any patterns of\n
275
00:16:46,551 --> 00:16:51,051
for Excel, maybe letters in, like, a\n
276
00:16:51,051 --> 00:16:54,471
or maybe even colors of the rainbow\n
277
00:16:55,610 --> 00:16:58,490
And we'll see, when we\nourselves start programming
278
00:16:58,490 --> 00:17:00,471
you the programmer\nwill ultimately provide
279
00:17:00,471 --> 00:17:05,671
some hints to the computer that tells\n
280
00:17:05,671 --> 00:17:08,631
So, similar in spirit to that, but\n
281
00:17:09,260 --> 00:17:12,651
So this system here actually has a\n
282
00:17:12,651 --> 00:17:14,421
Code for Information Interchange.
283
00:17:14,421 --> 00:17:16,760
And indeed, it began here\nin the US, and that's
284
00:17:16,760 --> 00:17:19,550
why it's actually a little\nbiased toward A's through Z's
285
00:17:19,550 --> 00:17:21,300
and a bit of punctuation as well.
286
00:17:21,300 --> 00:17:22,920
And that quickly became a problem.
287
00:17:22,921 --> 00:17:26,780
But if we start simply now,\nin English, the mapping
288
00:17:26,780 --> 00:17:28,860
itself is fairly straightforward.
289
00:17:28,861 --> 00:17:33,291
So if A is 65, B it 66,\nand dot dot dot, suppose
290
00:17:33,290 --> 00:17:36,020
that you received a text\nmessage, an email, from a friend
291
00:17:36,020 --> 00:17:39,080
and underneath the hood,\nso to speak, if you kind of
292
00:17:39,080 --> 00:17:42,740
looked inside the computer, what you\n
293
00:17:42,740 --> 00:17:48,080
or this email happened to\nbe the numbers 72, 73, 33
294
00:17:48,080 --> 00:17:50,810
or, really, the underlying\npattern of zeros and ones.
295
00:17:50,810 --> 00:17:56,376
What might your friend have sent you\n
296
00:18:02,280 --> 00:18:06,780
Well, apparently, according to this\n
297
00:18:06,780 --> 00:18:09,421
It's not obvious from\nthis chart what the 33 is
298
00:18:09,421 --> 00:18:11,520
but indeed, this\npattern represents "hi.
299
00:18:11,520 --> 00:18:13,773
And anyone want to guess,\nor if you know, what 33 is?
300
00:18:13,773 --> 00:18:14,940
AUDIENCE: Exclamation point.
301
00:18:14,941 --> 00:18:16,020
DAVID MALAN: Exclamation point.
302
00:18:16,020 --> 00:18:18,562
And this is, frankly, not the\nkind of thing most people know.
303
00:18:18,563 --> 00:18:22,241
But it's easily accessible by a\n
304
00:18:23,671 --> 00:18:26,431
When I said that we just need to\n
305
00:18:27,510 --> 00:18:29,302
They wrote it down in\na book or in a chart.
306
00:18:29,303 --> 00:18:33,961
And, for instance, here is our\n72 for H, here is our 73 for I
307
00:18:33,961 --> 00:18:37,861
and here is our 33\nfor exclamation point.
308
00:18:37,861 --> 00:18:41,191
And computers, Macs, PCs,\niPhones, Android devices
309
00:18:41,191 --> 00:18:43,721
just know this mapping\nby heart, if you will.
310
00:18:43,721 --> 00:18:46,451
They've been designed to\nunderstand those letters.
311
00:18:46,451 --> 00:18:48,121
So here, I might have received "hi.
312
00:18:48,121 --> 00:18:51,750
Technically, what I've received is\n
313
00:18:51,750 --> 00:18:54,960
But it's important to note that when\n
314
00:18:54,960 --> 00:18:58,020
in any format, be it\nemail or text or a file
315
00:18:58,020 --> 00:19:00,661
they do tend to come\nin standard lengths
316
00:19:00,661 --> 00:19:04,351
with a certain number of\nzeros and ones altogether.
317
00:19:04,351 --> 00:19:07,200
And this happens to be 8 plus 8, plus 8.
318
00:19:07,200 --> 00:19:10,561
So just to get the message\n"hi, exclamation point
319
00:19:10,560 --> 00:19:15,000
you would have received at least,\nit would seem, some 24 bits.
320
00:19:15,000 --> 00:19:18,330
But frankly, bits are so tiny,\nliterally and mathematically
321
00:19:18,330 --> 00:19:21,060
that we don't tend to think or\n
322
00:19:21,060 --> 00:19:23,370
You're probably more\nfamiliar with bytes.
323
00:19:23,371 --> 00:19:27,480
B-Y-T-E-S is a byte,\nis a byte, is a byte.
324
00:19:29,340 --> 00:19:32,373
And even those, frankly, aren't\n
325
00:19:32,374 --> 00:19:34,290
How high can you count\nif you have eight bits?
326
00:19:39,270 --> 00:19:42,570
Unless you want to go\nnegative, that's fine.
327
00:19:45,151 --> 00:19:48,631
Long story short, if we actually got\n
328
00:19:48,631 --> 00:19:54,061
and ones, and we figured out what\n
329
00:19:54,060 --> 00:19:57,570
to in decimal, it would\nindeed be 255, or less
330
00:19:57,570 --> 00:20:00,280
if you want to represent\nnegative numbers as well.
331
00:20:00,280 --> 00:20:04,140
So this is useful because now we can\n
332
00:20:04,141 --> 00:20:06,811
but, if the files are bigger,\nkilobytes is thousands of bytes
333
00:20:06,810 --> 00:20:10,590
megabytes is millions of bytes,\ngigabytes is billions of bytes
334
00:20:10,590 --> 00:20:14,080
terabytes are trillions\nof bytes, and so forth.
335
00:20:14,080 --> 00:20:20,520
We have a vocabulary for these\n
336
00:20:20,520 --> 00:20:24,350
The problem is that, if you're using\n
337
00:20:24,351 --> 00:20:27,801
byte per character, and\noriginally, only seven, you
338
00:20:27,800 --> 00:20:30,390
can only represent 255 characters.
339
00:20:30,391 --> 00:20:33,980
And that's actually 256 total\ncharacters, including zero.
340
00:20:33,980 --> 00:20:37,790
And that's fine if you're using\nliterally English, in this case
341
00:20:39,141 --> 00:20:42,081
But there's many human\nlanguages in the world
342
00:20:42,080 --> 00:20:45,410
that need many more symbols\nand, therefore, many more bits.
343
00:20:45,411 --> 00:20:48,291
So, thankfully, the world\ndecided that we'll indeed
344
00:20:48,290 --> 00:20:51,441
support not just the US\nEnglish keyboard, but all
345
00:20:51,441 --> 00:20:54,500
of the accented characters that\n
346
00:20:54,500 --> 00:20:57,740
And heck, if we use enough\nbits, zeros and ones
347
00:20:57,740 --> 00:21:01,730
not only can we represent all\nhuman languages in written form
348
00:21:01,730 --> 00:21:03,651
as well as some emotions\nalong the way, we
349
00:21:03,651 --> 00:21:06,621
can capture the latter with\nthese things called emojis.
350
00:21:06,621 --> 00:21:09,230
And indeed, these are very\nmuch in vogue these days.
351
00:21:09,230 --> 00:21:12,951
You probably send and/or receive\n
352
00:21:12,951 --> 00:21:16,520
These are just characters, like\nletters of an alphabet, patterns
353
00:21:16,520 --> 00:21:20,570
of zeros and ones that you're receiving,\n
354
00:21:20,570 --> 00:21:22,690
For instance, there\nare certain emojis that
355
00:21:22,691 --> 00:21:24,721
are represented with\ncertain patterns of bits.
356
00:21:24,721 --> 00:21:28,070
And when you receive them, your\n
357
00:21:30,000 --> 00:21:32,780
And this newer standard\nis called Unicode.
358
00:21:32,780 --> 00:21:35,270
So it's a superset of\nwhat we called ASCII.
359
00:21:35,270 --> 00:21:39,951
And Unicode is just a mapping of many\n
360
00:21:39,951 --> 00:21:42,320
or characters, more\ngenerally, that might
361
00:21:42,320 --> 00:21:45,140
use eight bits for\nbackwards compatibility
362
00:21:45,141 --> 00:21:49,821
with the old way of doing things with\n
363
00:21:49,820 --> 00:21:51,680
And if you have 16\nbits, you can actually
364
00:21:51,681 --> 00:21:54,713
represent more than\n65,000 possible letters.
365
00:21:54,713 --> 00:21:55,880
And that's getting up there.
366
00:21:55,881 --> 00:22:01,341
And heck, Unicode might even use 32\n
367
00:22:01,340 --> 00:22:03,230
and punctuation symbols and emojis.
368
00:22:03,230 --> 00:22:06,411
And that would give you up\nto 4 billion possibilities.
369
00:22:06,411 --> 00:22:09,980
And, I daresay, one of the reasons we\n
370
00:22:11,060 --> 00:22:14,060
I mean, we've got room for\nbillions more, literally.
371
00:22:14,060 --> 00:22:16,280
So, in fact, just as a\nlittle bit of trivia
372
00:22:16,280 --> 00:22:21,500
has anyone ever received this decimal\n
373
00:22:21,500 --> 00:22:25,971
has anyone ever received this pattern\n
374
00:22:25,971 --> 00:22:29,070
in a text or an email,\nperhaps this past year?
375
00:22:29,070 --> 00:22:33,951
Well, if you actually look this up,\n
376
00:22:33,951 --> 00:22:37,131
happens to represent\nface with medical mask.
377
00:22:37,131 --> 00:22:40,941
And notice that if you've got\nan iPhone or an Android device
378
00:22:40,941 --> 00:22:43,290
you might be seeing different things.
379
00:22:43,290 --> 00:22:46,580
In fact, this is the Android\nversion of this, most recently.
380
00:22:46,580 --> 00:22:49,221
This is the iOS version\nof it, most recently.
381
00:22:49,221 --> 00:22:51,891
And there's bunches of other\ninterpretations by other companies
382
00:22:53,040 --> 00:22:55,730
So Unicode, as a\nconsortium, if you will
383
00:22:55,730 --> 00:22:58,861
has standardized the descriptions\nof what these things are.
384
00:22:58,861 --> 00:23:02,391
But the companies themselves,\nmanufacturers out there
385
00:23:02,391 --> 00:23:05,000
have generally interpreted\nit as you see fit.
386
00:23:05,000 --> 00:23:08,330
And this can lead to some\nhuman miscommunications.
387
00:23:08,330 --> 00:23:11,870
In fact, for like, literally,\n
388
00:23:11,871 --> 00:23:14,781
I started being in the habit of\n
389
00:23:14,780 --> 00:23:17,820
like this because I thought it was\n
390
00:23:17,820 --> 00:23:19,760
I didn't realize this\nis the emoji for hug
391
00:23:19,760 --> 00:23:24,350
because whatever device I was using\n
392
00:23:24,351 --> 00:23:27,171
And that's because of their\ninterpretation of the data.
393
00:23:27,171 --> 00:23:31,551
This has happened too when\nwhat was a gun became a water
394
00:23:31,550 --> 00:23:33,590
pistol in some manufacturers' eyes.
395
00:23:33,590 --> 00:23:37,971
And so it's an interesting dichotomy\n
396
00:23:37,971 --> 00:23:42,500
want to represent and how we\n
397
00:23:42,500 --> 00:23:45,891
Questions, then, on these\nrepresentations of formats
398
00:23:45,891 --> 00:23:49,021
be it numbers or letters, or soon more.
399
00:23:49,520 --> 00:23:52,140
AUDIENCE: Why is decimal\npopular for a computer
400
00:23:52,141 --> 00:23:54,739
if binary is the basis for everything?
401
00:23:54,739 --> 00:23:56,530
DAVID MALAN: Sorry,\nwhy is what so popular?
402
00:23:56,530 --> 00:23:59,310
AUDIENCE: Why is the decimal popular\n
403
00:23:59,310 --> 00:24:01,900
DAVID MALAN: Yeah, so we'll come\n
404
00:24:01,901 --> 00:24:03,811
There are other ways\nto represent numbers.
405
00:24:07,171 --> 00:24:12,480
And hexadecimal is yet a fourth that\n
406
00:24:12,480 --> 00:24:15,540
through 9 plus A, B, C,\nD, E, F. And somehow
407
00:24:15,540 --> 00:24:18,600
you can similarly count\neven higher with those.
408
00:24:18,601 --> 00:24:21,121
We'll see in a few weeks\nwhy this is compelling.
409
00:24:21,121 --> 00:24:24,781
But hexadecimal, long story\nshort, uses four bits per digit.
410
00:24:24,780 --> 00:24:28,290
And so, four bits, if you have two\n
411
00:24:28,290 --> 00:24:30,810
And it's just a very\nconvenient unit of measure.
412
00:24:30,810 --> 00:24:34,140
And it's also human convention in\n
413
00:24:34,141 --> 00:24:35,874
But we'll come back to that soon.
414
00:24:36,540 --> 00:24:39,923
AUDIENCE: Do the lights on the\nstage supposedly say that--
415
00:24:39,923 --> 00:24:42,590
DAVID MALAN: Do the lights on the\nstage supposedly say anything?
416
00:24:42,590 --> 00:24:46,310
Well, if we had thought in advance\nto use maybe 64 light bulbs
417
00:24:46,310 --> 00:24:51,650
that would seem to give us 8\ntotal bytes on stage, 8 times 8
418
00:24:55,171 --> 00:24:58,447
Other questions on 0's and 1's?
419
00:24:58,446 --> 00:25:01,130
It's a little bright in here.
420
00:25:04,911 --> 00:25:08,931
Where everyone's pointing\nsomewhere specific.
421
00:25:11,391 --> 00:25:14,863
AUDIENCE: I was just going\nto ask about the 255 bits
422
00:25:14,863 --> 00:25:16,346
like with the maximum characters.
423
00:25:16,846 --> 00:25:19,555
DAVID MALAN: Ah, sure, and we'll\n
424
00:25:19,555 --> 00:25:22,131
in the coming days too,\nat a slower pace too
425
00:25:22,131 --> 00:25:26,074
we have, with eight bits, two\npossible values for the first
426
00:25:26,074 --> 00:25:28,490
and then two for the next, two\nfor the next, and so forth.
427
00:25:30,131 --> 00:25:32,230
That's 2 to the eighth\npower total, which
428
00:25:32,230 --> 00:25:36,730
means you can have 256 total\n
429
00:25:36,730 --> 00:25:40,580
But as we'll see soon computer\nscientists, programmers
430
00:25:40,580 --> 00:25:45,250
software often starts counting at 0 by\n
431
00:25:45,250 --> 00:25:50,980
patterns, 00000000 to represent\n
432
00:25:50,980 --> 00:25:56,830
you only have 255 other patterns left\n
433
00:25:59,421 --> 00:26:04,421
All right, so what then might we\n
434
00:26:04,990 --> 00:26:07,451
Well, we of course have things\nlike colors and programs
435
00:26:07,451 --> 00:26:09,682
like Photoshop and pictures and photos.
436
00:26:09,682 --> 00:26:11,141
Well let me ask the question again.
437
00:26:11,141 --> 00:26:14,351
How might a computer, do you think,\n
438
00:26:16,000 --> 00:26:19,510
Like what are our options if all we've\n
439
00:26:23,471 --> 00:26:27,070
RGB indeed is this acronym that\nrepresents some amount of red
440
00:26:27,070 --> 00:26:29,590
and some amount of green and\nblue and indeed computers
441
00:26:29,590 --> 00:26:32,380
can represent colors by just doing that.
442
00:26:32,381 --> 00:26:33,893
Remembering, for instance, this dot.
443
00:26:33,893 --> 00:26:36,851
This yellow dot on the screen that\n
444
00:26:36,851 --> 00:26:39,611
these days, well that's some amount\n
445
00:26:40,443 --> 00:26:42,340
And if you sort of mix\nthose colors together
446
00:26:42,340 --> 00:26:44,080
you can indeed get a very specific one.
447
00:26:44,080 --> 00:26:46,660
And we'll see you in\njust a moment just that.
448
00:26:46,661 --> 00:26:51,408
So indeed earlier on, humans\nonly used seven bits total.
449
00:26:51,407 --> 00:26:54,490
And it was only once they decided,\n
450
00:26:54,490 --> 00:26:57,100
got extended ASCII and\nthat was initially in part
451
00:26:57,101 --> 00:27:00,941
a solution to the same problem of\n
452
00:27:00,941 --> 00:27:05,248
in those patterns of zeros and ones\n
453
00:27:06,080 --> 00:27:10,330
But even that wasn't enough and that's\n
454
00:27:11,891 --> 00:27:16,961
So if we come back now to\nthis one particular color.
455
00:27:16,961 --> 00:27:19,330
RGB was proposed as a scheme,\nbut how might this work?
456
00:27:19,330 --> 00:27:21,280
Well, consider for instance this.
457
00:27:21,280 --> 00:27:25,661
If we do indeed decide as a group to\n
458
00:27:25,661 --> 00:27:28,421
with some mixture of some red,\nsome green, and some blue
459
00:27:28,421 --> 00:27:33,311
we have to decide how to represent\n
460
00:27:33,310 --> 00:27:37,030
Well, it turns out if all we have\n
461
00:27:38,351 --> 00:27:44,501
For instance, suppose a computer we're\n
462
00:27:44,500 --> 00:27:47,471
no longer in the context of\nan email or a text message
463
00:27:47,471 --> 00:27:51,701
but now in the context of something\n
464
00:27:51,701 --> 00:27:55,331
and creating graphical files,\nmaybe this first number
465
00:27:55,330 --> 00:28:00,140
could be interpreted as representing\n
466
00:28:00,688 --> 00:28:02,021
And that's exactly what happens.
467
00:28:02,020 --> 00:28:05,740
You can think of the first digit as\n
468
00:28:05,740 --> 00:28:10,030
And so ultimately when you combine that\n
469
00:28:10,030 --> 00:28:14,230
that amount of blue, it turns out it's\n
470
00:28:14,230 --> 00:28:18,040
And indeed, you can come up\nwith a numbers between 0 and 255
471
00:28:18,040 --> 00:28:21,461
for each of those colors to mix any\n
472
00:28:21,461 --> 00:28:23,330
And you can actually\nsee this in practice.
473
00:28:23,330 --> 00:28:26,530
Even though our screens,\nadmittedly, are getting really good
474
00:28:26,530 --> 00:28:30,739
on our phones and laptops such that you\n
475
00:28:30,739 --> 00:28:32,530
You might have heard\nthe term pixel before.
476
00:28:32,530 --> 00:28:34,661
Pixel's just a dot on\nthe screen and you've
477
00:28:34,661 --> 00:28:38,260
got thousands, millions of them these\n
478
00:28:38,260 --> 00:28:41,201
If I take even this\nemoji, which again happens
479
00:28:41,201 --> 00:28:46,211
to be one company's interpretation\nof a face with medical mask
480
00:28:46,211 --> 00:28:48,550
and zoom in a bit, maybe\nzoom in a bit more
481
00:28:48,550 --> 00:28:50,890
you can actually start\nto see these pixels.
482
00:28:50,891 --> 00:28:53,561
Things get pixelated\nbecause what you're seeing
483
00:28:53,560 --> 00:28:57,280
is each of the individual dots\n
484
00:28:57,280 --> 00:28:59,681
And apparently each of\nthese individual dots
485
00:28:59,681 --> 00:29:04,961
are probably using 24 bits, eight bits\n
486
00:29:04,961 --> 00:29:07,300
bits for blue, in some pattern.
487
00:29:07,300 --> 00:29:11,290
This program or some other like\n
488
00:29:11,290 --> 00:29:16,270
and it's white or yellow or\nblack or some brown in between.
489
00:29:16,270 --> 00:29:19,870
So if you look sort of awkwardly, but\n
490
00:29:19,871 --> 00:29:23,673
or maybe your TV, you can\nsee exactly this, too.
491
00:29:23,673 --> 00:29:25,631
All right, well, what\nabout things that we also
492
00:29:25,631 --> 00:29:27,256
watch every day on YouTube or the like?
493
00:29:28,540 --> 00:29:30,550
How would a computer,\nknowing what we know now
494
00:29:30,550 --> 00:29:32,545
represent something like a video?
495
00:29:35,080 --> 00:29:37,930
How might you represent a video\nusing only zeros and ones?
496
00:29:38,760 --> 00:29:43,260
AUDIENCE: As we can see here,\nthey represent images, right?
497
00:29:43,260 --> 00:29:47,760
[INAUDIBLE] sounds of\nthe 0 and 1s as well.
498
00:29:52,040 --> 00:29:55,410
To summarize, what video really\n
499
00:29:55,411 --> 00:29:58,131
It's not just one image, it's\nnot just one letter or a number
500
00:29:58,131 --> 00:30:01,911
it's presumably some kind of\nsequence because time is passing.
501
00:30:01,911 --> 00:30:05,780
So with a whole bunch of images,\nmaybe 24 maybe 30 per second
502
00:30:05,780 --> 00:30:08,480
if you fly them by the\nhuman's eyes, we can
503
00:30:08,480 --> 00:30:11,240
interpret them using our eyes\nand brain that there is now
504
00:30:11,240 --> 00:30:13,431
movement and therefore video.
505
00:30:13,431 --> 00:30:16,040
Similarly with audio or music.
506
00:30:16,040 --> 00:30:20,211
If we just came up with some convention\n
507
00:30:20,211 --> 00:30:23,361
on a musical instrument, could we have\n
508
00:30:23,361 --> 00:30:25,153
And this might be\nactually pretty familiar.
509
00:30:25,153 --> 00:30:29,901
Let me pull up a quick video here,\n
510
00:30:31,310 --> 00:30:32,990
You might remember from childhood.
511
00:30:54,961 --> 00:30:58,861
So granted that particular\nvideo is an actual video
512
00:30:58,861 --> 00:31:02,311
of a paper-based animation, but\n
513
00:31:02,310 --> 00:31:06,270
is some sequence of these images,\nwhich themselves of course
514
00:31:06,270 --> 00:31:10,320
are just zeros and ones because they're\n
515
00:31:10,320 --> 00:31:14,003
Now something like musical notes like\n
516
00:31:14,003 --> 00:31:16,170
might just naturally play\nthese on physical devices
517
00:31:16,171 --> 00:31:19,201
but computers can certainly\nrepresent those sounds, too.
518
00:31:19,201 --> 00:31:22,050
For instance, a popular\nformat for audio is
519
00:31:22,050 --> 00:31:24,780
called MIDI and MIDI\nmight just represent
520
00:31:24,780 --> 00:31:29,100
each note that you saw a moment ago\n
521
00:31:29,101 --> 00:31:32,491
But more generally, you might\nthink about music as having notes
522
00:31:32,490 --> 00:31:35,401
for instance, A through G, maybe\nsome flats and some sharps
523
00:31:35,401 --> 00:31:39,000
you might have the duration like how\n
524
00:31:39,000 --> 00:31:41,040
on a piano or some\nother device, and then
525
00:31:41,040 --> 00:31:43,770
just the volume like how hard\ndoes a human in the real world
526
00:31:43,770 --> 00:31:46,681
press down on that key and\ntherefore how loud is that sound?
527
00:31:46,681 --> 00:31:51,571
It would seem that just remembering\n
528
00:31:51,570 --> 00:31:57,090
we can then represent really all of\n
529
00:31:57,090 --> 00:32:00,420
So that then is really\na laundry list of ways
530
00:32:00,421 --> 00:32:02,551
that we can just represent information.
531
00:32:02,550 --> 00:32:05,313
Again, computers or digital have\nall of these different formats
532
00:32:05,314 --> 00:32:07,980
but at the end of the day and as\nfancy as those devices in years
533
00:32:07,980 --> 00:32:11,760
are, it's just zeros and ones, tiny\n
534
00:32:11,760 --> 00:32:14,941
if you will, represented in some\nway and it's up to the software
535
00:32:14,941 --> 00:32:17,941
that you and I and others\nwrite to use those zeros
536
00:32:17,941 --> 00:32:21,421
and ones in ways we want to get\nthe computers to do something
537
00:32:22,861 --> 00:32:27,181
Questions, then, on this representation\n
538
00:32:27,181 --> 00:32:30,811
is ultimately what problem solving\n
539
00:32:30,810 --> 00:32:35,590
and producing new via\nsome process in between.
540
00:32:40,070 --> 00:32:43,999
AUDIENCE: Yeah, so we talked about how\n
541
00:32:43,999 --> 00:32:45,962
you to interpret information.
542
00:32:45,962 --> 00:32:50,873
How does a file format like .mp4\n
543
00:32:52,346 --> 00:32:53,971
DAVID MALAN: So a really good question.
544
00:32:53,971 --> 00:32:55,921
There are many other\nfile formats out there.
545
00:32:55,921 --> 00:32:58,730
You allude to MP4 for video\nand more generally the use
546
00:32:58,730 --> 00:33:01,340
are these things called\ncodecs and containers.
547
00:33:01,340 --> 00:33:04,910
It's not quite as simple when\nusing larger files, for instance
548
00:33:04,911 --> 00:33:08,449
in more modern formats that a\n
549
00:33:09,560 --> 00:33:13,160
If you stored that many images\nfor like a Hollywood movie
550
00:33:13,161 --> 00:33:17,431
like 24 or 30 of them per second,\n
551
00:33:17,431 --> 00:33:19,760
And if you've ever taken\nphotos on your phone
552
00:33:19,760 --> 00:33:23,780
you might know how many megabytes or\n
553
00:33:24,381 --> 00:33:27,771
So humans have developed over\nthe years a fancier software
554
00:33:27,770 --> 00:33:32,480
that uses much more math to represent\n
555
00:33:32,480 --> 00:33:35,240
just using somehow shorter\npatterns of zeros and ones
556
00:33:35,240 --> 00:33:37,830
than are most simplistic\nrepresentation here.
557
00:33:37,830 --> 00:33:40,160
And they use what might\nbe called compression.
558
00:33:40,161 --> 00:33:42,561
If you've ever used a zip\nfile or something else
559
00:33:42,560 --> 00:33:45,170
somehow your computer is\nusing fewer zeros and ones
560
00:33:45,171 --> 00:33:47,061
to represent the same\namount of information
561
00:33:47,060 --> 00:33:49,522
ideally without losing any information.
562
00:33:49,522 --> 00:33:52,730
In the world of multimedia, which we'll\n
563
00:33:52,730 --> 00:33:56,330
there are both lossy and\nlossless formats out there.
564
00:33:56,330 --> 00:33:59,961
Lossless means you lose\nno information whatsoever.
565
00:33:59,961 --> 00:34:05,361
But more commonly as you're alluding\n
566
00:34:05,361 --> 00:34:08,151
where you're actually throwing\naway some amount of quality.
567
00:34:08,150 --> 00:34:10,592
You're getting some amount\nof pixelation that might not
568
00:34:10,592 --> 00:34:13,550
look perfect to the human, but heck\n
569
00:34:14,541 --> 00:34:17,691
And in the world of multimedia,\n
570
00:34:17,690 --> 00:34:20,480
and other MPEG containers\nthat can combine
571
00:34:20,481 --> 00:34:24,561
different formats of video, different\n
572
00:34:24,561 --> 00:34:26,841
but there, too, do\ndesigners have discretion.
573
00:34:28,911 --> 00:34:32,311
Other questions, then, on\ninformation here as well?
574
00:34:32,811 --> 00:34:35,523
AUDIENCE: So I know\ncomputers used to be very big
575
00:34:35,523 --> 00:34:37,510
and taking up like a\nwhole room and stuff.
576
00:34:37,510 --> 00:34:41,545
Is the reason they've gotten\nsmaller because we can store
577
00:34:41,545 --> 00:34:43,826
this information piecemeal or what?
578
00:34:44,701 --> 00:34:47,659
I mean, back in the day you might\n
579
00:34:47,659 --> 00:34:50,340
tube, which is like some\nphysically large device that
580
00:34:50,340 --> 00:34:53,280
might have only stored some 0 or 1.
581
00:34:53,280 --> 00:34:55,920
Yes, it is the miniaturization\nof hardware these days
582
00:34:55,920 --> 00:35:00,721
that has allowed us to store as many\n
583
00:35:01,661 --> 00:35:03,583
And as we've built more\nfancy machines that
584
00:35:03,583 --> 00:35:06,000
can sort of design this hardware\nat an even smaller scale
585
00:35:06,001 --> 00:35:08,531
we're just packing more and\nmore into these devices.
586
00:35:08,530 --> 00:35:09,840
But there, too, is a trade off.
587
00:35:09,840 --> 00:35:13,110
For instance, you might know by\nusing your phone or your laptop
588
00:35:13,110 --> 00:35:15,810
for quite a while, maybe on\nyour lap, starts to get warm.
589
00:35:15,811 --> 00:35:17,941
So there are these literal\nphysical side effects
590
00:35:17,940 --> 00:35:20,370
of this where now some\nof our devices run hot.
591
00:35:20,371 --> 00:35:23,251
This is why like a data\ncenter in the real world
592
00:35:23,251 --> 00:35:25,591
might need more air conditioning\nthan a typical place
593
00:35:25,590 --> 00:35:28,712
because there are these\nphysical artifacts as well.
594
00:35:28,713 --> 00:35:31,921
In fact, if you'd like to see one of\n
595
00:35:31,920 --> 00:35:35,310
across the river here in now Allston\n
596
00:35:35,311 --> 00:35:40,621
is the Harvard Mark 1 computer that\n
597
00:35:41,981 --> 00:35:44,491
Well if we come back now\nto this first picture
598
00:35:44,490 --> 00:35:47,161
being computer science or\nreally problem solving
599
00:35:47,161 --> 00:35:49,110
I daresay we have more\nthan enough ways now
600
00:35:49,110 --> 00:35:53,190
to represent information, input and\n
601
00:35:53,190 --> 00:35:56,250
on something and thankfully all\nof those before us have given us
602
00:35:56,251 --> 00:35:57,961
things like ASCII and Unicode.
603
00:35:57,960 --> 00:36:01,150
Not to mention MP4s, word\ndocuments, and the like.
604
00:36:01,150 --> 00:36:05,250
But what's inside of this proverbial\n
605
00:36:05,251 --> 00:36:06,998
going in the outputs are coming?
606
00:36:06,998 --> 00:36:09,539
Well that's where we get this\nterm you might have heard, too.
607
00:36:09,539 --> 00:36:14,670
An algorithm, which is just step-by-step\n
608
00:36:14,670 --> 00:36:17,940
incarnated in the world\nof computers by software.
609
00:36:17,940 --> 00:36:20,910
When you write software\naka programs, you
610
00:36:20,911 --> 00:36:26,099
are implementing one or more algorithms,\n
611
00:36:26,099 --> 00:36:29,380
for solving some problem, and maybe\n
612
00:36:29,380 --> 00:36:31,422
but at the end of the day,\nno matter the language
613
00:36:31,422 --> 00:36:34,410
you use the computer is going\nto represent what you type
614
00:36:37,769 --> 00:36:40,260
So what might be a\nrepresentative algorithm?
615
00:36:40,260 --> 00:36:42,630
Nowadays you might use\nyour phone quite a bit
616
00:36:42,630 --> 00:36:45,480
to make calls or send texts\nor emails and therefore you
617
00:36:45,481 --> 00:36:48,001
have a whole bunch of\ncontacts in your address book.
618
00:36:48,001 --> 00:36:50,251
Nowadays, of course,\nthis is very digital
619
00:36:50,251 --> 00:36:53,341
but whether on iOS or\nAndroid or the like
620
00:36:53,340 --> 00:36:56,010
you might have a whole\nbunch of names, first name
621
00:36:56,010 --> 00:36:58,942
and/or last, as well as numbers\nand emails and the like.
622
00:36:58,943 --> 00:37:01,651
You might be in the habit of like\n
623
00:37:01,650 --> 00:37:05,190
all of those names to find\nthe person you want to call.
624
00:37:05,190 --> 00:37:09,000
It's probably sorted alphabetically by\n
625
00:37:11,431 --> 00:37:16,764
This is frankly quite the same as\n
626
00:37:16,764 --> 00:37:18,181
when we just used a physical book.
627
00:37:18,181 --> 00:37:20,014
In this physical book\nmight be a whole bunch
628
00:37:20,014 --> 00:37:22,171
of names alphabetically\nsorted from left to right
629
00:37:22,170 --> 00:37:24,340
corresponding to a\nwhole bunch of numbers.
630
00:37:24,340 --> 00:37:27,000
So suppose that in this\nold Harvard phone book
631
00:37:27,001 --> 00:37:29,161
we want to search for John Harvard.
632
00:37:29,161 --> 00:37:31,441
We might of course start\nquite simply at the beginning
633
00:37:31,440 --> 00:37:36,070
here, looking at one page at a\ntime, and this is an algorithm.
634
00:37:36,070 --> 00:37:40,320
This is like literally step-by-step\nlooking for the solution
635
00:37:41,490 --> 00:37:43,710
In that sense, if John\nHarvard's in the phone book
636
00:37:43,710 --> 00:37:47,663
is this algorithm page-by-page\ncorrect, would you say?
637
00:37:49,121 --> 00:37:51,041
Like if John Harvard's\nin the phone book
638
00:37:51,041 --> 00:37:54,641
obviously I'm eventually going to get to\n
639
00:37:56,141 --> 00:37:58,340
Is it well designed, would you say?
640
00:37:58,840 --> 00:38:01,990
I mean this is going to take forever\n
641
00:38:01,990 --> 00:38:03,190
depending how this thing's sorted.
642
00:38:03,190 --> 00:38:04,930
All right, well let\nme go a little faster.
643
00:38:04,931 --> 00:38:06,431
I'll start like two pages at a time.
644
00:38:06,431 --> 00:38:11,050
2, 4, 6, 8, 10, 12, and so forth.
645
00:38:11,050 --> 00:38:13,721
Sounds faster, is faster, is it correct?
646
00:38:14,411 --> 00:38:16,421
DAVID MALAN: OK, why is it not correct?
647
00:38:17,235 --> 00:38:19,043
AUDIENCE: So if you're\nstarting on page 1
648
00:38:19,043 --> 00:38:22,085
you're only going odd number of pages,\n
649
00:38:23,630 --> 00:38:26,297
If I start on an odd number of\npages and I'm going two at a time
650
00:38:26,297 --> 00:38:28,230
I might miss pages in between.
651
00:38:28,231 --> 00:38:30,493
And if I therefore conclude\nwhen I get to the back
652
00:38:30,492 --> 00:38:33,201
of the book there was no John\n
653
00:38:33,201 --> 00:38:35,451
This would be again one of these bugs.
654
00:38:35,451 --> 00:38:39,260
But if I try a little harder,\nI feel like there's a solution.
655
00:38:39,260 --> 00:38:41,550
We don't have to completely\nthrow out this algorithm.
656
00:38:41,550 --> 00:38:44,240
I think we can probably go\nroughly twice as fast still.
657
00:38:44,240 --> 00:38:47,001
But what should we do\ninstead to fix this?
658
00:38:59,610 --> 00:39:02,985
So I think what many of us, most of us,\n
659
00:39:02,985 --> 00:39:05,610
these days, we might go roughly\nto the middle of the phone book
660
00:39:05,610 --> 00:39:06,902
just to kind of get us started.
661
00:39:06,902 --> 00:39:10,350
And now I'm looking down, I'm looking\n
662
00:39:10,351 --> 00:39:13,061
and it looks like I'm in the M section.
663
00:39:13,061 --> 00:39:15,759
So just to be clear,\nwhat should I do next?
664
00:39:21,657 --> 00:39:23,740
DAVID MALAN: OK, and\npresumably it is John Harvard
665
00:39:23,740 --> 00:39:24,949
would be to the left of this.
666
00:39:24,949 --> 00:39:27,931
So here's an opportunity to\nfiguratively and literally tear
667
00:39:27,931 --> 00:39:32,461
this particular problem in half,\nthrow half of the problem away.
668
00:39:32,460 --> 00:39:34,980
It's actually pretty easy\nif you just do it that way.
669
00:39:36,791 --> 00:39:41,141
But I've now just decreased the\n
670
00:39:41,141 --> 00:39:45,240
So if I started with 1,000 pages\nof phone numbers and names, now
671
00:39:46,590 --> 00:39:48,601
And already we haven't\nfound John Harvard
672
00:39:48,601 --> 00:39:50,751
but that's a big bite\nout of this problem.
673
00:39:50,751 --> 00:39:54,339
I do think it's correct because if\n
674
00:39:54,338 --> 00:39:56,130
he's definitely not\ngoing to be over there.
675
00:39:56,130 --> 00:40:00,150
I think if I repeat this again\n
676
00:40:00,150 --> 00:40:02,070
here I might have gone a little too far.
677
00:40:02,070 --> 00:40:03,581
Now I'm in like the E section.
678
00:40:03,581 --> 00:40:08,131
So let me tear the problem in half\n
679
00:40:08,130 --> 00:40:11,100
and again repeat, dividing\nand dividing and conquering
680
00:40:11,101 --> 00:40:14,550
until finally, presumably, I end\n
681
00:40:14,550 --> 00:40:18,090
book on which John Harvard's\nname either is or is not
682
00:40:18,090 --> 00:40:21,400
but because of the algorithm\nyou proposed, step by step
683
00:40:21,400 --> 00:40:24,550
I know that he's not in\nanything I discarded.
684
00:40:24,550 --> 00:40:28,201
So traumatic is that\nmight have been made out
685
00:40:28,201 --> 00:40:31,715
to be, it's actually just harnessing\n
686
00:40:31,715 --> 00:40:33,840
Indeed, this is what\nprogramming is all about, too.
687
00:40:33,840 --> 00:40:36,690
It's not about learning\na completely new world
688
00:40:36,690 --> 00:40:40,350
but really just how to harness intuition\n
689
00:40:40,351 --> 00:40:43,381
have and take naturally\nbut learning how to express
690
00:40:43,380 --> 00:40:45,900
them now more succinctly,\nmore precisely
691
00:40:45,900 --> 00:40:49,150
using things called\nprogramming languages.
692
00:40:49,150 --> 00:40:54,330
Why is an algorithm like that if I found\n
693
00:40:54,331 --> 00:40:56,820
just doing the first\none or even the second
694
00:40:56,820 --> 00:40:59,940
and maybe doubling back\nto check those even pages?
695
00:40:59,940 --> 00:41:01,830
Well let's just look\nat little charts here.
696
00:41:01,831 --> 00:41:04,291
Again, we don't have to get\ninto the nuances of numbers
697
00:41:04,291 --> 00:41:07,501
but if we've got like a chart\nhere, xy plot, on the x-axis
698
00:41:07,501 --> 00:41:09,431
here I claim as the size of the problem.
699
00:41:09,431 --> 00:41:12,441
So measured in the numbers\nof pages in the phone book.
700
00:41:12,440 --> 00:41:16,050
So the farther you go out here, the\n
701
00:41:16,050 --> 00:41:18,641
And here we have time\nto solve on the y-axis.
702
00:41:18,641 --> 00:41:21,331
So the higher you go\nup, the more time it's
703
00:41:21,331 --> 00:41:24,101
going to be taking to solve\nthat particular problem.
704
00:41:24,101 --> 00:41:27,751
So let's just arbitrarily say that\n
705
00:41:27,751 --> 00:41:31,511
like n pages, might be\nrepresented graphically like this.
706
00:41:31,510 --> 00:41:33,931
No matter the slope,\nit's a straight line
707
00:41:33,931 --> 00:41:36,360
because there's presumably\na one to one relationship
708
00:41:36,360 --> 00:41:40,440
between numbers of pages and number\n
709
00:41:40,951 --> 00:41:43,291
If the phone company adds\nanother page next year
710
00:41:43,291 --> 00:41:45,301
because some new people\nmove to town, that's
711
00:41:45,300 --> 00:41:47,161
going to require one\nadditional page for me.
712
00:41:49,110 --> 00:41:52,800
If, though, we use the second\nalgorithm, flawed though it was
713
00:41:52,800 --> 00:41:56,431
unless we double back a little bit\n
714
00:41:56,431 --> 00:41:59,280
that's too going to be a\nstraight line, but it's
715
00:41:59,280 --> 00:42:02,820
going to be a different slope because\n
716
00:42:02,820 --> 00:42:06,250
relationship because I'm\ngoing to pages at a time.
717
00:42:06,251 --> 00:42:10,981
So if the phone company adds\nanother page or another two pages
718
00:42:10,981 --> 00:42:13,008
that still only just one more step.
719
00:42:13,007 --> 00:42:15,090
You can see the difference\nif I kind of draw this.
720
00:42:15,090 --> 00:42:18,210
If this is the phone book in\nquestion, this number of pages
721
00:42:18,210 --> 00:42:20,910
it might take this many\nseconds on the yellow line
722
00:42:20,911 --> 00:42:24,661
to represent or to find\nsomeone like John Harvard.
723
00:42:24,661 --> 00:42:27,421
But of course on the first\nalgorithm, the red line
724
00:42:27,420 --> 00:42:29,760
it's literally going to\ntake twice as many steps.
725
00:42:29,760 --> 00:42:32,311
And what do the n here mean?\nn is the go-to variable
726
00:42:32,311 --> 00:42:36,251
for computer scientist or programmer\n
727
00:42:36,251 --> 00:42:38,671
So if the number of pages\nin the phone book is n
728
00:42:38,670 --> 00:42:41,580
the number of steps the second\nalgorithm would have taken
729
00:42:41,581 --> 00:42:44,266
would be in the worst case n over 2.
730
00:42:44,266 --> 00:42:46,271
Half as many because\nyou're going twice as fast.
731
00:42:46,271 --> 00:42:50,611
But the third algorithm, actually\nif you recall your logarithms
732
00:42:50,610 --> 00:42:52,320
looks a little something like this.
733
00:42:52,320 --> 00:42:54,750
There's a fundamentally\ndifferent relationship
734
00:42:54,751 --> 00:42:58,261
between the size of the problem and\n
735
00:42:58,260 --> 00:43:00,150
that technically is\nlog-based, too, again
736
00:43:00,150 --> 00:43:02,911
but it's really the\nshape that's different.
737
00:43:02,911 --> 00:43:06,331
The implication there is that if,\n
738
00:43:06,331 --> 00:43:08,911
two different towns here in\nMassachusetts, merge next year
739
00:43:08,911 --> 00:43:11,971
and there's just one phone\nbook that's twice as big
740
00:43:11,971 --> 00:43:14,670
no big deal for that\nthird and final algorithm.
741
00:43:15,181 --> 00:43:17,941
You just tear the problem\none more time in half
742
00:43:17,940 --> 00:43:21,360
taking one more byte,\nthat's it, not another 1,000
743
00:43:21,360 --> 00:43:23,760
bytes just to get to the solution.
744
00:43:23,760 --> 00:43:26,320
Put another way, you\ncan walk it way, way
745
00:43:26,320 --> 00:43:29,520
way out here to a much bigger\nphone book and ultimately
746
00:43:29,521 --> 00:43:32,551
that green line is barely\ngoing to have budged.
747
00:43:32,550 --> 00:43:35,880
So this then is just a way of\nnow formalizing and thinking
748
00:43:35,880 --> 00:43:42,400
about what the performance or\n
749
00:43:42,400 --> 00:43:46,070
Before we now make one more\n
750
00:43:46,070 --> 00:43:52,121
any questions then on this notion of\n
751
00:43:52,621 --> 00:43:56,193
AUDIENCE: How many phone\nbooks have you got?
752
00:43:56,193 --> 00:43:58,651
DAVID MALAN: (LAUGHING) A lot\nof phone books over the years
753
00:43:58,650 --> 00:44:02,040
and if you or your parents have any\n
754
00:44:02,041 --> 00:44:04,501
use them because they're hard to find.
755
00:44:13,007 --> 00:44:15,340
AUDIENCE: You could get Harry\nPotter as a guest speaker.
756
00:44:15,340 --> 00:44:16,473
DAVID MALAN: Sorry, say again.
757
00:44:16,474 --> 00:44:18,780
AUDIENCE: You could get Harry\nPotter as a guest speaker.
758
00:44:18,780 --> 00:44:19,800
DAVID MALAN: (LAUGHING) Oh, yeah.
759
00:44:20,300 --> 00:44:23,380
Then we'd have a little\nsomething more to use here.
760
00:44:23,380 --> 00:44:28,740
So now if we want to formalize\nfurther what it is we just did
761
00:44:28,740 --> 00:44:30,360
we can go ahead and introduce this.
762
00:44:30,360 --> 00:44:33,327
A form of code aka pseudocode.
763
00:44:33,327 --> 00:44:35,911
Pseudocode is not a specific\nlanguage, it's not like something
764
00:44:35,911 --> 00:44:38,851
we're about to start coding in, it's\n
765
00:44:38,851 --> 00:44:41,371
in English or any human\nlanguage succinctly
766
00:44:41,371 --> 00:44:45,331
correctly toward an end of getting\n
767
00:44:45,331 --> 00:44:47,701
So for instance, here\nmight be how we could
768
00:44:47,701 --> 00:44:51,331
formalize the code, the pseudocode\nfor that same algorithm.
769
00:44:51,331 --> 00:44:54,211
Step one was pick up the\nphone book, as I did.
770
00:44:54,210 --> 00:44:56,940
Step two might be open to\nthe middle of the phone book
771
00:44:56,940 --> 00:44:58,740
as you proposed that we do first.
772
00:44:58,740 --> 00:45:01,920
Step three was probably to\nlook down at the pages, I did.
773
00:45:01,920 --> 00:45:04,530
And step four gets a\nlittle more interesting
774
00:45:04,530 --> 00:45:07,770
because I had to quickly make a\n
775
00:45:07,771 --> 00:45:11,521
If person is on page, then\nI should probably just
776
00:45:11,521 --> 00:45:13,251
go ahead and call that person.
777
00:45:13,251 --> 00:45:15,751
But that probably wasn't the\ncase at least for John Harvard
778
00:45:17,851 --> 00:45:19,951
So there's this other\nquestion I should now
779
00:45:19,951 --> 00:45:22,590
ask else if the person\nis earlier in the book
780
00:45:22,590 --> 00:45:26,010
then I should tear the problem\nin half as I did but go left, so
781
00:45:26,010 --> 00:45:30,121
to speak, and then not just open to the\n
782
00:45:30,121 --> 00:45:33,300
but really just go back to\nstep three, repeat myself.
783
00:45:33,900 --> 00:45:37,650
Because I can just repeat what I\n
784
00:45:39,010 --> 00:45:41,753
But, if the person\nwas later in the book
785
00:45:41,753 --> 00:45:44,461
as might have happened with a\n
786
00:45:44,460 --> 00:45:47,085
then I should open to the middle\nof the right half of the book
787
00:45:47,085 --> 00:45:49,075
again go back to line\nthree, but again, I'm
788
00:45:49,076 --> 00:45:51,451
not going to get sucked doing\nsomething forever like this
789
00:45:51,451 --> 00:45:54,521
because I keep shrinking\nthe size of the problem.
790
00:45:54,521 --> 00:45:56,431
Lastly, the only\npossible scenario that's
791
00:45:56,431 --> 00:45:59,905
left, if John Harvard is not on\n
792
00:45:59,905 --> 00:46:02,280
and he's not to the right,\nwhat should our conclusion be?
793
00:46:03,322 --> 00:46:04,489
DAVID MALAN: He's not there.
794
00:46:05,391 --> 00:46:08,240
So we need to quit in some other form.
795
00:46:08,240 --> 00:46:11,751
Now as an aside, it's kind of deliberate\n
796
00:46:11,751 --> 00:46:16,021
at the end because this is what\n
797
00:46:16,021 --> 00:46:17,990
whether you're new at\nit or professional
798
00:46:17,990 --> 00:46:22,201
just not considering all possible\n
799
00:46:22,201 --> 00:46:25,701
that might not happen that often,\n
800
00:46:25,701 --> 00:46:28,550
in your own code,\npseudocode or otherwise
801
00:46:28,550 --> 00:46:31,101
this is when and why\nprograms might crash
802
00:46:31,101 --> 00:46:34,111
or you might say stupid little\n
803
00:46:34,110 --> 00:46:35,360
or your computer might reboot.
804
00:46:35,931 --> 00:46:38,780
It's doing something\nsort of unpredictable
805
00:46:38,780 --> 00:46:42,411
if a human, maybe myself,\ndidn't anticipate this.
806
00:46:42,411 --> 00:46:45,501
Like what does this program do if\n
807
00:46:45,501 --> 00:46:48,109
if I had omitted lines 12 and 13?
808
00:46:48,650 --> 00:46:50,650
Maybe it would behave\ndifferently on a Mac or PC
809
00:46:50,650 --> 00:46:53,330
because it's sort of undefined behavior.
810
00:46:53,331 --> 00:46:56,510
These are the kinds of omissions\nthat frankly you're invariably
811
00:46:56,510 --> 00:46:58,641
going to make, bugs\nyou're going to introduce
812
00:46:58,641 --> 00:47:02,751
mistakes you're going to make early\n
813
00:47:02,751 --> 00:47:06,111
But you'll get better at\nthinking about those corner cases
814
00:47:06,110 --> 00:47:09,030
and handling anything that can\n
815
00:47:09,030 --> 00:47:11,931
your code will be all the better for it.
816
00:47:11,931 --> 00:47:15,262
Now the problem ultimately\nwith learning how to program
817
00:47:15,262 --> 00:47:16,971
especially if you've\nnever had experience
818
00:47:16,971 --> 00:47:21,411
or even if you do but you\nlearned one language only
819
00:47:21,411 --> 00:47:25,130
is that they all look a little\ncryptic at first glance.
820
00:47:25,130 --> 00:47:27,260
But they do share certain commonalities.
821
00:47:27,260 --> 00:47:30,141
In fact, we'll use this\npseudocode to define those first.
822
00:47:30,141 --> 00:47:32,360
Highlighted in yellow\nhere are what henceforth
823
00:47:32,360 --> 00:47:34,520
we're going to start calling functions.
824
00:47:34,521 --> 00:47:37,641
Lots of different programming\nlanguages exist, but most of them
825
00:47:37,641 --> 00:47:40,820
have what we might call\nfunctions, which are actions
826
00:47:40,820 --> 00:47:43,797
or verbs that solve\nsome smaller problem.
827
00:47:43,797 --> 00:47:46,130
That is to say, you might use\na whole bunch of functions
828
00:47:46,130 --> 00:47:50,240
to solve a bigger problem\nbecause each function tends to do
829
00:47:50,240 --> 00:47:52,851
something very specific or precise.
830
00:47:52,851 --> 00:47:57,141
These then in English might be\n
831
00:47:57,141 --> 00:47:59,360
code, to these things called functions.
832
00:47:59,360 --> 00:48:02,931
Highlighted in yellow now are\nwhat we might call conditionals.
833
00:48:02,931 --> 00:48:05,751
Conditionals are things\nthat you do conditionally
834
00:48:05,751 --> 00:48:07,443
based on the answer to some question.
835
00:48:07,443 --> 00:48:09,651
You can think of them kind\nof like forks in the road.
836
00:48:09,650 --> 00:48:12,350
Do you go left or go right\nor some other direction
837
00:48:12,351 --> 00:48:14,809
based on the answer to some question?
838
00:48:14,809 --> 00:48:16,101
Well, what are those questions?
839
00:48:16,101 --> 00:48:20,181
Highlighted now in yellow or what we\n
840
00:48:20,181 --> 00:48:25,070
after a mathematician last name Bool,\n
841
00:48:25,070 --> 00:48:31,130
Or, if you prefer, true or false answers\n
842
00:48:31,130 --> 00:48:34,310
We just need to distinguish\none scenario from another.
843
00:48:34,311 --> 00:48:37,101
The last thing manifests\nin this pseudocode
844
00:48:37,101 --> 00:48:39,740
is what I might highlight\nnow and call loops.
845
00:48:39,740 --> 00:48:43,521
Some kind of cycle, some kind of\n
846
00:48:43,521 --> 00:48:48,861
again and again so that I don't\n
847
00:48:48,860 --> 00:48:53,030
a 1,000-page phone book, I can get\n
848
00:48:53,030 --> 00:48:58,161
of repeat myself inherently in order to\n
849
00:48:59,221 --> 00:49:01,581
So this then is what we\nmight call pseudocode
850
00:49:01,581 --> 00:49:05,151
and indeed there are\nother characteristics
851
00:49:05,150 --> 00:49:08,540
of programs that we'll touch on before\n
852
00:49:08,541 --> 00:49:13,740
values, variables, and more, but\n
853
00:49:13,740 --> 00:49:16,371
including some we will very\ndeliberately use in this class
854
00:49:16,371 --> 00:49:19,760
and that everyone in the real\nworld these days still uses
855
00:49:19,760 --> 00:49:22,201
its programs tend to look like this.
856
00:49:22,201 --> 00:49:24,860
This for instance, is a distillation\nof that very first program
857
00:49:24,860 --> 00:49:29,810
I wrote in 1996 in CS50 itself just\n
858
00:49:29,811 --> 00:49:34,731
In fact, this version here just tries\n
859
00:49:34,731 --> 00:49:37,731
Which is, dare say, the most\ncanonical first thing that most
860
00:49:37,731 --> 00:49:41,601
any programmer ever gets a\ncomputer to say just because
861
00:49:42,900 --> 00:49:45,920
I mean, there's a hash symbol,\n
862
00:49:45,920 --> 00:49:49,760
words like int, curly braces, quotes,\n
863
00:49:50,420 --> 00:49:54,020
I mean there's more overhead\nand more syntax and clutter
864
00:49:54,021 --> 00:49:55,941
than there is an actual idea.
865
00:49:55,940 --> 00:50:00,080
Now that's not to say that you won't\n
866
00:50:00,081 --> 00:50:03,441
because honestly there's not that many\n
867
00:50:03,440 --> 00:50:07,701
have typically a much smaller vocabulary\n
868
00:50:07,701 --> 00:50:10,581
but at first it might\nindeed look quite cryptic.
869
00:50:10,581 --> 00:50:14,361
But you can perhaps infer I have no\n
870
00:50:14,360 --> 00:50:18,170
but "Hello, world." is\npresumably quote unquote what
871
00:50:18,170 --> 00:50:19,860
will be printed on the screen.
872
00:50:19,860 --> 00:50:22,880
But what we'll do today,\nafter a short break
873
00:50:22,880 --> 00:50:24,980
and set the stage for\nnext week is introduce
874
00:50:24,981 --> 00:50:27,336
these exact same ideas in\njust a bit using Scratch
875
00:50:27,335 --> 00:50:29,210
something that you\nyourselves might have used
876
00:50:29,210 --> 00:50:32,630
when you're quite younger but\n
877
00:50:34,041 --> 00:50:38,421
The upside of what we'll soon do using\n
878
00:50:38,420 --> 00:50:41,960
language from our friends down the\n
879
00:50:41,960 --> 00:50:46,490
start to drag and drop things that\n
880
00:50:46,490 --> 00:50:48,471
together if it makes\nlogical sense to do so
881
00:50:48,471 --> 00:50:51,710
but without the distraction\nof hashes, parentheses
882
00:50:51,710 --> 00:50:54,410
curly braces, angle brackets,\nsemicolons, and things
883
00:50:54,411 --> 00:50:56,311
that are quite beside the point.
884
00:50:56,311 --> 00:50:58,800
But for now, let's go ahead\nand take a 10 minute break here
885
00:50:58,800 --> 00:51:01,101
and when we resume, we\nwill start programming.
886
00:51:01,101 --> 00:51:04,561
So this on the screen\nis a language called
887
00:51:04,561 --> 00:51:07,921
C something that will dive\ninto next week and thankfully
888
00:51:07,920 --> 00:51:10,980
this now on the screen is\nanother language called Python
889
00:51:10,981 --> 00:51:13,921
that we'll also take a look at\nin a few weeks before long along
890
00:51:13,920 --> 00:51:15,690
with other languages along the way.
891
00:51:15,690 --> 00:51:19,640
Today though, and for this first\nweek, week zero, so to speak
892
00:51:19,641 --> 00:51:21,391
we use Scratch because\nagain it will allow
893
00:51:21,391 --> 00:51:23,971
us to explore some of those\nprogramming fundamentals
894
00:51:23,971 --> 00:51:28,471
that will be in C and in Python and in\n
895
00:51:28,471 --> 00:51:32,521
but in a way where we don't have to\n
896
00:51:32,521 --> 00:51:35,351
So the world of Scratch looks like this.
897
00:51:35,351 --> 00:51:38,070
It's a web-based or downloadable\nprogramming environment
898
00:51:38,070 --> 00:51:40,920
that has this layout here\nby default. On the left here
899
00:51:40,920 --> 00:51:45,000
we'll soon see is a palette of puzzle\n
900
00:51:45,001 --> 00:51:47,521
represent all of those\nideas we just discussed.
901
00:51:47,521 --> 00:51:50,071
And by dragging and\ndropping these puzzle pieces
902
00:51:50,070 --> 00:51:54,181
or blocks over this big area\nand connecting them together
903
00:51:54,181 --> 00:51:56,251
if it makes logical\nsense to do so, we'll
904
00:51:56,251 --> 00:51:58,531
start programming in this environment.
905
00:51:58,530 --> 00:52:01,561
The environment allows you to have\n
906
00:52:01,561 --> 00:52:04,111
Multiple characters, things\nlike a cat or anything
907
00:52:04,110 --> 00:52:08,310
else, and those sprites exist\nin this rectangular world
908
00:52:08,311 --> 00:52:11,521
up here that you can full screen to\n
909
00:52:11,521 --> 00:52:16,201
is Scratch, who can move up, down, left,\n
910
00:52:16,201 --> 00:52:19,291
Within its Scratch's\nworld you can think of it
911
00:52:19,291 --> 00:52:23,251
as perhaps a familiar\ncoordinate system with Xs and Ys
912
00:52:23,251 --> 00:52:27,001
which is helpful only when it comes\n
913
00:52:27,001 --> 00:52:32,281
Right now Scratch is at the default,\n
914
00:52:32,280 --> 00:52:36,061
If you were to move the cat way\n
915
00:52:37,891 --> 00:52:40,651
If you move the cat all the way\n
916
00:52:40,650 --> 00:52:42,940
but y would now be negative 180.
917
00:52:42,940 --> 00:52:47,460
And if you went left, x would become\n
918
00:52:47,460 --> 00:52:51,690
or to the right x would be\n240 and y would stay zero.
919
00:52:51,690 --> 00:52:55,590
So those numbers generally don't\n
920
00:52:55,590 --> 00:52:58,170
move relatively in this\nworld up, down, left, right
921
00:52:58,170 --> 00:53:01,230
but when it comes time\nto precisely position
922
00:53:01,231 --> 00:53:03,601
some of these sprites\nor other imagery, it'll
923
00:53:03,601 --> 00:53:07,050
be helpful just to have that mental\n
924
00:53:07,050 --> 00:53:10,150
Well let's go ahead and make perhaps\n
925
00:53:10,150 --> 00:53:13,050
I'm going to switch over to the\nsame programming environment
926
00:53:13,050 --> 00:53:15,971
now for a tour of the left hand side.
927
00:53:15,971 --> 00:53:20,911
So by default selected here are\nthe category in blue motion
928
00:53:20,911 --> 00:53:24,451
which has a whole bunch of puzzle\n
929
00:53:24,451 --> 00:53:27,001
And whereas Scratch as\na graphical language
930
00:53:27,001 --> 00:53:30,688
categorizes things by the type\nof things that these pieces do
931
00:53:30,688 --> 00:53:32,521
we'll see that throughout\nthis whole palette
932
00:53:32,521 --> 00:53:35,461
we'll have functions and\nvariables and conditionals
933
00:53:35,460 --> 00:53:38,650
and Boolean expressions and more\n
934
00:53:38,650 --> 00:53:42,181
So for instance, moving 10 steps\nor turning one way or the other
935
00:53:42,181 --> 00:53:45,691
would be functions categorized\nhere as things like motion.
936
00:53:45,690 --> 00:53:49,257
Under looks in purple, you\nmight have speech bubbles
937
00:53:49,257 --> 00:53:51,090
that you can create by\ndragging and dropping
938
00:53:51,090 --> 00:53:54,360
these that might say "hello" or\n
939
00:53:54,360 --> 00:53:58,320
Or you could switch costumes, change\n
940
00:54:01,320 --> 00:54:05,010
You can play sounds like "meow" or\n
941
00:54:06,001 --> 00:54:09,271
Then there's these things Scratch calls\n
942
00:54:09,271 --> 00:54:11,191
is the first, when green flag clicked.
943
00:54:11,190 --> 00:54:14,221
Because if we look over to the\nright of Scratch's world here
944
00:54:14,221 --> 00:54:17,670
this rectangular region has\nthis green flag and red stop
945
00:54:17,670 --> 00:54:20,880
sign up above, one of which is\nfor Play one of which is for Stop
946
00:54:20,880 --> 00:54:24,510
and so that's going to allow us to\n
947
00:54:24,510 --> 00:54:27,840
when that green flag\nis initially clicked.
948
00:54:27,840 --> 00:54:31,860
But you can listen for other types of\n
949
00:54:31,860 --> 00:54:35,911
or something else, when this sprite\n
950
00:54:35,911 --> 00:54:39,570
Here you already see like a\nprogrammer's incarnation of things
951
00:54:39,570 --> 00:54:42,451
you and I take for granted like\nevery day now on our phones.
952
00:54:42,451 --> 00:54:46,391
Any time you tap an icon or drag your\n
953
00:54:46,391 --> 00:54:48,550
These are what a programmer\nwould call events
954
00:54:48,550 --> 00:54:51,451
things that happen and\nare often triggered by us
955
00:54:51,451 --> 00:54:55,471
humans and things that a program\nbe it in Scratch or Python
956
00:54:55,471 --> 00:54:59,280
or C or anything else can\nlisten for and respond to.
957
00:54:59,280 --> 00:55:01,980
Indeed, that's why when you tap\nthe phone icon on your phone
958
00:55:01,981 --> 00:55:04,441
the phone application\nstarts up because someone
959
00:55:04,440 --> 00:55:08,640
wrote software that's listening for a\n
960
00:55:08,641 --> 00:55:10,711
So Scratch has these same things, too.
961
00:55:10,710 --> 00:55:13,590
Under Control in orange,\nyou can see that we
962
00:55:13,590 --> 00:55:15,721
can wait for one second\nor repeat something
963
00:55:15,721 --> 00:55:17,670
some number of times,\n10 by default, but we
964
00:55:17,670 --> 00:55:20,800
can change anything in these\nwhite circles to anything else.
965
00:55:20,800 --> 00:55:22,710
There's another puzzle\npiece here forever
966
00:55:22,710 --> 00:55:25,590
which implies some kind of loop where\n
967
00:55:25,590 --> 00:55:27,060
Even though it seems a\nlittle tight, there's
968
00:55:27,061 --> 00:55:29,131
not much room to fit\nsomething there, Scratch
969
00:55:29,130 --> 00:55:31,005
is going to have these\nthings grow and shrink
970
00:55:31,005 --> 00:55:33,993
however we want to fill\nsimilarly shaped pieces.
971
00:55:33,994 --> 00:55:35,161
Here are those conditionals.
972
00:55:35,161 --> 00:55:40,061
If something is true or false,\nthen do this next thing.
973
00:55:40,061 --> 00:55:42,841
And that's how we can put in\nthis little trapezoid-like shape.
974
00:55:42,840 --> 00:55:46,770
Some form of Boolean expression, a\n
975
00:55:46,771 --> 00:55:50,131
or one/zero answer and decide\nwhether to do something or not.
976
00:55:50,130 --> 00:55:52,440
You can combine these things, too.
977
00:55:52,440 --> 00:55:55,905
If something is true, do this,\nelse do this other thing.
978
00:55:55,905 --> 00:55:57,780
And you can even tuck\none inside of the other
979
00:55:57,780 --> 00:56:01,320
if you want to ask three\nor four or more questions.
980
00:56:01,320 --> 00:56:03,010
Sensing, too, is going to be a thing.
981
00:56:03,010 --> 00:56:07,871
You can ask questions aka Boolean\n
982
00:56:07,871 --> 00:56:10,161
the mouse pointer, the\narrow on the screen?
983
00:56:10,161 --> 00:56:12,800
So that you can start to\ninteract with these programs.
984
00:56:12,800 --> 00:56:15,298
What is the distance between\na sprite and a mouse pointer?
985
00:56:15,298 --> 00:56:17,590
You can do simple calculations\njust to figure out maybe
986
00:56:17,590 --> 00:56:20,141
if the enemy is getting\nclose to the cat.
987
00:56:20,141 --> 00:56:23,651
Under Operator some lower level\n
988
00:56:23,650 --> 00:56:25,850
to pick random numbers,\nwhich for a game is great
989
00:56:25,851 --> 00:56:27,851
because then you can kind\nof vary the difficulty
990
00:56:27,851 --> 00:56:30,309
or what's happening in a game\nwithout the same game playing
991
00:56:33,550 --> 00:56:37,520
Something and something must be true\n
992
00:56:38,021 --> 00:56:40,240
Or we can even join two words together.
993
00:56:40,240 --> 00:56:43,181
Says apple and banana by default,\n
994
00:56:43,181 --> 00:56:46,990
whatever you want there to\ncombine multiple words into full
995
00:56:48,581 --> 00:56:52,361
Then lastly down here, there's in\n
996
00:56:52,360 --> 00:56:54,911
In math we've obviously\ngot x and y and whatnot.
997
00:56:54,911 --> 00:56:56,921
In programming we'll\nhave the same ability
998
00:56:56,920 --> 00:57:03,010
to store in these named symbols,\n
999
00:57:03,010 --> 00:57:06,701
Numbers or letters or words or\ncolors or anything, ultimately.
1000
00:57:06,701 --> 00:57:09,971
But in programming you'll see that\n
1001
00:57:09,971 --> 00:57:13,391
use simple letters like x\nand y and z, but to actually
1002
00:57:13,391 --> 00:57:20,291
give variables full singular or plural\n
1003
00:57:20,291 --> 00:57:24,131
Then lastly, if this isn't\nenough color blocks for you
1004
00:57:24,130 --> 00:57:25,960
you can create your own blocks.
1005
00:57:25,960 --> 00:57:29,050
Indeed, this is going to be a\n
1006
00:57:29,050 --> 00:57:32,920
and with the first problem set whereby\n
1007
00:57:32,920 --> 00:57:37,660
pieces and you realize, oh, would have\n
1008
00:57:37,661 --> 00:57:40,990
have just been replaced by one\nhad MIT thought to give me that
1009
00:57:40,990 --> 00:57:44,291
one puzzle piece, you yourself\ncan make your own blocks
1010
00:57:44,291 --> 00:57:47,111
by connecting these all together,\ngiving them a name, and boom
1011
00:57:47,110 --> 00:57:49,250
a new puzzle piece will exist.
1012
00:57:49,251 --> 00:57:51,581
So let's do the simplest,\nmost canonical programs
1013
00:57:51,581 --> 00:57:53,530
here, starting up with\ncontrol, and I'm going
1014
00:57:53,530 --> 00:57:57,880
to click and drag and drop this\n
1015
00:57:57,880 --> 00:58:01,107
Then I'm going to grab one\nmore, for instance under Looks
1016
00:58:01,108 --> 00:58:03,191
and under Looks I'm going\nto go ahead and just say
1017
00:58:03,190 --> 00:58:07,900
something like initially not\njust Hello but the more canonical
1018
00:58:09,550 --> 00:58:12,250
Now you might guess that in\nthis programming environment
1019
00:58:12,251 --> 00:58:15,611
I can go over here now and\nclick the green flag and voila
1020
00:58:17,090 --> 00:58:19,005
So that's my first\nprogram and obviously much
1021
00:58:19,005 --> 00:58:21,880
more user friendly than typing out\n
1022
00:58:21,880 --> 00:58:25,040
saw on the screen that you,\ntoo, will type out next week.
1023
00:58:25,041 --> 00:58:28,280
But for now, we'll just focus on\n
1024
00:58:28,280 --> 00:58:29,891
So what it is that just happened?
1025
00:58:29,891 --> 00:58:32,391
This purple block here is\nSay, that's the function
1026
00:58:32,391 --> 00:58:37,090
and it seems to take some form of input\n
1027
00:58:37,996 --> 00:58:40,121
Well this actually fits\nthe paradigm that we looked
1028
00:58:40,121 --> 00:58:42,445
at earlier of just inputs and outputs.
1029
00:58:42,445 --> 00:58:45,340
So if I may, if you consider\nwhat this puzzle piece is doing
1030
00:58:47,050 --> 00:58:51,010
The input in this case is going\n
1031
00:58:51,010 --> 00:58:55,346
The algorithm is going to be implemented\n
1032
00:58:55,346 --> 00:58:57,971
and the output of that is going\nto be some kind of side effect
1033
00:58:57,971 --> 00:59:01,271
like the cat and the speech\nbubble are saying Hello, world.
1034
00:59:01,271 --> 00:59:03,760
So already even that\nsimple drag and drop
1035
00:59:03,760 --> 00:59:07,340
mimics exactly this relatively\nsimple mental model.
1036
00:59:07,340 --> 00:59:08,750
So let's take things further.
1037
00:59:08,751 --> 00:59:11,923
Let's go ahead now and make the\n
1038
00:59:11,922 --> 00:59:14,380
that it says something like\nHello, David, or Hello, Carter
1039
00:59:14,380 --> 00:59:16,600
or Hello to you specifically.
1040
00:59:16,601 --> 00:59:18,556
And for this, I'm going\nto go under Sensing.
1041
00:59:18,556 --> 00:59:21,431
And you might have to poke around\n
1042
00:59:21,431 --> 00:59:24,472
around, but I've done this a few times\n
1043
00:59:26,860 --> 00:59:28,752
Ask what's your name,\nbut that's in white
1044
00:59:28,753 --> 00:59:30,461
so we can change the\nquestion to anything
1045
00:59:30,460 --> 00:59:34,930
we want, and it's going to wait for\n
1046
00:59:34,931 --> 00:59:37,510
This function called Ask\nis a little different
1047
00:59:37,510 --> 00:59:41,320
from the Say block, which just had\n
1048
00:59:42,460 --> 00:59:47,080
The ask function is even more powerful\n
1049
00:59:47,831 --> 00:59:50,831
This function is going\nto hand you back what
1050
00:59:50,831 --> 00:59:55,091
they typed in in the form of\nwhat's called a return value, which
1051
00:59:55,090 --> 00:59:57,940
is stored ultimately and by\ndefault this thing called Answer.
1052
00:59:57,940 --> 01:00:00,612
This little blue oval here\ncalled Answer is again
1053
01:00:00,612 --> 01:00:02,320
one of these variables\nthat in math would
1054
01:00:02,320 --> 01:00:05,828
be called just x or y but in\n
1055
01:00:05,829 --> 01:00:07,371
So I'm going to go ahead and do this.
1056
01:00:07,371 --> 01:00:09,204
Let me go ahead and\ndrag and drop this block
1057
01:00:09,204 --> 01:00:11,631
and I want to ask the question\nbefore saying anything
1058
01:00:11,630 --> 01:00:13,630
but you'll notice that\nScratch is smart and it's
1059
01:00:13,630 --> 01:00:15,641
going to realize I want to\ninsert something in between
1060
01:00:15,641 --> 01:00:17,599
and it's just going to\nmove things up and down.
1061
01:00:17,599 --> 01:00:20,530
I'm going to let go and ask the\n
1062
01:00:20,530 --> 01:00:23,770
And now if I want to go ahead\nand say hello, David or Carter
1063
01:00:23,771 --> 01:00:26,141
let's just do Hello\ncomma, because I obviously
1064
01:00:26,141 --> 01:00:28,820
don't know when I'm writing the\nprogram who's going to use it.
1065
01:00:28,820 --> 01:00:35,141
So let me now grab another looks block\n
1066
01:00:35,141 --> 01:00:39,280
let me go back to Sensing and now\n
1067
01:00:39,280 --> 01:00:42,610
by this other puzzle piece, and\n
1068
01:00:42,610 --> 01:00:45,460
Notice it's the same shape, even\n
1069
01:00:45,460 --> 01:00:47,650
Things will grow or shrink as needed.
1070
01:00:47,650 --> 01:00:49,721
All right, so let's now zoom out.
1071
01:00:49,721 --> 01:00:52,630
Let me go and stop the old version\n
1072
01:00:53,291 --> 01:00:55,751
Let me hit the green\nflag and what's my name?
1073
01:00:59,338 --> 01:01:01,880
All right, maybe I just wasn't\npaying close enough attention.
1074
01:01:03,141 --> 01:01:06,360
Green flag, D-A-V-I-D, Enter.
1075
01:01:09,280 --> 01:01:13,300
What's the bug or\nmistake might you think?
1076
01:01:14,150 --> 01:01:17,370
AUDIENCE: Do you need to somehow add\n
1077
01:01:17,371 --> 01:01:20,760
DAVID MALAN: Yeah, we kind of want\n
1078
01:01:20,760 --> 01:01:23,940
And it's technically a bug because\n
1079
01:01:23,940 --> 01:01:26,910
It's just saying David\nafter I asked for my name.
1080
01:01:26,911 --> 01:01:29,971
I'd like it to say\nmaybe Hello then David
1081
01:01:29,971 --> 01:01:32,460
but it's just blowing past\nthe Hello and printing David.
1082
01:01:32,460 --> 01:01:34,780
But let's put our finger\non why this is happening.
1083
01:01:34,780 --> 01:01:38,190
You're right for the solution, but\n
1084
01:01:39,228 --> 01:01:42,706
AUDIENCE: So it says hello,\nbut it gets to that last step
1085
01:01:42,706 --> 01:01:43,925
so quickly you can't see it.
1086
01:01:44,800 --> 01:01:47,440
I mean, computers are\nreally darn fast these days.
1087
01:01:47,440 --> 01:01:50,350
It is saying Hello, all of us\nare just too slow in this room
1088
01:01:50,351 --> 01:01:54,740
to even see it because it's then saying\n
1089
01:01:54,740 --> 01:01:57,443
So there's a couple of solutions\nhere, and yours is spot on
1090
01:01:57,443 --> 01:01:59,651
but just to poke around,\nyou'll see the first example
1091
01:01:59,650 --> 01:02:03,130
of how many ways in programming be\n
1092
01:02:03,130 --> 01:02:05,440
else, that there are going\nto be to solve problems?
1093
01:02:05,440 --> 01:02:07,420
We'll teach you over the\ncourse of these weeks
1094
01:02:07,420 --> 01:02:10,660
sometimes some ways are\nbetter relatively than others
1095
01:02:10,661 --> 01:02:13,811
but rarely is there a\nbest way necessarily
1096
01:02:13,811 --> 01:02:15,728
because again reasonable\npeople will disagree.
1097
01:02:15,728 --> 01:02:17,936
And what we'll try to teach\nyou over the coming weeks
1098
01:02:17,936 --> 01:02:20,111
is how to kind of think\nthrough those nuances.
1099
01:02:20,110 --> 01:02:22,257
And it's not going to be\nobvious at first glance
1100
01:02:22,257 --> 01:02:24,340
but the more programs you\nwrite, the more feedback
1101
01:02:24,340 --> 01:02:26,951
you get, the more bugs\nthat you introduce
1102
01:02:26,951 --> 01:02:30,860
the more you'll get your footing with\n
1103
01:02:30,860 --> 01:02:33,130
So let me try this in a couple of ways.
1104
01:02:33,130 --> 01:02:35,620
Up here would be one\nsolution to the problem.
1105
01:02:35,621 --> 01:02:40,286
MIT anticipated this kind of issue,\n
1106
01:02:40,286 --> 01:02:42,161
and I could just use a\npuzzle piece that says
1107
01:02:42,161 --> 01:02:44,771
say the following for\ntwo seconds or one second
1108
01:02:44,771 --> 01:02:47,381
or whatever, then do the\nsame with the next word
1109
01:02:47,380 --> 01:02:50,110
and it might be kind\nof a bit of a pause
1110
01:02:50,110 --> 01:02:54,610
Hello, one second, two seconds, David,\n
1111
01:02:54,610 --> 01:02:56,760
it would look a little\nmore grammatically correct.
1112
01:02:56,760 --> 01:02:59,260
But I can do it a little more\nelegantly, as you've proposed.
1113
01:02:59,260 --> 01:03:01,385
Let me go ahead and throw\naway one of these blocks
1114
01:03:01,385 --> 01:03:04,181
and you can just drag and let\ngo and it'll delete itself.
1115
01:03:04,181 --> 01:03:10,211
Let me go down to Operators because\n
1116
01:03:10,210 --> 01:03:13,721
So even if you're not sure what goes\n
1117
01:03:17,300 --> 01:03:20,170
Let me go ahead and\nsay hello comma space.
1118
01:03:20,170 --> 01:03:22,330
Now it could just say by\ndefault Hello, banana
1119
01:03:22,331 --> 01:03:27,701
but let me go back to\nSensing, Drag answer
1120
01:03:27,701 --> 01:03:29,501
and that's going to drag and drop there.
1121
01:03:29,501 --> 01:03:34,161
So now notice we're sort of stacking\n
1122
01:03:34,161 --> 01:03:38,021
so that the output of one becomes the\n
1123
01:03:38,021 --> 01:03:42,024
Let me go ahead and zoom\nout, hit Stop, and hit Play.
1124
01:03:42,023 --> 01:03:43,190
All right, what's your name?
1125
01:03:45,550 --> 01:03:48,770
Now it's presumably\nas we first intended.
1126
01:03:57,811 --> 01:04:02,728
So consider that even with\nthis additional example
1127
01:04:02,728 --> 01:04:05,811
it still fits the same mental model,\n
1128
01:04:05,811 --> 01:04:08,961
Here's that new function\nAsk something and wait.
1129
01:04:08,960 --> 01:04:12,471
And notice that in this case too\n
1130
01:04:12,471 --> 01:04:15,621
henceforth as an argument\nor a parameter, programming
1131
01:04:15,621 --> 01:04:18,681
speak for just an input in\nthe context of a function.
1132
01:04:18,681 --> 01:04:21,831
If we use our drawing as before\nto represent this thing here
1133
01:04:21,831 --> 01:04:26,061
we'll see that the input now is going\n
1134
01:04:26,061 --> 01:04:30,081
The algorithm is going to be implemented\n
1135
01:04:30,081 --> 01:04:33,381
the function called Ask, and the\noutput of that thing this time
1136
01:04:33,380 --> 01:04:36,110
is not going to be the\ncat saying anything yet
1137
01:04:36,110 --> 01:04:39,721
but rather it's going\nto be the actual answer.
1138
01:04:39,721 --> 01:04:43,130
So instead of the visual side effect\n
1139
01:04:43,130 --> 01:04:45,620
now nothing visible is happening yet.
1140
01:04:45,621 --> 01:04:49,671
Thanks to this function it's sort of\n
1141
01:04:49,670 --> 01:04:55,701
with whatever I typed in written on it\n
1142
01:04:57,951 --> 01:05:00,050
Now what did I then do with that value?
1143
01:05:00,050 --> 01:05:04,820
Well consider that with\nthe subsequent function
1144
01:05:04,820 --> 01:05:08,280
we had this Say block,\ntoo, combined with a join.
1145
01:05:08,280 --> 01:05:11,780
So we have this variable\ncalled Answer, we're joining it
1146
01:05:11,780 --> 01:05:14,202
with that first argument, Hello.
1147
01:05:14,202 --> 01:05:16,161
So already we see that\nsome functions like Join
1148
01:05:16,161 --> 01:05:20,001
can take not one but two arguments,\nor inputs, and that's fine.
1149
01:05:20,001 --> 01:05:24,891
The output of Join is presumably going\n
1150
01:05:24,891 --> 01:05:27,171
or whatever the human typed in.
1151
01:05:27,170 --> 01:05:31,580
That output notice is essentially\n
1152
01:05:31,581 --> 01:05:33,951
Say, just because we've\nkind of stacked things
1153
01:05:33,951 --> 01:05:35,851
or nested them on top of one another.
1154
01:05:35,851 --> 01:05:40,280
But methodically, it's\nreally the same idea.
1155
01:05:40,280 --> 01:05:44,900
The input now are two things,\nHello comma and the return value
1156
01:05:44,900 --> 01:05:47,300
from the previous Ask function.
1157
01:05:47,300 --> 01:05:51,230
The function now is going to be Join,\n
1158
01:05:51,231 --> 01:05:53,061
But that Hello, David\noutput is now going
1159
01:05:53,061 --> 01:05:57,501
to become the input to another function,\n
1160
01:05:57,501 --> 01:06:01,611
and that's then going to have the side\n
1161
01:06:02,670 --> 01:06:06,410
So again as sort of sophisticated\n
1162
01:06:06,411 --> 01:06:09,231
are going to get, they really do\n
1163
01:06:09,231 --> 01:06:12,741
of inputs and outputs and you just have\n
1164
01:06:12,740 --> 01:06:17,030
and to know what kinds of puzzle\n
1165
01:06:17,030 --> 01:06:19,650
But you can ultimately really\nkind of spice these things up.
1166
01:06:19,650 --> 01:06:21,320
Let me go back to my\nprogram here that just is
1167
01:06:21,320 --> 01:06:22,911
using the speech bubble at the moment.
1168
01:06:22,911 --> 01:06:26,090
Scratch's inside has some pretty\n
1169
01:06:26,090 --> 01:06:28,971
I click the Extensions button\nin the bottom left corner.
1170
01:06:28,971 --> 01:06:33,248
And let me go ahead and choose\nthe Text to Speech extension.
1171
01:06:33,248 --> 01:06:36,081
This is using a Cloud service, so\n
1172
01:06:36,081 --> 01:06:39,141
it can actually talk to the\nCloud or a third party service
1173
01:06:39,141 --> 01:06:42,411
and this one is going to give me a\n
1174
01:06:42,411 --> 01:06:45,171
the ability to speak\nsomething from my speakers
1175
01:06:45,170 --> 01:06:47,090
instead of just saying it textually.
1176
01:06:47,090 --> 01:06:48,574
So let me go ahead and drag this.
1177
01:06:48,574 --> 01:06:51,740
Now notice I don't have to interlock\n
1178
01:06:51,740 --> 01:06:52,971
and I want to move some things around.
1179
01:06:52,971 --> 01:06:55,280
I just want to use this as\nlike a canvas temporarily.
1180
01:06:55,280 --> 01:06:58,141
Let me go ahead and\nsteal the Join from here
1181
01:06:58,141 --> 01:07:01,581
put it there, let me throw away\nthe Say block by just moving it
1182
01:07:01,581 --> 01:07:04,641
left and letting go, and\nnow let me join this in
1183
01:07:04,641 --> 01:07:08,581
so I've now changed my program\nto be a little more interesting.
1184
01:07:08,581 --> 01:07:10,341
So now let me stop the old version.
1185
01:07:18,365 --> 01:07:20,240
DAVID MALAN: (LAUGHING)\nOK, minus 2 for real.
1186
01:07:20,240 --> 01:07:27,260
All right, so what I accidentally\n
1187
01:07:27,260 --> 01:07:30,170
for instructional purposes,\nwas the actual answer
1188
01:07:30,170 --> 01:07:32,660
that came back from the ask block.
1189
01:07:33,621 --> 01:07:37,593
So now if I play this again,\nlet's click the green icon.
1190
01:07:50,271 --> 01:07:54,481
OK, so we have these functions then\n
1191
01:07:54,481 --> 01:07:57,441
Well what about those conditionals\n
1192
01:07:57,440 --> 01:07:59,810
How can we bring these programs\nto life so it's not just
1193
01:07:59,811 --> 01:08:01,951
clicking a button and voila,\nsomething's happening?
1194
01:08:01,951 --> 01:08:04,251
Let's go ahead and make this\nnow even more interactive.
1195
01:08:04,251 --> 01:08:06,501
Let me go ahead and throw\naway most of these pieces
1196
01:08:06,501 --> 01:08:09,471
and let me just spice things up\n
1197
01:08:09,471 --> 01:08:12,170
I'm going to go to Play\nSound Meow until done.
1198
01:08:15,251 --> 01:08:18,521
OK, it's a little loud, but it\ndid exactly do what it said.
1199
01:08:21,791 --> 01:08:23,490
It's kind of an underwhelming\nprogram eventually
1200
01:08:23,490 --> 01:08:26,551
since you'd like to think that the\n
1201
01:08:26,940 --> 01:08:28,358
I have to keep hitting the button.
1202
01:08:28,358 --> 01:08:31,386
Well this seems like an opportunity\n
1203
01:08:31,386 --> 01:08:33,511
So all right, well if I\nwanted to meow, meow, meow
1204
01:08:33,511 --> 01:08:37,171
let me just grab a few of these, or you\n
1205
01:08:37,171 --> 01:08:39,360
and you can Copy Paste\neven in code here.
1206
01:08:43,230 --> 01:08:45,228
All right, so now like\nit's not really emoting
1207
01:08:45,228 --> 01:08:46,560
happiness in quite the same way.
1208
01:08:49,421 --> 01:08:52,481
Let me go to Control, wait\none second in between
1209
01:08:52,480 --> 01:08:55,860
which might be a little less worrisome.
1210
01:09:01,451 --> 01:09:06,310
OK, so if my goal was to make\nthe cat meow three times
1211
01:09:06,310 --> 01:09:10,000
I dare say this code or\nalgorithm is correct.
1212
01:09:10,001 --> 01:09:12,940
But let's now critique its design.
1213
01:09:21,501 --> 01:09:27,052
AUDIENCE: You could use the forever\n
1214
01:09:27,052 --> 01:09:28,510
DAVID MALAN: Yeah, so yeah, agreed.
1215
01:09:28,511 --> 01:09:31,301
I could use forever or repeat,\nbut let me push a little harder.
1216
01:09:31,990 --> 01:09:36,070
Like this works, I'm kind of done with\n
1217
01:09:36,070 --> 01:09:37,881
AUDIENCE: There's too much repetition.
1218
01:09:37,881 --> 01:09:40,131
DAVID MALAN: Yeah, there's\ntoo much repetition, right?
1219
01:09:40,131 --> 01:09:42,591
If I wanted to change the\nsound that the cat is making
1220
01:09:42,591 --> 01:09:46,041
to a different variant of meow or\n
1221
01:09:46,041 --> 01:09:48,351
I could change it from the\ndropdown here apparently
1222
01:09:48,350 --> 01:09:51,350
but then I'd have to change it here\n
1223
01:09:51,350 --> 01:09:54,110
and God, if this were even longer\nthat just gets tedious quickly
1224
01:09:54,110 --> 01:09:56,030
and you're probably\nincreasing the probability
1225
01:09:56,030 --> 01:09:57,081
that you're going to\nscrew up and you're going
1226
01:09:57,081 --> 01:10:00,072
to miss one of the dropdowns or\n
1227
01:10:00,072 --> 01:10:02,779
Or, if you wanted to change the\n
1228
01:10:02,779 --> 01:10:05,150
you've got to change it in\ntwo, maybe even more places.
1229
01:10:05,150 --> 01:10:07,581
Again, you're just\ncreating risk for yourself
1230
01:10:07,581 --> 01:10:09,150
and potential bugs in the program.
1231
01:10:09,150 --> 01:10:13,041
So I do like the repeat or the forever\n
1232
01:10:13,041 --> 01:10:15,801
And indeed, what I\nalluded to being possible
1233
01:10:15,801 --> 01:10:18,651
copy pasting earlier, doesn't\nmean it's a good thing.
1234
01:10:18,650 --> 01:10:20,421
And in code, generally\nspeaking, when you
1235
01:10:20,421 --> 01:10:23,841
start to copy and paste puzzle\npieces or text next week
1236
01:10:23,841 --> 01:10:26,910
you're probably not doing\nsomething quite well.
1237
01:10:26,909 --> 01:10:30,680
So let me go ahead and throw away most\n
1238
01:10:30,680 --> 01:10:33,680
keeping just two of the\nblocks that I care about.
1239
01:10:33,680 --> 01:10:37,880
Let me grab the Repeat block for now,\n
1240
01:10:37,881 --> 01:10:40,941
block, it's going to grow to fit\nit, let me reconnect all this
1241
01:10:40,940 --> 01:10:44,449
and change the 10 just\nto a 3, and now, Play.
1242
01:10:51,621 --> 01:10:53,538
It's still correct, but\nnow I've set the stage
1243
01:10:53,537 --> 01:10:57,040
to let the cat meow, for instance,\n
1244
01:10:57,041 --> 01:11:00,431
40 times by changing one thing, or\n
1245
01:11:00,430 --> 01:11:03,159
and just walk away and it\nwill meow forever instead.
1246
01:11:03,159 --> 01:11:05,230
If that's your goal,\nthat would be better.
1247
01:11:05,230 --> 01:11:07,569
A better design but still correct.
1248
01:11:08,319 --> 01:11:10,270
Now that I have a\nprogram that's designed
1249
01:11:10,270 --> 01:11:13,630
to have a cat meow, wow like why?
1250
01:11:13,631 --> 01:11:16,360
I mean, MIT invented\nScratch, Scratch as a cat
1251
01:11:16,359 --> 01:11:18,371
why is there no puzzle\npiece called Meow?
1252
01:11:18,371 --> 01:11:20,081
This feels like a missed opportunity.
1253
01:11:20,081 --> 01:11:22,751
Now to be fair, they gave\nus all the building blocks
1254
01:11:22,751 --> 01:11:26,032
with which we could implement that\n
1255
01:11:26,032 --> 01:11:28,239
and really computer science\nis to leverage what we're
1256
01:11:28,239 --> 01:11:30,489
going to now start calling Abstraction.
1257
01:11:30,489 --> 01:11:34,480
We have step-by-step instructions\nhere, the Repeat, the Play
1258
01:11:34,480 --> 01:11:37,001
and the Wait that collectively\nimplements this idea
1259
01:11:37,001 --> 01:11:38,771
that we humans would call meowing.
1260
01:11:38,770 --> 01:11:41,831
Wouldn't it be nice to abstract\naway those several puzzle
1261
01:11:41,831 --> 01:11:45,701
pieces into just one that literally\n
1262
01:11:45,701 --> 01:11:48,280
Well here's where we\ncan make our own blocks.
1263
01:11:48,280 --> 01:11:52,331
Let me go over here to Scratch\nunder the pink block category
1264
01:11:52,331 --> 01:11:55,932
here and let me click Make a Block.
1265
01:11:55,932 --> 01:11:57,640
Here I see a slightly\ndifferent interface
1266
01:11:57,640 --> 01:12:00,490
where I can choose a name for it\nand I'm going to call it Meow.
1267
01:12:05,180 --> 01:12:08,201
Now I'm just going to\nclean this up a bit here.
1268
01:12:08,201 --> 01:12:12,020
Let me drag and drop Play\nSound and Wait over here.
1269
01:12:13,030 --> 01:12:15,850
I'm just going to drag this\nway down here, way down
1270
01:12:15,850 --> 01:12:18,190
here because now that I'm\ndone implementing Meow
1271
01:12:18,190 --> 01:12:21,050
I'm going to literally abstract\nit away, sort of out of sight
1272
01:12:21,051 --> 01:12:25,301
out of mind, because now notice at\n
1273
01:12:26,711 --> 01:12:31,541
So at this point, I'd argue it doesn't\n
1274
01:12:31,541 --> 01:12:35,020
Frankly, I don't know how Ask\nor Say was implemented by MIT.
1275
01:12:35,020 --> 01:12:37,210
They abstracted those\nthings away for us.
1276
01:12:37,211 --> 01:12:40,121
Now I have a brand new puzzle\npiece that just says what it is.
1277
01:12:40,121 --> 01:12:44,440
And this is now still correct,\nbut arguably better design.
1278
01:12:45,041 --> 01:12:47,440
Because it's just more\nreadable to me, to you
1279
01:12:47,440 --> 01:12:49,690
it's more maintainable\nwhen you look at your code
1280
01:12:49,690 --> 01:12:52,240
a year from now for the first time\n
1281
01:12:52,240 --> 01:12:53,948
back at the very first\nprogram you wrote.
1282
01:12:55,541 --> 01:12:59,230
The function itself has semantics,\n
1283
01:12:59,230 --> 01:13:01,960
If you really care about\nhow Meow is implemented
1284
01:13:01,961 --> 01:13:04,931
you could scroll down and start\nto tinker with the underlying
1285
01:13:04,930 --> 01:13:09,701
implementation details, but otherwise\n
1286
01:13:09,701 --> 01:13:13,211
Now I feel like there's an\neven additional opportunity
1287
01:13:13,211 --> 01:13:17,711
here for abstraction and to factor\n
1288
01:13:17,711 --> 01:13:21,161
It's kind of lame that I\nhave this Repeat block that
1289
01:13:21,161 --> 01:13:24,511
lets me call the Meow function,\n
1290
01:13:25,570 --> 01:13:28,480
Wouldn't it be nice if I could\njust call them Meow function
1291
01:13:28,480 --> 01:13:32,860
aka use the Meow function, and pass\n
1292
01:13:32,860 --> 01:13:35,110
piece how many times I want it to meow?
1293
01:13:35,110 --> 01:13:37,970
Well let me go ahead and\nzoom out and scroll down.
1294
01:13:37,970 --> 01:13:41,560
Let me right click or Control click on\n
1295
01:13:41,560 --> 01:13:44,951
or I could just start from scratch,\n
1296
01:13:44,951 --> 01:13:48,791
Now here, rather than just give this\n
1297
01:13:50,801 --> 01:13:53,081
I'm going to go ahead and\ntype in, for instance, n
1298
01:13:53,081 --> 01:13:56,141
for number of times to\nmeow, and just to make
1299
01:13:56,140 --> 01:13:58,505
this even more user friendly\nand self descriptive
1300
01:13:58,506 --> 01:14:00,881
I'm going to add a label,\nwhich has no functional impact
1301
01:14:00,881 --> 01:14:03,070
it's just an aesthetic,\nand I'm just going
1302
01:14:03,070 --> 01:14:05,740
to say Times, just to make\nit read more like English
1303
01:14:05,740 --> 01:14:08,201
in this case that tells me\nwhat the puzzle piece does.
1304
01:14:09,680 --> 01:14:12,201
And now I need to refine\nthis a little bit.
1305
01:14:12,201 --> 01:14:18,480
Let me go ahead and grab\nunder Control a repeat block
1306
01:14:18,480 --> 01:14:22,180
let me move the Play, Sound,\nand Wait, into the repeat block.
1307
01:14:22,180 --> 01:14:24,720
I don't want 10 and I\nalso don't want 3 here.
1308
01:14:24,720 --> 01:14:29,880
What I want now is this n that is\n
1309
01:14:29,881 --> 01:14:33,271
is creating for me that represents\n
1310
01:14:34,025 --> 01:14:35,400
Notice that snaps right in place.
1311
01:14:35,400 --> 01:14:39,451
Let me connect this and now voila, I\n
1312
01:14:40,951 --> 01:14:44,878
It takes input that affects\nits behavior accordingly.
1313
01:14:44,878 --> 01:14:47,671
Now I'm going to scroll back up,\n
1314
01:14:47,671 --> 01:14:49,021
I just care that Meow exists.
1315
01:14:49,020 --> 01:14:52,560
Now I can tighten up my code, so\nto speak, use even fewer lines
1316
01:14:52,560 --> 01:14:55,740
to do the same thing by\nthrowing away the Repeat block
1317
01:14:55,740 --> 01:15:00,091
reconnecting this new puzzle piece here\n
1318
01:15:00,091 --> 01:15:02,221
now we're really programming, right?
1319
01:15:02,220 --> 01:15:04,560
We've not made any forward\nprogress functionally.
1320
01:15:04,560 --> 01:15:06,871
The thing just mouse three times.
1321
01:15:09,060 --> 01:15:11,371
As you program more and\nmore, these are the kinds
1322
01:15:11,371 --> 01:15:13,860
of instincts still start\nto acquire so that one
1323
01:15:13,860 --> 01:15:16,711
you can start to take a big assignment,\n
1324
01:15:16,711 --> 01:15:20,371
for homework even, that feels kind of\n
1325
01:15:21,541 --> 01:15:25,440
But if you start to identify what are\n
1326
01:15:25,440 --> 01:15:27,390
Then you can start making progress.
1327
01:15:27,390 --> 01:15:32,070
I do this to this day where if I have to\n
1328
01:15:32,070 --> 01:15:36,121
it's so easy to drag my feet and ugh,\n
1329
01:15:36,121 --> 01:15:38,820
until I just start writing\ndown like a to do list
1330
01:15:38,820 --> 01:15:41,490
and I start to modularize the\nprogram and say, all right, well
1331
01:15:41,490 --> 01:15:42,600
what do I want this thing to do?
1332
01:15:43,680 --> 01:15:45,680
I've got to have it say\nsomething on the screen.
1333
01:15:45,680 --> 01:15:48,150
All right, I need to have it\nsay something on the screen
1334
01:15:49,081 --> 01:15:52,421
Like literally a mental or written\n
1335
01:15:52,421 --> 01:15:55,621
if you will, in English on a\npiece of paper or text file
1336
01:15:55,621 --> 01:15:57,810
and then you can decide,\nOK, the first thing I
1337
01:15:57,810 --> 01:16:00,790
need to do for homework to\nsolve this real world problem
1338
01:16:02,161 --> 01:16:04,261
I need to use a bunch\nof other code, too
1339
01:16:04,261 --> 01:16:06,661
but I need to create a\nMeow function and boom
1340
01:16:06,661 --> 01:16:10,440
now you have a piece of the problem\n
1341
01:16:10,440 --> 01:16:14,850
book there, but in this case, we'll\n
1342
01:16:14,850 --> 01:16:16,510
All right, so what more can we do?
1343
01:16:16,511 --> 01:16:18,993
Let's add a few more\npieces to the puzzle here.
1344
01:16:18,993 --> 01:16:20,701
Let's actually interact\nwith the cat now.
1345
01:16:20,701 --> 01:16:24,030
Let me go ahead and now when the\n
1346
01:16:24,030 --> 01:16:26,820
and ask a question using an event here.
1347
01:16:26,820 --> 01:16:31,081
Let me go ahead and\nsay, let's see, I want
1348
01:16:31,081 --> 01:16:34,331
to do something like implement\nthe notion of petting the cat.
1349
01:16:34,331 --> 01:16:39,841
So if the cursor is touching the\n
1350
01:16:39,841 --> 01:16:42,472
it'd be cute if the cat meows\nlike you're petting a cat.
1351
01:16:42,472 --> 01:16:45,180
So I'm going to ask the question,\n
1352
01:16:45,180 --> 01:16:48,370
if let's see I think I need Sensing.
1353
01:16:48,371 --> 01:16:50,761
So if touching mouse\npointer, this is way too big
1354
01:16:50,761 --> 01:16:52,891
but again the shape is\nfine, so there goes.
1355
01:16:53,730 --> 01:16:56,280
And then if it's touching\nthe mouse pointer
1356
01:16:56,280 --> 01:16:59,760
that is if the cat to whom\nthis script or this program
1357
01:16:59,761 --> 01:17:03,031
any time I attach puzzle\npieces MIT calls them a script
1358
01:17:03,030 --> 01:17:07,230
or like a program, if you will, let\n
1359
01:17:07,230 --> 01:17:10,219
and say play sound meow until done.
1360
01:17:10,219 --> 01:17:11,761
All right, so here it is to be clear.
1361
01:17:11,761 --> 01:17:13,891
When the green flag is\nclicked, ask the question
1362
01:17:13,890 --> 01:17:18,180
if the cat is touching the mouse\npointer then place sound meow.
1363
01:17:29,581 --> 01:17:34,351
I'm worried it's not Scratch's\nfault. Feels like mine.
1364
01:17:39,381 --> 01:17:42,231
Yeah, in back, who just turned.
1365
01:17:47,935 --> 01:17:50,810
DAVID MALAN: Yeah, the problem is\n
1366
01:17:50,810 --> 01:17:54,260
Scratch asks the question, is the\n
1367
01:17:54,261 --> 01:17:57,261
And obviously it's not because the\n
1368
01:17:58,430 --> 01:18:01,370
It's fine if I move the cursor\ndown there, but too late.
1369
01:18:01,371 --> 01:18:03,141
The program already asked the question.
1370
01:18:03,140 --> 01:18:06,810
The answer was no or false or zero,\n
1371
01:18:08,190 --> 01:18:10,340
So what might be the solution here be?
1372
01:18:10,341 --> 01:18:12,891
I could move my cursor\nquickly, but that feels
1373
01:18:12,890 --> 01:18:14,900
like never going to work out right.
1374
01:18:18,051 --> 01:18:20,521
Could you use the forever loop?
1375
01:18:21,341 --> 01:18:24,601
So I could indeed use this Forever\n
1376
01:18:24,600 --> 01:18:28,080
to just constantly listen to me, well\n
1377
01:18:28,081 --> 01:18:30,150
or at least forever as\nlong as the program is
1378
01:18:30,150 --> 01:18:32,318
running until I explicitly hit Stop.
1379
01:18:33,150 --> 01:18:36,480
Let me go to Control, let\nme grab the Forever block
1380
01:18:36,480 --> 01:18:40,260
let me move the If inside of this\nForever block, reconnect this
1381
01:18:40,261 --> 01:18:43,951
go back up here, click the green\n
1382
01:18:43,951 --> 01:18:45,721
but let me try moving my cursor now.
1383
01:18:50,381 --> 01:18:52,298
So now the cat is actually\nresponding and it's
1384
01:18:52,297 --> 01:18:54,840
going to keep doing\nthis again and again.
1385
01:18:54,841 --> 01:18:58,651
So now we have this idea of taking these\n
1386
01:18:58,650 --> 01:19:01,501
pieces, assembling them into\nsomething more complicated.
1387
01:19:01,501 --> 01:19:03,621
I could definitely put a name to this.
1388
01:19:03,621 --> 01:19:05,371
I could create a custom\nblock, but for now
1389
01:19:05,371 --> 01:19:08,074
let's just consider what kind\nof more interactivity we can do.
1390
01:19:08,073 --> 01:19:09,240
Let me go ahead and do this.
1391
01:19:09,240 --> 01:19:12,930
By again grabbing a,\nwhen green flag clicked
1392
01:19:12,930 --> 01:19:16,003
let me go ahead and\nclick the video sensing
1393
01:19:16,003 --> 01:19:18,421
and I'm going to rotate the\nlaptop because otherwise we're
1394
01:19:18,421 --> 01:19:21,463
going to get a little inception thing\n
1395
01:19:22,470 --> 01:19:25,980
So I'm going to go reveal to\nyou what's inside the lectern
1396
01:19:29,911 --> 01:19:34,980
Now that we have a non video\nbackdrop, I'm going to say this.
1397
01:19:34,980 --> 01:19:37,110
Instead of the green flag\nclicked, actually, I'm
1398
01:19:37,110 --> 01:19:40,770
going to say when the video motion\n
1399
01:19:40,770 --> 01:19:47,233
measurement of motion, I'm going to go\n
1400
01:19:47,233 --> 01:19:48,940
And then I'm going to\nget out of the way.
1401
01:19:50,511 --> 01:19:53,341
We'll put them on top of there.
1402
01:20:00,451 --> 01:20:03,171
So my hand is moving faster\nthan 50 something or other
1403
01:20:03,171 --> 01:20:05,021
whatever the unit of measure is.
1404
01:20:06,881 --> 01:20:08,298
DAVID MALAN: (LAUGHING) Thank you.
1405
01:20:08,297 --> 01:20:10,960
So now we have an even\nmore interactive version.
1406
01:20:12,081 --> 01:20:15,721
But I think if I sort of slowly.
1407
01:20:18,780 --> 01:20:23,570
It's completely creepy, but I'm\n
1408
01:20:24,360 --> 01:20:27,161
Until finally my hand\nmoves as fast as that.
1409
01:20:27,161 --> 01:20:29,341
And so here actually is\nan opportunity to show you
1410
01:20:29,341 --> 01:20:31,261
something a former student did.
1411
01:20:35,501 --> 01:20:38,391
Let me go ahead and zoom out\nof this in just a moment.
1412
01:20:40,610 --> 01:20:42,360
(LAUGHING) If someone\nwould be comfortable
1413
01:20:42,360 --> 01:20:44,871
coming up not only masked but\nalso on camera on the internet
1414
01:20:44,871 --> 01:20:48,751
I thought we'd play one of your former\n
1415
01:20:48,751 --> 01:20:51,400
Would anyone like to volunteer\nhere and be up on stage?
1416
01:20:57,011 --> 01:20:58,981
Let me get it set up for you here.
1417
01:21:10,180 --> 01:21:13,240
All right, let me go ahead\nand full screen this here.
1418
01:21:13,240 --> 01:21:17,640
So this is whack-a-mole by one\nof your firmer predecessors.
1419
01:21:17,640 --> 01:21:20,670
It's going to use the camera focusing\n
1420
01:21:20,671 --> 01:21:22,328
to position inside of this rectangle.
1421
01:21:22,328 --> 01:21:24,661
Have you ever played the\nwhack-a-mole game at an arcade?
1422
01:21:25,470 --> 01:21:27,617
So for those who haven't,\nthese little moles pop up
1423
01:21:27,618 --> 01:21:29,701
and with a very fuzzy\nhammer you sort of hit down.
1424
01:21:29,701 --> 01:21:31,493
You though, if you\ndon't mind, you're going
1425
01:21:31,493 --> 01:21:34,121
to use your head to do this virtually.
1426
01:21:34,121 --> 01:21:39,121
So let's line up your head with\n
1427
01:21:48,211 --> 01:21:50,400
And now hit the moles with your head.
1428
01:22:14,930 --> 01:22:16,850
All right, a round of\napplause for Sahar.
1429
01:22:24,600 --> 01:22:26,732
So beyond having a\nlittle bit of fun here
1430
01:22:26,733 --> 01:22:28,440
the goal was to\ndemonstrate that by using
1431
01:22:28,440 --> 01:22:31,890
some fairly simple, primitive,\nsome basic building blocks
1432
01:22:31,890 --> 01:22:34,260
but assembling them in a fun\nway with some music, maybe
1433
01:22:34,261 --> 01:22:37,781
some new costumes or artwork, you\n
1434
01:22:37,780 --> 01:22:40,841
But at the end of the day, the\n
1435
01:22:40,841 --> 01:22:43,591
were ones like the ones I just\n
1436
01:22:43,591 --> 01:22:45,311
because there were\nclearly lots of moles.
1437
01:22:45,310 --> 01:22:49,290
So the student probably created a few\n
1438
01:22:49,291 --> 01:22:50,708
but at least four different moles.
1439
01:22:50,707 --> 01:22:53,457
They had like some kind of graphic\n
1440
01:22:54,600 --> 01:22:57,090
There were some kind of\ntimer, maybe a variable
1441
01:22:57,091 --> 01:22:59,676
that every second was counting down.
1442
01:22:59,676 --> 01:23:02,551
So you can imagine taking what looks\n
1443
01:23:02,551 --> 01:23:04,561
at first glance, and\nperhaps overwhelming
1444
01:23:04,560 --> 01:23:07,980
to solve yourself, but just think about\n
1445
01:23:07,980 --> 01:23:12,360
And pluck off one piece of the\npuzzle, so to speak, at a time.
1446
01:23:12,360 --> 01:23:15,100
So indeed if we rewind a little bit.
1447
01:23:15,100 --> 01:23:17,880
Let me go ahead here\nand introduce a program
1448
01:23:17,881 --> 01:23:20,521
that I myself made\nback in graduate school
1449
01:23:20,520 --> 01:23:23,081
when Scratch was first\nbeing developed by MIT.
1450
01:23:23,081 --> 01:23:26,400
Let me go ahead and open\nhere, give me just one second
1451
01:23:26,400 --> 01:23:30,630
something that I called back\nin the day Oscar Time that
1452
01:23:30,631 --> 01:23:32,761
looks a little something like this.
1453
01:23:32,761 --> 01:23:34,591
If I fullscreen it and hit Play.
1454
01:23:34,591 --> 01:23:38,166
[MUSIC - SESAME STREET, "I LOVE TRASH"]
1455
01:23:38,166 --> 01:23:40,041
OSCAR THE GROUCH:\n(SINGING) Oh, I love trash.
1456
01:23:40,041 --> 01:23:42,458
DAVID MALAN: So you'll notice\na piece of trash is falling.
1457
01:23:42,457 --> 01:23:45,930
I can click on it and drag and as I get\n
1458
01:23:45,930 --> 01:23:47,930
OSCAR THE GROUCH: (SINGING)\nAnything ragged or--
1459
01:23:47,930 --> 01:23:49,820
DAVID MALAN: It wants\nto go in, it seems.
1460
01:23:50,900 --> 01:23:52,400
OSCAR THE GROUCH: (SINGING) Yes, I--
1461
01:23:54,935 --> 01:23:56,600
OSCAR THE GROUCH: (SINGING) If you\n
1462
01:23:56,600 --> 01:23:57,740
DAVID MALAN: I'll do\nthe same, two points.
1463
01:23:57,740 --> 01:24:00,140
OSCAR THE GROUCH: (SINGING) I have here\n
1464
01:24:00,140 --> 01:24:01,820
DAVID MALAN: There's a\nsneaker falling from the sky
1465
01:24:01,820 --> 01:24:03,288
so another sprite of some sort.
1466
01:24:03,288 --> 01:24:05,246
OSCAR THE GROUCH: (SINGING)\nThe laces are torn.
1467
01:24:07,070 --> 01:24:09,291
DAVID MALAN: I can also\nget just a little lazy
1468
01:24:09,291 --> 01:24:13,234
and just let them fall into the\ntrash themself if I want to.
1469
01:24:13,234 --> 01:24:15,651
So you can see it doesn't have\nto do with my mouse cursor
1470
01:24:15,650 --> 01:24:18,270
it has to do apparently\nwith the distance here.
1471
01:24:18,270 --> 01:24:19,730
Let's listen a little further.
1472
01:24:19,730 --> 01:24:23,300
I think some additional trash\nis about to make its appearance.
1473
01:24:23,301 --> 01:24:26,871
Presumably there's some kind of variable\n
1474
01:24:26,871 --> 01:24:28,371
OSCAR THE GROUCH: (SINGING) I love--
1475
01:24:28,371 --> 01:24:30,703
DAVID MALAN: OK, let's see\nwhat the last chorus here is.
1476
01:24:30,703 --> 01:24:32,720
OSCAR THE GROUCH:\n(SINGING) Rotten stuff.
1477
01:24:32,720 --> 01:24:35,810
I have here some newspaper, crusty and
1478
01:24:35,810 --> 01:24:37,790
DAVID MALAN: OK, and thus he continues.
1479
01:24:37,791 --> 01:24:40,701
And the song actually\ngoes on and on and on
1480
01:24:40,701 --> 01:24:43,371
and I do not have fond memories\nof implementing this and hearing
1481
01:24:43,371 --> 01:24:46,400
this song for like 10\nstraight hours, but it's
1482
01:24:46,400 --> 01:24:50,055
a good example to just consider\nhow was this program composed?
1483
01:24:50,055 --> 01:24:52,430
How did I go about implementing\nit the first time around?
1484
01:24:52,430 --> 01:24:54,755
And let me go ahead and\nopen up some programs now
1485
01:24:54,756 --> 01:24:56,631
that I wrote in advance\njust so that we could
1486
01:24:56,631 --> 01:24:58,911
see how these things are assembled.
1487
01:24:58,911 --> 01:25:02,331
Honestly, the first thing\nI probably did was probably
1488
01:25:02,331 --> 01:25:04,730
to do something a little like this.
1489
01:25:04,730 --> 01:25:07,161
Here is just a version\nof the program where
1490
01:25:07,161 --> 01:25:09,860
I set out to solve\njust one problem first
1491
01:25:09,860 --> 01:25:12,121
of planting a lamp post in the program.
1492
01:25:12,621 --> 01:25:14,391
I kind of had a vision of what I wanted.
1493
01:25:14,390 --> 01:25:15,980
You know, it evolved\nover time, certainly
1494
01:25:15,980 --> 01:25:17,420
but I knew I wanted\ntrash to fall, I wanted
1495
01:25:17,421 --> 01:25:19,128
a cute little Oscar\nthe Grouch to pop out
1496
01:25:19,128 --> 01:25:22,041
of the trashcan, and some other\nstuff, but wow that's a lot
1497
01:25:23,690 --> 01:25:26,570
I'm going to start easy, download\na picture of a lamp post
1498
01:25:26,570 --> 01:25:30,951
and then drag and drop it into the\n
1499
01:25:31,970 --> 01:25:33,740
It doesn't functionally do anything.
1500
01:25:33,740 --> 01:25:36,831
I mean, literally that's the\ncode that I wrote to do this.
1501
01:25:36,831 --> 01:25:38,990
All I did was use like\nthe Backdrops feature
1502
01:25:38,990 --> 01:25:41,041
and drag and drop and\nmove things around
1503
01:25:41,041 --> 01:25:44,181
but it got me to version\none of my program.
1504
01:25:44,180 --> 01:25:46,010
Then what might version two be?
1505
01:25:46,011 --> 01:25:48,591
Well I considered what\npiece of functionality
1506
01:25:48,591 --> 01:25:52,083
frankly might be the easiest to\n
1507
01:25:52,082 --> 01:25:54,290
That seems like a pretty\ncore piece of functionality.
1508
01:25:54,291 --> 01:25:56,551
It just needs to sit\nthere most of the time.
1509
01:25:56,551 --> 01:25:59,931
So the next thing I\nprobably did was to open up
1510
01:25:59,930 --> 01:26:05,600
for instance, the trash can version\n
1511
01:26:06,113 --> 01:26:08,030
So this time I'll show\nyou what's inside here.
1512
01:26:08,030 --> 01:26:10,400
There is some code, but not much.
1513
01:26:10,400 --> 01:26:14,421
Notice at bottom right I change the\n
1514
01:26:14,421 --> 01:26:17,451
instead, but it's the same\nprinciple that I can control.
1515
01:26:17,451 --> 01:26:20,121
And then over here I added this code.
1516
01:26:20,121 --> 01:26:23,030
When the green flag is\nclicked, switch the costume
1517
01:26:23,030 --> 01:26:25,110
to something I arbitrarily\ncalled Oscar 1.
1518
01:26:25,110 --> 01:26:26,871
So I found a couple\nof different pictures
1519
01:26:26,871 --> 01:26:29,990
of a trash can, one that looks\n
1520
01:26:29,990 --> 01:26:32,150
and eventually one that\nhas Oscar coming out
1521
01:26:32,150 --> 01:26:33,810
and I just gave them different names.
1522
01:26:33,810 --> 01:26:36,980
So I said Switch to Oscar 1, which\nis the closed one by default
1523
01:26:36,980 --> 01:26:40,791
then forever do the following:\nif touching the mouse pointer
1524
01:26:40,791 --> 01:26:45,711
then switch the costume to\nOscar 2, else switch to Oscar 1.
1525
01:26:45,711 --> 01:26:49,311
That is to say, I just wanted to\n
1526
01:26:49,310 --> 01:26:52,130
and closing, even if it's not\nexactly what I wanted ultimately
1527
01:26:52,131 --> 01:26:54,181
I just wanted to make\nsome forward progress.
1528
01:26:54,180 --> 01:26:59,940
So here, when I run this program by\n
1529
01:26:59,940 --> 01:27:03,440
Nothing yet, but if I get\ncloser to the trash can
1530
01:27:03,440 --> 01:27:07,310
it indeed pops open because\nit's forever listening
1531
01:27:07,310 --> 01:27:10,500
for whether the sprite,\nthe trash can in this case
1532
01:27:10,501 --> 01:27:11,751
is touching the mouse pointer.
1533
01:27:12,440 --> 01:27:15,501
That was version 2, if you will.
1534
01:27:15,501 --> 01:27:18,650
If I went in now and added the lamp\n
1535
01:27:18,650 --> 01:27:20,150
now we're starting to make progress.
1536
01:27:20,360 --> 01:27:22,940
Now it would look a little\nsomething more like the program
1537
01:27:22,940 --> 01:27:25,130
I intended ultimately to create.
1538
01:27:25,131 --> 01:27:27,961
What piece did I probably\nbite off after that?
1539
01:27:27,961 --> 01:27:30,231
Well, I think what I did\nis I probably decided
1540
01:27:30,230 --> 01:27:33,740
let me implement one of the pieces of\n
1541
01:27:34,350 --> 01:27:37,880
Let's just get one piece of\ntrash working correctly first.
1542
01:27:37,881 --> 01:27:40,280
So let me go ahead and open this one.
1543
01:27:40,280 --> 01:27:43,850
And again, all of these examples will\n
1544
01:27:43,850 --> 01:27:46,040
so you can see all of\nthese examples, too.
1545
01:27:46,041 --> 01:27:48,440
It's not terribly long, I\njust implement it in advance
1546
01:27:48,440 --> 01:27:50,810
so we could flip\nthrough kind of quickly.
1547
01:27:52,341 --> 01:27:56,151
On the right hand side, I turned\nmy sprite into a piece of trash
1548
01:27:56,150 --> 01:27:58,770
this time instead of a cat,\ninstead of a trash can
1549
01:27:58,770 --> 01:28:04,130
and I also created, with Carter's help,\n
1550
01:28:04,131 --> 01:28:07,191
It's literally just a black line\nbecause I just wanted initially
1551
01:28:07,190 --> 01:28:09,621
to have some notion of a\nfloor so I could detect
1552
01:28:09,621 --> 01:28:12,141
if the trash is touching the floor.
1553
01:28:12,140 --> 01:28:15,411
Now without seeing the code yet,\njust hearing that description
1554
01:28:15,411 --> 01:28:20,301
why might I have wanted the second\n
1555
01:28:20,301 --> 01:28:22,748
with the trash intending\nto fall from the sky?
1556
01:28:22,747 --> 01:28:24,080
What might I have been thinking?
1557
01:28:24,081 --> 01:28:26,041
Like what problem might\nI be trying to solve?
1558
01:28:26,270 --> 01:28:28,730
AUDIENCE: You don't want the\nfirst sprite to go through it.
1559
01:28:28,730 --> 01:28:31,688
DAVID MALAN: Yeah, you don't want\n
1560
01:28:31,689 --> 01:28:34,280
go through, and then boom,\nyou completely lose it.
1561
01:28:34,280 --> 01:28:36,932
That would not be a very useful thing.
1562
01:28:36,932 --> 01:28:39,890
Or it would seem to maybe eat up more\n
1563
01:28:39,890 --> 01:28:42,560
if the trash is just endlessly\nfalling and I can't grab it.
1564
01:28:42,560 --> 01:28:44,331
It might be a little traumatic\nif you tried to get it
1565
01:28:44,331 --> 01:28:46,873
and you can't pull it back out\nand you can't fix the program.
1566
01:28:46,872 --> 01:28:48,548
So I just wanted the thing to stop.
1567
01:28:48,548 --> 01:28:50,091
So how might I have implemented this?
1568
01:28:50,091 --> 01:28:51,740
Let's look at the code at left.
1569
01:28:51,740 --> 01:28:56,360
Here I have a bit of randomness,\nlike I proposed earlier exists.
1570
01:28:56,360 --> 01:28:59,240
There's this blue\nfunction called Go To x
1571
01:28:59,240 --> 01:29:03,051
y that lets me move a\nsprite to any position
1572
01:29:03,051 --> 01:29:07,731
up, down, left, right, I picked a random\n
1573
01:29:07,730 --> 01:29:12,470
negative 240 to positive 240, and then\n
1574
01:29:12,470 --> 01:29:14,220
This just makes the\ngame more interesting.
1575
01:29:14,220 --> 01:29:18,410
It's kind of lame pretty quickly if the\n
1576
01:29:18,411 --> 01:29:21,650
Here's this a little bit of randomness,\n
1577
01:29:23,121 --> 01:29:26,694
So now if I click the green flag,\nyou'll see that it just falls
1578
01:29:26,694 --> 01:29:28,611
nothing interesting is\ngoing to happen, but it
1579
01:29:28,610 --> 01:29:33,050
does stop when it touches the black\n
1580
01:29:33,051 --> 01:29:37,821
I'm forever asking the question if\n
1581
01:29:37,820 --> 01:29:41,550
is to the floor is greater\nthan zero, that's fine.
1582
01:29:41,551 --> 01:29:44,361
Change the y location by negative 3.
1583
01:29:44,360 --> 01:29:48,860
So move it down 3 pixels, down 3\n
1584
01:29:48,860 --> 01:29:52,701
is not greater than zero, it is zero\n
1585
01:29:52,701 --> 01:29:54,427
it should just stop moving altogether.
1586
01:29:54,427 --> 01:29:56,510
There's other ways we could\nhave implemented this
1587
01:29:56,511 --> 01:29:58,731
but this felt like a nice,\nclean way that logically, just
1588
01:29:59,523 --> 01:30:03,711
OK, now I got some trash falling, I\n
1589
01:30:03,711 --> 01:30:08,211
I have a lamp post, now I'm a\ngood three steps into the program.
1590
01:30:09,411 --> 01:30:12,411
If we consider one or two\nfinal pieces, something
1591
01:30:12,411 --> 01:30:17,570
like the dragging of the trash, let me\n
1592
01:30:17,570 --> 01:30:21,171
Dragging the trash requires\na different type of question.
1593
01:30:24,121 --> 01:30:26,900
I only need one sprite, no\nfloor here because I just
1594
01:30:26,900 --> 01:30:29,720
want the human to move it up,\ndown, left, right and the human's
1595
01:30:29,720 --> 01:30:32,750
not going to physically be able\nto move it outside of the world.
1596
01:30:32,751 --> 01:30:36,480
If we zoom in on this code, the way\n
1597
01:30:36,480 --> 01:30:40,670
We're using that And conjunction\n
1598
01:30:40,671 --> 01:30:44,301
the green flag is clicked, we're\n
1599
01:30:44,301 --> 01:30:47,571
these questions, plural,\nif the mouse is down
1600
01:30:47,570 --> 01:30:53,211
and the trash is touching the\nmouse pointer, that's equivalent
1601
01:30:53,211 --> 01:30:55,521
logically to clicking on the trash.
1602
01:30:55,520 --> 01:30:58,831
Go ahead and move the\ntrash to the mouse pointer.
1603
01:30:58,831 --> 01:31:00,891
So again it takes this\nvery familiar idea
1604
01:31:00,890 --> 01:31:04,010
that you and I take for granted\n
1605
01:31:06,621 --> 01:31:10,820
Well Mac OS or Windows are\nprobably asking a question.
1606
01:31:10,820 --> 01:31:15,320
For every icon, is the mouse down\n
1607
01:31:15,320 --> 01:31:19,461
If so, go to the location\nof the mouse forever
1608
01:31:19,461 --> 01:31:21,561
while the mouse button is clicked down.
1609
01:31:21,560 --> 01:31:23,720
So how does this work in reality now?
1610
01:31:23,720 --> 01:31:25,970
Let me go ahead and click on the Play.
1611
01:31:25,970 --> 01:31:30,030
Nothing happens at first, but if\n
1612
01:31:33,121 --> 01:31:36,871
So I now need to kind of combine\n
1613
01:31:36,871 --> 01:31:39,604
but I bet I could just start\nto use just one single program.
1614
01:31:39,604 --> 01:31:42,021
Right now I'm using separate\nones to show different ideas
1615
01:31:42,020 --> 01:31:45,170
but now that's another\nbite out of the problem.
1616
01:31:45,171 --> 01:31:48,028
If we do one last one,\nsomething like the scorekeeping
1617
01:31:48,028 --> 01:31:51,110
is interesting, because recall that\n
1618
01:31:51,110 --> 01:31:54,661
into the can, Oscar popped out\nand told us the current score.
1619
01:31:54,661 --> 01:31:59,240
So let me go ahead and find\nthis one, Oscar variables
1620
01:31:59,240 --> 01:32:01,220
and let me zoom in on this one.
1621
01:32:01,220 --> 01:32:04,423
This one is longer because we\ncombined all of these elements.
1622
01:32:04,423 --> 01:32:07,341
So this is the kind of thing that\n
1623
01:32:07,341 --> 01:32:09,861
I have no idea how I would\nhave implemented this
1624
01:32:09,860 --> 01:32:12,140
from nothing, from scratch literally.
1625
01:32:12,140 --> 01:32:16,490
But again, if you take your\nvision and componenitize it
1626
01:32:16,490 --> 01:32:18,591
into these smaller,\nbite-sized problems, you
1627
01:32:18,591 --> 01:32:21,021
could take these baby\nsteps, so to speak, and then
1628
01:32:21,020 --> 01:32:22,590
solve everything collectively.
1629
01:32:22,591 --> 01:32:25,761
So what's new here is this bottom one.
1630
01:32:25,761 --> 01:32:30,411
Forever do the following:\nif the trash is touching
1631
01:32:30,411 --> 01:32:33,860
Oscar, the other sprite that\nwe've now added to the program
1632
01:32:35,841 --> 01:32:37,851
This is an orange and\nindeed if we poke around
1633
01:32:37,850 --> 01:32:42,411
we'll see that orange is a variable,\n
1634
01:32:42,411 --> 01:32:46,400
changing it means to add 1 or\nif it's negative subtract 1.
1635
01:32:46,400 --> 01:32:51,470
Then go ahead and have the\ntrash go to pick random.
1636
01:32:53,091 --> 01:32:56,941
Well, let me show you what it's doing\n
1637
01:32:58,341 --> 01:33:01,490
All right, it's falling, I'm clicking\n
1638
01:33:02,774 --> 01:33:04,191
All right, let me do it once more.
1639
01:33:06,831 --> 01:33:13,371
Why do I have this function at the\n
1640
01:33:13,371 --> 01:33:16,041
Like what problem is this solving here?
1641
01:33:18,132 --> 01:33:21,569
AUDIENCE: Just the same\ntrack teleported to the top
1642
01:33:21,569 --> 01:33:23,385
after you put it in the trash can.
1643
01:33:24,511 --> 01:33:27,303
Even though the human perceives\n
1644
01:33:27,302 --> 01:33:29,270
from the sky, it's\nactually the same piece
1645
01:33:29,270 --> 01:33:32,150
of trash, just kind of being\nmagically moved back to the top
1646
01:33:33,650 --> 01:33:36,530
There, too, you have this\nidea of reusable code.
1647
01:33:36,530 --> 01:33:40,161
If you were constantly copying\nand pasting your pieces of trash
1648
01:33:40,161 --> 01:33:43,190
and creating 20 pieces of trash, 30\n
1649
01:33:43,190 --> 01:33:46,820
want the game to have that many\n
1650
01:33:46,820 --> 01:33:49,911
Reuse the code that you wrote,\nreuse the sprites that you wrote
1651
01:33:49,911 --> 01:33:54,681
and that would give you not just\n
1652
01:33:54,680 --> 01:33:57,665
Well let's take a look at one\nfinal set of building blocks
1653
01:33:57,666 --> 01:33:59,541
that we can compose\nultimately into something
1654
01:33:59,541 --> 01:34:02,011
particularly interactive as follows.
1655
01:34:02,011 --> 01:34:03,921
Let me go ahead and\nzoom out here and let
1656
01:34:03,921 --> 01:34:08,570
me propose that we implement something\n
1657
01:34:09,990 --> 01:34:13,310
So I want to implement\nsome maze-based game that
1658
01:34:13,310 --> 01:34:15,060
looks at first glance like this.
1659
01:34:15,735 --> 01:34:18,110
It's not a very fun game yet,\nbut here's a little Harvard
1660
01:34:18,110 --> 01:34:22,190
shield, a couple of black lines, this\n
1661
01:34:22,190 --> 01:34:24,230
but notice you can't\nquite see my hand here
1662
01:34:24,230 --> 01:34:28,948
but I'm using my arrow keys to go down,\n
1663
01:34:28,948 --> 01:34:31,490
but if I keep going right, right,\nright, right, right, right
1664
01:34:31,490 --> 01:34:32,841
right it's not going anywhere.
1665
01:34:32,841 --> 01:34:35,320
And left, left, left, left, left, left,\n
1666
01:34:36,740 --> 01:34:41,240
So before we look at the code,\nhow might this be working?
1667
01:34:41,240 --> 01:34:44,990
What kinds of scripts,\ncollections of puzzle pieces
1668
01:34:44,990 --> 01:34:47,490
might collectively\nhelp us implement this?
1669
01:34:57,671 --> 01:35:00,838
There's probably some question being\n
1670
01:35:00,838 --> 01:35:03,131
and it happens to be a couple\nof sprites, each of which
1671
01:35:03,131 --> 01:35:06,381
is just literally a vertical black line\n
1672
01:35:07,451 --> 01:35:10,451
Is the distance to it\nzero or close to zero?
1673
01:35:10,451 --> 01:35:16,131
And if so, we just ignore the left\n
1674
01:35:16,871 --> 01:35:18,940
But otherwise, if we're\nnot touching a wall
1675
01:35:18,940 --> 01:35:22,480
what are we probably doing\ninstead forever here?
1676
01:35:22,480 --> 01:35:24,820
How is the movement working presumably?
1677
01:35:33,690 --> 01:35:35,564
DAVID MALAN: Sorry, say a little louder.
1678
01:35:35,564 --> 01:35:38,798
AUDIENCE: Presumably it's continually\n
1679
01:35:38,798 --> 01:35:40,190
and then moving when you do.
1680
01:35:41,065 --> 01:35:44,930
It's continually, forever listening for\n
1681
01:35:44,930 --> 01:35:47,270
and if the up arrow is\npressed, we're probably
1682
01:35:47,270 --> 01:35:49,800
changing the y by a positive value.
1683
01:35:49,801 --> 01:35:52,401
If the down arrow is pressed,\nwe're going down by y
1684
01:35:52,400 --> 01:35:54,090
and left and right accordingly.
1685
01:35:54,091 --> 01:35:55,671
So let's actually take a quick look.
1686
01:35:55,671 --> 01:35:59,001
If I zoom out here and take a look\n
1687
01:35:59,001 --> 01:36:01,556
there's a lot going on at\nfirst glance, but let's see.
1688
01:36:01,555 --> 01:36:03,680
First of all, let me drag\nsome stuff out of the way
1689
01:36:03,680 --> 01:36:05,840
because it's kind of\noverwhelming at first glance
1690
01:36:05,841 --> 01:36:09,291
especially if you, for instance, were\n
1691
01:36:09,291 --> 01:36:11,934
0 just to get inspiration,\nmost projects out there
1692
01:36:11,934 --> 01:36:13,851
are going to look\noverwhelming at first glance
1693
01:36:13,850 --> 01:36:16,320
until you start to wrap your\nmind around what's going on.
1694
01:36:16,320 --> 01:36:19,581
But in this case, we've\nimplemented some abstractions
1695
01:36:19,581 --> 01:36:22,730
from the get go to explain to\n
1696
01:36:24,680 --> 01:36:29,100
This is that program with the two black\n
1697
01:36:30,081 --> 01:36:33,801
It initially puts the shield\nin the middle, 0,0, then
1698
01:36:33,801 --> 01:36:37,251
forever listens for keyboard,\nas I think you were describing
1699
01:36:37,251 --> 01:36:40,461
and it feels for the walls, as\nI think you were describing.
1700
01:36:42,860 --> 01:36:46,400
These are custom blocks we created\n
1701
01:36:46,400 --> 01:36:48,650
those implementation details\nbecause honestly that's
1702
01:36:48,650 --> 01:36:50,130
all I need to know right now.
1703
01:36:50,131 --> 01:36:52,851
But, as aspiring programmers,\nif we're curious now
1704
01:36:52,850 --> 01:36:55,460
let's scroll down to the\nactual implementation
1705
01:36:57,261 --> 01:36:59,701
This is the one on the left\nand it is a little long
1706
01:36:59,701 --> 01:37:02,461
but it's a lot of similar structure.
1707
01:37:02,461 --> 01:37:07,371
We're doing the following, if the up\n
1708
01:37:08,030 --> 01:37:11,600
If the down arrow is pressed,\nthen change y by negative 1.
1709
01:37:12,341 --> 01:37:15,361
Right arrow, left arrow, and that's it.
1710
01:37:15,360 --> 01:37:17,990
So it just assembles all\nof those ideas, combines it
1711
01:37:17,990 --> 01:37:20,451
into one new block just because\nit's kind of overwhelming
1712
01:37:20,451 --> 01:37:22,680
let's just implement it\nonce and tuck it away.
1713
01:37:22,680 --> 01:37:26,720
And if we scroll now over to\nthe Feel for Walls function
1714
01:37:26,720 --> 01:37:30,080
this now is asking the\nquestion as hypothesized
1715
01:37:30,081 --> 01:37:35,001
if I'm touching the left wall, change my\n
1716
01:37:35,720 --> 01:37:38,660
If I'm touching the right\nwall, then move x by negative 1
1717
01:37:38,661 --> 01:37:40,592
to move a little bit away from it.
1718
01:37:40,592 --> 01:37:42,050
So it kind of bounces off the wall.
1719
01:37:42,051 --> 01:37:47,451
Just in case it slightly went over, we\n
1720
01:37:47,451 --> 01:37:51,230
All right, then a couple of\nmore pieces here to introduce.
1721
01:37:51,230 --> 01:37:54,440
What if we want to actually add\n
1722
01:37:55,310 --> 01:38:02,030
Well, let me go ahead to maybe this one\n
1723
01:38:02,030 --> 01:38:05,458
might, for instance, be designed to\n
1724
01:38:05,458 --> 01:38:08,751
This is like a maze and you're trying to\n
1725
01:38:09,810 --> 01:38:14,360
Uh oh, Yale is in the way and it\n
1726
01:38:15,541 --> 01:38:16,791
Well, let me ask someone else.
1727
01:38:19,251 --> 01:38:21,381
This is an idea you have,\nthis as an idea you see.
1728
01:38:21,381 --> 01:38:26,490
Let's reverse engineer in\nyour head how it works.
1729
01:38:28,935 --> 01:38:32,655
AUDIENCE: If the Yale symbol is\n
1730
01:38:34,639 --> 01:38:36,431
DAVID MALAN: Yeah, so\nif the Yale symbol is
1731
01:38:36,430 --> 01:38:39,130
touching the left wall or the right\n
1732
01:38:39,131 --> 01:38:41,756
And indeed we'll see there's a\npuzzle piece that can do exactly
1733
01:38:41,756 --> 01:38:44,066
that technically off\nthe edge, as we'll see
1734
01:38:44,065 --> 01:38:45,690
but there's another way we can do this.
1735
01:38:46,751 --> 01:38:49,271
The way we ourselves\ncan implement exactly
1736
01:38:49,270 --> 01:38:52,010
that idea bounce is just\nwith a little bit of logic.
1737
01:38:52,011 --> 01:38:54,581
So here's what this version\nof the program is doing.
1738
01:38:54,581 --> 01:38:58,721
It's moving Yale by default to 0,0\n
1739
01:38:58,720 --> 01:39:03,250
pointing it direction 90 degrees, which\n
1740
01:39:03,251 --> 01:39:06,490
and then it's forever doing\nthis: if touching the left wall
1741
01:39:06,490 --> 01:39:10,451
or touching the right wall,\nhere's our translation of bounce.
1742
01:39:10,451 --> 01:39:11,953
We're just turning 180 degrees.
1743
01:39:11,953 --> 01:39:13,661
And the nice thing\nabout that is we don't
1744
01:39:13,661 --> 01:39:16,368
have to worry if we're going from\n
1745
01:39:16,368 --> 01:39:19,480
180 degrees is going to\nwork on both of the walls.
1746
01:39:21,310 --> 01:39:24,670
After we do that, we just move\none step, one pixel, at a time
1747
01:39:24,671 --> 01:39:28,150
but we're doing it forever so\nsomething is happening continually
1748
01:39:28,150 --> 01:39:30,640
and the Yale icon is\nbouncing back and forth.
1749
01:39:30,640 --> 01:39:33,550
Well one final piece\nhere, what if now we
1750
01:39:33,551 --> 01:39:39,711
want another adversary, a more advanced\n
1751
01:39:39,711 --> 01:39:44,831
to go and follow us wherever\nwe are such that this time
1752
01:39:44,831 --> 01:39:51,161
we want the other sprite to\nnot just bounce back and forth
1753
01:39:51,161 --> 01:39:55,600
but literally follow us\nno matter where we go.
1754
01:39:55,600 --> 01:39:59,110
How might this be\nimplemented on the screen?
1755
01:39:59,110 --> 01:40:01,541
I bet it's another forever\nblock, but what's inside?
1756
01:40:01,541 --> 01:40:05,621
AUDIENCE: So forever get the\n
1757
01:40:05,621 --> 01:40:06,844
and move one step towards it.
1758
01:40:06,844 --> 01:40:09,761
DAVID MALAN: Yeah, forever point at\n
1759
01:40:11,140 --> 01:40:15,340
This is just going to go on forever if I\n
1760
01:40:15,341 --> 01:40:18,551
Notice it's sort of twitching\nback and forth because it goes one
1761
01:40:18,551 --> 01:40:20,201
pixel then one pixel then one pixel.
1762
01:40:20,201 --> 01:40:21,911
It's sort of in a frantic state here.
1763
01:40:21,911 --> 01:40:25,421
We haven't finished the game yet, but if\n
1764
01:40:25,421 --> 01:40:28,121
It didn't take much to\nimplement this simple idea.
1765
01:40:28,121 --> 01:40:30,911
Go to a random position just\nto make it kind of fair
1766
01:40:30,911 --> 01:40:33,740
initially, then forever\npoint towards Harvard
1767
01:40:33,740 --> 01:40:37,060
which is what we called the Harvard\ncrest sprite, move one step.
1768
01:40:37,060 --> 01:40:39,701
Suppose we now wanted to\nmake a more advanced level.
1769
01:40:39,701 --> 01:40:42,610
What's a minor change I could\nlogically make to this code just
1770
01:40:42,610 --> 01:40:45,037
to make MIT even better at this?
1771
01:40:45,037 --> 01:40:46,870
AUDIENCE: Change the\nnumber of steps to two.
1772
01:40:46,871 --> 01:40:48,310
DAVID MALAN: All right, change\nthe number of steps to two.
1773
01:40:49,390 --> 01:40:51,370
So now they got twice as fast.
1774
01:40:51,371 --> 01:40:53,751
Let me go ahead and just\nget this out of the way.
1775
01:40:53,751 --> 01:40:56,841
Oops, let me make it a fair fight.
1776
01:40:58,310 --> 01:41:01,530
All right, I unfortunately am\nstill moving one pixel at a time
1777
01:41:01,530 --> 01:41:03,081
so this isn't going to end well.
1778
01:41:04,220 --> 01:41:10,230
And if we're really aggressive and\n
1779
01:41:11,930 --> 01:41:16,370
Jesus, OK, so that's how you might\n
1780
01:41:17,761 --> 01:41:20,694
So it's not an accident that we\nchose these particular examples
1781
01:41:20,694 --> 01:41:23,361
here involving these particular\nschools because we have one more
1782
01:41:23,360 --> 01:41:25,680
demonstration we thought\nwe'd introduce today
1783
01:41:25,680 --> 01:41:30,020
if we could get one other\nvolunteer to come up and play
1784
01:41:30,020 --> 01:41:34,400
what was called by one of your\npredecessors Ivy's Hardest Game.
1785
01:41:34,400 --> 01:41:35,661
Let's see, you in the middle.
1786
01:41:40,161 --> 01:41:41,993
DAVID MALAN: Come a\nlittle closer, actually.
1787
01:41:44,440 --> 01:41:47,067
All right, round of applause\nhere if we could, too.
1788
01:41:53,237 --> 01:41:54,487
OK, sorry, what was your name?
1789
01:41:59,020 --> 01:42:02,680
So here we have on this other\nscreen Ivy's Hardest Game
1790
01:42:02,680 --> 01:42:04,750
written by a former CS50 student.
1791
01:42:04,751 --> 01:42:07,451
I think you'll see that it\ncombines these same principles.
1792
01:42:07,451 --> 01:42:09,940
The maze is clearly a\nlittle more advanced.
1793
01:42:09,940 --> 01:42:14,320
The goal at hand is to initially move\n
1794
01:42:14,320 --> 01:42:17,108
the way on the right so that you\ncatch up to him in this case
1795
01:42:17,108 --> 01:42:18,940
but you'll see that\nthere's different levels
1796
01:42:18,940 --> 01:42:21,350
and different levels of sophistication.
1797
01:42:21,350 --> 01:42:24,820
So if you're up for it, you can use just\n
1798
01:42:24,820 --> 01:42:27,520
You'll be controlling the\nHarvard sprite and if we
1799
01:42:27,520 --> 01:42:32,291
could raise the volume just a little\n
1800
01:42:32,291 --> 01:42:34,421
Here we go, clicking the green flag.
1801
01:42:41,444 --> 01:42:43,194
[MUSIC - MC HAMMER, "U CAN\'T TOUCH\nTHIS"]
1802
01:42:43,194 --> 01:42:45,100
MC HAMMER: (SINGING) Can't touch this.
1803
01:42:57,591 --> 01:42:58,801
MC HAMMER: (SINGING) so hard.
1804
01:42:58,801 --> 01:43:00,551
Makes me want to say, oh my Lord.
1805
01:43:03,520 --> 01:43:06,162
MC HAMMER: (SINGING) Feels\ngood when you know you're down.
1806
01:43:10,350 --> 01:43:12,520
MC HAMMER: (SINGING)\nYou can't touch this.
1807
01:43:17,859 --> 01:43:19,791
MC HAMMER: (SINGING) Can't touch this.
1808
01:43:23,618 --> 01:43:25,201
You let me bust the funky lyrics.
1809
01:43:27,314 --> 01:43:29,481
You got it like that and\nyou know you want to dance.
1810
01:43:29,480 --> 01:43:33,302
So move out of your seat and get\na fly girl and catch this beat.
1811
01:43:34,934 --> 01:43:38,283
Pump a little bit and let them know\n
1812
01:43:38,283 --> 01:43:39,740
Cold on a mission, so fall on back.
1813
01:43:39,740 --> 01:43:41,033
Let them know that you're too--
1814
01:43:46,378 --> 01:43:48,798
MC HAMMER: (SINGING) Can't touch this.
1815
01:43:48,798 --> 01:43:50,251
Why you standing there, man?
1816
01:43:54,761 --> 01:43:58,316
Give me a song or rhythm, making\n
1817
01:44:00,220 --> 01:44:02,553
You talking the Hammer when\nyou're talking about a show.
1818
01:44:03,751 --> 01:44:06,601
Singers are sweating so them\na wipe or a tame to learn.
1819
01:44:06,600 --> 01:44:08,110
DAVID MALAN: Second to last level.
1820
01:44:08,610 --> 01:44:10,451
MC HAMMER: (SINGING) That chart's legit.
1821
01:44:10,451 --> 01:44:13,201
Either work hard or\nyou might as well quit.
1822
01:44:18,091 --> 01:44:20,391
MC HAMMER: (SINGING)\nYou can't touch this.
1823
01:44:20,390 --> 01:44:22,412
DAVID MALAN: You're almost there.
1824
01:44:22,412 --> 01:44:23,870
MC HAMMER: (SINGING) Break it down.
1825
01:44:36,161 --> 01:44:38,266
MC HAMMER: (SINGING) Stop, Hammer time.
1826
01:44:38,266 --> 01:44:39,878
Go with the flow," it is said.
1827
01:44:39,878 --> 01:44:42,171
If you can't groove to this,\nthen you're probably dead.
1828
01:44:42,171 --> 01:44:44,240
So wave your hands in the\nair, bust a few moves
1829
01:44:44,240 --> 01:44:45,140
run your fingers through your hair.
1830
01:44:46,621 --> 01:44:49,051
Dance to this and you're\ngoing to get thinner.
1831
01:44:50,650 --> 01:44:53,305
Just for a minute let's all do the bump.
1832
01:45:03,400 --> 01:45:05,405
All right, that's it for CS50.
1833
01:46:29,360 --> 01:46:33,470
And this is week 1, the one in which\n
1834
01:46:33,470 --> 01:46:36,350
is something we technically said\n
1835
01:46:36,350 --> 01:46:39,650
played with this graphical language\n
1836
01:46:41,331 --> 01:46:43,461
But today, as promised,\nwe transition to something
1837
01:46:43,461 --> 01:46:45,921
a little more traditional,\na little more text-based
1838
01:46:45,921 --> 01:46:48,501
not puzzle piece- or\nblock-based, known as C.
1839
01:46:49,770 --> 01:46:50,978
It's been around for decades.
1840
01:46:50,979 --> 01:46:54,501
But it's a language that underlies so\n
1841
01:46:54,501 --> 01:46:57,261
among them something called\nPython that we'll also
1842
01:46:57,261 --> 01:46:58,821
come to in a few weeks' time.
1843
01:46:58,820 --> 01:47:00,921
Indeed, at the end of\nthe semester, the goal
1844
01:47:00,921 --> 01:47:02,661
is for you to feel that\nyou've not learned Scratch
1845
01:47:02,661 --> 01:47:04,911
you've not learned C, or\neven Python, for that matter
1846
01:47:04,911 --> 01:47:07,371
but fundamentally that you've\nlearned how to program.
1847
01:47:07,371 --> 01:47:09,801
Unfortunately, when you\nlearn how to program
1848
01:47:09,801 --> 01:47:13,681
with a more traditional language like\n
1849
01:47:13,680 --> 01:47:17,090
Last week I described all of the\n
1850
01:47:17,091 --> 01:47:19,881
that you see in this, like the\n
1851
01:47:19,881 --> 01:47:22,761
parentheses, curly braces,\nbackslash n, and more.
1852
01:47:22,761 --> 01:47:26,791
Well, today we're not going to reveal\n
1853
01:47:27,291 --> 01:47:31,461
But by next week, will this no\n
1854
01:47:31,461 --> 01:47:35,391
to you, a language that, presumably,\n
1855
01:47:36,320 --> 01:47:40,201
But to do that, we'll explore some\n
1856
01:47:40,201 --> 01:47:43,221
So recall that, via Scratch-- and\npresumably via problem set 1--
1857
01:47:43,220 --> 01:47:46,580
we took a look at things called\n
1858
01:47:46,581 --> 01:47:49,310
And related to functions\nwere arguments like inputs.
1859
01:47:49,310 --> 01:47:52,760
And related to some functions\nwere returned values like outputs.
1860
01:47:52,761 --> 01:47:56,151
Then we talked a bit about conditionals,\n
1861
01:47:56,150 --> 01:47:59,541
Boolean expressions, which are\n
1862
01:47:59,541 --> 01:48:03,141
questions, loops, which let you do\n
1863
01:48:03,140 --> 01:48:05,990
like in math, that let you\nstore values temporarily
1864
01:48:05,990 --> 01:48:07,671
and then even other topics still.
1865
01:48:07,671 --> 01:48:11,421
So if you were comfortable on the\n
1866
01:48:11,421 --> 01:48:14,490
realize that all of these topics\nare going to remain with us.
1867
01:48:14,490 --> 01:48:18,230
So really, today is just about acquiring\n
1868
01:48:18,230 --> 01:48:22,911
you translate those ideas into,\n
1869
01:48:22,911 --> 01:48:25,698
a new syntax, frankly,\nthat's actually more
1870
01:48:25,698 --> 01:48:27,740
simple in some ways than\nyour own human language
1871
01:48:27,740 --> 01:48:31,422
be it English or something else, because\n
1872
01:48:31,422 --> 01:48:33,380
There's actually far less\nsyntax that you might
1873
01:48:33,381 --> 01:48:35,451
have in, say, a typical human language.
1874
01:48:35,451 --> 01:48:39,600
But you need to be with these computer\n
1875
01:48:39,600 --> 01:48:41,850
so that you're most,\nultimately, correct
1876
01:48:41,850 --> 01:48:46,040
and ultimately will see to your code\n
1877
01:48:46,621 --> 01:48:49,791
So if you think about the last time\n
1878
01:48:49,791 --> 01:48:51,980
knowing what you were doing\nor encountered something new--
1879
01:48:51,980 --> 01:48:54,855
might not have been that long ago,\n
1880
01:48:54,855 --> 01:48:58,460
first time, or Old Campus or the like,\n
1881
01:48:58,461 --> 01:49:01,823
you didn't really need to know how\n
1882
01:49:01,823 --> 01:49:03,530
You didn't need to\nknow who everyone was
1883
01:49:03,530 --> 01:49:07,230
where everything was, how Harvard or\n
1884
01:49:07,730 --> 01:49:10,855
You sort of got by day to day by just\n
1885
01:49:10,855 --> 01:49:12,605
And anything you didn't\nreally understand
1886
01:49:12,605 --> 01:49:14,960
you sort of turned a blind\neye to until it's important.
1887
01:49:14,961 --> 01:49:16,761
And that's, indeed, what\nwe're going to do today.
1888
01:49:16,761 --> 01:49:18,636
And really, for the next\nseveral weeks, we'll
1889
01:49:18,636 --> 01:49:21,171
focus on details that\nare initially important
1890
01:49:21,171 --> 01:49:24,342
and try to wave our hands, so to speak,\n
1891
01:49:24,342 --> 01:49:25,800
we'll get to, might be interesting.
1892
01:49:25,801 --> 01:49:27,468
But for now, they might be distractions.
1893
01:49:27,467 --> 01:49:29,900
And by distractions, I really\nmean some of that syntax
1894
01:49:31,501 --> 01:49:34,671
So by the end of today-- and\n
1895
01:49:34,671 --> 01:49:37,400
your first foray, presumably,\ninto this language called C--
1896
01:49:37,400 --> 01:49:39,171
you'll have written some code.
1897
01:49:39,171 --> 01:49:41,780
And you'll be asking yourself--\nwe'll be asking yourselves--
1898
01:49:43,161 --> 01:49:46,041
Well, first and foremost, per\nlast week, be it in Scratch
1899
01:49:46,041 --> 01:49:51,591
or phone book form, code ultimately\n
1900
01:49:51,591 --> 01:49:53,541
You want the problem\nto be solved correctly.
1901
01:49:53,541 --> 01:49:55,490
So that one sort of goes without saying.
1902
01:49:55,490 --> 01:49:59,511
And along the way this term, we'll\n
1903
01:49:59,511 --> 01:50:02,746
so you don't have to just sit there\n
1904
01:50:02,746 --> 01:50:05,371
checking the output, trying\nanother input, checking the output.
1905
01:50:05,371 --> 01:50:07,161
There's a lot of automation\ntools in the real world--
1906
01:50:07,161 --> 01:50:09,201
and in this class and\nothers like it-- that
1907
01:50:09,201 --> 01:50:13,011
will help facilitate you answering\n
1908
01:50:13,011 --> 01:50:15,901
correct, according to our\nspecifications or the like.
1909
01:50:15,900 --> 01:50:18,440
But then something that's\ngoing to take more time
1910
01:50:18,440 --> 01:50:21,860
and you're probably not going to feel\n
1911
01:50:21,860 --> 01:50:24,890
the first weeks, is just how\nwell designed your code is.
1912
01:50:24,890 --> 01:50:27,613
It's one thing to speak\nEnglish or write English
1913
01:50:27,613 --> 01:50:30,030
but it's another thing-- or\nany language, for that matter.
1914
01:50:30,030 --> 01:50:32,190
But it's another thing to\nspeak it or write it well.
1915
01:50:32,190 --> 01:50:35,148
And we spend all these years in middle\n
1916
01:50:35,149 --> 01:50:38,601
writing papers and other documents,\n
1917
01:50:38,600 --> 01:50:41,810
as to how well formulated your\n
1918
01:50:41,810 --> 01:50:43,019
your paper was, and the like.
1919
01:50:43,020 --> 01:50:44,971
And there's that same\nidea in programming.
1920
01:50:44,970 --> 01:50:49,590
It doesn't matter necessarily that\n
1921
01:50:49,591 --> 01:50:53,615
If your code is a complete visual\nmess, or if it's crazy long
1922
01:50:53,615 --> 01:50:55,490
it's going to be really\nhard for someone else
1923
01:50:55,490 --> 01:50:59,121
to wrap their mind around what your code\n
1924
01:51:00,201 --> 01:51:04,011
And honestly, you-- the\nnext morning, the next year
1925
01:51:04,011 --> 01:51:07,551
the next time you look at\nthat code-- might have no idea
1926
01:51:07,551 --> 01:51:09,381
what you yourself were even thinking.
1927
01:51:09,381 --> 01:51:13,731
But you will if you focus,\ntoo, on designing good code
1928
01:51:13,730 --> 01:51:16,640
getting your algorithms efficient,\n
1929
01:51:16,640 --> 01:51:19,161
and even making sure your\ncode looks pretty, which
1930
01:51:19,161 --> 01:51:20,881
we'd describe as a matter of style.
1931
01:51:20,881 --> 01:51:24,514
So in the written human world, having\n
1932
01:51:24,514 --> 01:51:27,181
capitalization and the like-- the\nsort of way you write an essay
1933
01:51:27,180 --> 01:51:29,220
but not necessarily\nsend a text message--
1934
01:51:29,220 --> 01:51:31,420
relates to style, for instance.
1935
01:51:31,421 --> 01:51:33,360
And so good style in\ncode is going to have
1936
01:51:33,360 --> 01:51:36,961
a few of these characteristics that are\n
1937
01:51:36,961 --> 01:51:41,981
But you just have to start to get in the\n
1938
01:51:41,980 --> 01:51:44,940
So these three axes, so to speak,\n
1939
01:51:44,940 --> 01:51:48,121
are really the overarching\ngoals when writing code that
1940
01:51:48,121 --> 01:51:50,201
ultimately is going to look like this.
1941
01:51:50,201 --> 01:51:52,201
So this program we\nconjectured last week does
1942
01:51:52,201 --> 01:51:56,863
what if you run it on a Mac or\nPC or somewhere else, presumably?
1943
01:52:00,261 --> 01:52:02,136
DAVID J. MALAN: It just\nprints, Hello, world.
1944
01:52:02,136 --> 01:52:04,311
And honestly, that's kind\nof atrocious that you
1945
01:52:04,310 --> 01:52:08,060
need to hit your keyboard keys this\n
1946
01:52:08,060 --> 01:52:09,740
to get a program to say, Hello, world.
1947
01:52:09,740 --> 01:52:11,990
So a spoiler-- in a\nfew weeks' time when we
1948
01:52:11,990 --> 01:52:14,451
introduce other, more modern\nlanguages, like Python
1949
01:52:14,451 --> 01:52:18,631
you can distill this same logic\ninto literally one line of code.
1950
01:52:18,631 --> 01:52:20,301
And so we're getting there, ultimately.
1951
01:52:20,301 --> 01:52:23,294
But it's helpful to understand\nwhat it is that's going on here
1952
01:52:23,293 --> 01:52:25,461
because even though this\nis a pretty cryptic syntax
1953
01:52:25,461 --> 01:52:28,101
there's nothing after this week and,\n
1954
01:52:28,100 --> 01:52:30,920
be able to understand even about\nsomething that right now looks
1955
01:52:30,921 --> 01:52:32,311
a little something like this.
1956
01:52:33,644 --> 01:52:35,811
Well, I've given us sort\nof the answer to a problem.
1957
01:52:35,810 --> 01:52:37,728
How do you print, Hello,\nworld, on the screen?
1958
01:52:37,728 --> 01:52:39,177
So what do I do with this code?
1959
01:52:39,177 --> 01:52:42,260
Well, we're in the habit of typically\n
1960
01:52:43,911 --> 01:52:48,030
And yeah, I could open up Word or\n
1961
01:52:48,030 --> 01:52:50,780
and just literally transcribe\nthat character for character
1962
01:52:50,780 --> 01:52:53,390
save it, and boom, I've got a program.
1963
01:52:53,390 --> 01:52:56,630
But the problem, per last week, is\n
1964
01:52:56,631 --> 01:52:59,164
what other language, so to speak?
1965
01:53:00,081 --> 01:53:02,310
DAVID J. MALAN: Yeah, so\nbinary, zeros and ones.
1966
01:53:02,310 --> 01:53:04,310
And so this, obviously,\nis not zeros and ones.
1967
01:53:04,310 --> 01:53:07,040
So it doesn't matter if I put it in\n
1968
01:53:07,632 --> 01:53:10,050
The computer is not going to\nunderstand it until I somehow
1969
01:53:10,050 --> 01:53:11,570
translate it to zeros and ones.
1970
01:53:11,570 --> 01:53:14,121
And honestly, none of those\ntools that I rattled off
1971
01:53:14,121 --> 01:53:15,980
are really appropriate for programming.
1972
01:53:16,551 --> 01:53:19,011
Well, they come with features\nlike bold facing and italics
1973
01:53:19,011 --> 01:53:22,881
and sort of fluffy, aesthetic stuff\n
1974
01:53:22,881 --> 01:53:24,631
you're trying to do with your code.
1975
01:53:24,631 --> 01:53:26,661
And they don't have\nthe ability, it would
1976
01:53:26,661 --> 01:53:29,661
seem, to convert that code\nultimately to zeros and ones.
1977
01:53:29,661 --> 01:53:32,451
But tools that do have\nthis capability might
1978
01:53:32,451 --> 01:53:36,350
be called Integrated Development\nEnvironments, or IDEs
1979
01:53:36,350 --> 01:53:38,180
or, more simply, text editors.
1980
01:53:38,180 --> 01:53:42,560
A text editor is a tool that a\nprogrammer uses perhaps every day
1981
01:53:44,030 --> 01:53:46,850
And it's a simple program-- here,\n
1982
01:53:46,850 --> 01:53:49,542
called Visual Studio Code, or VS Code.
1983
01:53:49,542 --> 01:53:51,500
And at the top here, you\nsee that I've actually
1984
01:53:51,501 --> 01:53:56,391
created in advance before class a very\n
1985
01:53:57,020 --> 01:54:00,740
Well, .c indicates by convention that\n
1986
01:54:01,640 --> 01:54:06,050
It's not .docx, which would mean in\n
1987
01:54:08,060 --> 01:54:12,621
This is .c, which means in this file is\n
1988
01:54:12,621 --> 01:54:16,070
C. This number 1 here is just an\n
1989
01:54:16,070 --> 01:54:18,440
me keep track of how long\nor short this program is.
1990
01:54:18,440 --> 01:54:21,050
And the cursor is just\nblinking there, waiting
1991
01:54:21,051 --> 01:54:23,030
for me to start typing some code.
1992
01:54:23,030 --> 01:54:26,030
Well, let me go ahead and type\nout exactly the same code.
1993
01:54:26,030 --> 01:54:28,260
For me, it comes pretty\ncomfortably from memory.
1994
01:54:28,261 --> 01:54:31,791
So I'm going to go ahead and include\n
1995
01:54:32,961 --> 01:54:37,221
I'm going to magically type int\n
1996
01:54:37,220 --> 01:54:38,610
we'll come back to that later--
1997
01:54:38,610 --> 01:54:43,560
one of these curly braces and then a\n
1998
01:54:43,560 --> 01:54:46,610
Then I'm going to hit Tab\nto indent a few spaces.
1999
01:54:46,610 --> 01:54:52,850
And then I'm going to type not print,\n
2000
01:54:52,850 --> 01:54:55,850
close quote, close\nparenthesis, semicolon.
2001
01:54:55,850 --> 01:55:00,050
And I dare say this was essentially\n
2002
01:55:01,220 --> 01:55:03,291
I wrote it to say, "Hi, CS50.
2003
01:55:03,291 --> 01:55:06,201
Now it just says the more canonical,\n
2004
01:55:08,390 --> 01:55:11,240
And all I need to now do is\nmaybe hit Command-S or Control-S
2005
01:55:12,110 --> 01:55:15,081
And voila, I am a programmer.
2006
01:55:15,081 --> 01:55:17,828
The catch though, is,\nOK, how do I run this?
2007
01:55:17,828 --> 01:55:19,911
Like, on your Mac or PC,\nhow do you run a program?
2008
01:55:19,911 --> 01:55:21,411
Well, usually double-click an icon.
2009
01:55:21,411 --> 01:55:23,121
On your phone, you tap an icon.
2010
01:55:23,121 --> 01:55:26,810
In this environment that we're using\n
2011
01:55:26,810 --> 01:55:31,670
say most programmers-- use, you don't\n
2012
01:55:32,511 --> 01:55:35,734
That's very user friendly,\nbut it's not very necessary.
2013
01:55:35,734 --> 01:55:38,151
Especially when you get more\ncomfortable with programming
2014
01:55:38,150 --> 01:55:41,030
you're going to want to type commands\n
2015
01:55:42,020 --> 01:55:43,853
And you're going to\nwant to automate things
2016
01:55:43,854 --> 01:55:46,791
which is a lot easier if it's all\n
2017
01:55:46,791 --> 01:55:49,131
to mouse and muscular movements.
2018
01:55:49,131 --> 01:55:52,011
And so here I have my program.
2019
01:55:52,011 --> 01:55:54,591
It lives in this file called "hello.c.
2020
01:55:54,591 --> 01:55:58,131
I need to now convert it,\nthough, to zeros and ones.
2021
01:55:58,131 --> 01:56:02,331
Well, how do I go about doing this,\n
2022
01:56:03,680 --> 01:56:06,470
or source code, as it's\nconventionally called--
2023
01:56:06,470 --> 01:56:11,040
to this, these zeros and ones that\n
2024
01:56:11,041 --> 01:56:13,490
The zeros and ones from last\nweek can be used not only
2025
01:56:13,490 --> 01:56:18,751
to represent numbers and letters,\n
2026
01:56:18,751 --> 01:56:24,051
It can also represent instructions to a\n
2027
01:56:24,051 --> 01:56:26,091
or delete a file, or save a file.
2028
01:56:26,091 --> 01:56:28,761
All the sort of basics\nof a computer somehow
2029
01:56:28,761 --> 01:56:32,361
can be represented by other\npatterns of zeros and ones.
2030
01:56:32,360 --> 01:56:34,934
And just like last week,\nit depends on the context
2031
01:56:34,934 --> 01:56:36,351
in which these numbers are stored.
2032
01:56:36,350 --> 01:56:39,680
Sometimes they're interpreted as\nnumbers, like in a spreadsheet.
2033
01:56:39,680 --> 01:56:41,390
Sometimes they're interpreted as colors.
2034
01:56:41,390 --> 01:56:46,040
Sometimes they're interpreted as\n
2035
01:56:46,041 --> 01:56:50,820
to do very low-level operations,\n
2036
01:56:50,820 --> 01:56:55,610
So fortunately, last week's definition\n
2037
01:56:55,610 --> 01:56:58,020
is a nice mental model for\nexactly the goal at hand.
2038
01:56:58,020 --> 01:57:00,800
I have some input, AKA source code.
2039
01:57:00,801 --> 01:57:05,001
I want to output ultimately\nmachine code, those zeros and ones.
2040
01:57:05,001 --> 01:57:07,581
I certainly don't want to do\nthis kind of process by hand.
2041
01:57:07,581 --> 01:57:11,181
So hopefully there's an algorithm\n
2042
01:57:12,626 --> 01:57:14,751
And those of you who do\nhave some prior experience
2043
01:57:14,751 --> 01:57:16,947
this program might be called a?
2044
01:57:18,150 --> 01:57:20,150
So a few of you have,\nindeed, programmed before.
2045
01:57:20,150 --> 01:57:21,831
Not all languages use compilers.
2046
01:57:21,831 --> 01:57:24,351
C, in fact, is a language\nthat does use a compiler.
2047
01:57:24,350 --> 01:57:27,680
And so I just need to find myself--
2048
01:57:27,680 --> 01:57:31,280
on my computer somewhere,\npresumably-- a so-called compiler
2049
01:57:31,280 --> 01:57:35,730
a program whose purpose in life is\n
2050
01:57:35,730 --> 01:57:40,460
And source code written textually\n
2051
01:57:41,661 --> 01:57:44,490
The machine code is the\ncorresponding zeros and ones.
2052
01:57:44,490 --> 01:57:48,261
So let me go back to the same\nprogramming environment called
2053
01:57:48,261 --> 01:57:50,421
Visual Studio Code or VS Code.
2054
01:57:50,421 --> 01:57:53,541
This is typically a program you\n
2055
01:57:53,541 --> 01:57:57,351
can download onto their own Mac or\n
2056
01:57:57,350 --> 01:57:59,750
computer you own writing some code.
2057
01:57:59,751 --> 01:58:02,181
A downside, though, of that\napproach is that all of us
2058
01:58:02,180 --> 01:58:04,790
have slightly different\nversions of Macs or PCs.
2059
01:58:04,791 --> 01:58:07,371
We have slightly different\nversions of operating systems.
2060
01:58:07,371 --> 01:58:08,931
They may or may not be up to date.
2061
01:58:08,930 --> 01:58:13,020
It's just a technical support nightmare\n
2062
01:58:13,020 --> 01:58:15,831
especially for an introductory\n
2063
01:58:15,831 --> 01:58:19,201
be on the same page so we can\nget you up and running quickly.
2064
01:58:19,201 --> 01:58:23,180
And so I'm actually using a cloud-based\n
2065
01:58:23,180 --> 01:58:25,970
that you only need a browser to access.
2066
01:58:25,970 --> 01:58:28,760
And then you can be on any\ncomputer, today or tomorrow.
2067
01:58:28,761 --> 01:58:32,341
By the end of the semester, we're\n
2068
01:58:32,341 --> 01:58:36,021
so to speak, as best we can and\nget you onto your own Mac or PC
2069
01:58:36,020 --> 01:58:39,530
so that after this class, especially if\n
2070
01:58:39,530 --> 01:58:43,131
you feel like you can continue\n
2071
01:58:45,270 --> 01:58:47,960
But for now, wonderfully, the\nbrowser version of VS Code
2072
01:58:47,961 --> 01:58:51,793
should pretty much be identical\n
2073
01:58:51,792 --> 01:58:53,000
version of the same would be.
2074
01:58:53,001 --> 01:58:55,521
And you'll see in problem\nset 1 how to access this
2075
01:58:55,520 --> 01:58:58,791
and how to get going yourself\nwith your first programs.
2076
01:58:58,791 --> 01:59:01,641
But I haven't mentioned this\nbottom part of the screen
2077
01:59:01,640 --> 01:59:03,140
this bottom part of the screen.
2078
01:59:03,140 --> 01:59:06,570
And this is an area where we have\n
2079
01:59:06,570 --> 01:59:10,400
So this is sort of old-school technology\n
2080
01:59:10,400 --> 01:59:15,380
to interact with a computer, wherever\n
2081
01:59:15,381 --> 01:59:17,601
or even, in this case, in the cloud.
2082
01:59:17,600 --> 01:59:20,150
So on the top-hand\nportion of this screen
2083
01:59:20,150 --> 01:59:24,740
is my text editor, like tabbed\n
2084
01:59:24,740 --> 01:59:26,930
I can just create files and write code.
2085
01:59:26,930 --> 01:59:30,050
The bottom of the screen here,\nmy so-called terminal window
2086
01:59:30,051 --> 01:59:33,141
gives me the ability to\nrun commands on a server
2087
01:59:33,140 --> 01:59:35,600
that currently I have\nexclusive access to.
2088
01:59:35,600 --> 01:59:39,680
So because I logged into VS\nCode with my account online
2089
01:59:39,680 --> 01:59:44,451
I have my own sort of virtual\n
2090
01:59:44,451 --> 01:59:46,701
otherwise known as, in\nthis context, a container.
2091
01:59:46,701 --> 01:59:49,770
This has its own operating system\nfor me, its own hard drive
2092
01:59:49,770 --> 01:59:52,640
if you will, where I can save\nand create files of my own
2093
01:59:52,640 --> 01:59:54,990
separate from yours and vice versa.
2094
01:59:54,990 --> 01:59:57,381
And it's at this very\nsimple prompt, which
2095
01:59:57,381 --> 02:00:00,218
is conventionally-- but not always--\n
2096
02:00:00,217 --> 02:00:01,550
has nothing to do with currency.
2097
02:00:01,551 --> 02:00:03,644
It just means, type your commands here.
2098
02:00:03,644 --> 02:00:05,811
This is where I'm going to\nbe able to type commands
2099
02:00:05,810 --> 02:00:09,871
like compile my source\ncode into machine code.
2100
02:00:09,871 --> 02:00:15,523
So it's a Command Line Interface, or\n
2101
02:00:15,523 --> 02:00:18,230
that you might not have ever used\n
2102
02:00:19,220 --> 02:00:23,730
Odds are almost all of us in this room\n
2103
02:00:23,730 --> 02:00:26,990
but we're all going to start using an\n
2104
02:00:26,990 --> 02:00:30,140
is in a family of operating systems\n
2105
02:00:30,140 --> 02:00:33,230
line interface, but are used not\n
2106
02:00:33,230 --> 02:00:35,850
websites and developing\napplications and the like.
2107
02:00:35,850 --> 02:00:40,290
And it's, indeed, a familiar and very\n
2108
02:00:40,291 --> 02:00:45,801
So how do I go about making this\nfile, hello.c, into a program?
2109
02:00:45,801 --> 02:00:48,740
There's no icon to double-click,\nbut there is a command.
2110
02:00:48,740 --> 02:00:54,140
I can type, make hello, at this dollar\n
2111
02:00:54,140 --> 02:00:56,300
and nothing appears to happen.
2112
02:00:57,533 --> 02:00:59,490
And as we'll see in\nprogramming, almost always
2113
02:00:59,490 --> 02:01:02,661
if you don't see anything go wrong,\n
2114
02:01:02,661 --> 02:01:04,400
So this is going to\nbe a rarity at first
2115
02:01:04,400 --> 02:01:07,490
but this is a good thing that\nit just seems to do nothing.
2116
02:01:07,490 --> 02:01:11,661
But now there is in the\nfolder in my accounts
2117
02:01:11,661 --> 02:01:15,261
in this on the cloud\na file called "hello.
2118
02:01:15,261 --> 02:01:18,771
And it's a bit of a weird command,\n
2119
02:01:19,581 --> 02:01:22,581
. just means go into my current folder.
2120
02:01:22,581 --> 02:01:28,371
/hello means run the program called\n
2121
02:01:28,371 --> 02:01:32,871
So ./hello, and then Enter, and voila,\n
2122
02:01:40,110 --> 02:01:42,770
I'm going to go ahead and open\nup the sidebar of this program
2123
02:01:42,770 --> 02:01:44,911
and you'll see in problem\nset 1 how to do this.
2124
02:01:44,911 --> 02:01:48,051
And this might look a little different\n
2125
02:01:48,051 --> 02:01:51,191
Even the color scheme I'm using might\n
2126
02:01:51,190 --> 02:01:52,940
because it supports a\nnice colorful theme.
2127
02:01:52,940 --> 02:01:56,720
So you can have different colors\nand brightnesses depending
2128
02:01:56,720 --> 02:01:58,250
on your mood or the time of day.
2129
02:01:58,251 --> 02:02:01,641
What I've opened here, though, is\n
2130
02:02:01,640 --> 02:02:04,278
and this is just all of the\nfiles in my cloud account.
2131
02:02:04,279 --> 02:02:05,571
And there's not many right now.
2132
02:02:06,576 --> 02:02:09,081
One is the file called\nhello.c, and it's highlighted
2133
02:02:09,081 --> 02:02:10,851
because I've got it open right there.
2134
02:02:10,850 --> 02:02:14,330
And the other is a file called\n"hello," which is brand new
2135
02:02:14,331 --> 02:02:17,331
and was created when I ran that command.
2136
02:02:17,331 --> 02:02:21,431
And what's now worth noting is that\n
2137
02:02:22,780 --> 02:02:26,140
Like on the left-hand side, you have\n
2138
02:02:26,140 --> 02:02:30,275
But on the bottom here, again, you\n
2139
02:02:30,275 --> 02:02:32,650
These are just different ways\nto interact with computers
2140
02:02:32,650 --> 02:02:34,193
and you'll get comfortable with both.
2141
02:02:34,193 --> 02:02:37,451
And honestly, you're certainly familiar\n
2142
02:02:37,451 --> 02:02:40,911
so it's the command line one\nwith which we'll spend some time.
2143
02:02:40,911 --> 02:02:43,600
Now suppose that I just\nwanted to do something
2144
02:02:43,600 --> 02:02:45,438
more than compile this program.
2145
02:02:45,439 --> 02:02:47,231
Suppose I wanted to go\nahead and remove it.
2146
02:02:47,230 --> 02:02:48,701
Like, uh-uh, no, I made a mistake.
2147
02:02:48,701 --> 02:02:51,070
I want to say, "Hello,\nCS50," not "Hello, world.
2148
02:02:51,070 --> 02:02:55,121
I could just hover up here, like in\n
2149
02:02:55,121 --> 02:02:57,881
and I could poke around, and\nthere, delete permanently.
2150
02:02:57,881 --> 02:03:00,371
So most of us might have\nthat instinct on a Mac or PC.
2151
02:03:00,371 --> 02:03:02,801
You right-click or Control-click,\nand you poke around.
2152
02:03:02,801 --> 02:03:05,951
But in a command line interface,\nlet me do this instead.
2153
02:03:05,951 --> 02:03:08,440
The command for removing\nor deleting a file
2154
02:03:08,440 --> 02:03:12,100
in the world of Linux, this\nother operating system
2155
02:03:12,100 --> 02:03:16,330
is just a type rm for remove,\nand then "hello," Enter.
2156
02:03:16,331 --> 02:03:19,091
It's a somewhat cryptic confirmation\n
2157
02:03:19,961 --> 02:03:21,761
I'm going to go ahead\nand type Y for Yes.
2158
02:03:21,761 --> 02:03:24,191
And now when I hit\nEnter, watch what happens
2159
02:03:24,190 --> 02:03:28,240
at top left in the Explorer, the\nGUI, the graphical interface.
2160
02:03:30,789 --> 02:03:32,980
Not terribly exciting,\nbut this just means
2161
02:03:32,980 --> 02:03:35,539
this is a graphical version\nof what we're seeing here.
2162
02:03:35,539 --> 02:03:38,800
And in fact, if you want to\nnever use the GUI again--
2163
02:03:38,800 --> 02:03:41,690
I'll go ahead and close it\nwith a keyboard shortcut here--
2164
02:03:41,690 --> 02:03:45,430
you can forever just type\nls for list and hit Enter.
2165
02:03:45,430 --> 02:03:48,100
And you will see in the\ncommand line interface
2166
02:03:48,100 --> 02:03:50,617
all of the files in your current folder.
2167
02:03:50,618 --> 02:03:52,451
So anything you can do\nwith a mouse, you can
2168
02:03:52,451 --> 02:03:54,039
do with this command line interface.
2169
02:03:54,039 --> 02:03:57,411
And indeed, we'll see many more\nthings that you can do as well.
2170
02:03:57,411 --> 02:04:01,091
But the inventors of this, this\n
2171
02:04:02,291 --> 02:04:04,360
Like, the command is rm for remove.
2172
02:04:07,390 --> 02:04:10,480
Because it's just faster to type.
2173
02:04:10,480 --> 02:04:14,260
So before we forge ahead with making\n
2174
02:04:14,261 --> 02:04:16,181
just "Hello, world,"\nlet me pause here to see
2175
02:04:16,180 --> 02:04:19,420
if there's questions on\nsource code or machine
2176
02:04:19,421 --> 02:04:24,221
code or compiler or this\ncommand line interface.
2177
02:04:26,628 --> 02:04:28,921
DAVID J. MALAN: Really good\nquestion, and let me recap.
2178
02:04:28,921 --> 02:04:30,751
If I were to make\nchanges to the program
2179
02:04:30,751 --> 02:04:33,871
run it, and then maybe make other\nchanges and try to rerun it
2180
02:04:33,871 --> 02:04:37,028
would those changes be reflected,\n
2181
02:04:37,860 --> 02:04:39,961
I already removed the old version.
2182
02:04:39,961 --> 02:04:43,530
So let me go ahead and point\nout that if I do ./hello now
2183
02:04:43,530 --> 02:04:47,760
I'm going to see some kind of error\n
2184
02:04:47,761 --> 02:04:50,431
No such file or directory, so\nit's not terribly user friendly
2185
02:04:50,430 --> 02:04:51,960
but it's saying what the problem is.
2186
02:04:51,961 --> 02:04:55,231
Let me go ahead and remake\nit by typing make hello.
2187
02:04:55,230 --> 02:04:59,911
Now if I type ls, I'll see not one\n
2188
02:04:59,911 --> 02:05:03,615
is even green with a little asterisk\n
2189
02:05:03,615 --> 02:05:05,490
It's sort of the textual\nversion of something
2190
02:05:05,490 --> 02:05:07,251
you could double-click\nin our human world.
2191
02:05:07,251 --> 02:05:10,501
So now, of course, if I run hello, we're\n
2192
02:05:10,501 --> 02:05:15,811
But now suppose I change it to\n
2193
02:05:15,810 --> 02:05:21,000
Let me go ahead and save the file with\n
2194
02:05:21,001 --> 02:05:23,681
let me run ./hello again, and voila.
2195
02:05:25,171 --> 02:05:27,431
So let me ask someone else\nto answer that question.
2196
02:05:29,011 --> 02:05:31,051
Why did it not say, "Hello, CS50.
2197
02:05:33,171 --> 02:05:35,240
DAVID J. MALAN: Yeah, so\nI didn't compile it again.
2198
02:05:35,240 --> 02:05:38,421
So sort of newbie mistake, you're going\n
2199
02:05:38,990 --> 02:05:42,051
But now let me go ahead\nand remake hello, enter.
2200
02:05:42,051 --> 02:05:45,171
It's going to seemingly\nmake the same program.
2201
02:05:45,171 --> 02:05:49,431
But this time when I run\nit, it\'s, "Hello, CS50.
2202
02:05:49,430 --> 02:05:53,180
Any other questions on some\nof these building blocks?
2203
02:05:53,180 --> 02:05:56,580
And we'll come back to all the\ncrazy syntax I typed before long.
2204
02:05:56,581 --> 02:05:58,881
But for now, we're focusing\non just the output.
2205
02:06:00,853 --> 02:06:02,560
DAVID J. MALAN: When\nI keep running make
2206
02:06:02,560 --> 02:06:05,510
it creates a new version\nof the machine code.
2207
02:06:05,511 --> 02:06:09,490
So it keeps changing the hello program\n
2208
02:06:09,490 --> 02:06:12,599
There's no make file, per se.
2209
02:06:13,634 --> 02:06:15,051
DAVID J. MALAN: Good question, no.
2210
02:06:15,051 --> 02:06:18,141
If I open up that directory, you'll\n
2211
02:06:18,140 --> 02:06:21,230
And it doesn't matter how\nmany times I run make hello--
2212
02:06:21,230 --> 02:06:25,251
three, four, five-- it just\nkeeps overwriting the original.
2213
02:06:25,251 --> 02:06:28,753
So it's kind of like just saving in\n
2214
02:06:29,461 --> 02:06:31,003
But there's an additional step today.
2215
02:06:31,002 --> 02:06:36,280
We have to then convert my words to\n
2216
02:06:38,541 --> 02:06:40,730
DAVID J. MALAN: Oh, what\nhappens if I run hello.c?
2217
02:06:40,730 --> 02:06:45,081
So let me go ahead and do ./hello.c,\n
2218
02:06:48,801 --> 02:06:50,961
This is where the error\nmessages mean something
2219
02:06:50,961 --> 02:06:53,661
to the people who designed the operating\n
2220
02:06:53,661 --> 02:06:55,701
It's not that you don't\nhave access to the file.
2221
02:06:55,701 --> 02:06:57,411
It means that it's not executable.
2222
02:06:57,411 --> 02:07:00,201
This is not something you\nhave permission to run
2223
02:07:00,201 --> 02:07:04,743
but you do have permission to read\n
2224
02:07:06,520 --> 02:07:08,228
DAVID J. MALAN: Oh,\nreally good question.
2225
02:07:08,229 --> 02:07:11,801
So if I have named my file, hello\n
2226
02:07:11,801 --> 02:07:15,761
dot C, of the things that\nMake does is it automatically
2227
02:07:18,911 --> 02:07:20,711
we'll discuss this a bit more next week.
2228
02:07:20,711 --> 02:07:23,711
Make itself-- is kind of the\nfirst of white lies today--
2229
02:07:25,480 --> 02:07:30,820
It's a program that knows how to find\n
2230
02:07:30,820 --> 02:07:33,161
and automatically create the program.
2231
02:07:33,161 --> 02:07:37,091
If I use, as we'll discuss next\n
2232
02:07:37,091 --> 02:07:41,081
I have to type a much longer sequence\n
2233
02:07:41,081 --> 02:07:43,270
what do I want the name\nof my program to be.
2234
02:07:43,270 --> 02:07:45,310
Make is a nice program,\nespecially in week 1
2235
02:07:45,310 --> 02:07:47,661
because it just automates\nall of that for us.
2236
02:07:47,661 --> 02:07:51,221
And so here, we have now a\nprogram that very simply prints
2237
02:07:52,220 --> 02:07:54,790
So let's not put this\ninto the context of where
2238
02:07:54,791 --> 02:07:58,871
we left off last time in the context\n
2239
02:07:58,871 --> 02:08:01,961
So we discuss the last time, of\ncourse, functions and arguments.
2240
02:08:01,961 --> 02:08:07,011
Functions, again, are those actions and\n
2241
02:08:07,011 --> 02:08:09,403
And the arguments were the\ninputs to those functions
2242
02:08:09,403 --> 02:08:11,860
generally in those little white\novals that, in Scratch, you
2243
02:08:11,860 --> 02:08:13,961
could type words or numbers into.
2244
02:08:13,961 --> 02:08:16,961
We'll see, in all of the languages\nwe're going to see this term
2245
02:08:18,283 --> 02:08:20,990
And let's just start to translate\n
2246
02:08:20,990 --> 02:08:24,911
So for instance, let's put\nthis same program in C
2247
02:08:26,110 --> 02:08:29,621
This is what Hello, World looked like\n
2248
02:08:29,621 --> 02:08:32,530
This week, of course,\nit looks like print.
2249
02:08:32,530 --> 02:08:34,990
And then the parentheses,\nnotice, are kind of
2250
02:08:34,990 --> 02:08:38,679
deliberately designed in the world of\n
2251
02:08:38,679 --> 02:08:40,721
Even though this is a\nwhite oval, you kind of get
2252
02:08:40,720 --> 02:08:45,400
that it's kind of evoking that\nsame idea with the parentheses.
2253
02:08:45,400 --> 02:08:49,180
Technically the function\nin C, it's not called say.
2254
02:08:51,671 --> 02:08:55,061
The F stands for formatted, but we'll\n
2255
02:08:55,060 --> 02:08:57,610
But printf is the closest\nanalogous function
2256
02:08:57,610 --> 02:09:00,881
for say in the world of\nC. Notice if, though, you
2257
02:09:00,881 --> 02:09:05,501
want to print something like\nHello, World or Hello CS50 in C
2258
02:09:05,501 --> 02:09:08,980
you don't just write the\nwords as we did last week.
2259
02:09:08,980 --> 02:09:11,411
You also had an add what,\nif you notice already
2260
02:09:11,411 --> 02:09:13,360
what's missing from this version.
2261
02:09:13,360 --> 02:09:15,680
Yeah, so the double quotes\non the left and the right.
2262
02:09:15,680 --> 02:09:20,998
So, that's necessary in C whenever\nyou have a string of words.
2263
02:09:20,998 --> 02:09:22,541
And I'm using that word deliberately.
2264
02:09:22,541 --> 02:09:26,831
Whenever you have multiple words\n
2265
02:09:27,461 --> 02:09:30,971
And you have to put it in double\nquotes, not single quotes.
2266
02:09:30,970 --> 02:09:32,680
You have to put it in double quotes.
2267
02:09:32,680 --> 02:09:36,880
There's one other stupid thing\nthat we need to have in my C code
2268
02:09:36,881 --> 02:09:40,971
in order to get this function to do\n
2269
02:09:41,841 --> 02:09:44,240
So just like in our human\nworld, you eventually
2270
02:09:44,240 --> 02:09:47,270
got into the habit of using, at\n
2271
02:09:47,270 --> 02:09:50,690
Semicolon is generally what\nyou use to finish your thought
2272
02:09:50,690 --> 02:09:53,780
in the world of programming with C.
2273
02:09:53,780 --> 02:09:55,911
All right, so we have\nthat function in place.
2274
02:09:55,911 --> 02:09:59,661
Now, what does this really fit\n
2275
02:09:59,661 --> 02:10:01,070
Well, functions take arguments.
2276
02:10:01,070 --> 02:10:04,760
And it turns out functions can\nhave different types of outputs.
2277
02:10:04,761 --> 02:10:07,071
And we've actually seen\nboth already last week.
2278
02:10:07,070 --> 02:10:11,360
One type of output from a function\n
2279
02:10:11,360 --> 02:10:13,430
And it generally refers\nto something visual
2280
02:10:13,430 --> 02:10:17,570
like something appearing on the screen\n
2281
02:10:17,570 --> 02:10:20,211
It's sort of a side effect of\nthe function doing its thing.
2282
02:10:20,211 --> 02:10:24,231
And indeed, last week we saw this in\n
2283
02:10:24,230 --> 02:10:27,180
like Hello, World as\ninput to the say function.
2284
02:10:27,180 --> 02:10:31,040
And we saw on the screen Hello,\n
2285
02:10:32,930 --> 02:10:35,840
You can't actually do anything\nwith that visual output
2286
02:10:35,841 --> 02:10:38,581
other than consume it,\nvisually, with your human eyes.
2287
02:10:38,581 --> 02:10:43,251
But sometimes, recall last week, we\n
2288
02:10:43,251 --> 02:10:44,961
actually returned me some value.
2289
02:10:44,961 --> 02:10:47,001
Remember the ask, what's your name.
2290
02:10:47,001 --> 02:10:50,684
It handed me back whatever\nanswer the human typed in.
2291
02:10:50,684 --> 02:10:52,851
It didn't just arbitrarily\ndisplay it on the screen.
2292
02:10:52,850 --> 02:10:55,190
The cat didn't necessarily\nsay it on the screen.
2293
02:10:55,190 --> 02:11:01,640
It was stored, instead, in that special\n
2294
02:11:01,640 --> 02:11:05,510
Because some functions have not\nside effects but return values.
2295
02:11:05,511 --> 02:11:09,321
They hand you back an output\nthat you can use and reuse
2296
02:11:09,320 --> 02:11:11,961
unlike the side effect, which,\nagain displays and that's it.
2297
02:11:11,961 --> 02:11:14,461
You can't sort of catch\nit and hold on to it.
2298
02:11:14,461 --> 02:11:17,601
So, in the context of last\nweek, we had the ask block.
2299
02:11:17,600 --> 02:11:20,270
And that had this special\nanswer return value.
2300
02:11:20,270 --> 02:11:22,970
In C, we're going to\nsee in just a moment
2301
02:11:22,970 --> 02:11:25,400
we could translate this as follows.
2302
02:11:25,400 --> 02:11:29,001
The closest match I can\npropose for the ask block
2303
02:11:29,001 --> 02:11:31,581
is a function that we're going\nto start calling get string.
2304
02:11:31,581 --> 02:11:34,730
String is, again, a word, a\nset of words, like a phrase
2305
02:11:34,730 --> 02:11:36,710
or a sentence in programming.
2306
02:11:36,711 --> 02:11:41,331
It, too, is a function insofar as\n
2307
02:11:41,331 --> 02:11:43,551
this isn't always true--\nbut very often when
2308
02:11:43,551 --> 02:11:48,261
you have a word in C followed by an open\n
2309
02:11:48,261 --> 02:11:50,961
it's most likely the name of a function.
2310
02:11:50,961 --> 02:11:53,461
And we're going to see that\nthere's some exceptions to that.
2311
02:11:53,461 --> 02:11:55,791
But for now this indeed\nlooks like a function
2312
02:11:55,791 --> 02:11:57,201
because it matches that pattern.
2313
02:11:57,201 --> 02:12:00,668
If I want to ask the question,\nwhat's your name, question mark--
2314
02:12:00,668 --> 02:12:03,711
and I'm even going to deliberately\n
2315
02:12:03,711 --> 02:12:07,161
the cursor a little bit over so that\n
2316
02:12:08,190 --> 02:12:10,201
So that's just the nitpicky aesthetic.
2317
02:12:10,201 --> 02:12:14,480
This is perhaps the closest analog\nto just asking that question.
2318
02:12:14,480 --> 02:12:18,680
But because the ask\nblock returns a value
2319
02:12:18,680 --> 02:12:22,020
the analog here forget string is\nthat it, too, returns a value.
2320
02:12:22,020 --> 02:12:23,960
It doesn't just print the human's input.
2321
02:12:23,961 --> 02:12:28,131
It hands it back to you in the form\n
2322
02:12:28,131 --> 02:12:30,501
that I can then use and reuse.
2323
02:12:30,501 --> 02:12:33,471
Now ideally it would be as\nsimple as this literally
2324
02:12:33,470 --> 02:12:36,890
saying answer on the left equals.
2325
02:12:36,890 --> 02:12:38,780
And this is where\nthings start to diverge
2326
02:12:38,780 --> 02:12:40,640
from math and sort of our human world.
2327
02:12:40,640 --> 02:12:44,210
This equal sign, henceforth,\nis not the equal sign.
2328
02:12:44,211 --> 02:12:46,521
It is the assignment operator.
2329
02:12:46,520 --> 02:12:50,030
To assign a value means to\nstore a value in some variable.
2330
02:12:50,030 --> 02:12:53,640
And you read these things,\nweirdly, right to left.
2331
02:12:53,640 --> 02:12:55,820
So here is a function called get string.
2332
02:12:55,820 --> 02:12:58,310
I claim that it's going\nto return to you whatever
2333
02:12:58,310 --> 02:13:00,411
the human types in as their name.
2334
02:13:00,411 --> 02:13:03,320
It's going to get stored\nover here on the left because
2335
02:13:03,320 --> 02:13:06,331
of this so-called assignment\n
2336
02:13:06,331 --> 02:13:08,581
But it doesn't mean\nequality in this context.
2337
02:13:10,171 --> 02:13:14,961
But it does so by copying the value on\n
2338
02:13:14,961 --> 02:13:16,940
Unfortunately, we're not\nquite done yet with C.
2339
02:13:16,940 --> 02:13:19,670
And this is where, again, it\ngets a little annoying at first
2340
02:13:19,671 --> 02:13:23,780
where Scratch just let us express\n
2341
02:13:23,780 --> 02:13:27,110
In C when you have a\nvariable you don't just
2342
02:13:27,110 --> 02:13:29,001
give it a name like you did in Scratch.
2343
02:13:29,001 --> 02:13:33,051
You also have to tell the computer\nin advance what type of value
2344
02:13:34,310 --> 02:13:37,340
String is one such type of value.
2345
02:13:37,341 --> 02:13:40,070
Int, for integer, is\ngoing to be another.
2346
02:13:40,070 --> 02:13:42,860
And there's even more than that\nwe'll see today and beyond.
2347
02:13:42,860 --> 02:13:46,070
And this is partly an answer to the\n
2348
02:13:46,070 --> 02:13:49,970
last week, which was how does a computer\n
2349
02:13:51,320 --> 02:13:55,518
Like is this a letter, a number,\na color, a piece of video.
2350
02:13:55,518 --> 02:13:58,350
And I just claimed last week that\n
2351
02:14:00,871 --> 02:14:04,190
But within those\nprograms, it often depends
2352
02:14:04,190 --> 02:14:08,810
on what the human programmer\nsaid the type of the value is.
2353
02:14:08,810 --> 02:14:10,850
If this specifies that\nthe string, which means
2354
02:14:10,850 --> 02:14:12,890
interpret the following\nzeros and ones that
2355
02:14:12,890 --> 02:14:16,640
are stored in my program as\nwords or letters, more generally.
2356
02:14:16,640 --> 02:14:20,960
If it's an int for integer, it would\n
2357
02:14:20,961 --> 02:14:24,981
treat the following zeros and\nones in my program as a number
2358
02:14:26,791 --> 02:14:29,661
So here's where this\nweek, unlike with Scratch
2359
02:14:29,661 --> 02:14:33,081
which is kind of figures out what you\n
2360
02:14:33,081 --> 02:14:36,201
you have to be this pedantic\nand tell it what you mean.
2361
02:14:36,201 --> 02:14:39,721
There's still one stupid thing\nmissing from my code here.
2362
02:14:42,150 --> 02:14:44,230
DAVID J. MALAN: And we still\nneed the stupid semicolon.
2363
02:14:44,230 --> 02:14:45,648
And I'm sort of impugning it here.
2364
02:14:45,648 --> 02:14:48,060
Because honestly, these are\nthe kinds of stupid mistakes
2365
02:14:48,060 --> 02:14:50,701
you're going to make today,\ntomorrow, this weekend, next week
2366
02:14:50,701 --> 02:14:54,480
a few weeks from now, until you\n
2367
02:14:54,480 --> 02:14:58,440
as well as you do English or\nwhatever your spoken language is.
2368
02:15:00,511 --> 02:15:03,031
Suppose I mix apples and\noranges, so to speak
2369
02:15:03,030 --> 02:15:06,421
and I try to put a string in\nan int or an int in a string
2370
02:15:06,421 --> 02:15:08,581
the compiler is going to complain.
2371
02:15:08,581 --> 02:15:10,841
So when I run that make\ncommand as I did earlier
2372
02:15:10,841 --> 02:15:14,893
it's not going to be nice and blissfully\n
2373
02:15:14,893 --> 02:15:17,100
It's going to yell at me\nwith honestly a very cryptic
2374
02:15:17,100 --> 02:15:20,350
looking error message until we get\n
2375
02:15:23,100 --> 02:15:24,990
Ah, what happened to the backslash n.
2376
02:15:24,990 --> 02:15:26,820
So, we'll come back to that\nin just a moment, if we may.
2377
02:15:26,820 --> 02:15:29,778
Because I have deliberately omitted\n
2378
02:15:29,779 --> 02:15:32,311
And we'll see the different\nbehavior in a sec.
2379
02:15:37,470 --> 02:15:39,720
These are the kinds of\nthings that just matter.
2380
02:15:39,720 --> 02:15:43,710
And it's going to take time to recognize\n
2381
02:15:43,711 --> 02:15:48,306
Everything I've typed here except,\n
2382
02:15:48,305 --> 02:15:50,430
And the W is capitalized\njust because it's English.
2383
02:15:50,430 --> 02:15:51,900
Everything else is lowercase.
2384
02:15:51,900 --> 02:15:54,510
And this kind of varies by\nlanguage and also context.
2385
02:15:54,511 --> 02:15:58,621
So, in many languages the convention\n
2386
02:16:00,060 --> 02:16:02,440
Other languages might use\nsome capitals, as well.
2387
02:16:02,440 --> 02:16:04,050
But we'll talk about that before long.
2388
02:16:04,051 --> 02:16:05,951
But this is the kind\nof thing that matters
2389
02:16:05,951 --> 02:16:09,480
and is hard to see at first, especially\n
2390
02:16:09,480 --> 02:16:12,930
different when it's on your tiny\nlaptop screen from a capital S.
2391
02:16:12,930 --> 02:16:15,850
But you'll start to\ndevelop these instincts.
2392
02:16:15,850 --> 02:16:18,060
All right, so besides\nthis particular block
2393
02:16:18,060 --> 02:16:22,180
let's go ahead and consider how we can\n
2394
02:16:22,180 --> 02:16:24,300
So let me switch back to VS Code here.
2395
02:16:24,301 --> 02:16:26,081
This was the program I had earlier.
2396
02:16:26,081 --> 02:16:28,891
And let me go ahead and\nundo my CS50 change.
2397
02:16:30,720 --> 02:16:35,251
Rerun Make on Hello with the original\n
2398
02:16:35,251 --> 02:16:37,230
Enter, nothing bad\nseems to have happened.
2399
02:16:37,230 --> 02:16:40,740
So dot slash Hello, enter Hello, World.
2400
02:16:40,740 --> 02:16:43,140
Now, if you're curious,\nthis is a good instinct
2401
02:16:43,140 --> 02:16:45,558
to start to acquire what\nhappens if I get rid of this.
2402
02:16:45,558 --> 02:16:47,850
Well, I'm probably not going\nto break things too badly.
2403
02:16:48,781 --> 02:16:51,240
Let me go ahead now and do Make Hello.
2404
02:16:52,331 --> 02:16:53,980
So it's not a really bad mistake.
2405
02:16:53,980 --> 02:16:56,700
So let me go ahead and\nrun dot slash Hello.
2406
02:16:58,921 --> 02:17:01,560
Yeah, what do you see that's different?
2407
02:17:01,560 --> 02:17:05,091
Yeah, the dollar sign, my so-called\n
2408
02:17:05,630 --> 02:17:08,780
Well, we can presumably\ninfer now that the backslash
2409
02:17:08,781 --> 02:17:12,531
n is some fancy notation for\nsaying create a new line
2410
02:17:12,531 --> 02:17:14,881
move the cursor, so to\nspeak, to the next line.
2411
02:17:14,880 --> 02:17:18,770
Notice that the cursor will move to\n
2412
02:17:18,771 --> 02:17:20,781
If I keep hitting it,\nit just automatically
2413
02:17:20,781 --> 02:17:22,371
by nature of hitting enter, does it.
2414
02:17:22,370 --> 02:17:25,310
But it'd be kind of stupid if when\n
2415
02:17:25,310 --> 02:17:28,310
simple as it is, if\nthe next command is now
2416
02:17:28,310 --> 02:17:31,770
weirdly spaced in the middle of\n
2417
02:17:33,111 --> 02:17:35,061
It's really just an aesthetic argument.
2418
02:17:35,060 --> 02:17:40,940
And notice that it's not acceptable or\n
2419
02:17:40,941 --> 02:17:43,441
Let me go ahead and save that,\nthough, and see what happens.
2420
02:17:43,441 --> 02:17:47,001
Let me go ahead now and\nrun Make Hello enter.
2421
02:17:49,161 --> 02:17:53,390
This is like, what, 10 lines of\nerrors for a one line program.
2422
02:17:53,390 --> 02:17:56,390
And this is where, again, you'll start\n
2423
02:17:57,380 --> 02:18:00,530
These kinds of tools, like\nthe compiler tool we're using
2424
02:18:00,531 --> 02:18:03,861
were not designed necessarily\nwith user friendliness in mind.
2425
02:18:03,861 --> 02:18:06,081
That's changed over the\ndecades, but certainly early
2426
02:18:06,081 --> 02:18:09,810
on it's really just meant to be\n
2427
02:18:11,150 --> 02:18:13,941
Missing terminating\nclose quote character
2428
02:18:13,941 --> 02:18:17,061
long story short, when\nyou have a string in C
2429
02:18:17,060 --> 02:18:20,511
your double quotes just have to\n
2430
02:18:20,511 --> 02:18:22,049
Now, there's the slight white lie.
2431
02:18:23,091 --> 02:18:29,631
But the best way around it is to\n
2432
02:18:29,630 --> 02:18:32,390
To escape something means generally\nto put a backslash, and then
2433
02:18:32,390 --> 02:18:34,710
a special symbol like n for new line.
2434
02:18:34,710 --> 02:18:39,510
And this is just the agreed upon way\n
2435
02:18:39,511 --> 02:18:42,650
OK you don't just hit your enter key.
2436
02:18:42,650 --> 02:18:46,941
You instead put backslash n\nand that tells the computer
2437
02:18:46,941 --> 02:18:49,471
to move the cursor to the new line.
2438
02:18:50,977 --> 02:18:52,310
But once you know it, that's it.
2439
02:18:52,310 --> 02:18:55,470
It's just another word\nin our vocabulary.
2440
02:18:55,470 --> 02:18:58,550
So now let me transition to making\n
2441
02:18:58,550 --> 02:19:00,140
Instead of just saying\nHello, world, let me
2442
02:19:00,140 --> 02:19:02,015
change it like last week\nto say Hello, David
2443
02:19:02,015 --> 02:19:04,320
or whoever is interacting\nwith the program.
2444
02:19:04,320 --> 02:19:07,851
So I'm going to do string\nanswer gets, get string
2445
02:19:07,851 --> 02:19:11,256
quote unquote, what's your name.
2446
02:19:11,255 --> 02:19:13,130
I'm not going to bother\nwith a new line here.
2447
02:19:13,730 --> 02:19:15,181
This is now just a judgment call.
2448
02:19:15,181 --> 02:19:17,889
I deliberately want the human to\n
2449
02:19:20,790 --> 02:19:22,970
Well last week recall we used say.
2450
02:19:22,970 --> 02:19:25,251
And then we use the\nother block called join.
2451
02:19:25,251 --> 02:19:27,681
So the idea here is the same.
2452
02:19:27,681 --> 02:19:30,181
But the syntax this week is\ngoing to be a little different.
2453
02:19:30,181 --> 02:19:33,871
It's going to be printf, which\nprints something on the screen.
2454
02:19:33,870 --> 02:19:38,090
I'm going to go ahead\nand say Hello comma.
2455
02:19:38,091 --> 02:19:43,070
And let me just go with this initially\n
2456
02:19:43,070 --> 02:19:46,700
Let me go ahead and recompile my code.
2457
02:19:46,700 --> 02:19:50,292
Whoops, damn doesn't work still.
2458
02:19:50,292 --> 02:19:51,501
And look at all these errors.
2459
02:19:51,501 --> 02:19:54,441
There's more errors than code I wrote.
2460
02:19:56,431 --> 02:19:58,881
Well, this is actually\nsomething, a mistake you'll see
2461
02:19:58,880 --> 02:20:01,130
somewhat often, at least initially.
2462
02:20:01,130 --> 02:20:03,270
And let's start to glean\nwhat's going on here.
2463
02:20:03,271 --> 02:20:06,921
So here, if I look at the very first\n
2464
02:20:06,921 --> 02:20:09,921
so even though it jumped\ndown the screen pretty fast
2465
02:20:09,921 --> 02:20:12,831
I wrote Make Hello at\nthe dollar sign, prompt.
2466
02:20:12,831 --> 02:20:14,361
And then here's the first error.
2467
02:20:17,120 --> 02:20:21,020
technically character 5, but generally\n
2468
02:20:21,021 --> 02:20:24,951
there's an error, use of\nundeclared identifier string.
2469
02:20:28,101 --> 02:20:30,361
And this is not an\nobvious solution at first.
2470
02:20:30,361 --> 02:20:34,161
But you'll start to recognize\nthese patterns in error messages.
2471
02:20:34,161 --> 02:20:39,740
It turns out that if I want to use\n
2472
02:20:39,740 --> 02:20:43,879
I have to include another library\nup here, another line of code
2473
02:20:43,879 --> 02:20:46,460
rather, called CS50\ndot H. We'll come back
2474
02:20:46,460 --> 02:20:48,300
to what this means in just a moment.
2475
02:20:48,300 --> 02:20:54,560
But if I now retroactively say,\n
2476
02:20:55,670 --> 02:20:59,720
Before I added that new line,\nwhat is standard I/O doing?
2477
02:20:59,720 --> 02:21:01,670
Well, if you think\nback to Scratch, there
2478
02:21:01,670 --> 02:21:07,761
were a few examples with the camera and\n
2479
02:21:07,761 --> 02:21:10,281
Remember I had to poke around\nin the extensions button.
2480
02:21:10,281 --> 02:21:12,110
And then I had to load it into Scratch.
2481
02:21:12,110 --> 02:21:14,240
It didn't come natively with Scratch.
2482
02:21:15,920 --> 02:21:18,170
Some functions come with the language.
2483
02:21:18,170 --> 02:21:22,161
But for the most part, if you want to\n
2484
02:21:22,161 --> 02:21:26,570
like printf, you have to load\nthat extension, so to speak
2485
02:21:26,570 --> 02:21:29,361
that more traditionally\nis called a library.
2486
02:21:29,361 --> 02:21:35,121
So there is a standard I/O\nlibrary, STD I/O, standard I/O
2487
02:21:35,120 --> 02:21:37,350
where I/O just means input and output.
2488
02:21:37,351 --> 02:21:39,441
Which means, just like\nin MIT's World, there
2489
02:21:39,441 --> 02:21:43,161
was an extension for doing text\n
2490
02:21:43,161 --> 02:21:45,470
In C, there's an extension, a.k.a.
2491
02:21:45,470 --> 02:21:49,230
a library, for doing\nstandard input and output.
2492
02:21:49,230 --> 02:21:53,450
And so if you want to use any functions\n
2493
02:21:53,450 --> 02:21:58,130
like text from a keyboard, you\nhave to include standard I/O dot
2494
02:21:58,130 --> 02:22:02,360
H. And then can you use printf.
2495
02:22:03,890 --> 02:22:08,690
Get string, it turns out, is a\n
2496
02:22:08,691 --> 02:22:10,971
And as we'll see over\nthe coming weeks, it just
2497
02:22:10,970 --> 02:22:14,780
makes it way easier to\nget input from a user.
2498
02:22:14,781 --> 02:22:18,111
C is very good with printf at\nprinting output on the screen.
2499
02:22:18,111 --> 02:22:21,411
C makes it really annoying and\n
2500
02:22:21,411 --> 02:22:23,191
to just get input from the user.
2501
02:22:23,191 --> 02:22:25,820
So we wrote a function\ncalled get_string
2502
02:22:25,820 --> 02:22:29,240
but the only way you can use\nthat is to load the extension
2503
02:22:29,240 --> 02:22:32,271
a.k.a. load the library called CS50.
2504
02:22:32,271 --> 02:22:35,841
And we'll come back in time, like,\n
2505
02:22:35,841 --> 02:22:39,291
But for now, standard\nI/O is a library that
2506
02:22:39,290 --> 02:22:42,540
gives you access to printf and\ninput- and output-related stuff.
2507
02:22:42,540 --> 02:22:45,110
CS50 is a second library\nthat provides you
2508
02:22:45,111 --> 02:22:48,441
with access to functions\nthat don't come with C
2509
02:22:48,441 --> 02:22:51,501
that include something like get_string.
2510
02:22:51,501 --> 02:22:55,191
So with that said, we've\nnow kind of teased apart
2511
02:22:55,191 --> 02:22:58,171
at a high level what lines\n2 and now 1 are doing.
2512
02:22:58,171 --> 02:23:00,380
Let me go ahead and rerun make hello.
2513
02:23:01,611 --> 02:23:05,099
So all those crazy error messages\nwere resolved by just one fix
2514
02:23:05,099 --> 02:23:07,640
so key takeaway is not to get\noverwhelmed by the sheer number
2515
02:23:08,581 --> 02:23:13,521
Let me now do ./hello and if I type\n
2516
02:23:19,470 --> 02:23:24,091
Yeah, hello answer, because the\n
2517
02:23:24,091 --> 02:23:26,671
And it turns out that if\nyou just write "hello
2518
02:23:26,671 --> 02:23:29,611
answer" all in the double quotes,\nyou\'re really just passing
2519
02:23:29,611 --> 02:23:32,551
English as the input\nto the printf function
2520
02:23:32,550 --> 02:23:34,470
you're not actually\npassing in the variable.
2521
02:23:34,470 --> 02:23:37,320
And unfortunately in\nC, it's not quite as
2522
02:23:37,320 --> 02:23:40,861
easy to plug things in to\nother things that you've typed.
2523
02:23:40,861 --> 02:23:43,111
Remember in Scratch, there\nwas not just the Save block
2524
02:23:43,111 --> 02:23:45,661
but the Join block,\nwhich was kind of pretty
2525
02:23:45,661 --> 02:23:47,460
you can combine apples and oranges--
2526
02:23:48,900 --> 02:23:52,640
Then we changed it to hello and then\n
2527
02:23:52,640 --> 02:23:54,765
In C, the syntax is going\nto be a little different.
2528
02:23:54,765 --> 02:23:59,850
You tell the computer inside of your\n
2529
02:23:59,851 --> 02:24:05,191
a placeholder there, a so-called\n
2530
02:24:05,191 --> 02:24:08,041
put a string here eventually.
2531
02:24:08,040 --> 02:24:12,661
Then outside of your quotes, you\n
2532
02:24:12,661 --> 02:24:18,421
in whatever variable you want the\n
2533
02:24:19,230 --> 02:24:23,671
So %s is a format code which\nserves as a placeholder.
2534
02:24:23,671 --> 02:24:26,431
And now the printf function\nwas designed by humans years
2535
02:24:26,431 --> 02:24:28,650
ago to figure out how to\ndo the apple and banana
2536
02:24:28,650 --> 02:24:30,972
thing of joining two words together.
2537
02:24:30,972 --> 02:24:33,181
It's not nearly as user-friendly\nas it is in Scratch
2538
02:24:33,181 --> 02:24:35,701
but it's a very common paradigm.
2539
02:24:35,700 --> 02:24:38,880
So let me try and rerun\nthis now. make hello.
2540
02:24:42,630 --> 02:24:45,780
If I type Enter now, now it's hello.
2541
02:24:46,351 --> 02:24:48,810
And the printf, here's the F in printf.
2542
02:24:48,810 --> 02:24:53,581
It formats its input for you by using\n
2543
02:24:53,581 --> 02:24:58,201
strings, represented again by %s.
2544
02:24:58,200 --> 02:25:02,650
So a quick question then, if I focus\n
2545
02:25:02,650 --> 02:25:09,480
and even zoom in here, how many\n
2546
02:25:09,480 --> 02:25:13,501
A moment ago, I'll admit that it was\n
2547
02:25:14,790 --> 02:25:19,980
How many inputs might you\ninfer printf is taking now?
2548
02:25:20,521 --> 02:25:25,201
And it's implied by this comma here,\n
2549
02:25:25,200 --> 02:25:29,310
quote, unquote, "hello, %s"\nfrom the second one, answer.
2550
02:25:29,310 --> 02:25:32,730
And then just as a quick safety\ncheck here, why is it not 3?
2551
02:25:32,730 --> 02:25:35,320
Because there's obviously\ntwo commas here.
2552
02:25:35,320 --> 02:25:38,280
Why is it not actually\n3 arguments or inputs?
2553
02:25:43,001 --> 02:25:46,210
The comma to the left is actually\npart of my English grammar
2554
02:25:47,693 --> 02:25:50,110
And, again, here's where\nprogramming can just be confusing
2555
02:25:50,111 --> 02:25:52,901
early on because we're using the\n
2556
02:25:52,900 --> 02:25:55,730
different things, it just\ndepends on the context.
2557
02:25:55,730 --> 02:25:57,970
And so now is actually\na good time to point out
2558
02:25:57,970 --> 02:26:01,751
all of the somewhat pretty colors that\n
2559
02:26:02,290 --> 02:26:06,040
even though I wasn't going to a format\n
2560
02:26:06,040 --> 02:26:08,990
I certainly wasn't changing\nthings to red or blue or whatnot--
2561
02:26:08,990 --> 02:26:13,900
that's because a text editor like\n
2562
02:26:13,900 --> 02:26:17,050
This is a feature of so many different\n
2563
02:26:18,640 --> 02:26:23,020
If your text editor understands the\n
2564
02:26:24,320 --> 02:26:28,700
it highlights in different colors the\n
2565
02:26:28,700 --> 02:26:31,870
So, for instance, string and\nanswer here are in black
2566
02:26:31,870 --> 02:26:35,827
but get_string a function is in\nthis sort of nasty brown-yellow
2567
02:26:35,827 --> 02:26:38,411
here right now, but that's just\nhow it displays on the screen.
2568
02:26:38,411 --> 02:26:41,111
The string, though, here in red\nis kind of jumping out at me
2569
02:26:41,111 --> 02:26:42,791
and that's marginally useful.
2570
02:26:44,113 --> 02:26:46,280
That's kind of nice, because\nit's jumping out at me.
2571
02:26:46,281 --> 02:26:49,601
And so it's just using different colors\n
2572
02:26:49,601 --> 02:26:53,021
pop so you can focus on\nhow these ideas interrelate
2573
02:26:53,021 --> 02:26:55,121
and, honestly, when you\nmight make a mistake.
2574
02:26:55,120 --> 02:26:58,030
For instance, let me accidentally\nleave off this quote here.
2575
02:26:58,031 --> 02:27:03,011
And now all of a sudden,\nnotice if I delete the quote
2576
02:27:03,011 --> 02:27:05,711
the colors start to get a little awry.
2577
02:27:05,710 --> 02:27:09,640
But if I go back there and put it\n
2578
02:27:09,640 --> 02:27:11,470
What's another feature\nof this text editor?
2579
02:27:11,470 --> 02:27:15,251
Notice when my cursor is next\nto this parenthesis, which
2580
02:27:15,251 --> 02:27:18,111
demarcates the end of the\ninputs to the function
2581
02:27:18,111 --> 02:27:21,611
notice that highlighted in green\n
2582
02:27:22,150 --> 02:27:24,237
It's just a visually useful\nthing, especially when
2583
02:27:24,237 --> 02:27:26,320
you start writing more and\nmore code, just to make
2584
02:27:26,320 --> 02:27:28,240
sure your parentheses are lining up.
2585
02:27:28,240 --> 02:27:31,390
And that's true for these curly braces\n
2586
02:27:31,390 --> 02:27:33,011
We'll come back to those in a moment.
2587
02:27:33,011 --> 02:27:36,820
If I put my cursor there, you can\n
2588
02:27:37,700 --> 02:27:40,600
So it's nothing in your code\nfundamentally, it's just the editor
2589
02:27:40,601 --> 02:27:42,911
trying to help you, the human, program.
2590
02:27:42,911 --> 02:27:45,251
And you can even see it,\nthough it's a little subtle--
2591
02:27:45,251 --> 02:27:48,130
see these four dots here\nand these four dots here?
2592
02:27:49,691 --> 02:27:53,531
I configured VS Code to\nindent by four spaces, which
2593
02:27:54,761 --> 02:27:58,091
Any time I hit the Tab key, this\ntoo can help you make sure--
2594
02:27:58,091 --> 02:28:00,941
once we have more interesting\nand longer programs--
2595
02:28:00,941 --> 02:28:04,121
that everything lines\nup nice and neatly.
2596
02:28:05,380 --> 02:28:07,390
All right, any questions\nthen on printf or more?
2597
02:28:08,653 --> 02:28:11,122
AUDIENCE: [? Would ?]\nthe printf [INAUDIBLE]??
2598
02:28:11,122 --> 02:28:12,831
DAVID J. MALAN: Short\nanswer, yes. printf
2599
02:28:12,831 --> 02:28:16,130
can handle more than one\ntype of variable or value.
2600
02:28:17,240 --> 02:28:20,671
We're going to see %i is another\nfor plugging in an integer.
2601
02:28:20,671 --> 02:28:24,261
You can have multiple i's, multiple\n
2602
02:28:24,261 --> 02:28:26,150
We'll come back to that\nin just a little bit.
2603
02:28:26,150 --> 02:28:29,841
printf can take many more\narguments than just these two.
2604
02:28:29,841 --> 02:28:32,150
This is just meant to be representative.
2605
02:28:34,191 --> 02:28:36,201
Can you declare variables\nwithin the printf?
2606
02:28:37,611 --> 02:28:39,651
The only variable I'm\nusing right now is answer
2607
02:28:39,650 --> 02:28:43,251
and it's got to be done outside\n
2608
02:28:43,251 --> 02:28:46,310
Good question, we'll see\nmore of that before long.
2609
02:28:49,744 --> 02:28:51,911
DAVID J. MALAN: How do we\ndownload the CS50 library?
2610
02:28:51,911 --> 02:28:55,091
So we will show you in problems\nset 1 exactly how to do that.
2611
02:28:55,091 --> 02:28:58,421
It's automatically done for you in\n
2612
02:28:58,421 --> 02:29:01,810
If, ultimately, you program on your own\n
2613
02:29:01,810 --> 02:29:03,700
on, it's also installable online.
2614
02:29:03,700 --> 02:29:07,073
But if you want to ask that\nvia online or afterward
2615
02:29:07,074 --> 02:29:08,740
we can point you in the right direction.
2616
02:29:13,341 --> 02:29:16,970
DAVID J. MALAN: String is the type\n
2617
02:29:16,970 --> 02:29:19,100
the data type of the variable.
2618
02:29:19,101 --> 02:29:22,281
int is another keyword I alluded\n
2619
02:29:22,281 --> 02:29:26,191
int, for integer, is going to be\n
2620
02:29:26,191 --> 02:29:27,441
AUDIENCE: OK. [? Thank you. ?]
2621
02:29:29,691 --> 02:29:31,108
DAVID J. MALAN: Oh, good question.
2622
02:29:31,108 --> 02:29:34,970
Could I go ahead and just\nplug in this function
2623
02:29:34,970 --> 02:29:39,261
kind of like we did in Scratch,\n
2624
02:29:39,261 --> 02:29:42,470
and just do this, which\nrecall, is reminiscent of what
2625
02:29:42,470 --> 02:29:45,921
I did in Scratch by plopping\nblock on top of block on block?
2626
02:29:48,271 --> 02:29:50,230
Can I put string in front of get_string?
2627
02:29:50,730 --> 02:29:53,700
You only put the word string\nin front of a variable
2628
02:29:55,200 --> 02:29:57,870
And even though I'm apparently\nanswering the wrong question
2629
02:29:57,870 --> 02:30:01,710
let me go ahead and zoom out,\nsave this, do make hello again.
2630
02:30:03,450 --> 02:30:06,180
If I run ./hello, type in David, voila.
2631
02:30:07,450 --> 02:30:10,390
And so, actually, let's go down\n
2632
02:30:10,390 --> 02:30:12,210
Clearly, it's still correct--
2633
02:30:12,210 --> 02:30:14,610
at least, based on my limited testing.
2634
02:30:14,611 --> 02:30:18,001
Is this better designed\nor worse designed?
2635
02:30:18,001 --> 02:30:20,130
Let's open that question\nlike we did last week.
2636
02:30:21,480 --> 02:30:23,820
Yeah, I kind of agree with that.
2637
02:30:23,820 --> 02:30:26,011
Reasonable people could\ndisagree, but I do
2638
02:30:26,011 --> 02:30:29,970
agree that this seems harder to\n
2639
02:30:29,970 --> 02:30:32,438
but wait a minute. get_string\nis going to get used first
2640
02:30:32,438 --> 02:30:34,271
and then it's going to\ngive me back a value.
2641
02:30:34,271 --> 02:30:37,501
So, yeah, it just feels like it\nwas nicer to read top to bottom
2642
02:30:40,615 --> 02:30:44,611
And so this is useful if I only want\n
2643
02:30:44,611 --> 02:30:47,774
If I want to use it later in a\nlonger program, I'm out of luck
2644
02:30:47,773 --> 02:30:49,440
and so I haven't saved it in a variable.
2645
02:30:49,441 --> 02:30:53,041
So I think, long story short, we\ncould debate this all day long.
2646
02:30:53,040 --> 02:30:56,150
But in this case, eh, if you can\nmake a reasonable argument one
2647
02:30:56,150 --> 02:30:59,240
way or the other, that's a\npretty solid ground to stand on.
2648
02:30:59,240 --> 02:31:01,101
But, invariably,\nreasonable people are going
2649
02:31:01,101 --> 02:31:05,641
to disagree, whether first-time\n
2650
02:31:05,640 --> 02:31:09,740
So let's frame this one last example\n
2651
02:31:09,740 --> 02:31:11,380
of taking inputs and outputs.
2652
02:31:11,380 --> 02:31:13,130
The functions we've\nbeen talking about all
2653
02:31:13,130 --> 02:31:17,720
take inputs, otherwise now known as\n
2654
02:31:18,531 --> 02:31:21,740
That's just the fancy word\nfor an input to a function.
2655
02:31:21,740 --> 02:31:25,046
And some functions have either\nside effects, like we saw--
2656
02:31:25,046 --> 02:31:27,171
printing something, saying\nsomething on the screen
2657
02:31:27,171 --> 02:31:28,911
sort of visually or audibly--
2658
02:31:28,911 --> 02:31:33,921
or they return a value, which is a\n
2659
02:31:35,251 --> 02:31:39,261
If we look then at what we did last\n
2660
02:31:39,261 --> 02:31:41,810
the input was what's your\nname, the function was ask
2661
02:31:41,810 --> 02:31:44,780
and the return value was answer.
2662
02:31:44,781 --> 02:31:49,521
And now let's take a look at this block,\n
2663
02:31:49,521 --> 02:31:51,266
version of what we just did with the %s.
2664
02:31:51,266 --> 02:31:54,921
Last week we said save, then\njoin, then hello and answer.
2665
02:31:54,921 --> 02:31:58,581
But the interesting takeaway there\n
2666
02:31:58,581 --> 02:32:03,021
It was the fact that in Scratch\n2, the output of one function
2667
02:32:03,021 --> 02:32:08,150
like the green join, could become\nthe input to another function
2668
02:32:09,501 --> 02:32:12,470
The syntax in C is\nadmittedly pretty different
2669
02:32:12,470 --> 02:32:14,480
but the idea is essentially the same.
2670
02:32:14,480 --> 02:32:18,380
Here, though, we have\nhello, a placeholder
2671
02:32:18,380 --> 02:32:21,890
but we have to, in this\nworld of C, tell printf
2672
02:32:21,890 --> 02:32:25,110
what we want to plug in\nfor that placeholder.
2673
02:32:25,980 --> 02:32:27,147
But that's the way to do it.
2674
02:32:27,147 --> 02:32:29,690
When we get to Python and other\nlanguages later in the term
2675
02:32:29,691 --> 02:32:31,521
there's actually easier ways to do this.
2676
02:32:31,521 --> 02:32:34,311
But this is a very common\nparadigm, particularly when
2677
02:32:34,310 --> 02:32:37,490
you want to format\nyour data in some way.
2678
02:32:37,490 --> 02:32:40,050
All right, let's then take a\nstep back to where we began
2679
02:32:40,050 --> 02:32:43,700
which was with that whole\nprogram, which had the include
2680
02:32:43,700 --> 02:32:47,630
and it had int main(void) and\nall of this other cryptic syntax.
2681
02:32:47,630 --> 02:32:51,350
This Scratch piece last week\nwas kind of like the go-to
2682
02:32:51,351 --> 02:32:53,869
whenever you want to have a\nmain part of your program.
2683
02:32:53,869 --> 02:32:55,911
It's not the only way to\nstart a Scratch program.
2684
02:32:55,911 --> 02:32:59,181
You could listen for clicks or other\n
2685
02:32:59,181 --> 02:33:03,831
But this was probably the most popular\n
2686
02:33:03,831 --> 02:33:07,478
In C, the closest analog is\nto literally write this out.
2687
02:33:07,478 --> 02:33:10,520
So just like last week, if you were\n
2688
02:33:10,521 --> 02:33:13,101
when green flag clicked,\nas a C programmer
2689
02:33:13,101 --> 02:33:15,771
the first thing you would do is\nafter creating an empty file
2690
02:33:15,771 --> 02:33:18,621
like I did with hello.c,\nyou'd probably type int
2691
02:33:18,620 --> 02:33:22,040
main(void) open curly\nbrace, closed curly brace
2692
02:33:22,040 --> 02:33:26,100
and then you can put all of your\n
2693
02:33:26,101 --> 02:33:29,211
So just like Scratch had\nthis sort of magnetic nature
2694
02:33:29,210 --> 02:33:33,530
to it where the puzzle pieces would snap\n
2695
02:33:33,531 --> 02:33:38,240
tends to use these curly braces, one\n
2696
02:33:38,240 --> 02:33:41,480
And anything inside of\nthose braces, so to speak
2697
02:33:41,480 --> 02:33:44,570
is part of this puzzle piece, a.k.a.
2698
02:33:47,450 --> 02:33:50,840
We went down this rabbit hole moment ago\n
2699
02:33:50,841 --> 02:33:52,674
even though I didn't\ncall them by this name.
2700
02:33:52,674 --> 02:33:56,856
But, indeed, when we have a whole\n
2701
02:33:56,855 --> 02:33:59,480
Just have the one green flag\nclicked and then say hello, world.
2702
02:34:00,681 --> 02:34:03,351
After all, it's meant to be very\nuser-friendly and graphical.
2703
02:34:03,351 --> 02:34:09,531
In C, though, you technically can't just\n
2704
02:34:10,700 --> 02:34:16,340
Because, again, you need to tell\n
2705
02:34:16,341 --> 02:34:22,041
code that someone else wrote-- so that\n
2706
02:34:22,040 --> 02:34:24,980
You have to load the\nCS50 library whenever
2707
02:34:24,980 --> 02:34:28,280
you want to use get_string or\nother functions, like get_int
2708
02:34:29,630 --> 02:34:32,690
Otherwise, the compiler won't\nknow what get_string is.
2709
02:34:32,691 --> 02:34:35,011
You just have to do it this way.
2710
02:34:35,011 --> 02:34:37,341
The specific file name\nI'm mentioning here
2711
02:34:37,341 --> 02:34:44,310
stdio.h, cs50.h, is what C programmers\n
2712
02:34:44,310 --> 02:34:46,831
We'll see eventually what's\ninside of those files.
2713
02:34:46,831 --> 02:34:51,060
But long story short, it's like a menu\n
2714
02:34:51,060 --> 02:34:54,560
So in cs50.h, there's a menu\nmentioning get_string, get_int
2715
02:34:56,150 --> 02:35:01,790
And in stdio.h, there's a menu of\n
2716
02:35:01,790 --> 02:35:04,640
And that menu is what\nprepares the compiler
2717
02:35:04,640 --> 02:35:08,945
to know how to implement\nthose same functions.
2718
02:35:08,945 --> 02:35:10,310
All right, let me pause here.
2719
02:35:16,960 --> 02:35:20,560
A library provides all of the\nfunctionality we're talking about.
2720
02:35:20,560 --> 02:35:25,720
A header file is the very specific\n
2721
02:35:25,720 --> 02:35:27,770
And we'll discuss this more next week.
2722
02:35:27,771 --> 02:35:30,161
For now, they're essentially\nthe same, but we'll discuss
2723
02:35:30,161 --> 02:35:32,800
nuances between the two next week.
2724
02:35:32,800 --> 02:35:36,460
Yeah, the library would be standard\nI/O. The library would CS50.
2725
02:35:36,460 --> 02:35:41,720
The corresponding header\nfile is stdio.h, cs50.h.
2726
02:35:50,480 --> 02:35:54,230
incredibly common in the world of\n
2727
02:35:54,230 --> 02:35:59,300
but in C, there's technically no\n
2728
02:35:59,300 --> 02:36:02,216
We have sort of conjured it up\nto simplify the first few weeks.
2729
02:36:02,216 --> 02:36:05,091
That's a training wheel that we'll\n
2730
02:36:05,091 --> 02:36:09,411
take away, and we'll see why we've\n
2731
02:36:09,411 --> 02:36:14,161
Because C otherwise makes things\n
2732
02:36:14,161 --> 02:36:16,191
which then gets besides\nthe point for us.
2733
02:36:20,570 --> 02:36:23,480
Early on, you will have to use whatever\n
2734
02:36:23,480 --> 02:36:24,950
That will include CS50's functions.
2735
02:36:24,950 --> 02:36:28,190
Long story short, you referred, I\n
2736
02:36:28,191 --> 02:36:30,981
called scanf, we won't\ntalk about for a few weeks.
2737
02:36:30,980 --> 02:36:36,710
Long story short, in C, it's pretty easy\n
2738
02:36:36,710 --> 02:36:40,310
The catch is that it's really\neasy to do it dangerously.
2739
02:36:40,310 --> 02:36:45,110
And C, because it's an older,\nlower-level language, so to speak
2740
02:36:45,111 --> 02:36:49,431
that gives you pretty much ultimate\n
2741
02:36:49,431 --> 02:36:51,621
It's very easy to make mistakes.
2742
02:36:51,620 --> 02:36:55,100
And, indeed, that's too\nwhy we use the library
2743
02:36:55,101 --> 02:36:58,541
so your code won't crash unintendedly.
2744
02:36:58,540 --> 02:37:01,620
All right, so with this in\nmind, we have this now mapping
2745
02:37:01,620 --> 02:37:03,370
between the Scratch\nversion and the other.
2746
02:37:03,370 --> 02:37:06,537
Let me just give you a quick tour of\n
2747
02:37:06,538 --> 02:37:10,240
types that students will start seeing as\n
2748
02:37:10,240 --> 02:37:13,511
In the world of Linux, here\nis a non-exhaustive list
2749
02:37:13,511 --> 02:37:16,810
of commands with which you'll get\n
2750
02:37:16,810 --> 02:37:18,040
by playing with problem sets.
2751
02:37:18,040 --> 02:37:22,783
We've only seen two of these so\nfar, ls for list, rm for others.
2752
02:37:22,783 --> 02:37:24,700
But I mention them now\njust so that it doesn't
2753
02:37:24,700 --> 02:37:30,340
feel too foreign when you see them\n
2754
02:37:30,341 --> 02:37:32,411
cp is going to stand for copy.
2755
02:37:32,411 --> 02:37:35,710
mkdir is going to stand\nfor make directory. mv is
2756
02:37:35,710 --> 02:37:38,470
going to stand for move or rename.
2757
02:37:38,470 --> 02:37:44,470
rmdir is going to be remove directory,\n
2758
02:37:44,470 --> 02:37:46,960
and let me show you this\nlast one here first
2759
02:37:46,960 --> 02:37:49,510
only because it's something\nyou'll use so commonly.
2760
02:37:49,511 --> 02:37:53,681
If I go back to my code here on\n
2761
02:37:53,681 --> 02:37:58,091
and re-open the little GUI on the\n
2762
02:37:58,091 --> 02:38:01,451
revealing that I've got two\nfiles, hello and hello.c
2763
02:38:01,450 --> 02:38:02,950
so nothing has changed since there.
2764
02:38:02,950 --> 02:38:05,800
Suppose now that it's\na few weeks into class
2765
02:38:05,800 --> 02:38:07,661
and I want to start\norganizing the code I'm
2766
02:38:07,661 --> 02:38:10,511
writing so that I have a folder\nfor this week or next week
2767
02:38:10,511 --> 02:38:13,421
or maybe a folder for\nproblem set 1, problem set 2.
2768
02:38:15,130 --> 02:38:18,190
In the GUI, I can go up\nhere and do what most of you
2769
02:38:18,191 --> 02:38:19,961
would do instinctively on a Mac or PC.
2770
02:38:19,960 --> 02:38:22,450
You look for like a\nfolder icon, you click it
2771
02:38:22,450 --> 02:38:25,960
and then you name a\nfolder like PSet1, Enter.
2772
02:38:25,960 --> 02:38:28,690
Voila, you've got a folder called PSet1.
2773
02:38:28,691 --> 02:38:34,701
I can confirm as much with my command\n
2774
02:38:34,700 --> 02:38:36,890
How can I list what's in my folder?
2775
02:38:39,504 --> 02:38:41,421
and it's green with an\nasterisk because that's
2776
02:38:41,421 --> 02:38:43,581
my executable, my runnable program--
2777
02:38:43,581 --> 02:38:46,701
hello.c, which is my source\ncode, and now PSet1 with a slash
2778
02:38:46,700 --> 02:38:50,030
at the end, which just implies\nthat it's indeed a folder.
2779
02:38:50,031 --> 02:38:52,321
All right, I didn't really\nwant to do it that way.
2780
02:38:52,320 --> 02:38:53,870
I'd like to do it more advanced.
2781
02:38:53,870 --> 02:38:57,530
So let me go ahead and right-click\non PSet1, delete permanently.
2782
02:38:57,531 --> 02:38:59,711
I get a scary irreversible\nerror message.
2783
02:38:59,710 --> 02:39:01,460
But there's nothing\nin it, so that's fine.
2784
02:39:01,460 --> 02:39:03,560
Now I've deleted it using the GUI.
2785
02:39:03,560 --> 02:39:08,851
But now let me go ahead and start doing\n
2786
02:39:08,851 --> 02:39:11,060
And if you're wondering how\nthings keep disappearing
2787
02:39:11,060 --> 02:39:15,290
if you hit Control-L in your terminal\n
2788
02:39:15,290 --> 02:39:18,470
it will delete everything you previously\n
2789
02:39:18,470 --> 02:39:20,595
In practice, you don't need\nto be doing this often.
2790
02:39:20,595 --> 02:39:23,480
I'm doing it just to keep our\nfocus on my latest commands.
2791
02:39:23,480 --> 02:39:26,962
If I do-- what was the command\nto make a new directory?
2792
02:39:28,320 --> 02:39:30,540
DAVID J. MALAN: Yeah, so\nmkdir, make directory.
2793
02:39:32,251 --> 02:39:34,621
And notice at left, there's my PSet1.
2794
02:39:34,620 --> 02:39:37,210
If I want to get a little\noverzealous, plan for next week
2795
02:39:39,120 --> 02:39:44,722
Suppose now I want to open those\n
2796
02:39:44,722 --> 02:39:46,681
I could double-click on\nit like this, and you'd
2797
02:39:46,681 --> 02:39:48,408
see this little arrow is moving.
2798
02:39:48,407 --> 02:39:51,490
It's not doing anything because there's\n
2799
02:39:51,490 --> 02:39:55,050
But suppose again I want to get more\n
2800
02:39:55,050 --> 02:39:59,130
Notice if I type ls now, I\nsee all four same things.
2801
02:39:59,130 --> 02:40:05,820
Let me change directories\nwith cd space PSet1 Enter.
2802
02:40:05,820 --> 02:40:08,460
And now notice two things\nwill have happened.
2803
02:40:08,460 --> 02:40:14,070
One, my prompt has changed\nslightly to remind me where I am
2804
02:40:14,070 --> 02:40:17,730
just to keep me sane so that I don't\n
2805
02:40:17,730 --> 02:40:21,610
So here is just a visual reminder\n
2806
02:40:21,611 --> 02:40:26,911
If I type ls now, what should\nI see after hitting Enter?
2807
02:40:26,911 --> 02:40:29,591
Nothing, because I've only\ncreated empty folders so far.
2808
02:40:31,050 --> 02:40:35,490
If I wanted to create a folder called\n
2809
02:40:35,490 --> 02:40:37,890
called Mario this week, I can do that.
2810
02:40:37,890 --> 02:40:40,530
Now if I type ls, there is Mario.
2811
02:40:40,531 --> 02:40:42,871
Now if I do cd Mario,\nnotice my prompt's going
2812
02:40:42,870 --> 02:40:44,520
to change to be a little more precise.
2813
02:40:47,191 --> 02:40:49,051
And notice what's happening at top left.
2814
02:40:49,050 --> 02:40:51,150
Nothing now, because these\nfolders are collapsed.
2815
02:40:51,150 --> 02:40:54,390
But if I click the little\ntriangle, there I see Mario.
2816
02:40:54,390 --> 02:40:56,911
Nothing's going on in there\nbecause there's no files yet.
2817
02:40:56,911 --> 02:41:00,810
But suppose now I want to\ncreate a file called mario.c.
2818
02:41:00,810 --> 02:41:04,770
I could go up here, I could click the\n
2819
02:41:04,771 --> 02:41:07,591
Or I can just type code mario.c.
2820
02:41:08,220 --> 02:41:10,082
That creates a new tab for me.
2821
02:41:10,083 --> 02:41:13,291
I'm not going to write any code in here\n
2822
02:41:13,290 --> 02:41:16,950
And now at top left, you'll\nsee that mario.c appears.
2823
02:41:16,950 --> 02:41:19,492
So at some point, you can\neventually just close the Explorer.
2824
02:41:19,493 --> 02:41:22,158
Because, again, it's not providing\nyou with any new information.
2825
02:41:22,158 --> 02:41:23,970
It's maybe more\nuser-friendly, but there's
2826
02:41:23,970 --> 02:41:27,960
nothing you can't do at the command\n
2827
02:41:27,960 --> 02:41:29,880
All right, but now I'm kind of stuck.
2828
02:41:29,880 --> 02:41:32,220
How do I get out of this folder?
2829
02:41:32,220 --> 02:41:34,020
In my Mac or PC world,\nI'd probably click
2830
02:41:34,021 --> 02:41:37,320
the Back button or something like that\n
2831
02:41:37,320 --> 02:41:42,181
In the terminal window,\nI can do cd dot dot.
2832
02:41:42,181 --> 02:41:46,650
Dot dot is a nickname, if you\nwill, for the parent directory.
2833
02:41:46,650 --> 02:41:48,101
That is, the previous directory.
2834
02:41:48,101 --> 02:41:52,601
So if I hit Enter now, notice I'm\n
2835
02:41:52,601 --> 02:41:56,131
a.k.a. directory, and\nnow I'm back in PSet1.
2836
02:41:56,130 --> 02:42:00,240
Or, if I want to be fancy, let me\n
2837
02:42:00,240 --> 02:42:03,300
If I type ls, there's\nmario.c, just to orient us.
2838
02:42:03,300 --> 02:42:06,458
If I want to do multiple things\nat a time, I could do cd../..
2839
02:42:09,271 --> 02:42:13,021
which goes to my parent to my\ngrandparent all in one breath.
2840
02:42:13,021 --> 02:42:16,541
And voila, now I'm back in my\ndefault folder, if you will.
2841
02:42:16,540 --> 02:42:20,911
And one last little trick of the\n
2842
02:42:20,911 --> 02:42:24,181
was a moment ago, and you're\njust tired of all the navigation
2843
02:42:24,181 --> 02:42:26,881
if you just type cd and\nhit Enter, it'll whisk you
2844
02:42:26,880 --> 02:42:29,070
away back to your default\nfolder, and you don't have
2845
02:42:29,070 --> 02:42:31,320
to worry about getting there manually.
2846
02:42:31,320 --> 02:42:38,521
Recall a bit ago, though, that I\n
2847
02:42:38,521 --> 02:42:42,751
If dot refers to my parent,\nperhaps infer here syntactically
2848
02:42:42,751 --> 02:42:46,720
what does a single dot mean instead?
2849
02:42:46,720 --> 02:42:49,480
It means this directory,\nyour current directory.
2850
02:42:50,900 --> 02:42:52,993
It just makes super\nexplicit to the computer
2851
02:42:52,993 --> 02:42:55,451
that I want the program called\nhello that's installed here
2852
02:42:55,450 --> 02:42:59,500
not in some random other folder\non my hard drive, so to speak.
2853
02:42:59,501 --> 02:43:02,425
I want the one that's\nright here instead.
2854
02:43:02,425 --> 02:43:04,300
All right, so besides\nthese commands, there's
2855
02:43:04,300 --> 02:43:06,452
going to be others that\nwe encounter over time.
2856
02:43:06,452 --> 02:43:07,661
Those are kind of the basics.
2857
02:43:07,661 --> 02:43:11,710
That allows you to wean yourself off\n
2858
02:43:11,710 --> 02:43:14,470
and start using more comfortably,\nwith practice and time
2859
02:43:14,470 --> 02:43:16,360
a command line interface instead.
2860
02:43:16,361 --> 02:43:19,631
Well, what about those other\ntypes, now back in the world of C?
2861
02:43:19,630 --> 02:43:23,950
Those commands were not C. Those are\n
2862
02:43:23,950 --> 02:43:28,300
interface, like in Linux, which,\n
2863
02:43:28,300 --> 02:43:30,310
It's an alternative\nto Mac OS and Windows.
2864
02:43:30,310 --> 02:43:34,810
Back in the world of C now, we've\nseen strings, which are words.
2865
02:43:34,810 --> 02:43:38,351
I mentioned int or integer,\nbut there's others as well.
2866
02:43:38,351 --> 02:43:42,581
In the world of C, we've\nseen string, we will see int.
2867
02:43:42,581 --> 02:43:46,031
If you want a bigger integer, there's\n
2868
02:43:46,031 --> 02:43:49,451
If you want a single character,\nthere's something called a char.
2869
02:43:49,450 --> 02:43:53,680
If you want a Boolean value,\ntrue or false, there is a bool.
2870
02:43:53,681 --> 02:43:56,111
And if you want a floating-point value--
2871
02:43:56,111 --> 02:43:59,781
a fancy way of saying a real number,\n
2872
02:43:59,781 --> 02:44:03,201
that is what C and other\nlanguages call a float.
2873
02:44:03,200 --> 02:44:06,610
And if you want even more numbers\nafter the decimal point that
2874
02:44:06,611 --> 02:44:10,211
is more precision, you can\nuse something called a double.
2875
02:44:10,210 --> 02:44:14,020
That is to say, here is, again,\nan example in programming
2876
02:44:14,021 --> 02:44:17,681
where it's up to you now to provide\n
2877
02:44:17,681 --> 02:44:21,640
that it will rely on to know what\n
2878
02:44:23,331 --> 02:44:26,621
Is it a sound, an image,\na color, or the like?
2879
02:44:26,620 --> 02:44:30,970
These are the types of data types\n
2880
02:44:30,970 --> 02:44:35,800
What are the functions that come in\n
2881
02:44:35,800 --> 02:44:39,251
We talked about standard I/O, and\n
2882
02:44:39,251 --> 02:44:42,911
In the CS50 library, you can\nsee that it follows a pattern.
2883
02:44:42,911 --> 02:44:45,581
The C50 library exists largely\nfor the first few weeks
2884
02:44:45,581 --> 02:44:50,997
of the class to make our lives easier\n
2885
02:44:50,997 --> 02:44:53,831
So if you want to get a string,\n
2886
02:44:54,700 --> 02:44:57,742
If you want to get an integer from\n
2887
02:44:57,743 --> 02:45:01,091
When you want to get any of those\n
2888
02:45:03,861 --> 02:45:06,221
And they're indeed all\nlowercase by convention.
2889
02:45:07,331 --> 02:45:10,630
If we have the ability now to\nstore different types of data
2890
02:45:10,630 --> 02:45:13,751
and we have functions with which\nto get different types of data
2891
02:45:13,751 --> 02:45:16,841
how might you go about printing\ndifferent types of data?
2892
02:45:16,841 --> 02:45:23,800
Well, we've seen %s for string,\n%i for integer, %c for char
2893
02:45:23,800 --> 02:45:30,130
%f for a float or a double, those\n
2894
02:45:30,130 --> 02:45:33,760
and then %li for a long integer.
2895
02:45:33,761 --> 02:45:36,281
So here's the first\nexample of inconsistencies.
2896
02:45:36,281 --> 02:45:38,981
In an ideal world, that would\njust be %l and we'd move on.
2897
02:45:38,980 --> 02:45:42,940
It's %li instead in this case.
2898
02:45:42,941 --> 02:45:45,701
That's printf and some\nof its format codes.
2899
02:45:49,511 --> 02:45:51,858
no pun intended-- there is\na whole bunch of operators.
2900
02:45:51,858 --> 02:45:54,191
And, indeed, computers, one\nof the first things they did
2901
02:45:54,191 --> 02:45:57,941
was a lot of math and calculations, so\n
2902
02:45:57,941 --> 02:46:01,161
Computers, and in turn, C, really\ngood at addition, subtraction
2903
02:46:01,161 --> 02:46:04,191
multiplication, division,\nand even the percent sign
2904
02:46:04,191 --> 02:46:05,740
which is the remainder operator.
2905
02:46:05,740 --> 02:46:08,650
There's a special symbol\nin C and other languages
2906
02:46:08,650 --> 02:46:13,161
just for getting the remainder, when\n
2907
02:46:13,161 --> 02:46:18,911
There are other features in the world\n
2908
02:46:18,911 --> 02:46:22,931
And there's also what is of\n
2909
02:46:22,931 --> 02:46:27,371
makes it easier over time\nto write fewer characters
2910
02:46:27,370 --> 02:46:29,090
but express your thoughts the same.
2911
02:46:29,091 --> 02:46:33,470
So just as a single example\nof this, as a single example
2912
02:46:33,470 --> 02:46:37,331
consider this use of\na variable last week.
2913
02:46:37,331 --> 02:46:41,740
Here in Scratch is how you might\n
2914
02:46:41,740 --> 02:46:43,960
In C, it's going to be similar.
2915
02:46:43,960 --> 02:46:46,240
If you want the variable\nto be called counter
2916
02:46:46,240 --> 02:46:49,540
you literally write the word counter,\n
2917
02:46:49,540 --> 02:46:53,350
You then use the assignment\noperator, a.k.a. the equals sign
2918
02:46:53,351 --> 02:46:56,931
and you assign it whatever its initial\n
2919
02:46:56,931 --> 02:47:01,001
So, again, the 0 is going to get copied\n
2920
02:47:01,001 --> 02:47:02,990
because of that single equal sign.
2921
02:47:02,990 --> 02:47:05,831
But this isn't sufficient\nin C. What else
2922
02:47:05,831 --> 02:47:08,890
is missing on the right-hand\nside, instinctively now?
2923
02:47:08,890 --> 02:47:11,485
Even if you've never\nprogrammed in this before.
2924
02:47:13,150 --> 02:47:14,775
DAVID J. MALAN: A semicolon at the end.
2925
02:47:14,775 --> 02:47:16,941
And one other thing, I\nthink, is probably missing.
2926
02:47:18,414 --> 02:47:19,581
DAVID J. MALAN: A data type.
2927
02:47:19,581 --> 02:47:22,671
So if we can keep going\nback and forth here
2928
02:47:22,671 --> 02:47:26,320
what data type seems appropriate\nintuitively for counter?
2929
02:47:27,431 --> 02:47:29,861
So, indeed, we need to\ntell the computer when
2930
02:47:29,861 --> 02:47:32,531
creating a variable what\ntype of data we want
2931
02:47:32,531 --> 02:47:35,721
and we need to finish our\nthought with the semicolon.
2932
02:47:35,720 --> 02:47:38,140
So there might be a counterpart there.
2933
02:47:38,140 --> 02:47:43,270
What about in Scratch if we wanted\n
2934
02:47:43,271 --> 02:47:45,671
We had this very user-friendly\npuzzle piece last time
2935
02:47:45,671 --> 02:47:49,740
that was change counter\nby 1, or add 1 to counter.
2936
02:47:49,740 --> 02:47:54,120
In C, here's where things get\na little more interesting.
2937
02:47:54,120 --> 02:47:58,110
And pretty commonly done, you might\n
2938
02:47:59,310 --> 02:48:02,220
And this is where, again, it's\n
2939
02:48:03,361 --> 02:48:05,341
Otherwise, this makes no sense.
2940
02:48:05,341 --> 02:48:08,251
counter cannot equal\ncounter plus 1, right?
2941
02:48:08,251 --> 02:48:11,171
That just doesn't work if we're\ntalking about integers here.
2942
02:48:11,171 --> 02:48:13,411
That's because the equal\nsign is assignment.
2943
02:48:13,411 --> 02:48:15,751
So it can certainly be the\ncase that you calculate
2944
02:48:15,751 --> 02:48:19,921
counter plus 1, whatever that is,\n
2945
02:48:19,921 --> 02:48:22,871
from right to left to be that new value.
2946
02:48:22,870 --> 02:48:25,200
This, as we'll see,\nis a very common thing
2947
02:48:25,200 --> 02:48:29,100
to do in programming just to kind of\n
2948
02:48:29,101 --> 02:48:30,931
You can write this more succinctly.
2949
02:48:30,931 --> 02:48:34,711
This code here is what we'll\ncall syntactic sugar, sort
2950
02:48:34,710 --> 02:48:39,540
of a fancy way of saying the same thing\n
2951
02:48:40,320 --> 02:48:44,220
This also adds 1, or whatever\nnumber you type over here
2952
02:48:44,220 --> 02:48:46,081
to the variable on the left.
2953
02:48:46,081 --> 02:48:49,771
And there's one other form of syntactic\n
2954
02:48:49,771 --> 02:48:51,601
and it's even more terse than this.
2955
02:48:51,601 --> 02:48:56,851
That too will increment counter by 1\n
2956
02:48:56,851 --> 02:48:59,701
Or if you change it to minus\nminus, subtracting 1 from it.
2957
02:48:59,700 --> 02:49:02,310
You can't do that with\n2 and 3 and 4, but you
2958
02:49:02,310 --> 02:49:07,860
can do it by default with just plus plus\n
2959
02:49:12,611 --> 02:49:15,641
DAVID J. MALAN: Ah, so when you are\n
2960
02:49:15,640 --> 02:49:20,200
has been created, as we did with\nthe code that looked like this
2961
02:49:20,200 --> 02:49:23,590
you no longer need to remind the\ncomputer what the data type is.
2962
02:49:23,591 --> 02:49:27,220
Thankfully, the computer is\nat least as smart as that.
2963
02:49:27,220 --> 02:49:31,990
It will remember the type of\nthe data that you intended.
2964
02:49:31,990 --> 02:49:34,941
Other questions or comments on this?
2965
02:49:34,941 --> 02:49:36,191
All right, that's quite a lot.
2966
02:49:36,191 --> 02:49:38,570
Why don't we go ahead and\nhere take a 10-minute break?
2967
02:49:38,570 --> 02:49:41,771
And we'll be back, we'll\nstart writing some code.
2968
02:49:44,980 --> 02:49:48,581
We've just looked at some\nof the basics of compiling
2969
02:49:48,581 --> 02:49:50,335
even if it doesn't\nquite feel that basic.
2970
02:49:50,335 --> 02:49:52,210
But now, let's actually\nstart focusing really
2971
02:49:52,210 --> 02:49:55,510
on writing more and more code,\nmore and more interesting
2972
02:49:55,511 --> 02:49:58,701
code, kind of like we dove\ninto Scratch last week.
2973
02:49:58,700 --> 02:50:00,610
So here I have these code open.
2974
02:50:01,570 --> 02:50:04,361
I'm going to focus more on my\n
2975
02:50:04,361 --> 02:50:07,444
Many different ways I can create new\n
2976
02:50:08,511 --> 02:50:11,501
So, again, within this\nenvironment of VS Code
2977
02:50:11,501 --> 02:50:15,581
I can literally write the code\n
2978
02:50:15,581 --> 02:50:18,371
and it just creates a new\nfile for me automatically.
2979
02:50:18,370 --> 02:50:20,411
Or I could do that in the GUI.
2980
02:50:20,411 --> 02:50:23,382
I'm going to go ahead and create\nthis file called calculator.c
2981
02:50:23,382 --> 02:50:25,841
and I'm going to go ahead and\ninclude some familiar things.
2982
02:50:25,841 --> 02:50:30,851
So I'm just going to go ahead and\n
2983
02:50:30,851 --> 02:50:33,911
I'm going to go ahead from\nmemory and do the int void main--
2984
02:50:33,911 --> 02:50:38,248
more on that next week, why it's\n
2985
02:50:38,248 --> 02:50:40,540
And now let me just implement\na very simple calculator.
2986
02:50:40,540 --> 02:50:44,360
We saw some mathematical\noperators, like plus and the like.
2987
02:50:45,861 --> 02:50:48,671
So let me go ahead and\nfirst give myself a variable
2988
02:50:48,671 --> 02:50:52,480
called x, sort of like grade\nschool math or algebra.
2989
02:50:52,480 --> 02:50:55,570
Let me go ahead then and\nget an int, which is new
2990
02:50:55,570 --> 02:50:57,251
but I mentioned this exists.
2991
02:50:57,251 --> 02:51:00,640
And then let me just ask the user\nfor whatever their x value is.
2992
02:51:00,640 --> 02:51:03,984
The thing in the quotes is\njust the English, or the string
2993
02:51:03,984 --> 02:51:06,650
that I'm printing on the screen.\nso I could say anything I want.
2994
02:51:06,650 --> 02:51:09,931
I'm just going to say x colon\nto prompt the user accordingly.
2995
02:51:09,931 --> 02:51:12,431
Now I'm going to go ahead and\nget another variable called y.
2996
02:51:13,630 --> 02:51:16,270
And now, I'm going to\nprompt the user for y.
2997
02:51:16,271 --> 02:51:18,551
And I'm just very nitpickly\nusing a space just
2998
02:51:18,550 --> 02:51:22,030
to move the cursor so it doesn't\nlook too messy on the screen.
2999
02:51:22,031 --> 02:51:27,221
And then lastly, let me go ahead and\n
3000
02:51:27,220 --> 02:51:31,720
In an ideal world, I would just\nsay something like printf x + y.
3001
02:51:31,720 --> 02:51:36,640
But that is not valid in C. The\n
3002
02:51:36,640 --> 02:51:39,261
has to be a string in double quotes.
3003
02:51:39,261 --> 02:51:43,060
So if I want to print out\nthe value of an integer
3004
02:51:43,060 --> 02:51:47,320
I need to put something in quotes\n
3005
02:51:47,320 --> 02:51:49,220
if I want to move the cursor as well.
3006
02:51:49,220 --> 02:51:51,640
So, again, we only glimpsed\nit briefly, but what
3007
02:51:51,640 --> 02:51:55,480
do I replace these question marks with\n
3008
02:51:56,570 --> 02:51:57,980
DAVID J. MALAN: Yeah, so %i.
3009
02:51:57,980 --> 02:52:00,591
Just like %s was string, %i is integer.
3010
02:52:02,361 --> 02:52:06,351
And now if I want to add x and y, for\n
3011
02:52:06,351 --> 02:52:09,771
doesn't do much of anything other\n
3012
02:52:11,341 --> 02:52:14,091
And, again, it looks definitely\ncryptic at first glance.
3013
02:52:14,091 --> 02:52:16,490
It would be if programming\nweren't this cryptic.
3014
02:52:16,490 --> 02:52:18,511
Other languages will\nclean this up for us.
3015
02:52:18,511 --> 02:52:22,191
But, again, if you focus on the\n
3016
02:52:22,191 --> 02:52:26,150
which is a format string with\nEnglish or whatever language
3017
02:52:27,591 --> 02:52:31,550
then it takes potentially more\narguments after the comma
3018
02:52:33,921 --> 02:52:36,800
All right, let me go ahead\nnow and make calculator
3019
02:52:36,800 --> 02:52:41,300
which, again, compiles\nmy source code in C
3020
02:52:41,300 --> 02:52:44,480
pictured above, and converts\nit into corresponding machine
3021
02:52:50,720 --> 02:52:53,841
Let's do 1 plus 1 and Enter.
3022
02:52:54,681 --> 02:52:57,531
Now I have the makings of a calculator.
3023
02:52:57,531 --> 02:53:00,421
Now let's start to tinker\nwith this a little bit.
3024
02:53:00,421 --> 02:53:02,271
What if I instead had done this?
3025
02:53:02,271 --> 02:53:08,150
int z = x + y and then plug-in z here.
3026
02:53:08,150 --> 02:53:14,841
If I rerun make calculator, Enter,\n
3027
02:53:14,841 --> 02:53:20,791
still equals 2, and let me claim that\n
3028
02:53:20,790 --> 02:53:22,820
which of these versions\nis better designed?
3029
02:53:22,820 --> 02:53:27,650
If both seem to be correct at very\n
3030
02:53:27,650 --> 02:53:30,710
or is the previous one without the z?
3031
02:53:30,710 --> 02:53:34,190
OK, so this one is arguably better\n
3032
02:53:34,191 --> 02:53:36,501
variable called z that I\ncannot only print but, heck
3033
02:53:36,501 --> 02:53:39,300
if my program is longer,\nI can use it elsewhere.
3034
02:53:42,272 --> 02:53:44,680
Debatable, like before, because\nit depends on my intent.
3035
02:53:44,681 --> 02:53:46,556
And, honestly, I think\na pretty good argument
3036
02:53:46,556 --> 02:53:48,001
can be made for the first version.
3037
02:53:48,001 --> 02:53:50,851
Because if I have no\nintention of-- as you note--
3038
02:53:50,851 --> 02:53:53,801
using that variable\nagain, you know what?
3039
02:53:53,800 --> 02:53:55,800
Maybe I might as well do\nthis, just because it's
3040
02:53:55,800 --> 02:53:57,331
one less thing to think about.
3041
02:53:58,441 --> 02:54:00,900
It's one less line of code\nto have to understand.
3042
02:54:02,040 --> 02:54:04,770
So here, again, it does\ndepend on your intention.
3043
02:54:04,771 --> 02:54:07,021
But this field is pretty reasonable.
3044
02:54:07,021 --> 02:54:09,301
And I think, as someone\nnoted earlier, when
3045
02:54:09,300 --> 02:54:13,411
I did the same thing with get_string,\n
3046
02:54:13,411 --> 02:54:16,335
s line because get_string and the\nwhat's your name inside of it
3047
02:54:16,335 --> 02:54:17,460
it was just so much longer.
3048
02:54:17,460 --> 02:54:20,850
But x + y, eh, it's not that hard\nto wrap our mind around what's
3049
02:54:20,851 --> 02:54:23,060
going on inside of the printf argument.
3050
02:54:23,060 --> 02:54:25,831
So, again, these are the kinds\nof thoughts that hopefully you'll
3051
02:54:25,831 --> 02:54:28,531
acquire the instinct for\non not necessarily reaching
3052
02:54:28,531 --> 02:54:30,990
the same answer as someone\nelse, but, again, the thought
3053
02:54:30,990 --> 02:54:33,220
process is what matters here.
3054
02:54:33,220 --> 02:54:36,180
All right, so how might I enhance\nthis program a little bit?
3055
02:54:36,181 --> 02:54:38,471
Let's just talk about\nstyle for just a moment.
3056
02:54:38,470 --> 02:54:42,990
So x and y, at least in this case,\n
3057
02:54:43,500 --> 02:54:45,750
Because that's the go-to\nvariable names in math
3058
02:54:45,750 --> 02:54:47,375
when you're adding two things together.
3059
02:54:47,375 --> 02:54:48,909
So x and y seem pretty reasonable.
3060
02:54:48,909 --> 02:54:53,159
I could have done something like,\nwell, maybe my first variable
3061
02:54:53,159 --> 02:54:56,730
should be called first\nnumber and my next variable
3062
02:54:56,730 --> 02:54:58,552
should be called second number.
3063
02:54:58,552 --> 02:55:00,511
And then down here, I\nwould have to change this
3064
02:55:00,511 --> 02:55:04,329
to first number plus second number.
3065
02:55:04,329 --> 02:55:07,050
Like, eh, this isn't really\nadding anything semantically
3066
02:55:08,341 --> 02:55:11,230
But that would be one other\n
3067
02:55:11,230 --> 02:55:14,761
So if you have very simple\nideas that are conventionally
3068
02:55:14,761 --> 02:55:18,790
expressed with common variable names\n
3069
02:55:18,790 --> 02:55:22,591
What if I want to annotate this program\n
3070
02:55:22,591 --> 02:55:25,381
Well, I can add in C\nwhat are called comments.
3071
02:55:25,380 --> 02:55:30,030
With a slash slash, two forward slashes,\n
3072
02:55:32,579 --> 02:55:35,400
And then down here, I could\ndo something like prompt user
3073
02:55:35,400 --> 02:55:37,619
for y, just to remind\nmyself what I'm doing there.
3074
02:55:37,620 --> 02:55:40,171
And down here, perform addition.
3075
02:55:40,171 --> 02:55:42,990
Now, in this case, I'm not\nsure these commands are really
3076
02:55:44,101 --> 02:55:47,820
Because in the time it took me to write\n
3077
02:55:47,820 --> 02:55:49,950
I could have just read\nthe three lines of code.
3078
02:55:49,950 --> 02:55:52,890
But as our programs\nget more sophisticated
3079
02:55:52,890 --> 02:55:55,548
and you start to learn more syntax--
3080
02:55:55,549 --> 02:55:58,091
that, honestly, you might forget\nthe next day, the next week
3081
02:55:58,091 --> 02:56:01,890
the next month-- might be useful\n
3082
02:56:01,890 --> 02:56:04,681
reminds you of what your\ncode is doing or maybe even
3083
02:56:06,421 --> 02:56:09,329
With these early programs,\nnot really necessary
3084
02:56:09,329 --> 02:56:11,671
doesn't really add all that\nmuch to our comprehension
3085
02:56:11,671 --> 02:56:14,159
but it is a mechanism\nyou have in place that
3086
02:56:14,159 --> 02:56:18,060
can help you actually remind\nyourself or remind someone
3087
02:56:18,060 --> 02:56:20,279
else what it is that's going on.
3088
02:56:20,279 --> 02:56:23,070
Well, let me go ahead and rerun\n
3089
02:56:24,431 --> 02:56:27,001
And here, too, you might\nthink I'm typing crazy fast--
3090
02:56:28,771 --> 02:56:32,201
So it turns out that\nLinux, the operating system
3091
02:56:32,200 --> 02:56:33,510
we're using here in the cloud--
3092
02:56:33,511 --> 02:56:36,961
but, actually, Windows and Mac\nOS nowadays support this too--
3093
02:56:38,800 --> 02:56:42,300
So if you only have one\nprogram that starts with C-A-L
3094
02:56:42,300 --> 02:56:45,210
you don't have to finish writing\n
3095
02:56:45,210 --> 02:56:47,770
and the computer will\nfinish your thought for you.
3096
02:56:47,771 --> 02:56:51,931
The other thing you can do is\nif you hit Up and keep going up
3097
02:56:51,931 --> 02:56:54,424
you'll scroll through your\nentire history of commands.
3098
02:56:54,424 --> 02:56:56,341
So there too, I've been\nsaving some keystrokes
3099
02:56:56,341 --> 02:56:59,174
by hitting Up quickly rather than\n
3100
02:56:59,831 --> 02:57:02,130
So, again, just another\nlittle convenience
3101
02:57:02,130 --> 02:57:05,911
to make programming and interacting with\n
3102
02:57:05,911 --> 02:57:09,281
All right, let me go ahead and just make\n
3103
02:57:09,281 --> 02:57:11,131
The comments have no functional impact.
3104
02:57:11,130 --> 02:57:13,230
These green things are\njust notes to self.
3105
02:57:13,230 --> 02:57:15,421
Let me run calculator with\nmaybe-- how about this?
3106
02:57:15,421 --> 02:57:19,441
Instead of 1 plus 1,\nhow about 1 billion--
3107
02:57:22,004 --> 02:57:23,171
whoops, let's do that again.
3108
02:57:24,460 --> 02:57:30,175
1 million, 1 billion, and another 1\n
3109
02:57:30,175 --> 02:57:31,550
All right, so that seems correct.
3110
02:57:31,550 --> 02:57:33,092
Let's run this program one more time.
3111
02:57:33,093 --> 02:57:39,281
How about 2 billion\nplus another 2 billion?
3112
02:57:42,120 --> 02:57:45,120
So, apparently, it's not so correct.
3113
02:57:45,120 --> 02:57:49,590
And, clearly, running 1 plus 1 was\n
3114
02:57:50,550 --> 02:57:53,235
What might have gone wrong?
3115
02:57:53,236 --> 02:57:54,361
What might have gone wrong?
3116
02:57:58,341 --> 02:58:00,511
The computer probably ran\nout of space with bits.
3117
02:58:00,511 --> 02:58:05,271
So it turns out with these data types--\n
3118
02:58:05,271 --> 02:58:09,650
and also float and char and those\n
3119
02:58:09,650 --> 02:58:13,280
and, most importantly, finite\nnumber of bits to represent them.
3120
02:58:14,960 --> 02:58:18,890
Newer computers use more bits, older\n
3121
02:58:18,890 --> 02:58:21,841
It's not necessarily standardized\nfor all of these data types.
3122
02:58:21,841 --> 02:58:28,161
But in this case, in this environment,\n
3123
02:58:28,835 --> 02:58:30,980
So with 32 bits, you\ncan count pretty high.
3124
02:58:30,980 --> 02:58:34,161
This is 64 light bulbs on the\nstage and could count even higher.
3125
02:58:34,161 --> 02:58:38,331
An int is only using half of these, or\n
3126
02:58:38,331 --> 02:58:42,351
Now, if you think back to last week,\n
3127
02:58:42,351 --> 02:58:47,371
And if you have 8 bits, 8 zeros and\n
3128
02:58:47,370 --> 02:58:49,700
just a good number to\ngenerally remember as trivia.
3129
02:58:49,700 --> 02:58:53,540
8 bits gives you 256\npermutations of zeros and ones.
3130
02:58:53,540 --> 02:58:57,440
32 gives you roughly how\nmany, if anyone knows?
3131
02:58:59,841 --> 02:59:02,511
So it's roughly 4 billion, 2 to the 32.
3132
02:59:02,511 --> 02:59:04,310
If you don't know that, it's fine.
3133
02:59:04,310 --> 02:59:07,310
Most programmers, though, eventually\n
3134
02:59:10,790 --> 02:59:13,880
2 billion plus 2 billion\nis exactly 4 billion.
3135
02:59:13,880 --> 02:59:17,960
And that actually should\nfit in a 32-bit integer.
3136
02:59:17,960 --> 02:59:20,751
The catch is that my Mac,\nyour PC, and the like
3137
02:59:20,751 --> 02:59:22,791
also like to support negative numbers.
3138
02:59:22,790 --> 02:59:26,150
And if you want to support both positive\n
3139
02:59:26,150 --> 02:59:29,271
means with 32-bit integers,\nyou can count as high
3140
02:59:29,271 --> 02:59:33,543
as roughly 2 billion positive\nor 2 billion negative
3141
02:59:34,501 --> 02:59:38,060
That's still 4 billion, give or\ntake, but it's only half as many
3142
02:59:38,060 --> 02:59:39,661
in one direction or the other.
3143
02:59:39,661 --> 02:59:44,210
So how could I go about implementing\na correct calculator here?
3144
02:59:47,601 --> 02:59:50,181
Yeah, so not just li,\nwhich was for long integer.
3145
02:59:50,181 --> 02:59:54,631
I have to make one more change,\n
3146
02:59:54,630 --> 02:59:59,090
So let me go back up here and change\n
3147
03:00:00,230 --> 03:00:02,400
And then let me change y as well.
3148
03:00:02,400 --> 03:00:05,900
And then let me change the format code\n
3149
03:00:07,970 --> 03:00:10,581
Let me recompile the calculator--
3150
03:00:14,060 --> 03:00:15,601
That's should obviously be the same.
3151
03:00:15,601 --> 03:00:20,701
Now let's do 2 billion\nand another 2 billion
3152
03:00:20,700 --> 03:00:22,330
and cross our fingers this time.
3153
03:00:22,331 --> 03:00:24,720
Now we're counting as high as 4 billion.
3154
03:00:24,720 --> 03:00:27,870
And we can go way higher than\n4 billion, but we're only
3155
03:00:27,870 --> 03:00:29,760
kicking the can down the street a bit.
3156
03:00:29,761 --> 03:00:31,261
Even though we're now using--
3157
03:00:32,431 --> 03:00:37,781
64 bits, which is as long as this\n
3158
03:00:37,781 --> 03:00:40,381
It might be a really big\nvalue, but it's still finite.
3159
03:00:40,380 --> 03:00:42,810
And we'll come back at the\nend of today to these kinds
3160
03:00:44,040 --> 03:00:48,310
Because arguably now, my calculator\n
3161
03:00:48,310 --> 03:00:51,280
billions of possible inputs but not all.
3162
03:00:51,281 --> 03:00:53,401
And that's problematic\nif you actually want
3163
03:00:53,400 --> 03:00:58,740
to use my calculator for any\npossible inputs, not just
3164
03:00:58,740 --> 03:01:03,300
ones that are roughly less than,\n
3165
03:01:03,300 --> 03:01:05,292
All right, any questions then on that?
3166
03:01:05,292 --> 03:01:07,501
But it's really just a\nprecursor for all the problems
3167
03:01:07,501 --> 03:01:10,505
that we're going to have to\neventually deal with later on.
3168
03:01:15,688 --> 03:01:17,021
DAVID J. MALAN: A good question.
3169
03:01:17,261 --> 03:01:20,351
If we were still using z, we would\n
3170
03:01:20,351 --> 03:01:23,471
Otherwise, we'd be ignoring\n32 of the bits that
3171
03:01:23,470 --> 03:01:25,990
had been added together via the longs.
3172
03:01:27,831 --> 03:01:32,351
All right, so how about we spice things\n
3173
03:01:32,351 --> 03:01:35,711
how about something\nwith some conditions?
3174
03:01:35,710 --> 03:01:37,630
Let's start to ask\nsome actual questions.
3175
03:01:37,630 --> 03:01:44,090
So a moment ago, recall that we had\n
3176
03:01:44,091 --> 03:01:46,091
Now let's look back at\nsomething in Scratch that
3177
03:01:46,091 --> 03:01:48,632
looked a little something like\nthis, a bunch of puzzle pieces
3178
03:01:48,632 --> 03:01:50,710
asking questions by way\nof these conditionals
3179
03:01:50,710 --> 03:01:53,710
and then these Boolean expressions\n
3180
03:01:55,240 --> 03:01:58,300
In C, this actually maps pretty cleanly.
3181
03:01:58,300 --> 03:02:02,290
It's much cleaner from left to right\n
3182
03:02:02,290 --> 03:02:04,990
Here, we have just code\nthat looks like this.
3183
03:02:04,990 --> 03:02:09,251
If, a space, two parentheses\nand then x less than y
3184
03:02:09,251 --> 03:02:12,224
and then we have something like\nprintf there in the middle.
3185
03:02:12,224 --> 03:02:14,140
So here, it's actually\nkind of a nice mapping.
3186
03:02:14,140 --> 03:02:16,511
Notice that, just as the\nyellow puzzle piece in Scratch
3187
03:02:16,511 --> 03:02:18,644
is kind of hugging the\npurple puzzle piece
3188
03:02:18,644 --> 03:02:21,310
that's effectively the role that\nthese curly braces are playing.
3189
03:02:21,310 --> 03:02:24,700
They're sort of encapsulating\nall of the code on the inside.
3190
03:02:24,700 --> 03:02:27,850
The parentheses represent\nthe Boolean expression
3191
03:02:27,851 --> 03:02:32,411
that needs to be asked and answered to\n
3192
03:02:32,411 --> 03:02:34,841
And here's an exception to\nwhat I alluded to earlier.
3193
03:02:34,841 --> 03:02:38,890
Usually, when you see a word and\nthen a parenthesis, something
3194
03:02:38,890 --> 03:02:42,130
and then closed parenthesis, I\n
3195
03:02:42,130 --> 03:02:44,320
And I'm still feeling pretty\ngood about that claim.
3196
03:02:45,761 --> 03:02:48,281
And the word if is not a function.
3197
03:02:48,281 --> 03:02:50,501
It's just a programming construct.
3198
03:02:50,501 --> 03:02:55,421
It's a feature of the C language\n
3199
03:02:55,421 --> 03:02:58,001
for different purposes\nfor a Boolean expression.
3200
03:02:58,001 --> 03:02:59,411
How about something like this?
3201
03:02:59,411 --> 03:03:02,501
Last week, if you wanted to\nhave a two-way fork in the road
3202
03:03:02,501 --> 03:03:05,501
go this way or that way,\nyou can have if and else.
3203
03:03:05,501 --> 03:03:08,091
In C, that would look a\nlittle something like this.
3204
03:03:08,091 --> 03:03:11,111
And if we add in the printf's,\nit now looks quite like the same
3205
03:03:11,111 --> 03:03:15,041
but it adds, of course, the word else\n
3206
03:03:15,040 --> 03:03:21,190
As an aside, in C, It's not strictly\n
3207
03:03:21,191 --> 03:03:25,781
if you have only one line\nof code indented underneath.
3208
03:03:25,781 --> 03:03:30,068
For best practice, though, do so anyway,\n
3209
03:03:30,067 --> 03:03:31,900
and ultimately anyone\nelse reading your code
3210
03:03:31,900 --> 03:03:35,411
that you intend for just that one\n
3211
03:03:35,411 --> 03:03:37,060
How about this from last week?
3212
03:03:37,060 --> 03:03:38,681
Here was a three-way fork in the road.
3213
03:03:38,681 --> 03:03:45,073
If x is less than y, else if x is\n
3214
03:03:45,073 --> 03:03:47,531
Now, here's where you have some\ndisparities between Scratch
3215
03:03:47,531 --> 03:03:52,751
and C. Scratch uses an equals sign\n
3216
03:03:52,751 --> 03:03:55,841
C uses a single equals sign\nfor assignment from right
3217
03:03:55,841 --> 03:03:58,611
to left, minor difference\nbetween the two worlds.
3218
03:03:58,611 --> 03:04:02,831
In C, we could implement the same\n
3219
03:04:02,831 --> 03:04:04,331
just this additional else if.
3220
03:04:04,331 --> 03:04:08,695
And if we add in the printf's, it\n
3221
03:04:08,695 --> 03:04:12,460
This is correct both in the\nScratch world and in the C world.
3222
03:04:12,460 --> 03:04:16,420
But could someone make a claim that\n
3223
03:04:18,581 --> 03:04:21,730
We need the else, at least,\nbut we don't need the last if.
3224
03:04:21,730 --> 03:04:24,220
Because, at least in the\nworld of comparing integers
3225
03:04:24,220 --> 03:04:27,220
it's either going to be less\nthan, greater than, or equal to.
3226
03:04:28,730 --> 03:04:31,331
So you can save a few\nseconds, if you will
3227
03:04:31,331 --> 03:04:35,380
of your program running-- a blink of\n
3228
03:04:35,380 --> 03:04:37,870
and then inferring what\nthe answer to the third
3229
03:04:37,870 --> 03:04:41,480
must be just by nature of\nyour own human logic here.
3230
03:04:41,480 --> 03:04:42,820
Now, why is that a good thing?
3231
03:04:42,820 --> 03:04:46,570
If, for instance, x and y\nhappen to equal each other--
3232
03:04:46,570 --> 03:04:51,341
I type in 1 and 1 for both values,\n
3233
03:04:51,341 --> 03:04:55,570
in the case of this version,\nyou're sort of stupidly
3234
03:04:55,570 --> 03:04:59,021
asking three questions, all of\nwhich are going to get asked
3235
03:04:59,021 --> 03:05:02,320
even though the answer is no, no, yes.
3236
03:05:05,800 --> 03:05:09,640
That seems to be unnecessary because\n
3237
03:05:09,640 --> 03:05:13,450
get rid of the unnecessary if and\n
3238
03:05:13,450 --> 03:05:16,480
else print that x is equal to y--
3239
03:05:16,480 --> 03:05:20,960
now if x indeed equals y because\n
3240
03:05:20,960 --> 03:05:26,200
now you're only going to ask two\n
3241
03:05:26,200 --> 03:05:29,480
and then you're going to get\nyour same correct result.
3242
03:05:29,480 --> 03:05:32,140
So, again, a minor detail,\nbut, again, the kinds of things
3243
03:05:32,140 --> 03:05:33,849
you should be thinking\nabout, not only as
3244
03:05:33,849 --> 03:05:36,431
you write your code to\nbe correct but also write
3245
03:05:36,431 --> 03:05:38,831
it to be well-designed as well.
3246
03:05:38,831 --> 03:05:41,081
All right, so why don't we\ngo ahead and translate this
3247
03:05:41,081 --> 03:05:44,650
into the context of an\nactual program here?
3248
03:05:44,650 --> 03:05:46,570
I'll create a blank window here.
3249
03:05:46,570 --> 03:05:50,441
And let's do something with points,\n
3250
03:05:51,521 --> 03:05:54,641
Let me go ahead and\nrun code of points.c.
3251
03:05:54,640 --> 03:05:56,710
That's just going to\ngive me a new text file.
3252
03:05:56,710 --> 03:06:00,760
And then up here, I'm going to\ndo my usual, include cs50.h.
3253
03:06:04,781 --> 03:06:08,231
So a lot of boilerplate, so to\nspeak, in these early programs.
3254
03:06:09,521 --> 03:06:12,820
Let's ask the user, how\nmany points did they
3255
03:06:12,820 --> 03:06:15,060
lose on their most recent CS50 PSet?
3256
03:06:15,060 --> 03:06:19,040
So sort of evoke my photograph of\n
3257
03:06:19,040 --> 03:06:20,690
where I lost a couple of points myself.
3258
03:06:23,800 --> 03:06:27,740
Then I'll ask a question in English\n
3259
03:06:29,300 --> 03:06:33,600
And then once I have this answer,\n
3260
03:06:33,601 --> 03:06:36,861
So if points is less than 2--
3261
03:06:36,861 --> 03:06:40,521
borrowing the syntax that we\nsaw on the screen a moment ago--
3262
03:06:40,521 --> 03:06:42,921
let's go ahead and print\nout something explanatory
3263
03:06:42,921 --> 03:06:49,191
like you lost fewer points\nthan me, backslash n.
3264
03:06:49,191 --> 03:06:52,221
else if points greater than 2--
3265
03:06:52,220 --> 03:06:53,990
which is, again how many I lost--
3266
03:06:53,990 --> 03:06:59,661
I'm going to go ahead and print out you\n
3267
03:06:59,661 --> 03:07:03,847
else if-- wait a minute, else seems\n
3268
03:07:03,847 --> 03:07:05,931
I'm just going to go ahead\nand print out something
3269
03:07:05,931 --> 03:07:12,621
like you lost the same number\nof points as me, backslash n.
3270
03:07:12,620 --> 03:07:16,790
So, really, just a straightforward\n
3271
03:07:16,790 --> 03:07:18,960
but to a concrete scenario here.
3272
03:07:18,960 --> 03:07:21,090
So let me go ahead and save this.
3273
03:07:21,091 --> 03:07:24,140
Let me go ahead and\nrun make points, Enter.
3274
03:07:27,021 --> 03:07:29,331
And then, how many points did you lose?
3275
03:07:31,102 --> 03:07:32,810
All right, you lost\nfewer points than me.
3276
03:07:37,081 --> 03:07:40,851
So, again, we have the ability to\n
3277
03:07:40,851 --> 03:07:44,841
from last week in reality, which is\n
3278
03:07:46,111 --> 03:07:50,811
There's something subtle here, though,\n
3279
03:07:50,810 --> 03:07:53,390
that someone might call a magic number.
3280
03:07:53,390 --> 03:07:56,581
This is programming speak\nfor something I've done here.
3281
03:07:56,581 --> 03:08:01,880
There's a bit of redundancy unrelated\n
3282
03:08:01,880 --> 03:08:07,470
But is there something I typed twice\n
3283
03:08:07,470 --> 03:08:11,430
Exactly, I've hard-coded, so to speak,\n
3284
03:08:11,431 --> 03:08:13,771
in two locations, in this case--
3285
03:08:13,771 --> 03:08:15,701
that did not come from the user.
3286
03:08:15,700 --> 03:08:18,600
So, apparently, once I\ncompile this, this is it.
3287
03:08:18,601 --> 03:08:21,671
You're always comparing\nyourself to me in like, 1996
3288
03:08:21,671 --> 03:08:24,630
which for better or for worse,\nis all the program can do.
3289
03:08:24,630 --> 03:08:27,930
But this is an example too of\na magic number in the sense
3290
03:08:27,931 --> 03:08:31,691
like, wait, where did that 2 come\n
3291
03:08:31,691 --> 03:08:35,251
It feels like we are setting the\n
3292
03:08:35,251 --> 03:08:36,480
of screwing up down the road.
3293
03:08:36,480 --> 03:08:39,931
Because the longer this code gets,\n
3294
03:08:42,900 --> 03:08:45,030
am I going to keep typing the number 2?
3295
03:08:47,911 --> 03:08:49,861
But, honestly, eventually,\nyou're going to screw up
3296
03:08:49,861 --> 03:08:52,411
and you're going to miss one of the\n
3297
03:08:52,411 --> 03:08:54,911
because maybe I did worse the\nnext year, or 1, I did better.
3298
03:08:54,911 --> 03:08:57,550
And you don't want these\nnumbers to get out of sync.
3299
03:08:57,550 --> 03:09:01,050
So what would be a logical\nimprovement to this design
3300
03:09:01,050 --> 03:09:04,110
rather than hard-coding the\nsame number sort of magically
3301
03:09:07,011 --> 03:09:09,689
Yeah, why don't I make a\nvariable that I can use in there?
3302
03:09:09,689 --> 03:09:11,480
So, for instance, I\ncould create a variable
3303
03:09:11,480 --> 03:09:14,540
like this, another integer called mine.
3304
03:09:14,540 --> 03:09:16,370
And I'm just going to\ninitialize it to 2.
3305
03:09:16,370 --> 03:09:19,370
And then I'm going to change\nmentions of 2 to this.
3306
03:09:19,370 --> 03:09:23,000
And mine is a pretty reasonable\nname for a variable insofar
3307
03:09:23,001 --> 03:09:27,261
as it refers to exactly\nwhose points are in question.
3308
03:09:27,261 --> 03:09:29,781
There's a risk here,\nthough, minor though it is.
3309
03:09:29,781 --> 03:09:32,361
I could accidentally\nchange mine at some point.
3310
03:09:32,361 --> 03:09:36,451
Maybe I forget what mine represents,\n
3311
03:09:36,450 --> 03:09:39,020
So there\'s a way to tell the\ncomputer "don\'t trust me
3312
03:09:39,021 --> 03:09:40,730
because I\'m going to\nscrew up eventually
3313
03:09:40,730 --> 03:09:43,310
by making a variable constant too.
3314
03:09:43,310 --> 03:09:45,780
So a constant in a\nprogramming language--
3315
03:09:45,781 --> 03:09:47,421
this did not exist in Scratch--
3316
03:09:47,421 --> 03:09:50,541
is just an additional hint to the\n
3317
03:09:50,540 --> 03:09:52,130
you to program more defensively.
3318
03:09:52,130 --> 03:09:55,340
If you don't trust\nyourself necessarily to not
3319
03:09:55,341 --> 03:09:57,650
screw up later, or\nhonestly, in practice
3320
03:09:57,650 --> 03:10:01,161
if you know that number should\nnever change, make it constant
3321
03:10:01,161 --> 03:10:02,820
and never think about it again.
3322
03:10:02,820 --> 03:10:08,091
This tells the compiler to make sure\n
3323
03:10:09,740 --> 03:10:14,240
And another convention in C and other\n
3324
03:10:14,240 --> 03:10:16,692
it's often common to just\ncapitalize the variable.
3325
03:10:16,692 --> 03:10:18,650
Kind of like you're\nyelling, but it really just
3326
03:10:18,650 --> 03:10:20,003
visually makes it stand out.
3327
03:10:20,003 --> 03:10:21,711
So it's kind of like\na nice rule of thumb
3328
03:10:21,710 --> 03:10:24,590
that helps you realize, oh,\nthat must be a constant.
3329
03:10:24,591 --> 03:10:27,291
Capitalization alone does\nnot make it constant.
3330
03:10:28,880 --> 03:10:31,130
But the capitalization\nis just a visual reminder
3331
03:10:31,130 --> 03:10:35,040
that this is somewhere,\nsomehow a constant.
3332
03:10:35,040 --> 03:10:37,490
So just a minor refinement,\nbut, again, we're
3333
03:10:37,490 --> 03:10:41,450
sort of getting better at\nprogramming just by instilling
3334
03:10:43,700 --> 03:10:48,020
Questions, then, on conditionals\nin C or these constants?
3335
03:10:48,756 --> 03:10:53,810
AUDIENCE: Why do you not use\n
3336
03:10:53,810 --> 03:10:58,761
DAVID J. MALAN: Yeah, why do you\n
3337
03:10:59,361 --> 03:11:01,401
This is the way the\nlanguage was designed.
3338
03:11:03,501 --> 03:11:07,220
Generally speaking, when you're\n
3339
03:11:08,181 --> 03:11:09,861
there's no semicolons involved.
3340
03:11:09,861 --> 03:11:14,781
For now, assume that semicolons usually\n
3341
03:11:14,781 --> 03:11:18,411
That's not 100% reliable of a heuristic,\n
3342
03:11:20,720 --> 03:11:23,720
Left hand was not talking to the right\n
3343
03:11:25,191 --> 03:11:27,391
All right, so let's do something else.
3344
03:11:28,681 --> 03:11:31,761
If I have the ability to ask\nsomething conditionally--
3345
03:11:31,761 --> 03:11:33,740
is this thing true or\nis this other thing--
3346
03:11:33,740 --> 03:11:36,650
could I write a very simple program\n
3347
03:11:36,650 --> 03:11:39,980
tells me if a number the\nhuman types is even or odd?
3348
03:11:39,980 --> 03:11:42,390
Well, let me just get the\nframework for that in place.
3349
03:11:42,390 --> 03:11:44,990
Let me go ahead and\nwrite code of a parity--
3350
03:11:44,990 --> 03:11:47,060
is a fancy way of saying even or odd.
3351
03:11:47,060 --> 03:11:53,511
And let me go ahead and include cs50.h,\n
3352
03:11:53,511 --> 03:11:55,521
again, more on those down the road.
3353
03:11:55,521 --> 03:11:59,781
But, for now, I'm going to go ahead\n
3354
03:11:59,781 --> 03:12:03,351
by calling get_int and asking\nthem for whatever n is.
3355
03:12:03,351 --> 03:12:07,171
And then now I'm going to\nintroduce some pseudocode.
3356
03:12:07,171 --> 03:12:08,931
So here's the first\nexample of a program
3357
03:12:08,931 --> 03:12:12,001
honestly, that I'm not\nreally sure how to proceed.
3358
03:12:12,001 --> 03:12:14,691
So let me just resort to some\npseudocode using comments.
3359
03:12:14,691 --> 03:12:17,221
Eventually, I'll get rid of\nthis and write actual code.
3360
03:12:17,220 --> 03:12:22,675
But if n is even, then print--
3361
03:12:22,675 --> 03:12:24,050
actually, let me just print that.
3362
03:12:24,050 --> 03:12:27,411
Let me just go ahead and say\nprintf, quote unquote, "even"
3363
03:12:27,411 --> 03:12:29,630
because I know how to use printf.
3364
03:12:29,630 --> 03:12:33,230
else-- all right, I\nknow how to printf odd
3365
03:12:33,230 --> 03:12:35,900
so let me just say printf,\nquote unquote, "odd".
3366
03:12:35,900 --> 03:12:39,630
So here, I've sort of taken a bite\n
3367
03:12:39,630 --> 03:12:42,680
And let me go ahead and put\nin my little placeholders.
3368
03:12:42,681 --> 03:12:44,581
I want to do some kind of conditions.
3369
03:12:44,581 --> 03:12:49,400
So if, question marks now, let me go\n
3370
03:12:53,540 --> 03:12:55,640
I'm getting closer to solving this.
3371
03:12:55,640 --> 03:12:59,370
But I still have this\nquestion mark here.
3372
03:12:59,370 --> 03:13:06,090
How, using syntax we've seen, might\n
3373
03:13:08,120 --> 03:13:12,050
There's this little operator I\n
3374
03:13:12,050 --> 03:13:14,100
operator, that will let\nyou do exactly that.
3375
03:13:14,101 --> 03:13:16,663
If you divide any number by\n2, that mathematical heuristic
3376
03:13:16,663 --> 03:13:19,371
is going to tell you if it's even\n
3377
03:13:21,290 --> 03:13:25,040
And that's nice because the alternative\n
3378
03:13:25,040 --> 03:13:33,440
like if n == 0 or if n\nequals 2 or n equals 4--
3379
03:13:33,441 --> 03:13:37,070
your code would be infinitely long if\n
3380
03:13:37,070 --> 03:13:43,911
But if I do n divided by 2\nand look at the remainder--
3381
03:13:43,911 --> 03:13:47,281
it's a little cryptic, but\nthis will indeed do the trick.
3382
03:13:47,281 --> 03:13:51,471
So the percent sign is\nthe remainder operator.
3383
03:13:51,470 --> 03:13:56,790
It does numerator divided by denominator\n
3384
03:13:56,790 --> 03:13:58,980
but, rather, the remainder of that.
3385
03:13:58,980 --> 03:14:02,511
So if you divide anything by 2,\n
3386
03:14:02,511 --> 03:14:06,681
And if, indeed, 2 divides\ninto n evenly, giving you 0
3387
03:14:06,681 --> 03:14:08,060
then you're going to print even.
3388
03:14:10,070 --> 03:14:14,751
But there is something odd-- pun\n
3389
03:14:14,751 --> 03:14:20,640
What is another new piece of syntax,\n
3390
03:14:25,540 --> 03:14:28,090
And I even caught myself\nverbally saying it a moment ago
3391
03:14:28,091 --> 03:14:29,470
just because it's so ingrained.
3392
03:14:34,581 --> 03:14:36,331
DAVID J. MALAN: Yeah, if\nsomething's equivalent to the other.
3393
03:14:36,331 --> 03:14:38,480
So now this is the equality operator.
3394
03:14:38,480 --> 03:14:40,280
It's not assignment from right to left.
3395
03:14:40,281 --> 03:14:42,561
And this one too is an\nexample of, literally
3396
03:14:42,560 --> 03:14:45,440
humans not really planning\nahead, perhaps, left hand
3397
03:14:45,441 --> 03:14:47,541
not talking to right hand\nin that someone decided
3398
03:14:47,540 --> 03:14:49,161
let's use the equals\nsign for assignment.
3399
03:14:49,161 --> 03:14:52,036
And then some number of minutes or\n
3400
03:14:52,036 --> 03:14:53,841
how do we now compare for equality?
3401
03:14:55,054 --> 03:14:58,220
And if you think this is a little weird,\n
3402
03:14:58,220 --> 03:15:01,050
there's a third version where\nyou use three equal signs.
3403
03:15:01,050 --> 03:15:03,331
So, again, it's humans that\ndesign these languages.
3404
03:15:03,331 --> 03:15:06,528
So if you're ever frustrated by them,\n
3405
03:15:06,528 --> 03:15:08,361
it might just not have\nbeen the best design.
3406
03:15:08,361 --> 03:15:11,191
But we just kind of have\nto live with it ever since.
3407
03:15:11,191 --> 03:15:12,891
So let me go ahead and zoom out here.
3408
03:15:12,890 --> 03:15:15,751
Let me go ahead and make parity here.
3409
03:15:15,751 --> 03:15:20,570
So make parity-- and, again, parity\n
3410
03:15:20,570 --> 03:15:23,300
./parity, type in a number like 2.
3411
03:15:26,421 --> 03:15:29,720
3, that's indeed odd, and so forth.
3412
03:15:29,720 --> 03:15:32,870
If we continue testing, presumably,\n
3413
03:15:34,171 --> 03:15:37,701
Let me go ahead now and let me start\n
3414
03:15:37,700 --> 03:15:40,760
because, admittedly, it's getting\n
3415
03:15:40,761 --> 03:15:42,681
all of that boilerplate at the top.
3416
03:15:42,681 --> 03:15:46,011
Let me create a program\ncalled agree.c that's
3417
03:15:46,011 --> 03:15:48,111
reminiscent of any of\nthose forms you have
3418
03:15:48,111 --> 03:15:52,291
to agree to online with a checkbox\n
3419
03:15:52,290 --> 03:15:55,460
So let me throw away all the\nguts of this main program
3420
03:15:55,460 --> 03:15:57,780
and now ask something like this.
3421
03:15:57,781 --> 03:16:01,621
Let me go ahead and prompt\nuser to agree to something.
3422
03:16:01,620 --> 03:16:06,800
I'm going to go ahead and say, how\n
3423
03:16:06,800 --> 03:16:12,110
whatever the question might be--\n
3424
03:16:12,111 --> 03:16:13,951
for yes or no, respectively.
3425
03:16:13,950 --> 03:16:16,970
So if it's only a single\ncharacter, actually, I
3426
03:16:16,970 --> 03:16:18,610
can actually get by with just get_char.
3427
03:16:18,611 --> 03:16:20,361
Not used it before,\nbut it was on our menu
3428
03:16:20,361 --> 03:16:22,341
of functions from the CS50 library.
3429
03:16:22,341 --> 03:16:25,431
And if I want to get\nthe user's response
3430
03:16:25,431 --> 03:16:28,501
the return value should be\na char also on the left.
3431
03:16:28,501 --> 03:16:31,281
So now we've seen strings,\nints, and now chars
3432
03:16:31,281 --> 03:16:33,171
if we only care about a single letter.
3433
03:16:33,171 --> 03:16:38,150
And now let's go ahead,\ncheck whether user agreed.
3434
03:16:38,150 --> 03:16:47,751
So how about if c == "y", then let me\n
3435
03:16:47,751 --> 03:16:51,900
print out agreed or some\nsuch sentence like that.
3436
03:16:51,900 --> 03:16:54,560
else if they did not type\nc-- or you know what?
3437
03:16:54,560 --> 03:16:56,390
Let's be explicit here,\njust so they can't
3438
03:16:56,390 --> 03:16:59,060
type z or b or some random letter.
3439
03:16:59,060 --> 03:17:07,011
else if c=="n" n for no, then let me\n
3440
03:17:07,978 --> 03:17:10,520
And I'm just going to ignore\nthe user if they don't cooperate
3441
03:17:10,521 --> 03:17:14,781
and they type z or b or\nsomething that's not y or n.
3442
03:17:14,781 --> 03:17:21,141
All right, let me go ahead now and\n
3443
03:17:22,570 --> 03:17:25,251
Let's go with the default.\nOK, so that seems to work.
3444
03:17:25,251 --> 03:17:27,201
No, I don't agree this time.
3445
03:17:28,411 --> 03:17:33,921
How about my caps lock key is on or\n
3446
03:17:37,972 --> 03:17:41,181
So, obviously, a bug, at least if I want\n
3447
03:17:41,181 --> 03:17:43,021
which is kind of reasonable.
3448
03:17:43,021 --> 03:17:48,480
So what would be the possible\nsolutions here, do you think?
3449
03:17:48,480 --> 03:17:51,620
How do I solve this and tolerate\nboth capital and lowercase?
3450
03:17:51,620 --> 03:17:54,335
Maybe what's the simplest,\nmost naive implementation?
3451
03:17:56,490 --> 03:17:58,990
DAVID J. MALAN: Yeah, so why\ndon't I just ask two questions?
3452
03:17:58,990 --> 03:18:03,550
Or you know what, even more simplistic\n
3453
03:18:03,550 --> 03:18:06,911
if you will, let me just copy\nand paste some of this code.
3454
03:18:06,911 --> 03:18:11,781
Change this to an else-- whoops,\nnot in caps-- else if "Y".
3455
03:18:11,781 --> 03:18:13,781
And then I bet I could\ndo the same thing with n.
3456
03:18:13,781 --> 03:18:15,641
But here too, just like\nwith Scratch, as soon
3457
03:18:15,640 --> 03:18:17,724
as you start to find\nyourself copying and pasting
3458
03:18:17,724 --> 03:18:19,480
you're probably doing something wrong.
3459
03:18:19,480 --> 03:18:22,810
And what you said verbally,\nif I may, was actually better.
3460
03:18:22,810 --> 03:18:29,501
Because you're implying that I could\n
3461
03:18:32,841 --> 03:18:39,711
The catch is, you can't use the word OR\n
3462
03:18:39,710 --> 03:18:43,180
So you can express one\nquestion or another.
3463
03:18:43,181 --> 03:18:46,301
You only need one of the\nanswers to be yes or true
3464
03:18:46,300 --> 03:18:48,220
and you use two vertical bars.
3465
03:18:48,220 --> 03:18:50,980
By contrast, just so\nyou've seen it, if you
3466
03:18:50,980 --> 03:18:54,341
wanted to check if something is equal\n
3467
03:18:54,341 --> 03:18:56,890
you could use two ampersands.
3468
03:18:56,890 --> 03:18:59,411
This logically would make\nno sense here, though.
3469
03:18:59,411 --> 03:19:02,650
Certainly, what the human typed can't\n
3470
03:19:03,921 --> 03:19:05,711
So in this case, we do want OR.
3471
03:19:05,710 --> 03:19:07,475
But that allows me to\ntighten my code up.
3472
03:19:07,476 --> 03:19:09,851
I don't have to start copying\nand pasting whole branches.
3473
03:19:09,851 --> 03:19:13,560
I can now ask two questions at once.
3474
03:19:13,560 --> 03:19:16,870
Questions, then, on this variation?
3475
03:19:17,869 --> 03:19:19,661
Can you convert the\ninput to all lowercase?
3476
03:19:20,831 --> 03:19:22,480
We don't have the capability yet.
3477
03:19:22,480 --> 03:19:24,940
It turns out that's going to require--
3478
03:19:24,941 --> 03:19:27,166
to be easy, another library,\nthough we could do it
3479
03:19:27,165 --> 03:19:30,040
ourselves knowing a little bit about\n
3480
03:19:30,040 --> 03:19:34,113
But, yes, that would be an alternative,\n
3481
03:19:37,140 --> 03:19:38,390
DAVID J. MALAN: Good question.
3482
03:19:38,390 --> 03:19:41,987
Unfortunately, you have to be explicit\n
3483
03:19:41,987 --> 03:19:44,320
even though that's kind of\nhow you might think about it.
3484
03:19:44,320 --> 03:19:49,240
You have to ask a complete question\n
3485
03:19:50,230 --> 03:19:51,940
Let me ask a question now too.
3486
03:19:53,050 --> 03:19:57,760
I deliberately used single quotes\n
3487
03:19:59,351 --> 03:20:03,521
Previously, we used double quotes\n
3488
03:20:05,911 --> 03:20:08,970
Correct, string is double quotes for\n
3489
03:20:10,050 --> 03:20:14,210
And single quotes for single characters.
3490
03:20:14,210 --> 03:20:15,800
Because my data type is different.
3491
03:20:15,800 --> 03:20:18,770
I chose the simple route of\njust using a single char.
3492
03:20:18,771 --> 03:20:21,351
In fact, this program\nwon't work with Y-E-S
3493
03:20:21,351 --> 03:20:25,041
or N-O. That's not supported at the\n
3494
03:20:25,040 --> 03:20:27,831
I had to use single quotes\nbecause that's how C does it.
3495
03:20:27,831 --> 03:20:29,790
If you're dealing with\nsingle characters
3496
03:20:29,790 --> 03:20:31,850
a.k.a. chars, use single quotes.
3497
03:20:32,781 --> 03:20:36,591
even if it's one single\ncharacter in a string
3498
03:20:36,591 --> 03:20:39,623
as though you're starting to write\n
3499
03:20:39,623 --> 03:20:40,790
that would be double quotes.
3500
03:20:40,790 --> 03:20:42,860
And we'll see why this\nis before long too.
3501
03:20:42,861 --> 03:20:46,551
But, again, just things to keep\nin mind whenever writing code
3502
03:20:46,550 --> 03:20:48,740
in this particular language.
3503
03:20:50,970 --> 03:20:56,060
So, short answer, if I'm understanding\n
3504
03:20:56,060 --> 03:20:58,270
And this would be even more incorrect.
3505
03:20:58,271 --> 03:21:00,771
But if you don't mind, let me\nkick the can a couple of weeks
3506
03:21:00,771 --> 03:21:02,480
on this as to why this doesn't work.
3507
03:21:02,480 --> 03:21:06,852
The most pleasant way to do this would\n
3508
03:21:06,852 --> 03:21:08,810
But even this is a slippery\nslope, because what
3509
03:21:08,810 --> 03:21:12,179
if the user does something weird,\n
3510
03:21:12,179 --> 03:21:13,970
You can imagine this\ngetting messy quickly.
3511
03:21:13,970 --> 03:21:16,310
I like your idea earlier\nabout just forcing everything
3512
03:21:16,310 --> 03:21:18,560
to lowercase just to standardize things.
3513
03:21:18,560 --> 03:21:22,700
Unfortunately, you cannot compare\nstrings for equality like this
3514
03:21:22,700 --> 03:21:24,780
for, again, reasons will\ncome to before long.
3515
03:21:24,781 --> 03:21:27,681
So for today, we're keeping it\n
3516
03:21:27,681 --> 03:21:31,640
not nearly as user-friendly to\nonly tolerate individual letters.
3517
03:21:31,640 --> 03:21:34,820
And there's a question over here.
3518
03:21:34,820 --> 03:21:36,980
On the US English keyboard\nit's shift and then
3519
03:21:36,980 --> 03:21:40,081
the backslash key above Return,\nbut depending on your keyboard
3520
03:21:42,320 --> 03:21:45,140
All right, so let's\nactually now look back
3521
03:21:45,140 --> 03:21:47,181
at something we did a\nlittle bit of last week.
3522
03:21:47,181 --> 03:21:49,971
Let me go ahead and open\na file called meow.c
3523
03:21:49,970 --> 03:21:52,460
because, recall, that's what\nwe had Scratch do initially.
3524
03:21:52,460 --> 03:21:55,251
Let me include not the\nC50 library this time
3525
03:21:55,251 --> 03:21:59,150
but just stdio.h because I\nonly want printf for this demo.
3526
03:21:59,150 --> 03:22:02,720
Let me go ahead now and\njust print out meow.
3527
03:22:02,720 --> 03:22:06,411
And then if I want the cat to meow\n
3528
03:22:11,210 --> 03:22:14,120
The program is written--\ncorrect, I claim.
3529
03:22:15,591 --> 03:22:17,841
But, again, this was the\nbeginning of our conversation
3530
03:22:17,841 --> 03:22:20,541
last week of not being\nparticularly well-designed.
3531
03:22:20,540 --> 03:22:23,630
And if someone wants to maybe\npoint out the now obvious
3532
03:22:23,630 --> 03:22:28,380
why is this not\nwell-designed, necessarily?
3533
03:22:28,380 --> 03:22:29,820
Yeah, it's just repetition, right?
3534
03:22:29,820 --> 03:22:31,894
Again, I literally\nresorted to copy-paste.
3535
03:22:31,894 --> 03:22:33,810
That should be the signal\nthat you're probably
3536
03:22:33,810 --> 03:22:37,870
doing something wrong or, at best,\n
3537
03:22:37,870 --> 03:22:40,195
So the solution, as you\nmight glean from last week
3538
03:22:40,195 --> 03:22:42,570
is probably going to be one\nof those things called loops.
3539
03:22:42,570 --> 03:22:45,271
So let's just take a look at some\nof the syntax for loops in C.
3540
03:22:45,271 --> 03:22:48,084
But, again, no new ideas,\nit's just some new syntax
3541
03:22:48,084 --> 03:22:49,501
that'll take some getting used to.
3542
03:22:49,501 --> 03:22:53,130
In Scratch, if you wanted to meow\n
3543
03:22:53,130 --> 03:22:57,880
there's not a forever keyword in C, so\n
3544
03:22:57,880 --> 03:22:59,220
But this is the best we can do.
3545
03:22:59,220 --> 03:23:03,331
It turns out there is a\nkeyword called while in C.
3546
03:23:03,331 --> 03:23:05,550
And that kind of has\nthe right semantics
3547
03:23:05,550 --> 03:23:08,740
because it's like while I do\nsomething again and again
3548
03:23:10,230 --> 03:23:14,790
But just like an if condition\nor an else if condition
3549
03:23:14,790 --> 03:23:18,060
those took a Boolean\nexpression in parentheses
3550
03:23:18,060 --> 03:23:21,130
a while loop also takes a Boolean\nexpression in parentheses.
3551
03:23:22,531 --> 03:23:26,041
Now, if I want to do something\n
3552
03:23:26,040 --> 03:23:30,540
say while 2 is greater than\n1, while 3 is greater than 2
3553
03:23:30,540 --> 03:23:32,430
or just something completely arbitrary.
3554
03:23:32,431 --> 03:23:36,240
But that should rub you the wrong\n
3555
03:23:36,240 --> 03:23:41,130
Why 3-- if you want true, just say true.
3556
03:23:41,130 --> 03:23:45,780
So it turns out in C, there are\n
3557
03:23:45,781 --> 03:23:48,901
that are literally true\nand false, respectively.
3558
03:23:48,900 --> 03:23:53,341
I could also put the number 1 for\n
3559
03:23:53,341 --> 03:23:56,081
but most people would just\nsay true to be explicit.
3560
03:23:56,081 --> 03:23:59,431
So it's a little hackish, if\nyou will, but very conventional.
3561
03:23:59,431 --> 03:24:03,781
There's no forever keyword in C. If\n
3562
03:24:03,781 --> 03:24:06,191
I'm going to just use\nsomething like printf here.
3563
03:24:06,191 --> 03:24:08,731
So, again, not perfect\ntranslation from one
3564
03:24:08,730 --> 03:24:11,708
to the other, but absolutely\npossible in C. What about this?
3565
03:24:11,708 --> 03:24:14,041
This is a little more common\nif you want to do something
3566
03:24:14,040 --> 03:24:17,280
a finite number of times, like repeat 3.
3567
03:24:17,281 --> 03:24:21,901
There's a few different ways we can\n
3568
03:24:21,900 --> 03:24:25,470
And here's where C-- like a\nlot of text-based languages
3569
03:24:25,470 --> 03:24:28,921
you kind of have to whip out that\n
3570
03:24:28,921 --> 03:24:31,171
blocks and think about,\nall right, how can I
3571
03:24:31,171 --> 03:24:35,880
build a little machine in software that\n
3572
03:24:35,880 --> 03:24:40,380
Well, let me give myself a variable\n
3573
03:24:40,380 --> 03:24:46,740
Let me create a loop whose Boolean\n
3574
03:24:46,740 --> 03:24:50,310
the idea being here, why don't\nI just kind of count 1, 2, 3?
3575
03:24:50,310 --> 03:24:53,790
So how do I implement\nthis physicality in code?
3576
03:24:53,790 --> 03:24:57,180
I give myself a variable,\nset it to 0, 0 fingers up.
3577
03:24:57,181 --> 03:25:00,001
Now, I ask the question,\nis counter less than 3?
3578
03:25:00,001 --> 03:25:02,970
If so, go ahead and print out meow.
3579
03:25:02,970 --> 03:25:06,331
And just intuitively, even if\n
3580
03:25:06,331 --> 03:25:09,480
before Scratch, what\nmore do I need to do?
3581
03:25:09,480 --> 03:25:12,421
I've left room here for\none more line of logic.
3582
03:25:13,990 --> 03:25:15,220
We have to increase counter.
3583
03:25:15,220 --> 03:25:19,150
So I need code like I showed earlier,\n
3584
03:25:19,150 --> 03:25:21,400
And so here's where\nprogramming sometimes
3585
03:25:21,400 --> 03:25:22,921
becomes a bit more like plumbing.
3586
03:25:22,921 --> 03:25:25,421
You can't just say what you\nmean, like you couldn't Scratch.
3587
03:25:25,421 --> 03:25:28,001
You have to build a little\nsort of software machine
3588
03:25:28,001 --> 03:25:31,181
that initializes a value, does\n
3589
03:25:31,181 --> 03:25:34,091
And so it's kind of like\nthis software-based machine
3590
03:25:34,091 --> 03:25:37,211
but together, that's just using\nsome familiar building blocks.
3591
03:25:38,380 --> 03:25:40,672
Just like in Scratch, you\nmight have used loops a bunch
3592
03:25:40,673 --> 03:25:42,161
of times, pretty common in C.
3593
03:25:42,161 --> 03:25:44,050
So can we tighten this code up?
3594
03:25:44,050 --> 03:25:48,581
This is correct, but here are\nsome conventions that are popular.
3595
03:25:48,581 --> 03:25:51,025
If you're going to count, just say i.
3596
03:25:51,025 --> 03:25:52,900
A convention in\nprogramming-- with, at least
3597
03:25:52,900 --> 03:25:57,490
languages like C-- is just use i\n
3598
03:25:57,490 --> 03:25:59,650
is to count from like, 0 on up.
3599
03:26:01,900 --> 03:26:04,780
It's just more verbose\nthan you need to be.
3600
03:26:05,740 --> 03:26:07,532
You don't need more semantics than that.
3601
03:26:07,532 --> 03:26:08,990
All right, what else can I do here?
3602
03:26:08,990 --> 03:26:12,177
There's another opportunity\nto tighten up this code.
3603
03:26:14,720 --> 03:26:17,720
Yeah, that syntactic sugar\nthat does nothing new
3604
03:26:17,720 --> 03:26:19,520
but it does it more succinctly.
3605
03:26:19,521 --> 03:26:23,990
I can change this to either the\n
3606
03:26:25,700 --> 03:26:27,920
Now, this is pretty canonical.
3607
03:26:27,921 --> 03:26:31,941
This is how most people\nwould implement something
3608
03:26:31,941 --> 03:26:34,611
three times using a loop in C--
3609
03:26:34,611 --> 03:26:36,336
using a while loop, that is.
3610
03:26:36,335 --> 03:26:39,350
Turns out that it's so common\nin C and other languages
3611
03:26:39,351 --> 03:26:43,161
to do something finitely many times,\n
3612
03:26:43,161 --> 03:26:46,021
In this model, to be\nclear, the logic, though
3613
03:26:46,021 --> 03:26:49,221
is that we start by initializing the\n
3614
03:26:49,220 --> 03:26:52,310
We then ask the question,\nis i less than 0?
3615
03:26:52,310 --> 03:26:56,030
If so, everything that's\nindented inside the curly braces
3616
03:26:56,031 --> 03:26:59,240
gets executed-- namely,\nmeow then the update.
3617
03:26:59,240 --> 03:27:02,931
Then the computer is going to\nhave to recheck the condition
3618
03:27:02,931 --> 03:27:06,531
to make sure that i hasn't gotten\n
3619
03:27:06,531 --> 03:27:09,621
But if not, it then does this\nagain and it does this again.
3620
03:27:09,620 --> 03:27:12,045
And then it repeats, constantly\nchecking the condition
3621
03:27:12,046 --> 03:27:14,421
and executing what's in the\nblock, checking the condition
3622
03:27:14,421 --> 03:27:15,837
and executing what's in the block.
3623
03:27:15,837 --> 03:27:20,390
After three times of that, the condition\n
3624
03:27:21,620 --> 03:27:24,740
It just proceeds to whatever's\n
3625
03:27:24,740 --> 03:27:27,390
It jumps to the next blocks down below.
3626
03:27:27,390 --> 03:27:30,181
All right, what's another\nway, though, to do this?
3627
03:27:30,181 --> 03:27:32,181
Well, I've deliberately\nbeen counting from 0--
3628
03:27:32,181 --> 03:27:33,972
and that's a programming\nconvention, right?
3629
03:27:33,972 --> 03:27:36,688
We started last week with all\nthe light bulbs off, which was 0.
3630
03:27:36,688 --> 03:27:39,021
So it's pretty reasonable to\nstart counting at 0's, just
3631
03:27:39,853 --> 03:27:41,870
Like, no fingers are up, this is 0--
3632
03:27:43,560 --> 03:27:47,841
But if you prefer, you could\nstart counting at i equals 1.
3633
03:27:47,841 --> 03:27:51,140
But then you don't want to\ndo it while i is less than 3
3634
03:27:51,140 --> 03:27:54,200
you want to do i is\nless than or equal to 3.
3635
03:27:54,200 --> 03:27:58,640
On most keyboards, there's no symbol for\n
3636
03:27:58,640 --> 03:28:02,690
or equal to, so in C, you\nuse two characters, less than
3637
03:28:02,691 --> 03:28:05,811
and then an equals sign\nwith no spaces in between.
3638
03:28:05,810 --> 03:28:08,251
That just means less than or equal to.
3639
03:28:08,251 --> 03:28:12,771
We could change it to set i to 2\n
3640
03:28:13,820 --> 03:28:18,900
We could make this be a 10\nand less than or equal to 12.
3641
03:28:18,900 --> 03:28:20,810
But, again, just stick with the basics.
3642
03:28:20,810 --> 03:28:23,931
Start at 0 and count on up\nwould be the convention.
3643
03:28:23,931 --> 03:28:27,291
Or if you prefer to count\ndown, that's fine too.
3644
03:28:27,290 --> 03:28:31,700
Set i to 3 and then do this so\nlong as i is greater than 0
3645
03:28:31,700 --> 03:28:34,552
but you have to decrement\ninstead of increment.
3646
03:28:34,552 --> 03:28:36,261
So, again, we could\ndo this all day long.
3647
03:28:36,261 --> 03:28:39,396
There's literally an infinite number\n
3648
03:28:39,396 --> 03:28:41,271
And that's why I keep\nemphasizing convention.
3649
03:28:41,271 --> 03:28:43,761
Call the variable i for\nsomething like this
3650
03:28:43,761 --> 03:28:47,271
initialize it to 0 for something like\n
3651
03:28:47,271 --> 03:28:49,161
unless you really prefer to count down.
3652
03:28:49,161 --> 03:28:51,921
Again, just certain human conventions.
3653
03:28:51,921 --> 03:28:55,111
All right, how about\nanother way to do this?
3654
03:28:55,111 --> 03:28:59,070
This is what's called a for\nloop in C, also very common.
3655
03:28:59,070 --> 03:29:02,091
It's not quite as straightforward\n
3656
03:29:02,091 --> 03:29:04,011
to bottom in exactly the same way.
3657
03:29:04,011 --> 03:29:07,400
This kind of has a lot more\nlogic tucked into its first line.
3658
03:29:07,400 --> 03:29:09,921
But it does exactly the same thing.
3659
03:29:11,630 --> 03:29:15,350
notice that inside the\nparentheses, next to the word for
3660
03:29:15,351 --> 03:29:18,523
there's two semicolons-- which\nis another weird use of syntax.
3661
03:29:18,522 --> 03:29:20,480
They're not at the end\nof the line, now they're
3662
03:29:20,480 --> 03:29:21,855
in the middle of the parentheses.
3663
03:29:21,855 --> 03:29:24,030
But that's what the\nhumans chose years ago.
3664
03:29:24,031 --> 03:29:30,951
The first thing before the semicolons\n
3665
03:29:30,950 --> 03:29:34,040
The next thing is the condition\nthat's going to constantly get
3666
03:29:34,040 --> 03:29:36,560
checked every cycle through this loop.
3667
03:29:36,560 --> 03:29:41,360
And the last thing is going to be\n
3668
03:29:41,361 --> 03:29:42,990
in this case is going to be count up.
3669
03:29:42,990 --> 03:29:45,710
So, again, if I rewind\nwe initialize i to 0.
3670
03:29:45,710 --> 03:29:48,350
We then ask the question,\nis i less than 3?
3671
03:29:48,351 --> 03:29:52,611
If so, execute what's\ninside of the loop.
3672
03:29:52,611 --> 03:29:58,490
Then the computer does this, it does\n
3673
03:29:58,490 --> 03:30:01,220
And then it's not going\nto blindly meow again.
3674
03:30:01,220 --> 03:30:04,581
It's going to check again the\ncondition, is i less than 3?
3675
03:30:04,581 --> 03:30:06,351
Then it's going to meow if so.
3676
03:30:06,351 --> 03:30:10,761
Then it might go ahead and increment\n
3677
03:30:10,761 --> 03:30:14,449
So, again, this does not read quite\n
3678
03:30:14,449 --> 03:30:16,740
You kind of read it left to\nright and then jump around.
3679
03:30:16,740 --> 03:30:21,890
But, again, the initialization,\nthe constant Boolean expression
3680
03:30:21,890 --> 03:30:24,681
being checked, and the\nupdate after each time
3681
03:30:24,681 --> 03:30:32,551
does the exact same thing as what we saw\n
3682
03:30:35,011 --> 03:30:37,052
I think most people would\nprobably eventually use
3683
03:30:37,052 --> 03:30:42,570
a for loop once comfortable, but just\n
3684
03:30:42,570 --> 03:30:45,671
All right, any questions, then, on\n
3685
03:30:47,923 --> 03:30:49,631
DAVID J. MALAN: A for\nloop and while loop
3686
03:30:49,630 --> 03:30:53,140
can both be used to do\nexactly the same thing.
3687
03:30:53,140 --> 03:30:56,733
There are subtle differences\nwith issues of scope
3688
03:30:56,733 --> 03:30:58,691
which we'll discuss before\nlong, where when you
3689
03:30:58,691 --> 03:31:00,921
create a variable in a for loop--
3690
03:31:00,921 --> 03:31:04,390
notice that it was, again, inside\nof those parentheses, which
3691
03:31:04,390 --> 03:31:08,831
technically means it's only going to\n
3692
03:31:08,831 --> 03:31:12,581
By contrast, with the while loop,\nI declared my variable outside
3693
03:31:13,210 --> 03:31:17,030
That variable is going to continue\n
3694
03:31:17,031 --> 03:31:20,008
So that's one of the\nminor differences there.
3695
03:31:20,591 --> 03:31:22,871
But you'll see some others over time.
3696
03:31:22,870 --> 03:31:26,090
All right, so we claim then\nthat it's better in some form
3697
03:31:27,290 --> 03:31:29,331
So let's actually jump back to the code.
3698
03:31:29,331 --> 03:31:33,831
Let me go ahead and now re-implement\n
3699
03:31:33,831 --> 03:31:39,701
So how about for int i\n= 0, i less than 3, i++.
3700
03:31:39,700 --> 03:31:44,440
Then inside my curly braces, let me go\n
3701
03:31:44,441 --> 03:31:46,911
with a newline and a semicolon.
3702
03:31:46,911 --> 03:31:50,661
So I did it pretty quickly just because\n
3703
03:31:50,661 --> 03:31:53,591
But if I now make meow, no errors there.
3704
03:31:57,101 --> 03:31:59,230
Well, let's do now what\nwe did last week, which
3705
03:31:59,230 --> 03:32:03,501
was to begin to make our own\ncustom functions, if you will
3706
03:32:03,501 --> 03:32:09,431
by using our own in C. So here's\n
3707
03:32:09,431 --> 03:32:13,841
but we'll explain over time what\n
3708
03:32:13,841 --> 03:32:17,201
If I want to create a\nfunction called meow--
3709
03:32:17,200 --> 03:32:21,280
because the authors of C did not create\n
3710
03:32:21,281 --> 03:32:23,951
I need to give it a name, like meow.
3711
03:32:23,950 --> 03:32:26,530
I need to specify if\nit takes any inputs.
3712
03:32:26,531 --> 03:32:28,391
For now, I'm going to say no.
3713
03:32:28,390 --> 03:32:34,200
And I'm going to explicitly say no\n
3714
03:32:34,200 --> 03:32:37,080
It's also necessary when\nimplementing a function in C--
3715
03:32:37,081 --> 03:32:38,851
which was not necessary in Scratch--
3716
03:32:38,851 --> 03:32:41,411
to specify what its return type is.
3717
03:32:41,411 --> 03:32:45,060
But for now, I'm just going to say\n
3718
03:32:46,591 --> 03:32:49,531
and that's what the void\nin parentheses means--
3719
03:32:49,531 --> 03:32:54,091
and it does not return\nanything like ask did
3720
03:32:54,091 --> 03:32:56,220
or like get_string or get_int does.
3721
03:32:56,220 --> 03:32:59,940
meow's purpose in life is just to\n
3722
03:32:59,941 --> 03:33:01,990
by printing something on the screen.
3723
03:33:01,990 --> 03:33:04,261
So what is meow going to do?
3724
03:33:04,261 --> 03:33:06,480
I'm going to have it\nquite simply say printf
3725
03:33:06,480 --> 03:33:10,290
quote unquote, "meow", backslash n.
3726
03:33:10,290 --> 03:33:14,670
And now, just like in\nScratch, I can now just call
3727
03:33:14,671 --> 03:33:16,831
a brand new function called meow.
3728
03:33:16,831 --> 03:33:19,531
And here's where too, if you\nreally don't like the curly braces
3729
03:33:19,531 --> 03:33:22,591
technically speaking, you can\nget rid of them when there's
3730
03:33:22,591 --> 03:33:24,810
only one line of code inside your loop.
3731
03:33:24,810 --> 03:33:27,511
But, again, stylistically,\nI would encourage
3732
03:33:27,511 --> 03:33:30,511
you to preserve them to make\nsuper clear to yourself and others
3733
03:33:32,470 --> 03:33:35,100
Let me go ahead and save\nthis and do make meow.
3734
03:33:39,970 --> 03:33:42,140
DAVID J. MALAN: Yeah, so\n0 does not belong there.
3735
03:33:49,601 --> 03:33:51,621
All right, it's still working OK.
3736
03:33:51,620 --> 03:33:54,770
But recall what I did in Scratch,\n
3737
03:33:54,771 --> 03:33:57,820
And just to make a point, let me\njust highlight this and move it
3738
03:33:59,261 --> 03:34:02,291
Because, again, now that meow\nexists, it's an abstraction.
3739
03:34:02,290 --> 03:34:04,600
I just know a meow function exists.
3740
03:34:04,601 --> 03:34:06,111
I want to be able to use it.
3741
03:34:08,021 --> 03:34:09,730
My main function is the same.
3742
03:34:09,730 --> 03:34:12,280
Let me go ahead and make meow again.
3743
03:34:12,281 --> 03:34:17,531
And now, just by moving that function,\n
3744
03:34:17,531 --> 03:34:18,851
And let's look at the first.
3745
03:34:18,851 --> 03:34:21,018
Again, the rule of thumb\nhere-- it's a little small
3746
03:34:21,018 --> 03:34:25,060
but it says meow.c in bold-- which is\n
3747
03:34:25,060 --> 03:34:27,911
5 is the line number,\nand 20 is the character.
3748
03:34:27,911 --> 03:34:30,740
So line number is enough alone.
3749
03:34:32,501 --> 03:34:36,611
Oh, this is what happens\nwhen I scrolled up too far.
3750
03:34:37,210 --> 03:34:39,760
This is the error we're\nnow looking at, line 7.
3751
03:34:39,761 --> 03:34:43,691
I was looking at the old error message\n
3752
03:34:45,550 --> 03:34:49,780
All right, apparently, C does not\n
3753
03:34:49,781 --> 03:34:53,441
Implicit declaration of\nfunction meow is invalid in C99.
3754
03:34:54,581 --> 03:34:57,941
Declaration of function means\nyour creation of a function.
3755
03:34:57,941 --> 03:35:01,511
Like, I'm declaring that meow\nexists, but I haven't apparently
3756
03:35:02,501 --> 03:35:06,230
And then C99 is the version\nof C from the year 1999
3757
03:35:06,230 --> 03:35:09,190
which we generally use here, it's\n
3758
03:35:12,460 --> 03:35:16,030
Can you infer from the mere fact\n
3759
03:35:16,031 --> 03:35:19,041
of the file-- which was fine\nin Scratch but now is bad--
3760
03:35:21,865 --> 03:35:23,990
DAVID J. MALAN: Yeah, C is\njust kind of old school.
3761
03:35:23,990 --> 03:35:25,681
It reads your code top to bottom.
3762
03:35:25,681 --> 03:35:30,001
And if it does not know what meow\n
3763
03:35:30,001 --> 03:35:32,911
it just freaks out and prints\nout these error messages.
3764
03:35:32,911 --> 03:35:38,511
So the solution is, quite simply, don't\n
3765
03:35:38,511 --> 03:35:42,470
But you can imagine this getting a\n
3766
03:35:42,470 --> 03:35:46,761
because main is, by name, the\nmain part of your program.
3767
03:35:46,761 --> 03:35:49,970
And, honestly, it would just\nbe nice if main were always
3768
03:35:51,099 --> 03:35:53,390
Because if you want to\nunderstand what a file is doing
3769
03:35:53,390 --> 03:35:55,400
it makes sense to just\nread it top to bottom.
3770
03:35:55,400 --> 03:35:57,575
Well, there is a solution to this.
3771
03:35:57,575 --> 03:36:02,720
You can put functions in different\n
3772
03:36:02,720 --> 03:36:07,501
as you-- and this is perhaps the\n
3773
03:36:07,501 --> 03:36:10,671
so long as you leave a little\nbreadcrumb for the compiler
3774
03:36:10,671 --> 03:36:13,130
at the very top of your\nfile that literally
3775
03:36:13,130 --> 03:36:17,090
repeats the return value,\nthe name, and the arguments
3776
03:36:17,091 --> 03:36:19,681
to that function, semicolon.
3777
03:36:19,681 --> 03:36:22,791
This is, so to speak,\ndeclaring your function--
3778
03:36:22,790 --> 03:36:25,260
and the real fancy way\nis this is a prototype.
3779
03:36:25,261 --> 03:36:27,541
It's like, what is this\nthing going to look like?
3780
03:36:27,540 --> 03:36:30,480
But the semicolon means I'm not\ngoing to deal with this yet.
3781
03:36:30,480 --> 03:36:32,331
I'm going to actually\ndefine the function
3782
03:36:32,331 --> 03:36:34,701
or implement it down below here.
3783
03:36:34,700 --> 03:36:36,680
This is kind of a stupid detail.
3784
03:36:36,681 --> 03:36:40,221
More recent languages\nget rid of this need
3785
03:36:40,220 --> 03:36:41,993
you can put your functions in any order.
3786
03:36:41,994 --> 03:36:43,911
But, again, if you just\nthink about the basics
3787
03:36:43,911 --> 03:36:46,230
of programming languages\nlike this one here--
3788
03:36:47,210 --> 03:36:49,350
it must just be reading\nyour code top to bottom.
3789
03:36:49,351 --> 03:36:53,060
So annoying, yes, but\nexplained, yes too.
3790
03:36:53,060 --> 03:36:58,280
So let me go ahead and make meow one\n
3791
03:36:58,281 --> 03:37:02,331
And let me make one final enhancement\nto this meow program here.
3792
03:37:02,331 --> 03:37:05,220
Let me go ahead now and\nsay something like this.
3793
03:37:05,220 --> 03:37:07,190
Let me go ahead and say,\nall right, wouldn't it
3794
03:37:07,191 --> 03:37:12,941
be nice if my meow function could do\n
3795
03:37:12,941 --> 03:37:14,701
So suppose I want to do this.
3796
03:37:14,700 --> 03:37:17,870
This meow function at the moment\nis going to meow three times.
3797
03:37:17,870 --> 03:37:21,110
But suppose I want to meow\nn times, where n is just
3798
03:37:21,111 --> 03:37:23,181
some number provided by the user.
3799
03:37:23,181 --> 03:37:27,560
Well, just like in Scratch,\ncustom functions can take inputs
3800
03:37:27,560 --> 03:37:30,120
I just presently am saying void.
3801
03:37:30,120 --> 03:37:34,550
But if I change this to int n,\nthereby telling the compiler
3802
03:37:34,550 --> 03:37:38,150
hey, meow still doesn't\nreturn something
3803
03:37:38,150 --> 03:37:40,581
but it does take something as input.
3804
03:37:40,581 --> 03:37:43,581
It takes an integer,\nand I want to call it n.
3805
03:37:43,581 --> 03:37:46,070
So this is another way\nof declaring a variable
3806
03:37:46,070 --> 03:37:48,890
but a way of declaring a\nvariable that gets handed into
3807
03:37:50,431 --> 03:37:55,101
So now if I tighten up main here, now\n
3808
03:37:55,101 --> 03:37:58,551
just like in Scratch, which is this.
3809
03:37:58,550 --> 03:38:01,040
If I now look at this\ncode-- let me Zoom in here--
3810
03:38:01,040 --> 03:38:04,310
now my main program is really\nwell-written in the sense
3811
03:38:04,310 --> 03:38:07,190
that it just says what it\ndoes, meow three times.
3812
03:38:07,191 --> 03:38:11,031
This works, though, because I\n
3813
03:38:11,031 --> 03:38:17,691
an integer called n, and then using\n
3814
03:38:18,771 --> 03:38:21,271
You might have caught my one mistake.
3815
03:38:21,271 --> 03:38:25,101
I also have to remind myself up\nhere to make that change too.
3816
03:38:25,101 --> 03:38:27,891
Again, this is one of the only\nredundancies or copy-paste
3817
03:38:29,511 --> 03:38:32,031
But there, I have now a better version.
3818
03:38:32,031 --> 03:38:36,381
So let me go ahead and rerun\nthis, make meow, ./meow.
3819
03:38:36,980 --> 03:38:39,530
So, again, no change\nin correctness but now
3820
03:38:39,531 --> 03:38:41,391
again, we're sort of\nmodularizing our code.
3821
03:38:41,390 --> 03:38:44,720
And, heck, what you could do now-- and\n
3822
03:38:45,501 --> 03:38:48,320
those header files we talked\nabout early, those libraries
3823
03:38:48,320 --> 03:38:51,050
this is the kind of modularization\nwe're talking about.
3824
03:38:51,050 --> 03:38:54,921
We, the staff, wrote a function called\n
3825
03:38:54,921 --> 03:39:01,191
we put it in a file called CS50, and we\n
3826
03:39:01,191 --> 03:39:03,381
these things called prototypes--
3827
03:39:05,570 --> 03:39:10,251
So that when you all, as aspiring\nprogrammers, include cs50.h
3828
03:39:10,251 --> 03:39:14,300
you are sort of secretly telling the\n
3829
03:39:14,300 --> 03:39:16,290
what the menu of available functions is.
3830
03:39:16,790 --> 03:39:21,560
Because in CS50 is lines like\nthese-- obviously, not for meow
3831
03:39:21,560 --> 03:39:24,140
but for get_string,\nget_int, and so forth.
3832
03:39:24,140 --> 03:39:29,540
And stdio.h is the same lines\nof code for things like printf.
3833
03:39:29,540 --> 03:39:31,740
So that's all that's going on there.
3834
03:39:31,740 --> 03:39:38,040
It's just a way of telling the computer\n
3835
03:39:38,040 --> 03:39:40,920
All right, any questions,\nthen, on these here?
3836
03:39:44,310 --> 03:39:47,130
So if you don't mind, I\nwant to continue to wave
3837
03:39:47,130 --> 03:39:49,050
my hand at that detail for today.
3838
03:39:49,050 --> 03:39:53,581
Indeed, int main void is a little weird,\n
3839
03:39:53,581 --> 03:39:55,667
We have no mechanism\nfor providing input yet.
3840
03:39:55,667 --> 03:39:57,751
And what does it mean for\nmain to return anything?
3841
03:39:57,751 --> 03:39:59,251
Like, who is it returning to?
3842
03:39:59,251 --> 03:40:00,378
For another day, if we may.
3843
03:40:00,378 --> 03:40:02,461
They're going to come into\nplay but that, for now
3844
03:40:02,460 --> 03:40:05,100
today is just something you\nshould take at face value
3845
03:40:05,101 --> 03:40:08,320
as necessary copy-paste\nto begin programs.
3846
03:40:08,320 --> 03:40:11,490
So meow is a function that takes an\n
3847
03:40:11,490 --> 03:40:14,970
but it didn't actually have a\nreturn value, hence the void.
3848
03:40:14,970 --> 03:40:17,520
But what if we actually want\nto create our own function that
3849
03:40:17,521 --> 03:40:20,431
not only takes 0 or\nmore inputs as arguments
3850
03:40:20,431 --> 03:40:24,060
but also returns some value, maybe an\n
3851
03:40:24,990 --> 03:40:27,640
Well, it turns out, in C,\nwe can do that as well.
3852
03:40:27,640 --> 03:40:31,081
Let me go ahead and create a\nnew file here called discount.
3853
03:40:31,081 --> 03:40:33,121
And let's implement a\nquick program via which
3854
03:40:33,120 --> 03:40:35,640
we can discount some regular\nprice by some percentage
3855
03:40:35,640 --> 03:40:37,740
as though there's a sale\ngoing on in a store.
3856
03:40:37,740 --> 03:40:44,490
Let me go ahead and include our usual\n
3857
03:40:44,490 --> 03:40:47,550
Let me give myself int\nmain void as before.
3858
03:40:47,550 --> 03:40:50,380
And inside of main, let's go\nahead and do something simple.
3859
03:40:50,380 --> 03:40:52,470
Let's give ourselves a\nfloat called regular
3860
03:40:52,470 --> 03:40:55,350
representing the regular\nprice of something in a store.
3861
03:40:55,351 --> 03:40:58,111
Let's go ahead and get a float\nfrom the user asking them
3862
03:41:00,480 --> 03:41:04,740
Then, next, let's go ahead and declare\n
3863
03:41:04,740 --> 03:41:07,950
called sale, ultimately\nrepresenting the sale price
3864
03:41:07,950 --> 03:41:09,810
after some percentage discount off.
3865
03:41:09,810 --> 03:41:13,020
And let's go ahead and simply\ncalculate whatever regular is.
3866
03:41:13,021 --> 03:41:15,851
And, say, 15% off is a\npretty good discount.
3867
03:41:15,851 --> 03:41:20,471
So let's go ahead and discount\nregular, whatever it is, by 15%
3868
03:41:20,470 --> 03:41:23,911
which is equivalent, of course, to\n
3869
03:41:25,681 --> 03:41:30,126
Of course, if we're taking off 15%,\n
3870
03:41:30,126 --> 03:41:32,251
Now, let's go ahead and\nprint out the results here.
3871
03:41:32,251 --> 03:41:36,271
Let me go ahead and say\nprintf sale price, colon--
3872
03:41:36,271 --> 03:41:38,941
let me go ahead and %f,\nbut, more specifically
3873
03:41:38,941 --> 03:41:43,951
%.2f because, at least in US currency\n
3874
03:41:45,751 --> 03:41:48,630
And then let me go ahead and\nplug in the value of sale.
3875
03:41:48,630 --> 03:41:52,230
All right, let's go down here\nand do make discount, Enter.
3876
03:41:52,230 --> 03:41:54,780
So far, so good-- ./discount.
3877
03:41:54,781 --> 03:41:57,001
And the regular price is maybe $100.
3878
03:41:57,001 --> 03:41:59,560
So the sale price should be $85.
3879
03:41:59,560 --> 03:42:01,483
So our arithmetic seems\nto be correct here.
3880
03:42:01,483 --> 03:42:02,941
But let's fast-forward now in time.
3881
03:42:02,941 --> 03:42:04,801
Suppose that we find\nourselves discounting
3882
03:42:04,800 --> 03:42:07,727
a lot of prices in an\napplication, maybe a website
3883
03:42:07,727 --> 03:42:10,560
like Amazon where they're offering\n
3884
03:42:10,560 --> 03:42:13,200
And it'd be nice to have\na reusable function that
3885
03:42:13,200 --> 03:42:16,420
just does this arithmetic for\nus, simple though it may be.
3886
03:42:16,421 --> 03:42:18,511
So let's go ahead and\nmodify discount this time
3887
03:42:18,511 --> 03:42:22,051
to give ourselves our own\nfunction called discount
3888
03:42:22,050 --> 03:42:23,778
for instance, that takes an input--
3889
03:42:23,778 --> 03:42:25,861
like the regular price\nthat you want to discount--
3890
03:42:25,861 --> 03:42:28,073
and then it also returns a value.
3891
03:42:28,073 --> 03:42:29,281
It doesn't just print it out.
3892
03:42:29,281 --> 03:42:34,081
It returns a value, namely, a float\n
3893
03:42:34,081 --> 03:42:37,230
So let me go down\nbelow main and go ahead
3894
03:42:37,230 --> 03:42:39,900
and define a function that's\ngoing to return a float
3895
03:42:39,900 --> 03:42:42,118
because we're dealing\nwith dollar amount still.
3896
03:42:42,118 --> 03:42:43,951
The function is going\nto be called discount.
3897
03:42:43,950 --> 03:42:47,970
And it's going to take one input, like\n
3898
03:42:47,970 --> 03:42:50,171
In here, I'm going to do\nsomething very simple.
3899
03:42:50,171 --> 03:42:55,810
I'm going to say float sale equals\n
3900
03:42:55,810 --> 03:42:58,052
And then I'm going to go\nahead and return sale.
3901
03:42:58,052 --> 03:43:00,511
Now, for that matter, I can\nactually tighten this up a bit.
3902
03:43:00,511 --> 03:43:04,081
If I'm only declaring a variable\nto store a value that I'm then
3903
03:43:04,081 --> 03:43:09,191
returning with this keyword return, I\n
3904
03:43:09,191 --> 03:43:11,041
So I can delete the second line.
3905
03:43:11,040 --> 03:43:13,680
And I can actually just go ahead\nand get rid of that variable
3906
03:43:13,681 --> 03:43:16,291
altogether and immediately\nreturn whatever the arithmetic
3907
03:43:16,290 --> 03:43:20,161
result is of taking the price input,\n
3908
03:43:21,900 --> 03:43:25,681
So very simple function that\nsimply does the discounting for me.
3909
03:43:25,681 --> 03:43:29,011
As always, let me go\nahead and copy-paste--
3910
03:43:29,011 --> 03:43:32,251
the only time it's OK to copy-paste--\n
3911
03:43:32,251 --> 03:43:35,341
the top of the file, so that\nwhen compiling this code
3912
03:43:35,341 --> 03:43:38,611
main has already seen\nthe word discount before.
3913
03:43:38,611 --> 03:43:40,271
And now let me go into the code here.
3914
03:43:40,271 --> 03:43:43,411
And instead of doing\nthe math myself in main
3915
03:43:43,411 --> 03:43:46,470
let me presume that we\nhave some function already
3916
03:43:46,470 --> 03:43:50,790
in our toolkit called discount that\n
3917
03:43:52,845 --> 03:43:54,970
And then down here, my code\ndoesn't need to change.
3918
03:43:54,970 --> 03:43:58,200
I'm still going to print out\nsale the variable in which I'm
3919
03:43:58,200 --> 03:44:00,440
storing that result. But\nnotice what I've done here.
3920
03:44:00,441 --> 03:44:02,191
I've sort of abstracted\nthe way the notion
3921
03:44:02,191 --> 03:44:06,091
of taking a discount by creating my\n
3922
03:44:06,091 --> 03:44:07,951
price, or anything else as input.
3923
03:44:07,950 --> 03:44:10,440
It does a little bit of math,\nsimple though it is here
3924
03:44:10,441 --> 03:44:12,011
and then it returns a value.
3925
03:44:12,011 --> 03:44:14,941
But notice that discount\nis not printing that value.
3926
03:44:14,941 --> 03:44:17,191
It's literally using\nthis other keyword called
3927
03:44:17,191 --> 03:44:21,811
return so that I can hand back that\n
3928
03:44:21,810 --> 03:44:25,411
back a value, just like get_int\n
3929
03:44:25,411 --> 03:44:29,851
for you-- so that I up here on\nline 9 can go ahead and store
3930
03:44:29,851 --> 03:44:33,721
that value in a variable if I want\n
3931
03:44:33,720 --> 03:44:38,440
Let me go ahead now and recompile\nthis code with make discount.
3932
03:44:38,441 --> 03:44:40,381
Let me go ahead and do ./discount.
3933
03:44:43,040 --> 03:44:46,530
Sale price is going to be $85 as well.
3934
03:44:46,531 --> 03:44:50,721
Now, it turns out that functions don't\n
3935
03:44:51,261 --> 03:44:53,240
They can actually take 2 or 3 or more.
3936
03:44:53,240 --> 03:44:57,200
So, in fact, suppose we wanted to now\n
3937
03:44:57,200 --> 03:45:01,400
and take in as input to the discount\n
3938
03:45:01,400 --> 03:45:03,890
that I want to discount but\nalso the percentage off
3939
03:45:03,890 --> 03:45:08,150
thereby allowing us to support not just\n
3940
03:45:08,870 --> 03:45:13,610
Well, let me go up here and declare\n
3941
03:45:13,611 --> 03:45:15,951
And let me ask the user\nfor how many percentage
3942
03:45:15,950 --> 03:45:17,810
points they want to take off.
3943
03:45:17,810 --> 03:45:21,501
So I'm going to say percent_off\ninside of the prompt here
3944
03:45:21,501 --> 03:45:23,751
get that int called percent_off.
3945
03:45:23,751 --> 03:45:26,871
And now in addition to\npassing in regular as an input
3946
03:45:26,870 --> 03:45:30,800
to the discount function, I'm\nalso going to pass in percent_off.
3947
03:45:30,800 --> 03:45:34,820
But I need to tell the computer\n
3948
03:45:34,820 --> 03:45:37,130
and the way I do this\nis just with a comma
3949
03:45:37,130 --> 03:45:39,380
down here in the\nfunction's own definition.
3950
03:45:39,380 --> 03:45:43,610
Here is going to be a percentage\nargument, a second argument
3951
03:45:44,511 --> 03:45:50,091
And I'm now going to use that\n
3952
03:45:50,091 --> 03:45:53,480
I don't want to just do percentage\n
3953
03:45:53,480 --> 03:45:56,744
that's going to increase\nthe size of the total price.
3954
03:45:56,744 --> 03:45:59,661
I actually need to do a little bit\n
3955
03:45:59,661 --> 03:46:03,380
a percentage off, like the number\n15 for 15 percentage points
3956
03:46:03,380 --> 03:46:06,590
I need to do 100 minus that\nmany percentage points
3957
03:46:06,591 --> 03:46:08,871
thereby giving me 100 minus 15--
3958
03:46:09,710 --> 03:46:13,130
And then I need to divide\nthat by 100 in order now
3959
03:46:13,130 --> 03:46:18,030
to give myself 0.85 times\nthe price that was passed in.
3960
03:46:18,031 --> 03:46:22,911
But if I go ahead now and save this,\n
3961
03:46:22,911 --> 03:46:24,890
I notice that I've\nactually got an error here.
3962
03:46:26,300 --> 03:46:28,220
Well, I need to change\nthat prototype too.
3963
03:46:28,220 --> 03:46:30,718
And, again, this is admittedly\nan annoying aspect of C
3964
03:46:30,718 --> 03:46:32,510
that you have to maintain\nconsistency here.
3965
03:46:33,177 --> 03:46:35,390
I'm just going to go up\nhere, change this to int
3966
03:46:35,390 --> 03:46:37,640
percentage-- spelling incorrectly.
3967
03:46:37,640 --> 03:46:40,581
And now let me retry\ncompilation, make discount
3968
03:46:40,581 --> 03:46:42,111
crossing my fingers this time.
3969
03:46:42,111 --> 03:46:46,701
Worked OK. ./discount, and voila, $100.
3970
03:46:46,700 --> 03:46:49,220
And percent off, say, 15 points.
3971
03:46:52,980 --> 03:46:55,310
Now, it's worth noting\nthat I've deliberately
3972
03:46:55,310 --> 03:46:58,400
returned the results of my\nmath from this function.
3973
03:46:58,400 --> 03:47:01,790
I haven't just done the math on the\n
3974
03:47:01,790 --> 03:47:03,950
In fact, if we take a look\nat this second version
3975
03:47:03,950 --> 03:47:07,770
where discount is now taking a price\n
3976
03:47:07,771 --> 03:47:10,041
notice that I'm not doing\nsomething like this.
3977
03:47:10,040 --> 03:47:14,420
I'm not just saying price\nequals price times 100
3978
03:47:14,421 --> 03:47:18,261
minus percentage divided\nby 100 and leaving at that.
3979
03:47:18,261 --> 03:47:21,951
The problem there is that\nthis variable price is going
3980
03:47:21,950 --> 03:47:24,290
to be scoped to that discount function.
3981
03:47:24,290 --> 03:47:27,230
And we'll encounter this again\n
3982
03:47:27,230 --> 03:47:32,091
just refers to where in which a\n
3983
03:47:33,210 --> 03:47:36,200
So it turns out if I change price\n
3984
03:47:36,200 --> 03:47:38,600
function, that's not going\nto have a lasting effect.
3985
03:47:38,601 --> 03:47:40,310
If I actually want to\nget the result back
3986
03:47:40,310 --> 03:47:43,761
to the function that used the\ndiscount function, namely, main
3987
03:47:43,761 --> 03:47:47,001
I actually do need to take this\napproach of actually returning
3988
03:47:47,001 --> 03:47:51,720
the value explicitly so that ultimately\n
3989
03:47:52,220 --> 03:47:54,345
Well, let's go ahead and\nmaybe how about let's just
3990
03:47:54,345 --> 03:47:57,800
use these primitives in\njust a few different ways.
3991
03:47:57,800 --> 03:48:02,300
How about a little game of\nyesteryear, Super Mario Brothers?
3992
03:48:02,300 --> 03:48:05,661
And in the original Super Mario\n
3993
03:48:05,661 --> 03:48:08,060
so you have these\nside-scrolling worlds that
3994
03:48:08,060 --> 03:48:11,470
look like this where there's some coins\n
3995
03:48:11,970 --> 03:48:15,230
So let's just use this as a\nvisual to consider how in C could
3996
03:48:15,230 --> 03:48:17,060
I start to make\nsomething semi-graphical.
3997
03:48:17,060 --> 03:48:20,270
Like, not actual colors or fanciness,\n
3998
03:48:20,271 --> 03:48:23,011
just something like printing\nout some question marks.
3999
03:48:23,011 --> 03:48:26,031
Well, if I go back over here,\nlet me create that actual file
4000
03:48:30,290 --> 03:48:34,190
Let me go ahead and include\nstdio.h, int main void, again
4001
03:48:34,191 --> 03:48:36,320
which we'll continue to\ncopy-paste for today.
4002
03:48:36,320 --> 03:48:40,431
And then let me just go ahead and\n
4003
03:48:41,421 --> 03:48:44,609
All right, this is what we\nmight call ASCII art, which
4004
03:48:44,609 --> 03:48:47,400
just means graphics but really just\n
4005
03:48:47,400 --> 03:48:52,130
And if I make mario and do ./mario,\n
4006
03:48:52,130 --> 03:48:55,911
as this, but it's the beginning\nof this kind of map for a game.
4007
03:48:55,911 --> 03:49:00,050
Well, if I wanted to now print\nout of those things dynamically
4008
03:49:00,050 --> 03:49:01,730
let me go back to my code here.
4009
03:49:01,730 --> 03:49:03,831
And instead of printing\nout for all at once
4010
03:49:03,831 --> 03:49:08,451
I could do something like four int i\n
4011
03:49:08,450 --> 03:49:13,230
And then inside here, I could just\n
4012
03:49:13,230 --> 03:49:15,890
Let me save that, make mario.
4013
03:49:15,890 --> 03:49:20,630
And, at the risk of\ndisappointing, so close
4014
03:49:20,630 --> 03:49:23,800
but I made a mistake,\njust a stupid aesthetic.
4015
03:49:23,800 --> 03:49:25,970
The prompt is not on the new line.
4016
03:49:29,060 --> 03:49:31,810
DAVID J. MALAN: Yeah, I need an\n
4017
03:49:35,540 --> 03:49:37,790
OK, no, because that's going\nto put it after everyone
4018
03:49:37,790 --> 03:49:40,498
and it's going to make this thing\n
4019
03:49:40,498 --> 03:49:44,030
So, logically, just like in Scratch, put\n
4020
03:49:44,751 --> 03:49:48,300
And just print out, for instance,\nonly, quote unquote, new line.
4021
03:49:48,300 --> 03:49:51,350
And now if I do make\nmario again, ./mario, OK.
4022
03:49:52,337 --> 03:49:54,171
But a little better\ndesigned in that now I'm
4023
03:49:54,171 --> 03:49:57,390
not repeating myself multiple times,\n
4024
03:49:57,390 --> 03:50:00,710
But let's do one other\nthing here with mario.
4025
03:50:00,710 --> 03:50:05,960
Let me go ahead and ask the user how\n
4026
03:50:05,960 --> 03:50:09,710
The catch here is that there's another\n
4027
03:50:09,710 --> 03:50:12,170
and it's called a do\nwhile loop, generally.
4028
03:50:12,171 --> 03:50:15,711
A do while loop is\nsimilar to a while loop
4029
03:50:15,710 --> 03:50:19,220
but it checks the condition\nlast instead of first.
4030
03:50:19,220 --> 03:50:21,038
Recall earlier on the\nslide, we had while
4031
03:50:21,039 --> 03:50:22,581
open parenthesis, closed parenthesis.
4032
03:50:22,581 --> 03:50:25,941
And I kept claiming that we check\n
4033
03:50:25,941 --> 03:50:29,070
it was, 3 in advance again and again.
4034
03:50:29,070 --> 03:50:32,630
A do while loop just inverts the\nlogic so that you can actually
4035
03:50:34,040 --> 03:50:36,380
At the top of this program,\nI'm going to go ahead now
4036
03:50:36,380 --> 03:50:40,460
and give myself a variable\nn like this of type integer.
4037
03:50:40,460 --> 03:50:44,870
And then I'm going to do, literally,\n
4038
03:50:44,870 --> 03:50:48,590
n equals get_int-- and I'm going\nto ask the user for the width
4039
03:50:48,591 --> 03:50:51,441
like the number of\ndollar signs to print.
4040
03:50:51,441 --> 03:50:56,031
And I'm going to do this\nwhile n is less than, say, 1.
4041
03:50:56,031 --> 03:50:59,031
So this is a little cryptic,\nbut the salient differences
4042
03:50:59,031 --> 03:51:04,191
are the Boolean expression is now\n
4043
03:51:07,380 --> 03:51:11,390
Well, the difference\nhere if I make mario is--
4044
03:51:12,890 --> 03:51:16,620
I need to add cs50.h, because\nI'm now using get_int.
4045
03:51:16,620 --> 03:51:22,050
If I now compile this version\nof Mario and do ./mario
4046
03:51:22,050 --> 03:51:26,640
a do while loop is helpful when you want\n
4047
03:51:26,640 --> 03:51:30,931
and then check some condition or some\n
4048
03:51:30,931 --> 03:51:32,461
in this case, the user cooperated.
4049
03:51:32,460 --> 03:51:35,772
It would make no sense if\nthe user typed in, say, 0
4050
03:51:35,772 --> 03:51:37,230
because there's no work to be done.
4051
03:51:37,230 --> 03:51:39,450
It'd be really weird if\nthey said negative 100
4052
03:51:39,450 --> 03:51:41,220
because that makes no sense logically.
4053
03:51:41,220 --> 03:51:46,261
So with this simple construct\nhere, I am doing the following
4054
03:51:48,810 --> 03:51:53,077
The implication is that as soon\n
4055
03:51:53,077 --> 03:51:55,411
I'm going to break out of\nthis loop, and I've got myself
4056
03:51:55,411 --> 03:52:00,630
a variable called n containing,\nessentially, a positive value, 1
4057
03:52:03,210 --> 03:52:07,320
And I can now use this, for\ninstance, here, change the 4 to an n
4058
03:52:07,320 --> 03:52:09,810
so now my program is completely dynamic.
4059
03:52:09,810 --> 03:52:13,710
Let me go ahead and do\nmake mario, ./mario again.
4060
03:52:18,751 --> 03:52:22,140
And the difference here with the\n
4061
03:52:22,140 --> 03:52:25,320
involves getting user input,\nwell, there's no question to ask.
4062
03:52:25,320 --> 03:52:27,161
The user hasn't given you anything yet.
4063
03:52:27,161 --> 03:52:31,111
So you have to do something first,\n
4064
03:52:31,111 --> 03:52:35,808
if the human has, for instance,\ncooperated, in this case.
4065
03:52:35,808 --> 03:52:37,891
All right, well why don't\nwe escalate to something
4066
03:52:37,890 --> 03:52:41,940
more like this in the same game,\n
4067
03:52:41,941 --> 03:52:45,481
and this is like a two-dimensional\nwall that's popping up here?
4068
03:52:45,480 --> 03:52:48,480
It looks like a 3 by 3, for\n
4069
03:52:48,480 --> 03:52:51,931
And it's like, made of bricks, so\n
4070
03:52:51,931 --> 03:52:53,761
Well, it turns out that we can nest--
4071
03:52:53,761 --> 03:52:57,310
that is, combine-- some of\nthese same ideas as follows.
4072
03:52:57,310 --> 03:53:01,110
Let me go ahead now and\nchange back to this code.
4073
03:53:01,111 --> 03:53:05,501
And I'm going to keep the\ndo while loop from before.
4074
03:53:05,501 --> 03:53:07,441
And I'm going to ask,\nthough, this question
4075
03:53:07,441 --> 03:53:09,091
what's the size of this square?
4076
03:53:09,091 --> 03:53:13,841
I'm going to assume it's n by\nn, so 3 by 3, 4 by 4, whatever.
4077
03:53:13,841 --> 03:53:16,621
So I'm just going to ask for the\nsize of this square of bricks.
4078
03:53:18,373 --> 03:53:20,790
Well, I'm going to go ahead,\nfor instance, and print out--
4079
03:53:20,790 --> 03:53:25,907
how about for int i =\n0, i less than n, i++.
4080
03:53:25,907 --> 03:53:27,990
Let me just keep it simple\nand print out something
4081
03:53:27,990 --> 03:53:32,790
like this, just a single\nhash symbol that is a brick
4082
03:53:34,800 --> 03:53:36,270
All right, let's make mario.
4083
03:53:38,400 --> 03:53:40,320
OK, that's close to being it.
4084
03:53:41,431 --> 03:53:43,181
All right, but I need it to be wider.
4085
03:53:43,181 --> 03:53:45,751
So the solution last time\nwas to get rid of the newline
4086
03:53:45,751 --> 03:53:49,921
and then maybe put the\nnewline here, after the loop.
4087
03:53:49,921 --> 03:53:55,951
All right, so let's do make mario,\n
4088
03:53:55,950 --> 03:54:00,480
All right, so I kind of need to\ncombine these two ideas somehow.
4089
03:54:00,480 --> 03:54:04,351
So how might we solve this problem?
4090
03:54:04,351 --> 03:54:10,591
I want to print rows and\ncolumns, not row or column.
4091
03:54:12,630 --> 03:54:15,015
AUDIENCE: Add another\nloop in the for loop.
4092
03:54:15,890 --> 03:54:17,570
Add another loop in the for loop, right?
4093
03:54:17,570 --> 03:54:21,890
If you use one loop conceptually\n
4094
03:54:21,890 --> 03:54:24,470
to bottom, and then\nwithin each row, you then
4095
03:54:24,470 --> 03:54:26,870
sort of typewriter style--\nold school typewriter--
4096
03:54:26,870 --> 03:54:30,020
do like, character, character,\n
4097
03:54:30,021 --> 03:54:32,521
I think we could do exactly\nwhat we want to achieve here.
4098
03:54:33,421 --> 03:54:36,441
Let me get rid of this line and\nget rid of this line for now.
4099
03:54:36,441 --> 03:54:39,201
And let me just give myself\nanother loop on the inside.
4100
03:54:39,200 --> 03:54:42,920
And since I'm already using i,\nanother reasonable convention
4101
03:54:42,921 --> 03:54:45,021
here would be to say something like j.
4102
03:54:45,021 --> 03:54:49,311
So j also gets 0, j is less than n, j++.
4103
03:54:49,310 --> 03:54:51,620
And now, what's going to happen?
4104
03:54:51,620 --> 03:54:56,180
Let me go ahead and print out just\n
4105
03:54:56,181 --> 03:54:58,441
And let me save and let me run this.
4106
03:54:58,441 --> 03:54:59,990
Let me see how close we are.
4107
03:55:01,970 --> 03:55:06,421
OK, three, that's clearly wrong, but\n
4108
03:55:07,581 --> 03:55:12,890
What's the one fix I need now to\n
4109
03:55:12,890 --> 03:55:15,200
down to the next row when appropriate?
4110
03:55:17,140 --> 03:55:19,091
Yeah, I need one of these backslash n's.
4111
03:55:19,091 --> 03:55:23,921
And let me add some comments now to\n
4112
03:55:23,921 --> 03:55:30,970
For each row, for each column,\nhow about print a brick--
4113
03:55:30,970 --> 03:55:33,100
just to kind of explain the logic?
4114
03:55:33,101 --> 03:55:37,810
And so I add that because\nnow move to next row
4115
03:55:37,810 --> 03:55:40,390
I could do something like\nthis with a backslash n.
4116
03:55:40,390 --> 03:55:43,751
So here is where the comments,\nreally, my pseudocode
4117
03:55:43,751 --> 03:55:46,730
actually kind of illuminates\nthe situation a bit.
4118
03:55:46,730 --> 03:55:51,353
Let me go ahead and recompile\n
4119
03:55:51,353 --> 03:55:53,770
It's not a perfect square,\njust because these hash symbols
4120
03:55:53,771 --> 03:55:57,551
are a little taller than they are wide,\n
4121
03:55:57,550 --> 03:56:03,751
Now I've done something that's quite\n
4122
03:56:03,751 --> 03:56:08,244
All right, so let me pause here\n
4123
03:56:08,244 --> 03:56:10,411
Again, the code's getting\na little more complicated
4124
03:56:10,411 --> 03:56:14,115
but we're just building more\n
4125
03:56:14,115 --> 03:56:15,990
with familiar puzzle\npieces-- some variables
4126
03:56:15,990 --> 03:56:17,341
some loops, some conditionals.
4127
03:56:17,341 --> 03:56:19,411
It's all the same as before.
4128
03:56:20,880 --> 03:56:22,140
Can you multiply strings in C?
4129
03:56:22,890 --> 03:56:25,807
But ask that same question again in\n
4130
03:56:25,808 --> 03:56:27,774
and the answer will be yes.
4131
03:56:29,880 --> 03:56:32,520
In C, you must specify\nthe return type, the name
4132
03:56:32,521 --> 03:56:34,581
of the function, and the\ninputs, or arguments
4133
03:56:34,581 --> 03:56:35,831
to the function in that order.
4134
03:56:35,831 --> 03:56:39,181
And if none of them are applicable,\nyou write the word void.
4135
03:56:39,181 --> 03:56:42,060
So same question as earlier, let\nme kick that can a week or so
4136
03:56:42,060 --> 03:56:44,261
and we'll come back to\nthat and we'll see why.
4137
03:56:44,261 --> 03:56:47,281
But for now, just take on faith\n
4138
03:56:47,281 --> 03:56:49,651
Because main is a little\nspecial, similar to the
4139
03:56:51,031 --> 03:56:54,271
It too was a little special as well.
4140
03:57:02,411 --> 03:57:05,960
If you want to get out of a\nloop early, you could do this.
4141
03:57:05,960 --> 03:57:08,320
So let me answer this question this way.
4142
03:57:08,320 --> 03:57:14,390
An alternative to a do while loop\n
4143
03:57:15,880 --> 03:57:18,290
so do the following forever--
4144
03:57:18,290 --> 03:57:23,470
let me go ahead and get an inch from\n
4145
03:57:26,771 --> 03:57:28,611
that is, a positive integer--
4146
03:57:28,611 --> 03:57:32,411
then go ahead and use a\nnew keyword called break.
4147
03:57:32,411 --> 03:57:35,691
This is identical to what we just did.
4148
03:57:37,130 --> 03:57:39,671
It's like a couple extra\nlines, a lot of them are blank.
4149
03:57:39,671 --> 03:57:41,111
And so it's just an alternative.
4150
03:57:41,111 --> 03:57:43,856
But a do while does the same\nthing but a little tighter--
4151
03:57:43,855 --> 03:57:47,001
if that's in answer to your question.
4152
03:57:47,001 --> 03:57:51,434
All right, so let's now introduce,\n
4153
03:57:51,433 --> 03:57:53,350
that I've kind of been\nbrushing under the rug
4154
03:57:53,351 --> 03:57:55,781
though we did see a little bit\nof evidence of this earlier
4155
03:57:55,781 --> 03:57:57,881
when we tried to add 2\nbillion and 2 billion
4156
03:57:57,880 --> 03:58:02,180
and it overflowed the number\nof bits in an int, so to speak.
4157
03:58:02,181 --> 03:58:06,431
Let me go ahead and code up a\nprogram called calculator again.
4158
03:58:06,431 --> 03:58:09,221
But I'm going to go ahead now\nand change this to floats.
4159
03:58:09,220 --> 03:58:12,820
So I'm going to change x to a float,\n
4160
03:58:12,820 --> 03:58:15,650
And a float, again, is just\na floating point value
4161
03:58:15,650 --> 03:58:19,070
which is a fancy way of saying a real\n
4162
03:58:19,070 --> 03:58:22,480
And down here, I'm going to\ngo ahead and use %f for float.
4163
03:58:22,480 --> 03:58:24,851
And I'm going to go ahead\nnow and do one more thing.
4164
03:58:24,851 --> 03:58:27,643
Instead of addition, I want to do\n
4165
03:58:29,501 --> 03:58:32,050
And I'm going to give myself\nanother third float called z
4166
03:58:32,050 --> 03:58:33,730
as we did at the beginning of today.
4167
03:58:33,730 --> 03:58:37,331
And I'm going to print out z\ninstead of x and y explicitly.
4168
03:58:37,331 --> 03:58:42,011
So I'm going to go ahead now and\n
4169
03:58:42,011 --> 03:58:44,320
And let's do something like, oh, 2/3.
4170
03:58:47,710 --> 03:58:49,600
So that's what you would rather expect.
4171
03:58:52,310 --> 03:58:54,501
All right, so 0.1, and a bunch of zeros.
4172
03:58:54,501 --> 03:58:56,521
That too is what you\nwould rather expect.
4173
03:58:56,521 --> 03:58:58,191
But now let me get a little curious.
4174
03:58:58,191 --> 03:59:02,271
It turns out that in C, you can\n
4175
03:59:03,230 --> 03:59:05,390
By default, you get 6 or so digits.
4176
03:59:05,390 --> 03:59:07,850
Suppose that you want\nto get exactly 2 digits.
4177
03:59:07,851 --> 03:59:11,816
You can more succinctly say 0.2\n
4178
03:59:11,816 --> 03:59:14,691
This is the kind of thing that's\n
4179
03:59:14,691 --> 03:59:16,911
and you find that, OK,\nformat code for floats
4180
03:59:16,911 --> 03:59:19,531
uses 0.2 to do two decimal points.
4181
03:59:19,531 --> 03:59:23,061
So let me do make calculator\nagain, ./calculator.
4182
03:59:25,501 --> 03:59:28,490
So it handles the display of\nsignificant digits for us here.
4183
03:59:28,490 --> 03:59:32,181
And now let me go ahead\nand do 1/10 and 0.10.
4184
03:59:33,494 --> 03:59:35,661
Well, maybe I really want\na lot of precision, right?
4185
03:59:35,661 --> 03:59:37,161
I've got a really powerful computer.
4186
03:59:37,161 --> 03:59:39,681
Let me see 50 numbers\nafter the decimal point.
4187
03:59:39,681 --> 03:59:41,551
That's a lot of significant digits.
4188
03:59:41,550 --> 03:59:44,341
Let me remake the\ncalculator-- whoops, typo.
4189
03:59:44,341 --> 03:59:48,980
Let me remake the calculator,\n./mario calculator.
4190
03:59:54,400 --> 03:59:58,390
Pretty sure it's supposed to be\n
4191
03:59:59,531 --> 04:00:01,239
All right, well, maybe\nthat's just a bug.
4192
04:00:02,290 --> 04:00:05,050
OK, that's really getting funky.
4193
04:00:06,400 --> 04:00:10,448
It seems that my program cannot\nonly not do addition very well--
4194
04:00:10,448 --> 04:00:12,281
we eventually hit\nproblems in the billions--
4195
04:00:12,281 --> 04:00:16,861
we can't even do very\nprecise numbers here.
4196
04:00:19,861 --> 04:00:22,201
In a nutshell, the computer's\napproximating the answer
4197
04:00:22,200 --> 04:00:25,142
using that many numbers\nafter the decimal point.
4198
04:00:25,143 --> 04:00:26,851
But the problem\nfundamentally is actually
4199
04:00:26,851 --> 04:00:30,101
very similar to that integer\noverflow from before.
4200
04:00:30,101 --> 04:00:31,801
And I'm using that now as a term of art.
4201
04:00:31,800 --> 04:00:36,210
Integers can overflow if you're trying\n
4202
04:00:37,128 --> 04:00:40,378
You sort of change them all to ones, and\n
4203
04:00:40,378 --> 04:00:42,761
Same thing here, but in the\ndifferent context of floats--
4204
04:00:42,761 --> 04:00:45,060
if you only have 32\nbits-- or, heck, if we
4205
04:00:45,060 --> 04:00:48,720
change to double and only have 64\n
4206
04:00:50,070 --> 04:00:54,070
And, yet, pretty sure there's an\n
4207
04:00:54,070 --> 04:00:59,040
In the world, which is to say a computer\n
4208
04:00:59,040 --> 04:01:01,413
represent all possible\nnumbers in the world.
4209
04:01:01,414 --> 04:01:03,331
Because, again, there's\nnot an infinite number
4210
04:01:03,331 --> 04:01:06,570
of permutations of 32 or 64 bits.
4211
04:01:06,570 --> 04:01:10,351
It might be a lot, in the billions\n
4212
04:01:10,351 --> 04:01:13,771
And so, indeed, this is the\ncomputer's closest approximation
4213
04:01:13,771 --> 04:01:15,971
to what's actually going on there.
4214
04:01:15,970 --> 04:01:18,990
And so this is an example of what\n
4215
04:01:21,001 --> 04:01:26,341
Floating-point imprecision refers to the\n
4216
04:01:26,341 --> 04:01:29,341
to represent all possible\nreal numbers 100%
4217
04:01:29,341 --> 04:01:33,118
precisely, at least by default\nin languages like C. Thankfully
4218
04:01:33,118 --> 04:01:35,201
in the world of scientific\ncomputing and so forth
4219
04:01:35,200 --> 04:01:38,820
there are solutions to this problem\n
4220
04:01:38,820 --> 04:01:42,070
But the problem fundamentally\nis still going to be there.
4221
04:01:42,070 --> 04:01:44,790
So there's a reason I\nchanged x and y to floats.
4222
04:01:44,790 --> 04:01:47,050
Let's see what would\nhappen if we rewound a bit.
4223
04:01:47,050 --> 04:01:52,501
And instead of using floats for x and y,\n
4224
04:01:52,501 --> 04:01:56,161
And let's go far back\nand use get_int as well
4225
04:01:56,161 --> 04:01:59,138
thereby giving us integers x and y.
4226
04:01:59,138 --> 04:02:01,721
Let's still leave z as a float,\nbecause at the end of the day
4227
04:02:01,720 --> 04:02:04,440
we want to be able to handle\nfractions or floating-point values.
4228
04:02:04,441 --> 04:02:07,381
But let's go ahead now and\nprint out this value of z
4229
04:02:07,380 --> 04:02:09,720
having changed x and y now to ints.
4230
04:02:09,720 --> 04:02:15,390
make calculator, ./calculator, and\n
4231
04:02:16,710 --> 04:02:21,960
And it's not 0.666, and it's\nnot even rounding oddly.
4232
04:02:21,960 --> 04:02:23,760
It's just all zeros this time.
4233
04:02:24,970 --> 04:02:28,770
Well, it turns out that C, when\n
4234
04:02:28,771 --> 04:02:31,771
is always going to give you\nback an integer, an int.
4235
04:02:31,771 --> 04:02:34,951
The problem is that floating-point\nvalues don't fit in ints.
4236
04:02:34,950 --> 04:02:37,650
Only the integral part to the\nleft of the decimal point does.
4237
04:02:37,650 --> 04:02:41,490
Everything at and beyond the decimal\n
4238
04:02:41,490 --> 04:02:44,281
known as a feature in\nC called truncation.
4239
04:02:44,281 --> 04:02:47,591
When dividing an integer by an\ninteger, you get back an integer.
4240
04:02:47,591 --> 04:02:51,181
But if you're trying to then store\n
4241
04:02:51,181 --> 04:02:54,631
result in that integer, C is just\ngoing to throw away everything
4242
04:02:54,630 --> 04:02:57,421
at and beyond the decimal point,\nleaving us with this case
4243
04:02:57,421 --> 04:03:03,521
in just the 0 from what should\nhave been 0.666666 and so forth.
4244
04:03:03,521 --> 04:03:05,230
So let's see one more example, in fact.
4245
04:03:05,229 --> 04:03:06,841
Let me go back to my terminal here.
4246
04:03:06,841 --> 04:03:08,550
Let me do ./calculator again.
4247
04:03:09,751 --> 04:03:13,751
This time, It should be\n1.33333 and so forth.
4248
04:03:13,751 --> 04:03:20,729
But let's see, 4 divided by 3, both as\n
4249
04:03:20,729 --> 04:03:23,761
but there too the\nanswer should be 1.333.
4250
04:03:23,761 --> 04:03:27,871
But the floating-point part is\ngetting truncated or thrown away
4251
04:03:30,370 --> 04:03:33,970
Well, certainly, we could just use\n
4252
04:03:33,970 --> 04:03:37,591
But if, by nature of your program,\n
4253
04:03:37,591 --> 04:03:40,800
or maybe even longs, for which\nthe same problem would occur--
4254
04:03:40,800 --> 04:03:44,069
what we can actually do\nis called type conversion.
4255
04:03:44,069 --> 04:03:47,040
And we can explicitly tell\nthe computer that we actually
4256
04:03:47,040 --> 04:03:50,229
want to treat this int as though\nit's a floating-point value.
4257
04:03:50,229 --> 04:03:52,020
And we can do that for both x and y.
4258
04:03:52,021 --> 04:03:55,871
So let me go back to my code here, and\n
4259
04:03:55,870 --> 04:04:01,260
I can convert y to a float by\n
4260
04:04:01,261 --> 04:04:04,261
by literally writing the type\nfloat inside of parentheses
4261
04:04:05,431 --> 04:04:08,881
And if I really want to be explicit,\n
4262
04:04:08,880 --> 04:04:12,790
But, strictly speaking, it suffices\n
4263
04:04:14,069 --> 04:04:19,110
Let me go ahead now and do make\ncalculator again, ./calculator
4264
04:04:19,111 --> 04:04:21,900
and let's try 2 divided by 3.
4265
04:04:21,899 --> 04:04:25,139
And now, we're back to an\nanswer that's closer to correct.
4266
04:04:25,139 --> 04:04:27,840
But, indeed, we're still having\nsome rounding issues there.
4267
04:04:27,841 --> 04:04:31,621
Let's run it one more\ntime for 4 divided by 3.
4268
04:04:31,620 --> 04:04:33,989
There too we're closer to\nthe right answer, at least.
4269
04:04:33,989 --> 04:04:36,450
But we still have that\nfloating-point imprecision
4270
04:04:36,450 --> 04:04:39,300
but that's going to be another\nproblem altogether to solve.
4271
04:04:39,300 --> 04:04:41,220
And here in a little\nmore detail is that issue
4272
04:04:41,220 --> 04:04:44,399
of integer overflow, which\nis in the context of ints.
4273
04:04:44,399 --> 04:04:48,389
Suppose that we think back to\nlast week when we had three bits
4274
04:04:48,389 --> 04:04:53,970
and we counted from 0 to\n7, 0, 1, 2, 3, 4, 5, 6, 7.
4275
04:04:53,970 --> 04:04:56,220
I think I asked the question,\nhow would we count to 8?
4276
04:04:56,220 --> 04:04:58,290
Someone proposed, well,\nwe need a fourth bit.
4277
04:04:58,290 --> 04:05:01,559
That's fine if you have a\nfourth bit, if you have access
4278
04:05:01,559 --> 04:05:03,479
to another light bulb or transistor.
4279
04:05:03,479 --> 04:05:09,239
If you don't, though, the next number\n
4280
04:05:09,239 --> 04:05:13,110
But if you don't have space for\nor hardware for that fourth bit
4281
04:05:13,111 --> 04:05:16,341
you might as well just be\nrepresenting the number 0.
4282
04:05:16,341 --> 04:05:19,640
So in the world of integers, if\nyou're only using three bits
4283
04:05:19,639 --> 04:05:23,360
those three bits eventually\noverflow when you count past 7.
4284
04:05:23,361 --> 04:05:28,011
Because what should be 8 can't fit, so\n
4285
04:05:28,011 --> 04:05:30,841
And as arcane as this\nproblem might seem
4286
04:05:30,841 --> 04:05:33,050
we humans have done\nthis a couple of times.
4287
04:05:33,050 --> 04:05:35,327
You might recall\nknowing about or reading
4288
04:05:35,327 --> 04:05:37,161
about the Y2K problem,\nwhere a lot of people
4289
04:05:37,161 --> 04:05:38,460
thought the world was going to end.
4290
04:05:38,960 --> 04:05:43,911
Because on January 1st of\n2000, a lot of computers
4291
04:05:43,911 --> 04:05:48,501
presumably, were going to update their\n
4292
04:05:48,501 --> 04:05:51,831
The problem is, though, for\ndecades, for efficiency, we humans
4293
04:05:51,831 --> 04:05:54,710
were honestly in the habit of\nnot storing years as four digits.
4294
04:05:55,233 --> 04:05:58,191
Because that's just a lot of space\n
4295
04:05:59,460 --> 04:06:02,540
So a lot of computer systems,\nespecially early on when
4296
04:06:02,540 --> 04:06:05,780
hardware was very expensive\nand memory was very tight
4297
04:06:05,781 --> 04:06:08,390
just stored the last\ntwo digits of any year.
4298
04:06:08,389 --> 04:06:14,270
The problem, of course, on January 1st\n
4299
04:06:14,271 --> 04:06:19,341
But if you don't have room for\nanother digit it's just 00.
4300
04:06:19,341 --> 04:06:23,720
And if your code assumes a prefix of\n
4301
04:06:23,720 --> 04:06:26,862
1999 back to the year 1900.
4302
04:06:26,862 --> 04:06:29,570
Thankfully, long story short, a\n
4303
04:06:29,570 --> 04:06:32,880
in a lot of old languages and\nmostly warded off this problem
4304
04:06:34,400 --> 04:06:39,980
The next time the world might end\n
4305
04:06:39,980 --> 04:06:42,560
Now, that might feel\nlike a long time away
4306
04:06:42,560 --> 04:06:44,931
but so did the year 2000, at one point.
4307
04:06:44,931 --> 04:06:51,351
Why might clocks again break in\n
4308
04:06:54,980 --> 04:06:57,150
So this refers to some\nnumber of seconds.
4309
04:06:57,150 --> 04:07:00,560
So it turns out that the way\n
4310
04:07:00,560 --> 04:07:04,130
is they count the total number\nof seconds since the epoch, which
4311
04:07:04,130 --> 04:07:06,620
is defined as January 1, 1970.
4312
04:07:07,220 --> 04:07:09,890
It was just a good year\nto start counting at
4313
04:07:09,890 --> 04:07:11,880
when computers really\ncame onto the scene.
4314
04:07:11,880 --> 04:07:16,400
Unfortunately, most computers used 32\n
4315
04:07:16,400 --> 04:07:20,331
since January 1, 1970, the\nimplication of which is we
4316
04:07:20,331 --> 04:07:23,210
can only count up to\nroughly 2 billion seconds.
4317
04:07:23,210 --> 04:07:29,810
2 billion seconds is going to\nhappen in 2038, at which 30 11's
4318
04:07:29,810 --> 04:07:31,700
are going to roll over as follows.
4319
04:07:31,700 --> 04:07:34,898
That number 2 billion,\nwhich is the max--
4320
04:07:34,898 --> 04:07:37,440
because if you're representing\npositive and negative numbers
4321
04:07:37,441 --> 04:07:39,941
recall that you can only count\nas high as positive 2 billion
4322
04:07:42,380 --> 04:07:44,421
This is roughly the number\n2 billion in binary.
4323
04:07:44,421 --> 04:07:47,041
It's all ones with one\nzero way over here.
4324
04:07:47,040 --> 04:07:50,990
If I count one second past that\n2 billion number, give or take--
4325
04:07:50,990 --> 04:07:54,050
that means, all right,\nI add 1, I carry the 1--
4326
04:07:54,050 --> 04:07:57,290
it's just like 9's\nbecoming 0's in decimal.
4327
04:07:57,290 --> 04:08:01,161
If I keep this sort of simple\n
4328
04:08:01,161 --> 04:08:05,451
carrying the 1, carrying the 1, 1 second\n
4329
04:08:05,450 --> 04:08:08,220
I have this number in\nthe computer's memory.
4330
04:08:08,220 --> 04:08:11,780
So there's still 1 bit that's\na 1 all the way to the left.
4331
04:08:11,781 --> 04:08:16,041
Unfortunately, that bit\noften represents negativity
4332
04:08:16,040 --> 04:08:20,720
whereby if that first bit is negative,\n
4333
04:08:20,720 --> 04:08:22,280
somehow represents a negative number.
4334
04:08:23,331 --> 04:08:24,831
There's a fancier representation.
4335
04:08:24,831 --> 04:08:27,800
But a very big, positive\nnumber very suddenly
4336
04:08:27,800 --> 04:08:29,690
becomes a very big, negative number.
4337
04:08:29,691 --> 04:08:32,671
And that number is roughly\nnegative 2 billion.
4338
04:08:32,671 --> 04:08:35,541
That means computers\nin 2038 on that date
4339
04:08:35,540 --> 04:08:37,970
are going to accidentally\nthink that it's
4340
04:08:37,970 --> 04:08:43,610
been negative 2 billion seconds since\n
4341
04:08:43,611 --> 04:08:46,731
computers potentially think it's 1901.
4342
04:08:46,730 --> 04:08:50,990
So what is the solution to\nthe 2038 problem, perhaps?
4343
04:08:50,990 --> 04:08:53,601
Y2K was because we were\nusing two digits for years.
4344
04:08:55,970 --> 04:09:00,591
And, thankfully, we're getting a\n
4345
04:09:00,591 --> 04:09:03,320
and computers now are\nincreasingly using 64 bits.
4346
04:09:03,320 --> 04:09:05,716
And all of us will be long\ngone by the time we run out
4347
04:09:05,716 --> 04:09:08,091
of that number of seconds, so\nit's someone else's problem
4348
04:09:09,841 --> 04:09:11,774
But that's really the\nfundamental solution.
4349
04:09:11,773 --> 04:09:13,940
If you're running up against\nsomething finite, well
4350
04:09:13,941 --> 04:09:16,371
just kick the can further and\njust give yourself more bits.
4351
04:09:16,370 --> 04:09:18,710
And, frankly, because hardware\nis so much cheaper these days
4352
04:09:18,710 --> 04:09:21,200
computers are so much faster,\nit's not as big of a deal
4353
04:09:21,200 --> 04:09:22,730
as it might have been decades ago.
4354
04:09:22,730 --> 04:09:24,501
But that's indeed the solution.
4355
04:09:24,501 --> 04:09:27,291
But this arises in very common contexts.
4356
04:09:27,290 --> 04:09:32,120
In fact, let me go ahead and write a\n
4357
04:09:32,120 --> 04:09:35,150
You might think that just converting\n
4358
04:09:35,150 --> 04:09:37,730
might be simple, but let\nme go ahead and do this.
4359
04:09:37,730 --> 04:09:41,331
In pennies.c, I'm going to\ngo ahead and include cs50.h.
4360
04:09:41,331 --> 04:09:48,271
And I'm going to include stdio.h,\n
4361
04:09:48,271 --> 04:09:50,400
And now down here, I'm going to do this.
4362
04:09:50,400 --> 04:09:52,370
I'm going to get a float\ncalled amount, and I'm
4363
04:09:52,370 --> 04:09:56,540
going to ask the user for some amount\n
4364
04:09:56,540 --> 04:09:59,270
and I'm going to store that\nin a variable called amount.
4365
04:09:59,271 --> 04:10:07,461
Then I'm going to simply convert that\n
4366
04:10:10,111 --> 04:10:16,281
And then I'm going to go ahead and print\n
4367
04:10:16,281 --> 04:10:18,231
because that's just an\ninteger in pennies--
4368
04:10:18,230 --> 04:10:22,490
backslash n, quote\nunquote, comma, pennies.
4369
04:10:22,490 --> 04:10:26,181
All right, so if I didn't make any\n
4370
04:10:27,800 --> 04:10:31,700
And suppose I have, say, $0.99, so 0.99.
4371
04:10:41,310 --> 04:10:42,931
There's that imprecision issue.
4372
04:10:42,931 --> 04:10:45,240
And this isn't even\nthat big of an amount.
4373
04:10:45,240 --> 04:10:49,302
Now, not a big deal if the cashier gives\n
4374
04:10:49,302 --> 04:10:50,761
but you can imagine this adding up.
4375
04:10:50,761 --> 04:10:54,240
You can imagine this being worrisome\nfor financial implications
4376
04:10:54,240 --> 04:10:57,630
for financial transactions, for\n
4377
04:10:57,630 --> 04:10:59,860
My program can't even handle this.
4378
04:10:59,861 --> 04:11:01,981
Well, there are some solutions here.
4379
04:11:01,980 --> 04:11:04,171
And it looks like what's\nreally happening--
4380
04:11:04,171 --> 04:11:08,730
if I print it out using the %f with a\n
4381
04:11:09,450 --> 04:11:14,280
presumably, the computer is struggling\n
4382
04:11:14,281 --> 04:11:20,761
It's probably storing 4 dollars\nand 19.9999-something cents.
4383
04:11:20,761 --> 04:11:23,551
So it's close, but it's not quite there.
4384
04:11:23,550 --> 04:11:28,081
So I could at least solve this\nby rounding up, for instance.
4385
04:11:28,081 --> 04:11:31,081
And it turns out there is\na round function out there.
4386
04:11:31,081 --> 04:11:33,841
And it turns out that it's in a\nlibrary called the math library.
4387
04:11:33,841 --> 04:11:36,841
And you would know this by looking\n
4388
04:11:37,950 --> 04:11:43,800
And if I now make pennies again and\n
4389
04:11:46,171 --> 04:11:49,531
So at least in this context, it\nseems like a solvable problem.
4390
04:11:49,531 --> 04:11:53,461
But it's certainly something I\n
4391
04:11:53,460 --> 04:11:57,180
Unfortunately, even professional,\n
4392
04:11:57,181 --> 04:12:00,073
have not been particularly\nattentive to these kinds of details.
4393
04:12:00,073 --> 04:12:03,031
And in a class like this, the goal\n
4394
04:12:03,031 --> 04:12:06,551
but to really teach you what's going\n
4395
04:12:06,550 --> 04:12:09,600
so that you have a bottom-up\nunderstanding of how data
4396
04:12:09,601 --> 04:12:11,980
is represented, how computers\nare manipulating it
4397
04:12:11,980 --> 04:12:16,060
so that you are not on the failing\n
4398
04:12:16,060 --> 04:12:19,360
And so that we as a society are not\n
4399
04:12:19,861 --> 04:12:22,231
And this happens,\nunfortunately, all of the time.
4400
04:12:22,230 --> 04:12:26,011
This is a Boeing airplane\nthat a few years ago needed
4401
04:12:26,011 --> 04:12:29,621
to be rebooted after every 248 days.
4402
04:12:30,120 --> 04:12:34,740
Because this Boeing airplane software\n
4403
04:12:34,740 --> 04:12:36,990
tenths of a second to keep\ntrack of something or other
4404
04:12:36,990 --> 04:12:38,761
related to its electrical power.
4405
04:12:38,761 --> 04:12:43,470
And, unfortunately, after 248 days of\n
4406
04:12:43,470 --> 04:12:45,690
which in the airline\nindustry is apparently not
4407
04:12:45,691 --> 04:12:49,381
uncommon to make every dollar count,\n
4408
04:12:50,581 --> 04:12:54,511
the 32-bit number would\nroll over and the power
4409
04:12:54,511 --> 04:12:57,601
would shut off on the airplane\nas a side effect because of sort
4410
04:12:57,601 --> 04:12:59,681
of undefined behavior in that case.
4411
04:12:59,681 --> 04:13:02,941
The temporary solution by Boeing at\n
4412
04:13:02,941 --> 04:13:06,150
sort of operating system style,\n
4413
04:13:06,150 --> 04:13:09,841
And that was indeed the fix until they\n
4414
04:13:11,351 --> 04:13:14,400
And the more hardware we carry\n
4415
04:13:14,400 --> 04:13:17,550
use these kinds of devices,\nthe more of these problems
4416
04:13:17,550 --> 04:13:20,430
we're going to run into down the road.
4417
04:14:42,261 --> 04:14:45,086
DAVID MALAN: This is\nCS50 and this is week 2.
4418
04:14:45,085 --> 04:14:47,710
Now that you have some programming\nexperience under your belts
4419
04:14:47,710 --> 04:14:50,170
in this more arcane language called c.
4420
04:14:50,171 --> 04:14:53,050
Among our goals today is to help\n
4421
04:14:53,050 --> 04:14:54,911
been doing these past several days.
4422
04:14:54,911 --> 04:14:58,216
Wrestling with your first programs in\n
4423
04:14:58,216 --> 04:15:00,341
up understanding of what\nsome of these commands do.
4424
04:15:00,341 --> 04:15:02,841
And, ultimately, what more\nwe can do with this language.
4425
04:15:02,841 --> 04:15:06,011
So this recall was the very\nfirst program you wrote
4426
04:15:06,011 --> 04:15:09,131
I wrote in this language\ncalled C, much more textual
4427
04:15:09,130 --> 04:15:11,230
certainly, than the Scratch equivalent.
4428
04:15:11,230 --> 04:15:15,460
But at the end of the day,\ncomputers, your Mac, your PC
4429
04:15:15,460 --> 04:15:18,815
VS Code doesn't understand\nthis actual code.
4430
04:15:18,816 --> 04:15:21,941
What's the format into which we need\n
4431
04:15:23,462 --> 04:15:26,050
DAVID MALAN: So binary,\notherwise known as machine code.
4432
04:15:26,550 --> 04:15:30,130
The 0s and 1s that your computer\nactually does understand.
4433
04:15:30,130 --> 04:15:32,290
So somehow we need to\nget to this format.
4434
04:15:32,290 --> 04:15:34,990
And up until now, we've been\nusing this command called make
4435
04:15:34,990 --> 04:15:37,931
which is aptly named, because\nit lets you make programs.
4436
04:15:37,931 --> 04:15:40,691
And the invocation of that\nhas been pretty simple.
4437
04:15:40,691 --> 04:15:44,711
Make hello looks in your current\n
4438
04:15:44,710 --> 04:15:49,360
hello.c, implicitly, and then it\n
4439
04:15:49,361 --> 04:15:51,911
which itself is executable,\nwhich just means runnable
4440
04:15:51,911 --> 04:15:54,161
so that you can then do ./hello.
4441
04:15:54,161 --> 04:15:58,451
But it turns out that make is\nactually not a compiler itself.
4442
04:15:58,450 --> 04:16:00,100
It does help you make programs.
4443
04:16:00,101 --> 04:16:04,781
But make is this utility that comes on\n
4444
04:16:04,781 --> 04:16:08,321
to actually compile code by\nusing an actual compiler
4445
04:16:08,320 --> 04:16:12,550
the program that converts source code\n
4446
04:16:12,550 --> 04:16:14,921
or whatever cloud environment\nyou might be using.
4447
04:16:14,921 --> 04:16:17,591
In fact, what make is\ndoing for us, is actually
4448
04:16:17,591 --> 04:16:21,490
running a command automatically\nknown as clang, for C language.
4449
04:16:21,490 --> 04:16:25,851
And, so here, for instance, in VS\n
4450
04:16:25,851 --> 04:16:27,730
this time in the context\nof a text editor
4451
04:16:27,730 --> 04:16:30,940
and I could compile\nthis with make hello.
4452
04:16:30,941 --> 04:16:33,828
Let me go ahead and use the\ncompiler itself manually.
4453
04:16:33,827 --> 04:16:36,911
And we'll see in a moment why we've\n
4454
04:16:36,911 --> 04:16:39,320
I'm going to run clang instead.
4455
04:16:39,320 --> 04:16:41,601
And then I'm going to run hello.c.
4456
04:16:41,601 --> 04:16:43,751
So it's a little different\nhow the compiler's used.
4457
04:16:43,751 --> 04:16:46,421
It needs to know, explicitly,\nwhat the file is called.
4458
04:16:46,421 --> 04:16:49,541
I'll go ahead and run\nclang, hello.c, Enter.
4459
04:16:49,540 --> 04:16:52,675
Nothing seems to happen, which,\n
4460
04:16:52,675 --> 04:16:54,050
Because no errors have popped up.
4461
04:16:54,050 --> 04:17:00,400
And if I do ls for list, you'll see\n
4462
04:17:00,400 --> 04:17:03,490
But there is a curiously-named\nfile called a.out.
4463
04:17:03,490 --> 04:17:06,880
This is a historical convention,\nstands for assembler output.
4464
04:17:06,880 --> 04:17:09,640
And this is, just, the default\nfile name for a program
4465
04:17:09,640 --> 04:17:13,661
that you might compile yourself,\nmanually, using clang itself.
4466
04:17:13,661 --> 04:17:16,091
Let me go ahead now and\npoint out that that's
4467
04:17:16,091 --> 04:17:17,601
kind of a stupid name for a program.
4468
04:17:17,601 --> 04:17:20,695
Even though it works,\n./a.out would work.
4469
04:17:20,695 --> 04:17:23,320
But if you actually want to\ncustomize the name of your program
4470
04:17:23,320 --> 04:17:26,980
we could just resort to make,\nor we could do explicitly
4471
04:17:28,181 --> 04:17:31,031
It turns out, some\nprograms, among them make
4472
04:17:31,031 --> 04:17:33,251
support what are called\ncommand line arguments
4473
04:17:33,251 --> 04:17:34,570
and more on those later today.
4474
04:17:34,570 --> 04:17:37,931
But these are literally words or\n
4475
04:17:37,931 --> 04:17:41,591
after the name of a program that just\n
4476
04:17:44,300 --> 04:17:47,200
And it turns out, if you read\nthe documentation for clang
4477
04:17:47,200 --> 04:17:52,300
you can actually pass a -o, for\n
4478
04:17:52,300 --> 04:17:54,520
lets you specify,\nexplicitly what do you want
4479
04:17:54,521 --> 04:17:56,056
your outputted program to be called?
4480
04:17:56,056 --> 04:17:58,931
And then you go ahead and type the\n
4481
04:17:58,931 --> 04:18:01,371
want to compile, from\nsource code to machine code.
4482
04:18:02,980 --> 04:18:06,251
Again, nothing seems to happen,\nand I type ls and voila.
4483
04:18:06,251 --> 04:18:09,271
Now we still have the old a.out,\nbecause I didn't delete it yet.
4484
04:18:10,271 --> 04:18:14,681
So ./hello, voila, runs\nhello, world again.
4485
04:18:14,681 --> 04:18:16,421
And let me go ahead\nand remove this file.
4486
04:18:16,421 --> 04:18:20,854
I could, of course, resort to using\n
4487
04:18:20,853 --> 04:18:23,770
Which, I am in the habit of closing,\n
4488
04:18:23,771 --> 04:18:26,501
But I could go ahead and right-click\nor control-click on a.out
4489
04:18:26,501 --> 04:18:27,626
if I want to get rid of it.
4490
04:18:27,626 --> 04:18:30,560
Or again, let me focus on\nthe command line interface.
4491
04:18:32,290 --> 04:18:35,260
We didn't really use it much,\nbut what command removes a file?
4492
04:18:36,925 --> 04:18:40,690
DAVID MALAN: So rm for\nremove. rm, a.out, Enter.
4493
04:18:40,691 --> 04:18:44,320
Remove regular file,\na.out, y for yes, enter.
4494
04:18:44,320 --> 04:18:46,900
And now, if I do ls\nagain, voila, it's gone.
4495
04:18:46,900 --> 04:18:48,911
All right, so, let's\nnow enhance this program
4496
04:18:48,911 --> 04:18:54,550
to do the second version we ever did,\n
4497
04:18:54,550 --> 04:18:57,409
so that we have access to functions\n
4498
04:18:57,409 --> 04:19:04,601
Let me do string, name, gets,\nget string, what's your name
4499
04:19:05,810 --> 04:19:10,270
And now, let me go ahead and say hello\n
4500
04:19:11,181 --> 04:19:13,421
So this was version 2 of\nour program last time
4501
04:19:13,421 --> 04:19:17,560
that very easily compiled with make\n
4502
04:19:17,560 --> 04:19:20,620
If I want to compile this\nthing myself with clang, using
4503
04:19:20,620 --> 04:19:22,780
that same lesson learned,\nall right, let's do it.
4504
04:19:22,781 --> 04:19:29,561
clang-o, hello, just so I get a better\n
4505
04:19:29,560 --> 04:19:34,011
And a new error pops up that some of\n
4506
04:19:34,011 --> 04:19:37,841
So it's a bit arcane here, and there's\n
4507
04:19:37,841 --> 04:19:39,591
with temp for temporary there.
4508
04:19:39,591 --> 04:19:42,820
But somehow, my issue's in\nmain, as we can see here.
4509
04:19:42,820 --> 04:19:44,518
It somehow relates to hello.c.
4510
04:19:44,518 --> 04:19:47,351
Even though we might not have seen\n
4511
04:19:47,351 --> 04:19:50,230
but there's an undefined\nreference to get string.
4512
04:19:50,230 --> 04:19:52,060
As though get string doesn't exist.
4513
04:19:52,060 --> 04:19:55,601
Now, your first instinct might be, well\n
4514
04:19:56,441 --> 04:19:58,570
That's the very first\nline of my program.
4515
04:19:58,570 --> 04:20:02,171
But it turns out, make is doing\n
4516
04:20:02,171 --> 04:20:06,191
Just putting cs50.h, or any header\nfile at the top of your code
4517
04:20:06,191 --> 04:20:10,990
for that matter, just teaches the\n
4518
04:20:10,990 --> 04:20:13,570
It, sort of, asks the compiler\nto-- it asks the compiler
4519
04:20:13,570 --> 04:20:16,870
to trust that I will, eventually,\n
4520
04:20:16,870 --> 04:20:22,390
like get string, and cs50.h,\nand stdio.h, printf, therein.
4521
04:20:22,390 --> 04:20:28,091
But this error here, some kind of\n
4522
04:20:28,091 --> 04:20:30,220
that there's a separate\nprocess for actually
4523
04:20:30,220 --> 04:20:34,540
finding the 0s and 1s that\ncs50 compiled long ago for you.
4524
04:20:34,540 --> 04:20:38,110
That authors of this operating\n
4525
04:20:39,161 --> 04:20:42,101
We need to, somehow,\ntell the compiler that we
4526
04:20:42,101 --> 04:20:44,711
need to link in code\nthat someone else wrote
4527
04:20:44,710 --> 04:20:48,010
the actual machine code that someone\n
4528
04:20:48,011 --> 04:20:51,757
So to do that, you'd have to\ntype -lcs50, for instance
4529
04:20:52,841 --> 04:20:55,809
So additionally, telling clang\n
4530
04:20:55,808 --> 04:20:58,600
a file called hello, and you want\n
4531
04:20:58,601 --> 04:21:03,461
you also want to quote-unquote\nlink in a bunch of 0s and 1s
4532
04:21:03,460 --> 04:21:07,270
that collectively implement\nget string and printf.
4533
04:21:07,271 --> 04:21:11,480
So now, if I hit enter,\nthis time it compiled OK.
4534
04:21:11,480 --> 04:21:17,403
And now if I run ./hello, it works\n
4535
04:21:17,403 --> 04:21:20,361
But honestly, this is just going to\n
4536
04:21:20,361 --> 04:21:22,191
Notice, already, just\nto compile my code
4537
04:21:22,191 --> 04:21:25,677
I have to run clang-o,\nhello, hello.c, lcs50
4538
04:21:25,677 --> 04:21:27,761
and you're going to have\nto type more things, too.
4539
04:21:27,761 --> 04:21:31,150
If you wanted to use the math library,\n
4540
04:21:31,150 --> 04:21:33,700
you would also have\nto do -lm, typically
4541
04:21:33,700 --> 04:21:37,150
to specify give me the math\nbits that someone else compiled.
4542
04:21:37,150 --> 04:21:39,230
And the commands just\nget longer and longer.
4543
04:21:39,230 --> 04:21:43,780
So moving forward, we won't have\n
4544
04:21:43,781 --> 04:21:45,591
but clang is, indeed, the compiler.
4545
04:21:45,591 --> 04:21:48,640
That is the program that converts\n
4546
04:21:48,640 --> 04:21:52,699
But we'll continue to use make because\n
4547
04:21:52,699 --> 04:21:54,490
And the commands are\nonly going to get more
4548
04:21:54,490 --> 04:21:58,900
cryptic the more sophisticated and\n
4549
04:21:58,900 --> 04:22:03,880
And make, again, is just a tool\nthat makes all that happen.
4550
04:22:03,880 --> 04:22:08,560
Let me pause there to see if\n
4551
04:22:08,560 --> 04:22:10,150
take a look further under the hood.
4552
04:22:11,445 --> 04:22:14,445
AUDIENCE: Can you explain again what\n
4553
04:22:14,445 --> 04:22:16,779
DAVID MALAN: Sure, let me\ncome back to that in a moment.
4554
04:22:18,011 --> 04:22:20,177
We'll come back to that,\nvisually, in just a moment.
4555
04:22:20,177 --> 04:22:23,111
But it means to link in the\n0s and 1s that collectively
4556
04:22:23,111 --> 04:22:24,695
implement get string and printf.
4557
04:22:24,695 --> 04:22:26,320
But we'll see that, visually, in a sec.
4558
04:22:31,334 --> 04:22:32,751
DAVID MALAN: Really good question.
4559
04:22:32,751 --> 04:22:35,111
How come I didn't have\nto link in standard I/O?
4560
04:22:35,111 --> 04:22:37,211
Because I used printf in version 1.
4561
04:22:37,210 --> 04:22:40,540
Standard I/O is just, literally,\nso standard that it's built in
4562
04:22:43,060 --> 04:22:45,341
It did not come with the\nlanguage C or the compiler.
4563
04:22:46,511 --> 04:22:50,861
And other libraries, even though\n
4564
04:22:50,861 --> 04:22:54,861
they might not be enabled by default,\n
4565
04:22:54,861 --> 04:22:57,731
So you're not loading more 0s\nand 1s into the computer's memory
4566
04:22:58,540 --> 04:23:01,510
So standard I/O is special, if you will.
4567
04:23:05,681 --> 04:23:07,421
DAVID MALAN: Oh, what does the -o mean?
4568
04:23:07,421 --> 04:23:10,451
So -o is shorthand for\nthe English word output
4569
04:23:10,450 --> 04:23:15,520
and so -o is telling clang to\nplease output a file called hello
4570
04:23:15,521 --> 04:23:18,111
because the next thing I\nwrote after the command line
4571
04:23:18,111 --> 04:23:24,190
recall was clang -o hello, then\n
4572
04:23:24,190 --> 04:23:27,667
And this is where these commands\ndo get and stay fairly arcane.
4573
04:23:27,667 --> 04:23:29,501
It's just through muscle\nmemory and practice
4574
04:23:29,501 --> 04:23:31,871
that you'll start to remember, oh\n
4575
04:23:31,870 --> 04:23:34,537
what are the command line arguments\nyou can provide to programs?
4576
04:23:34,538 --> 04:23:35,831
But we've seen this before.
4577
04:23:35,831 --> 04:23:39,040
Technically, when you run make\n
4578
04:23:39,040 --> 04:23:41,240
hello is the command line argument.
4579
04:23:41,240 --> 04:23:43,300
It's an input to the\nmake function, albeit
4580
04:23:43,300 --> 04:23:46,510
typed at the prompt, that tells\nmake what you want to make.
4581
04:23:46,511 --> 04:23:50,441
Even when I used rm a moment\nago, and did rm of a.out
4582
04:23:50,441 --> 04:23:52,541
the command line argument\nthere was called a.out
4583
04:23:52,540 --> 04:23:55,000
and it's telling rm what to delete.
4584
04:23:55,001 --> 04:23:59,531
It is entirely dependent on the programs\n
4585
04:23:59,531 --> 04:24:02,351
whether you use dash this\nor dash that, but we'll
4586
04:24:02,351 --> 04:24:05,066
see over time, which ones\nactually matter in practice.
4587
04:24:05,066 --> 04:24:10,481
So to come back to the first question\n
4588
04:24:10,480 --> 04:24:12,823
let's consider the code more closely.
4589
04:24:12,823 --> 04:24:14,531
So here is that first\nversion of the code
4590
04:24:14,531 --> 04:24:18,851
again, with stdio.h and only\nprintf, so no cs50 stuff yet.
4591
04:24:18,851 --> 04:24:21,101
Until we add it back in\nand had the second version
4592
04:24:21,101 --> 04:24:23,891
where we actually get the human's name.
4593
04:24:23,890 --> 04:24:27,043
When you run this command,\nthere's a few things
4594
04:24:27,043 --> 04:24:28,960
that are happening\nunderneath the hood, and we
4595
04:24:28,960 --> 04:24:30,911
won't dwell on these\nkinds of details, indeed
4596
04:24:30,911 --> 04:24:33,130
we'll abstract it away by using make.
4597
04:24:33,130 --> 04:24:35,200
But it's worth understanding\nfrom the get-go
4598
04:24:35,200 --> 04:24:38,140
how much automation is going on, so\n
4599
04:24:39,111 --> 04:24:42,201
You have this bottom-up\nunderstanding of what's going on.
4600
04:24:42,200 --> 04:24:45,790
So when we say you've been\ncompiling your code with make
4601
04:24:45,790 --> 04:24:47,860
that's a bit of an oversimplification.
4602
04:24:47,861 --> 04:24:51,041
Technically, every time\nyou compile your code
4603
04:24:51,040 --> 04:24:53,831
you're having the computer do\nfour distinct things for you.
4604
04:24:53,831 --> 04:24:57,281
And this is not four distinct things\n
4605
04:24:57,281 --> 04:24:59,441
every time you run your\nprogram, what's happening
4606
04:24:59,441 --> 04:25:02,081
but it helps to break it\ndown into building blocks
4607
04:25:02,081 --> 04:25:06,371
as to how we're getting from source\n
4608
04:25:06,370 --> 04:25:10,900
It turns out, that when you compile,\n
4609
04:25:10,900 --> 04:25:14,771
speaking, you're doing four things\n
4610
04:25:14,771 --> 04:25:18,221
Preprocessing it, compiling it,\nassembling it, and linking it.
4611
04:25:18,220 --> 04:25:21,610
Just humans decided, let's just\n
4612
04:25:21,611 --> 04:25:24,490
But for a moment, let's\nconsider what these steps are.
4613
04:25:24,490 --> 04:25:26,950
So preprocessing refers to this.
4614
04:25:26,950 --> 04:25:30,970
If we look at our source code,\n
4615
04:25:30,970 --> 04:25:34,702
and therefore get string, notice that\n
4616
04:25:34,702 --> 04:25:36,911
And they're kind of special\nversus all the other code
4617
04:25:36,911 --> 04:25:39,970
we've written, because they start\n
4618
04:25:39,970 --> 04:25:41,921
And that's sort of a\nspecial syntax that means
4619
04:25:41,921 --> 04:25:44,861
that these are, technically,\ncalled preprocessor directives.
4620
04:25:44,861 --> 04:25:49,551
Fancy way of saying they're handled\n
4621
04:25:49,550 --> 04:25:54,130
In fact, if we focus on\ncs50.h, recall from last week
4622
04:25:54,130 --> 04:26:00,130
that I provided a hint as to what's\n
4623
04:26:00,130 --> 04:26:04,840
What was the one salient thing that\n
4624
04:26:04,841 --> 04:26:07,736
why we were including\nit in the first place?
4625
04:26:08,611 --> 04:26:11,111
DAVID MALAN: So get\nstring, specifically
4626
04:26:11,111 --> 04:26:13,421
the prototype for get string.
4627
04:26:13,421 --> 04:26:15,671
We haven't made many of\nour own functions yet
4628
04:26:15,671 --> 04:26:18,101
but recall that any time\nwe've made our own functions
4629
04:26:18,101 --> 04:26:20,591
and we've written them\nbelow main in a file
4630
04:26:20,591 --> 04:26:23,050
we've also had to, somewhat\nstupidly, copy paste
4631
04:26:23,050 --> 04:26:25,630
the prototype of the function\nat the top of the file
4632
04:26:25,630 --> 04:26:29,470
just to teach the compiler that\n
4633
04:26:29,470 --> 04:26:31,690
it does down there, but it will exist.
4634
04:26:32,560 --> 04:26:35,240
So again, that's what these\nprototypes are doing for us.
4635
04:26:35,240 --> 04:26:37,601
So therefore, in my\ncode, If I want to use
4636
04:26:37,601 --> 04:26:41,021
a function like get string,\nor printf, for that matter
4637
04:26:41,021 --> 04:26:43,411
they're not implemented\nclearly in the same file
4638
04:26:43,411 --> 04:26:44,661
they're implemented elsewhere.
4639
04:26:44,661 --> 04:26:46,952
So I need to tell the compiler\nto trust me that they're
4640
04:26:46,952 --> 04:26:48,261
implemented somewhere else.
4641
04:26:48,261 --> 04:26:51,070
And so technically,\ninside of cs50.h, which
4642
04:26:51,070 --> 04:26:54,671
is installed somewhere in the\ncloud's hard drive, so to speak
4643
04:26:54,671 --> 04:26:59,081
that you all are accessing via VS Code,\n
4644
04:26:59,081 --> 04:27:03,130
A prototype for the get string function\n
4645
04:27:03,130 --> 04:27:07,090
get string, it takes one input,\nor argument, called prompt
4646
04:27:07,091 --> 04:27:09,970
and that type of that\nprompt is a string.
4647
04:27:09,970 --> 04:27:15,411
Get string, not surprisingly, has a\n
4648
04:27:15,411 --> 04:27:19,060
So literally, that line and a\nbunch of others, are in cs50.h.
4649
04:27:19,060 --> 04:27:22,540
So rather than you all having\nto copy paste the prototype
4650
04:27:22,540 --> 04:27:25,420
you can just trust that\ncs50 figured out what it is.
4651
04:27:25,421 --> 04:27:29,230
You can include cs50.h\nand the compiler is going
4652
04:27:29,230 --> 04:27:31,681
to go find that prototype for you.
4653
04:27:31,681 --> 04:27:33,740
Same thing in standard\nI/O. Someone else-- what
4654
04:27:33,740 --> 04:27:37,880
must clearly be in stdio.h,\namong other stuff, that
4655
04:27:37,880 --> 04:27:41,850
motivates our including stdio.h, too?
4656
04:27:43,058 --> 04:27:45,290
DAVID MALAN: Printf, the\nprototype for printf
4657
04:27:45,290 --> 04:27:48,270
and I'll just change it here\nin yellow, to be the same.
4658
04:27:48,271 --> 04:27:49,671
And it turns out, the format--
4659
04:27:49,671 --> 04:27:52,851
the prototype for printf\nis, actually, pretty fancy
4660
04:27:52,851 --> 04:27:56,001
because, as you might have noticed,\n
4661
04:27:56,001 --> 04:28:00,171
something to print, 2, if you want\n
4662
04:28:00,171 --> 04:28:02,880
So the dot dot dot just\nrepresents exactly that.
4663
04:28:02,880 --> 04:28:06,590
It's not quite as simple a prototype\n
4664
04:28:07,376 --> 04:28:10,310
So what does it mean to\npreprocess your code?
4665
04:28:10,310 --> 04:28:14,120
The very first thing the\ncompiler, clang, in this case
4666
04:28:14,120 --> 04:28:18,530
is doing for you when it reads your\n
4667
04:28:18,531 --> 04:28:22,221
notices, oh, here is hash include,\n
4668
04:28:22,220 --> 04:28:27,350
And it, essentially, finds those files\n
4669
04:28:27,351 --> 04:28:31,251
and does the equivalent of copying\n
4670
04:28:31,251 --> 04:28:33,621
into your code at the very top.
4671
04:28:33,620 --> 04:28:36,710
Thereby teaching the compiler\nthat gets string and printf
4672
04:28:36,710 --> 04:28:38,690
will eventually exist somewhere.
4673
04:28:38,691 --> 04:28:42,740
So that's the preprocessing\nstep, whereby, again, it's
4674
04:28:42,740 --> 04:28:46,341
just doing a find-and-replace of\n
4675
04:28:46,341 --> 04:28:48,771
It's plugging in the files\nthere so that you, essentially
4676
04:28:48,771 --> 04:28:52,041
get all the prototypes\nyou need automatically.
4677
04:28:53,091 --> 04:28:55,490
What does it mean, then,\nto compile the results?
4678
04:28:55,490 --> 04:28:57,710
Because at this point\nin the story, your code
4679
04:28:57,710 --> 04:28:59,938
now looks like this in\nthe computer's memory.
4680
04:28:59,939 --> 04:29:01,730
It doesn't change your\nfile, it's doing all
4681
04:29:01,730 --> 04:29:04,251
of this in the computer's\nmemory, or RAM, for you.
4682
04:29:04,251 --> 04:29:06,331
But it, essentially, looks like this.
4683
04:29:06,331 --> 04:29:09,861
Well the next step is what's,\ntechnically, really compiling.
4684
04:29:09,861 --> 04:29:12,681
Even though again, we use\ncompile as an umbrella term.
4685
04:29:12,681 --> 04:29:15,771
Compiling code in C\nmeans to take code that
4686
04:29:15,771 --> 04:29:18,001
now looks like this in\nthe computer's memory
4687
04:29:18,001 --> 04:29:21,150
and turn it into something\nthat looks like this.
4688
04:29:22,611 --> 04:29:25,251
But it was just a few\ndecades ago that, if you
4689
04:29:25,251 --> 04:29:28,191
were taking a class like\nCS50 in its earlier form
4690
04:29:28,191 --> 04:29:32,001
we wouldn't be using C it didn't exist\n
4691
04:29:32,001 --> 04:29:33,951
something called assembly language.
4692
04:29:33,950 --> 04:29:37,490
And there's different types of,\n
4693
04:29:37,490 --> 04:29:41,271
But this is about as low level as\n
4694
04:29:41,271 --> 04:29:43,671
understands, be it a\nMac, or PC, or a phone
4695
04:29:43,671 --> 04:29:46,911
before you start getting\ninto actual 0s and 1s.
4696
04:29:46,911 --> 04:29:48,273
And most of this is cryptic.
4697
04:29:48,273 --> 04:29:51,440
I couldn't tell you what this is doing\n
4698
04:29:51,441 --> 04:29:54,561
and rewound mentally, years\nago, from having studied it
4699
04:29:54,560 --> 04:29:57,140
but let's highlight a\nfew key words in yellow.
4700
04:29:57,140 --> 04:30:01,640
Notice that this assembly language\n
4701
04:30:01,640 --> 04:30:04,790
for you automatically,\nstill has mention of main
4702
04:30:04,790 --> 04:30:07,550
and it has mention of get string,\nand it has mention of printf.
4703
04:30:07,550 --> 04:30:10,618
So there's some relationship to\nthe C code we saw a moment ago.
4704
04:30:10,619 --> 04:30:12,411
And then if I highlight\nthese other things
4705
04:30:12,411 --> 04:30:14,691
these are what are called\ncomputer instructions.
4706
04:30:14,691 --> 04:30:17,001
At the end of the day,\nyour Mac, your PC
4707
04:30:17,001 --> 04:30:20,601
your phone actually only\nunderstands very basic instructions
4708
04:30:20,601 --> 04:30:25,281
like addition, subtraction, division,\n
4709
04:30:25,281 --> 04:30:30,451
load from memory, print something to\n
4710
04:30:30,450 --> 04:30:32,015
And that's what you're seeing here.
4711
04:30:32,015 --> 04:30:37,011
These assembly instructions\nare what the computer actually
4712
04:30:37,011 --> 04:30:41,131
feeds into the brains of the computer,\n
4713
04:30:41,130 --> 04:30:44,030
And it's that Intel CPU,\nor whatever you have
4714
04:30:44,031 --> 04:30:47,481
that understands this instruction, and\n
4715
04:30:47,480 --> 04:30:50,120
And collectively, long\nstory short, all they do
4716
04:30:50,120 --> 04:30:52,880
is print hello, world on\nthe screen, but in a way
4717
04:30:52,880 --> 04:30:56,171
that the machine understands how to do.
4718
04:30:58,761 --> 04:31:01,271
Are there any questions on\nwhat we mean by preprocessing?
4719
04:31:01,271 --> 04:31:05,111
Which finds and replaces the hash\n
4720
04:31:05,111 --> 04:31:08,711
and compiling, which technically\ntakes your source code
4721
04:31:08,710 --> 04:31:12,430
once preprocessed, and converts it to\n
4722
04:31:12,431 --> 04:31:14,603
AUDIENCE: [INAUDIBLE] each CPU has--
4723
04:31:15,550 --> 04:31:18,970
Each type of CPU has\nits own instruction set.
4724
04:31:19,540 --> 04:31:23,230
And as a teaser, this is why,\nat least back in the day, when
4725
04:31:23,230 --> 04:31:27,161
we used to install software from\n
4726
04:31:27,161 --> 04:31:32,483
this is why you can't take a program\n
4727
04:31:32,483 --> 04:31:33,941
and run it on a Mac, or vice-versa.
4728
04:31:33,941 --> 04:31:38,681
Because the commands, the instructions\n
4729
04:31:39,761 --> 04:31:44,411
Now Microsoft, or any company, could\n
4730
04:31:44,411 --> 04:31:48,370
like C or another, and they can\n
4731
04:31:50,050 --> 04:31:54,369
It's twice as much work and sometimes\n
4732
04:31:54,370 --> 04:31:57,401
but that's why these steps\nare somewhat distinct.
4733
04:31:57,400 --> 04:32:00,970
You can now use the same code and\n
4734
04:32:03,911 --> 04:32:07,060
Thankfully, this part is fairly\n
4735
04:32:07,060 --> 04:32:10,511
To assemble code, which is step\nthree of four, that is just
4736
04:32:10,511 --> 04:32:14,621
happening for you every time\nyou run make or, in turn, clang
4737
04:32:14,620 --> 04:32:17,830
this assembly language, which the\n
4738
04:32:17,831 --> 04:32:21,341
for you from your source code,\nis turned into 0s and 1s.
4739
04:32:21,341 --> 04:32:25,043
So that's the step that, last\nweek, I simplified and said
4740
04:32:25,043 --> 04:32:28,210
when you compile your code, you convert\n
4741
04:32:29,230 --> 04:32:31,945
Technically, that happens\nwhen you assemble your code.
4742
04:32:31,945 --> 04:32:35,200
But no one in normal\nconversations says that, they just
4743
04:32:35,200 --> 04:32:37,540
say compile for all of these terms.
4744
04:32:43,331 --> 04:32:46,661
Even in this simple program\nof getting the user's name
4745
04:32:46,661 --> 04:32:51,380
and then plugging it into printf, I'm\n
4746
04:32:51,880 --> 04:32:54,460
My own, which is in hello.c.
4747
04:32:54,460 --> 04:32:59,860
Some of CS50s, which is\nin hello.c, sorry-- which
4748
04:32:59,861 --> 04:33:03,341
is in cs50.c, which is not\na file I've mentioned, yet
4749
04:33:03,341 --> 04:33:07,480
but it stands to reason, that if\n
4750
04:33:07,480 --> 04:33:09,640
turns out, the actual\nimplementation of get string
4751
04:33:09,640 --> 04:33:11,861
and other things are in cs50.c.
4752
04:33:11,861 --> 04:33:15,550
And there's a third file\nsomewhere on the hard drive
4753
04:33:15,550 --> 04:33:18,521
that's involved in compiling\neven this simple program.
4754
04:33:18,521 --> 04:33:24,231
hello.c, cs50.c, and by that\nlogic, what might the other be?
4755
04:33:27,861 --> 04:33:30,951
And that's a bit of a white lie,\n
4756
04:33:30,951 --> 04:33:34,011
that there's actually multiple files\n
4757
04:33:34,010 --> 04:33:35,640
and we'll take the simplification.
4758
04:33:35,640 --> 04:33:40,460
So when I have this code,\nand I compile my code
4759
04:33:40,460 --> 04:33:45,560
I get those 0s and 1s that end up taking\n
4760
04:33:45,561 --> 04:33:51,091
into 0s and 1s that are combined with\n
4761
04:33:52,100 --> 04:33:57,560
Here might be the 0s and 1s for my code,\n
4762
04:33:57,561 --> 04:34:02,181
Here might be the 0s and 1s for what\n
4763
04:34:02,181 --> 04:34:06,471
Here might be the 0s and 1s that someone\n
4764
04:34:06,471 --> 04:34:09,980
The last and final step\nis that linking command
4765
04:34:09,980 --> 04:34:12,591
that links all of these\n0s and 1s together
4766
04:34:12,591 --> 04:34:18,081
essentially stitches them together\n
4767
04:34:18,080 --> 04:34:20,645
or called a.out, whatever you name it.
4768
04:34:20,646 --> 04:34:25,911
That last step is what combines all of\n
4769
04:34:25,911 --> 04:34:28,311
And my God, now we're\nreally in the weeds.
4770
04:34:28,311 --> 04:34:31,281
Who wants to even think about\nrunning code at this level?
4771
04:34:33,440 --> 04:34:36,008
When you're running make,\nthere's some very concrete steps
4772
04:34:36,008 --> 04:34:38,550
that are happening that humans\nhave developed over the years
4773
04:34:38,550 --> 04:34:41,960
over the decades, that breakdown\n
4774
04:34:41,960 --> 04:34:46,670
to 0s and 1s, or machine code,\ninto these very specific steps.
4775
04:34:46,670 --> 04:34:50,360
But henceforth, you can\ncall all of this compiling.
4776
04:34:52,856 --> 04:34:55,064
AUDIENCE: Can you explain\nagain what a.out signifies?
4777
04:34:57,530 --> 04:35:02,150
a.out is just the conventional,\n
4778
04:35:02,151 --> 04:35:05,541
that you compile directly\nwith a compiler, like clang.
4779
04:35:05,541 --> 04:35:07,941
It's a meaningless name, though.
4780
04:35:07,940 --> 04:35:11,510
It stands for assembler output, and\n
4781
04:35:11,510 --> 04:35:12,950
from this assembling process.
4782
04:35:12,951 --> 04:35:15,411
It's a lame name for a\ncomputer program, and we
4783
04:35:15,411 --> 04:35:20,710
can override it by outputting\nsomething like hello, instead.
4784
04:35:27,687 --> 04:35:32,121
DAVID MALAN: To recap, there are\n
4785
04:35:32,120 --> 04:35:36,170
cs50.h, stdio.h, technically, they're\n
4786
04:35:36,170 --> 04:35:38,720
even though you, strictly\nspeaking, don't need most of them
4787
04:35:38,721 --> 04:35:42,451
but they are there, just in\ncase you might want them.
4788
04:35:42,451 --> 04:35:43,921
And finally, any other questions?
4789
04:35:48,138 --> 04:35:51,181
DAVID MALAN: Does it matter what order\n
4790
04:35:51,181 --> 04:35:53,401
Sometimes with libraries,\nyes, it matters
4791
04:35:53,401 --> 04:35:55,781
what order they are linked in together.
4792
04:35:55,780 --> 04:35:58,591
But for our purposes, it's\nreally not going to matter.
4793
04:35:58,591 --> 04:36:03,010
It's going to-- make is going to take\n
4794
04:36:03,510 --> 04:36:06,055
So with that said, henceforth,\ncompiling, technically
4795
04:36:06,931 --> 04:36:10,951
But we'll focus on it as a higher\nlevel concept, an abstraction
4796
04:36:14,140 --> 04:36:16,771
So another process that we'll\nnow begin to focus on all the
4797
04:36:16,771 --> 04:36:19,951
more this week because, invariably,\n
4798
04:36:19,951 --> 04:36:21,421
ran up against some challenges.
4799
04:36:21,420 --> 04:36:24,810
You probably created your very first\n
4800
04:36:24,811 --> 04:36:28,201
and so let's focus for a moment on\n
4801
04:36:28,201 --> 04:36:31,320
As you spend more time\nthis semester, in the years
4802
04:36:31,320 --> 04:36:34,530
to come If you continue to program,\n
4803
04:36:34,530 --> 04:36:37,837
going to write bug\nfree code, ultimately.
4804
04:36:37,837 --> 04:36:40,920
Though your programs are going to get\n
4805
04:36:40,920 --> 04:36:44,490
and we're all going to start to\n
4806
04:36:44,491 --> 04:36:46,831
And to this day, I write\nbuggy code all the time.
4807
04:36:46,830 --> 04:36:48,780
And I'm always horrified\nwhen I do it up here.
4808
04:36:48,780 --> 04:36:50,880
But hopefully, that\nwon't happen too often.
4809
04:36:50,881 --> 04:36:54,361
But when it does, it's a process,\nnow, of debugging, trying
4810
04:36:54,361 --> 04:36:56,491
to find the mistakes in your program.
4811
04:36:56,491 --> 04:36:59,861
You don't have to stare at your code,\n
4812
04:36:59,861 --> 04:37:02,851
There are actual tools\nthat real world programmers
4813
04:37:02,850 --> 04:37:06,120
use to help debug their\ncode and find these faults.
4814
04:37:06,120 --> 04:37:08,716
So what are some of the techniques\nand tools that folks use?
4815
04:37:08,716 --> 04:37:13,701
Well as an aside, if you've ever--
4816
04:37:13,701 --> 04:37:17,101
a bug in a program is a mistake,\n
4817
04:37:17,100 --> 04:37:22,270
If you've ever heard this tale,\nsome 50 plus years ago, in 1947.
4818
04:37:22,271 --> 04:37:27,030
This is an entry in a log book written\n
4819
04:37:27,030 --> 04:37:29,490
as-- named Grace Hopper,\nwho happened to be the one
4820
04:37:29,491 --> 04:37:33,606
to record the very first discovery of a\n
4821
04:37:33,605 --> 04:37:36,120
This was like a moth\nthat had flown into
4822
04:37:36,120 --> 04:37:41,341
at the time, a very sophisticated system\n
4823
04:37:41,341 --> 04:37:44,311
very large, refrigerator-sized\ntype systems
4824
04:37:44,311 --> 04:37:48,421
in which an actual bug caused an issue.
4825
04:37:48,420 --> 04:37:51,450
The etymology of bug though,\npredates this particular instance
4826
04:37:51,451 --> 04:37:54,841
but here you have, as any computer\n
4827
04:37:54,841 --> 04:37:57,105
of a first physical bug in a computer.
4828
04:37:57,105 --> 04:37:59,582
How, though, do you go\nabout removing such a thing?
4829
04:37:59,582 --> 04:38:02,040
Well, let's consider a very\nsimple scenario from last time
4830
04:38:02,041 --> 04:38:05,041
for instance, when we were trying to\n
4831
04:38:05,041 --> 04:38:07,230
like this column of 3 bricks.
4832
04:38:07,230 --> 04:38:10,920
Let's consider how I might go about\n
4833
04:38:10,920 --> 04:38:15,390
Let me switch back over to VS\nCode here, and I'm going to run--
4834
04:38:17,010 --> 04:38:18,900
And I'm not going to\ntrust myself, so I'm
4835
04:38:18,901 --> 04:38:20,768
going to call it\nbuggy.c from the get-go
4836
04:38:20,768 --> 04:38:22,600
knowing that I'm going\nto mess something up.
4837
04:38:22,600 --> 04:38:25,411
But I'm going to go ahead\nand include stdio.h.
4838
04:38:25,411 --> 04:38:28,201
And I'm going to define main, as usual.
4839
04:38:28,201 --> 04:38:30,210
So hopefully, no mistakes just yet.
4840
04:38:30,210 --> 04:38:32,970
And now, I want to print those\n3 bricks on the screen using
4841
04:38:34,530 --> 04:38:40,681
So how about 4 int i get 0, i less\n
4842
04:38:40,681 --> 04:38:42,541
Now, inside of my\ncurly braces, I'm going
4843
04:38:42,541 --> 04:38:48,221
to go ahead and print out a hash\n
4844
04:38:48,221 --> 04:38:52,236
All right, saving the file, doing\n
4845
04:38:52,236 --> 04:38:57,601
So there's no syntactical errors,\n
4846
04:38:57,600 --> 04:39:00,900
But some of you have probably\nseen the logical error already
4847
04:39:00,901 --> 04:39:03,631
because when I run this\nprogram I don't get
4848
04:39:03,631 --> 04:39:09,691
this picture, which was 3 bricks\n
4849
04:39:09,690 --> 04:39:12,190
Now, this might be jumping out\nat you, why it's happening
4850
04:39:12,190 --> 04:39:14,190
but I've kept the program\nsimple just so that we
4851
04:39:14,190 --> 04:39:18,271
don't have to find an actual bug, we can\n
4852
04:39:20,230 --> 04:39:23,311
What might be the first strategy\nfor finding a bug like this
4853
04:39:23,311 --> 04:39:27,552
rather than staring at your code,\n
4854
04:39:28,385 --> 04:39:31,950
Well, let's actually try to diagnose\n
4855
04:39:31,951 --> 04:39:34,681
And the simplest way to do\nthis now, and years from now
4856
04:39:34,681 --> 04:39:38,131
is, honestly, going to be to\nuse a function like printf.
4857
04:39:38,131 --> 04:39:40,050
Printf is a wonderfully\nuseful function, not
4858
04:39:40,050 --> 04:39:42,810
for formatting-- printing\nformatted strings and all that, for
4859
04:39:42,811 --> 04:39:45,691
just looking inside\nthe values of variables
4860
04:39:45,690 --> 04:39:48,612
that you might be curious\nabout to see what's going on.
4861
04:39:50,580 --> 04:39:53,370
I see that there's 4 coming\nout, but I intended 3.
4862
04:39:53,370 --> 04:39:56,001
So clearly, something's\nwrong with my i variables.
4863
04:39:56,001 --> 04:39:58,350
So let me be a little more pedantic.
4864
04:39:58,350 --> 04:40:01,560
Let me go inside of this\nloop and, temporarily
4865
04:40:01,561 --> 04:40:04,741
say something explicit, like, i is--
4866
04:40:04,741 --> 04:40:09,461
&i /n, and then just\nplug in the value of i.
4867
04:40:09,960 --> 04:40:13,230
This is not the program I want to\n
4868
04:40:13,230 --> 04:40:18,661
writing, because now I'm going\nto say make buggy, ./buggy.
4869
04:40:18,661 --> 04:40:20,760
And if I look, now,\nat the output, I have
4870
04:40:20,760 --> 04:40:25,350
some helpful diagnostic information.\n
4871
04:40:25,350 --> 04:40:27,870
and I get a hash, 2 and I\nget a hash, 3 and I get hash.
4872
04:40:28,788 --> 04:40:30,870
I'm clearly going too many\nsteps because, maybe, I
4873
04:40:30,870 --> 04:40:33,510
forgot that computers are,\nessentially, counting from 0
4874
04:40:33,510 --> 04:40:35,710
and now, oh, it's less than or equal to.
4875
04:40:37,291 --> 04:40:40,201
Again, trivial example,\nbut just by using printf
4876
04:40:40,201 --> 04:40:43,171
you can see inside of\nthe computer's memory
4877
04:40:43,170 --> 04:40:45,390
by just printing stuff out like this.
4878
04:40:45,390 --> 04:40:50,030
And now, once you've figured it out, oh,\n
4879
04:40:50,030 --> 04:40:52,400
or I should start\ncounting from 1, there's
4880
04:40:52,401 --> 04:40:53,901
any number of ways I could fix this.
4881
04:40:53,901 --> 04:40:56,916
But the most conventional is\nprobably just to say less than 3.
4882
04:40:56,916 --> 04:41:03,441
Now, I can delete my temporary print\n
4883
04:41:06,050 --> 04:41:08,091
All right, and to this day, I do this.
4884
04:41:08,091 --> 04:41:11,120
Whether it's making a command line\n
4885
04:41:11,120 --> 04:41:13,310
or mobile application,\nIt's very common to use
4886
04:41:13,311 --> 04:41:15,531
printf, or some equivalent\nin any language
4887
04:41:15,530 --> 04:41:19,611
just to poke around and see what's\ninside the computer's memory.
4888
04:41:19,611 --> 04:41:22,831
Thankfully, there's more\nsophisticated tools than this.
4889
04:41:22,830 --> 04:41:25,190
Let me go ahead and\nreintroduce the bug here.
4890
04:41:25,190 --> 04:41:28,880
And let me reopen my\nsidebar at left here.
4891
04:41:28,881 --> 04:41:32,811
Let me now recompile the code\nto make sure it's current.
4892
04:41:32,811 --> 04:41:35,570
And I'm going to run a\ncommand called debug50.
4893
04:41:35,570 --> 04:41:39,350
Which is a command that's\nrepresentative of a type of program
4894
04:41:41,001 --> 04:41:43,940
And this debugger is\nactually built into VS Code.
4895
04:41:43,940 --> 04:41:47,960
And all debug50 is doing for us is\n
4896
04:41:47,960 --> 04:41:49,911
VS Code's built-in debugger.
4897
04:41:49,911 --> 04:41:52,521
So this isn't even a\nCS50-specific tool, we've
4898
04:41:52,521 --> 04:41:55,431
just given you a debug50\ncommand to make it easier
4899
04:41:55,431 --> 04:41:57,116
to start it up from the get-go.
4900
04:41:57,116 --> 04:42:01,821
And the way you run this debugger\n
4901
04:42:01,820 --> 04:42:04,381
the name of the program\nthat you want to debug.
4902
04:42:06,471 --> 04:42:08,271
So you don't mention your c-file.
4903
04:42:08,271 --> 04:42:10,911
You mention your already-compiled code.
4904
04:42:10,911 --> 04:42:16,491
And what this debugger is going\n
4905
04:42:16,491 --> 04:42:19,191
walk through my code step-by-step.
4906
04:42:19,190 --> 04:42:23,190
Because every program we've written\n
4907
04:42:23,190 --> 04:42:26,585
even if I'm not done thinking\nthrough each step at a time.
4908
04:42:26,585 --> 04:42:30,111
With a debugger, I can\nactually click on a line number
4909
04:42:30,111 --> 04:42:33,441
and say pause execution\nhere, and the debugger
4910
04:42:33,440 --> 04:42:38,390
will let me walk through my code one\n
4911
04:42:38,390 --> 04:42:41,001
one minute at a time,\nat my own human pace.
4912
04:42:41,001 --> 04:42:43,730
Which is super compelling when\nthe programs get more complicated
4913
04:42:43,730 --> 04:42:46,861
and they might, otherwise,\nfly by on the screen.
4914
04:42:46,861 --> 04:42:50,120
So I'm going to click\nto the left of line 5.
4915
04:42:50,120 --> 04:42:52,230
And notice that these\nlittle red dots appear.
4916
04:42:52,230 --> 04:42:55,550
And if I click on one it\nstays, and gets even redder.
4917
04:42:55,550 --> 04:42:58,490
And I'm going to run debug50 on ./buggy.
4918
04:42:58,491 --> 04:43:03,351
And in just a moment, you'll see that a\n
4919
04:43:03,350 --> 04:43:06,170
It's doing some\nconfiguration of the screen.
4920
04:43:06,170 --> 04:43:10,950
Let me zoom out a little bit here so\n
4921
04:43:10,951 --> 04:43:14,701
And sometimes, you'll see in VS\n
4922
04:43:14,701 --> 04:43:18,741
which looks very cryptic, just go back\n
4923
04:43:18,741 --> 04:43:22,136
Because at the terminal window is where\n
4924
04:43:22,135 --> 04:43:24,380
And let's now take a\nlook at what's going on.
4925
04:43:24,381 --> 04:43:28,911
If I zoom in on my\nbuggy.c code here, you'll
4926
04:43:28,911 --> 04:43:35,151
notice that we have the same program\n
4927
04:43:36,080 --> 04:43:39,920
Not a coincidence, that's the line\n
4928
04:43:39,920 --> 04:43:44,661
The little red dot means break\nhere, pause execution here.
4929
04:43:44,661 --> 04:43:47,976
And the yellow line has\nnot yet been executed.
4930
04:43:47,976 --> 04:43:51,860
But if I, now, at the top of my\n
4931
04:43:53,010 --> 04:43:55,010
There's one for this,\nwhich, if I hover over it
4932
04:43:55,010 --> 04:43:58,400
says Step Over, there's another\nthat's going to say Step Into
4933
04:43:58,401 --> 04:44:00,081
there's a third that says Step Out.
4934
04:44:00,080 --> 04:44:02,780
I'm just going to use the\nfirst of these, Step Over.
4935
04:44:02,780 --> 04:44:05,841
And I'm going to do this, and\n
4936
04:44:05,841 --> 04:44:09,921
moved from line 5 to line\n7 because now it's ready
4937
04:44:09,920 --> 04:44:12,215
but hasn't yet printed out that hash.
4938
04:44:12,216 --> 04:44:16,078
But the most powerful thing here,\nnotice, is that top left here.
4939
04:44:16,078 --> 04:44:18,411
It's a little cryptic, because\nthere's a bunch of things
4940
04:44:18,411 --> 04:44:21,170
going on that will make more\nsense over time, but at the top
4941
04:44:21,170 --> 04:44:22,730
there's a section called variables.
4942
04:44:22,730 --> 04:44:25,010
Below that, something\ncalled locals, which means
4943
04:44:25,010 --> 04:44:27,080
local to my current function, main.
4944
04:44:27,080 --> 04:44:31,670
And notice, there's my variable\n
4945
04:44:31,670 --> 04:44:37,070
So now, once I click Step Over\nagain, watch what happens.
4946
04:44:37,070 --> 04:44:39,920
We go from line 7 back to line 5.
4947
04:44:39,920 --> 04:44:43,715
But look in the terminal window,\none of the hashes has printed.
4948
04:44:43,716 --> 04:44:46,311
But now, it's printed at my own pace.
4949
04:44:46,311 --> 04:44:48,291
I can think through this step-by-step.
4950
04:44:48,291 --> 04:44:50,601
Notice that i has not changed, yet.
4951
04:44:50,600 --> 04:44:53,960
It's still 0 because the yellow\n
4952
04:44:53,960 --> 04:44:58,400
But the moment I click Step Over,\nit's going to execute line 5.
4953
04:44:58,401 --> 04:45:05,271
Now, notice at top left, i has become\n
4954
04:45:05,271 --> 04:45:07,550
because now, highlighted is line 7.
4955
04:45:07,550 --> 04:45:12,260
So if I click Step Over\nagain, we'll see the hash.
4956
04:45:12,260 --> 04:45:16,190
If I repeat this process at my\nown human, comfortable pace
4957
04:45:16,190 --> 04:45:21,300
I can see my variables changing, I\n
4958
04:45:21,300 --> 04:45:24,163
and I can just think about\nshould that have just happened.
4959
04:45:24,163 --> 04:45:26,120
I can pause and give\nthought to what's actually
4960
04:45:26,120 --> 04:45:30,501
going on without trying to race the\n
4961
04:45:30,501 --> 04:45:32,751
I'm going to go ahead and\nstop here because we already
4962
04:45:32,751 --> 04:45:35,690
know what this particular problem\nis, and that brings me back
4963
04:45:35,690 --> 04:45:36,980
to my default terminal window.
4964
04:45:36,980 --> 04:45:40,440
But this debugger, let me\ndisable the breakpoint now
4965
04:45:40,440 --> 04:45:42,830
so it doesn't keep\nbreaking, this debugger
4966
04:45:42,830 --> 04:45:45,020
will be your friend\nmoving forward in order
4967
04:45:45,021 --> 04:45:49,550
to step through your code step-by-step,\n
4968
04:45:49,550 --> 04:45:51,080
where something has gone wrong.
4969
04:45:51,080 --> 04:45:54,657
Printf is great, but it gets annoying if\n
4970
04:45:54,657 --> 04:45:57,740
print this, print this, print this,\n
4971
04:45:59,241 --> 04:46:04,041
The debugger lets you do the\nequivalent, but automatically.
4972
04:46:04,041 --> 04:46:10,221
Questions on this debugger, which you'll\n
4973
04:46:12,814 --> 04:46:14,820
AUDIENCE: You were using\na Step Over feature.
4974
04:46:14,820 --> 04:46:17,563
What do the other\nfeatures in the debugger--
4975
04:46:17,563 --> 04:46:18,980
DAVID MALAN: Really good question.
4976
04:46:18,980 --> 04:46:21,980
We'll see this before long, but those\n
4977
04:46:21,980 --> 04:46:26,721
step into and step out of, actually\n
4978
04:46:26,721 --> 04:46:28,461
if I had any more than main.
4979
04:46:28,460 --> 04:46:31,220
So if main called a\nfunction called something
4980
04:46:31,221 --> 04:46:34,640
and something called a function\n
4981
04:46:34,640 --> 04:46:38,990
stepping over the entire execution of\n
4982
04:46:38,991 --> 04:46:41,366
and walk through its\nlines of code one by one.
4983
04:46:41,366 --> 04:46:43,281
So any time you have\na problem set you're
4984
04:46:43,280 --> 04:46:46,400
working on that has multiple functions,\n
4985
04:46:46,401 --> 04:46:50,511
if you want, or you can set it inside\n
4986
04:46:50,510 --> 04:46:53,390
to focus your attention only on that.
4987
04:46:53,390 --> 04:46:56,900
And we'll see examples\nof that over time.
4988
04:46:58,041 --> 04:47:02,361
And what's the sort of, elephant\nin the room, so to speak
4989
04:47:02,361 --> 04:47:04,011
is actually a duck in this case.
4990
04:47:04,010 --> 04:47:06,420
Why is there this duck and\nall of these ducks here?
4991
04:47:06,420 --> 04:47:10,700
Well, it turns out, a third, genuinely\n
4992
04:47:10,701 --> 04:47:14,316
is talking through problems, talking\n
4993
04:47:14,315 --> 04:47:16,880
Now, in the absence of having\na family member, or a friend
4994
04:47:16,881 --> 04:47:20,781
or a roommate who actually wants to\n
4995
04:47:20,780 --> 04:47:25,580
generally, programmers turn to a\n
4996
04:47:25,580 --> 04:47:27,620
if something animate is not available.
4997
04:47:27,620 --> 04:47:31,021
The idea behind rubber duck\ndebugging, so to speak
4998
04:47:31,021 --> 04:47:37,011
is that simply by looking at your code\n
4999
04:47:37,010 --> 04:47:41,300
I'm starting a 4 loop and\nI'm initializing i to 0.
5000
04:47:41,300 --> 04:47:43,251
OK, then, I'm printing out a hash.
5001
04:47:43,251 --> 04:47:48,372
Just by talking through your\ncode, step-by-step, invariably
5002
04:47:48,372 --> 04:47:51,080
finds you having the proverbial\n
5003
04:47:51,080 --> 04:47:53,300
because you realize, wait a minute\nI just said something stupid
5004
04:47:53,300 --> 04:47:54,771
or I just said something wrong.
5005
04:47:54,771 --> 04:47:58,761
And this is really just a proxy for any\n
5006
04:48:00,320 --> 04:48:02,701
But in the absence of any\nof those people in the room
5007
04:48:02,701 --> 04:48:04,618
you're welcome to take,\non your way out today.
5008
04:48:04,617 --> 04:48:08,540
One of these little, rubber ducks and\n
5009
04:48:08,541 --> 04:48:12,081
you want to talk through one\nof your problems in CS50
5010
04:48:12,080 --> 04:48:13,400
or maybe life more generally.
5011
04:48:13,401 --> 04:48:15,741
But having it there on\nyour desk is just a way
5012
04:48:15,741 --> 04:48:19,401
to help you hear illogic\nin what you think
5013
04:48:19,401 --> 04:48:22,051
might, otherwise, be logical code.
5014
04:48:22,050 --> 04:48:26,661
So printf, debugging, rubber-duck\n
5015
04:48:26,661 --> 04:48:29,468
you'll see over time, to\nget to the source of code
5016
04:48:29,468 --> 04:48:31,050
that you will write that has mistakes.
5017
04:48:31,050 --> 04:48:33,140
Which is going to happen,\nbut it will empower you
5018
04:48:33,140 --> 04:48:36,260
all the more to solve those mistakes.
5019
04:48:36,260 --> 04:48:41,700
All right, any questions on debugging,\n
5020
04:48:44,001 --> 04:48:46,911
DAVID MALAN: What's the difference\n
5021
04:48:46,911 --> 04:48:50,241
At the moment, the only one that's\n
5022
04:48:50,241 --> 04:48:53,601
is Step Over, because it means\nstep over each line of code.
5023
04:48:53,600 --> 04:48:58,310
If, though, I had other functions\n
5024
04:48:58,311 --> 04:49:03,561
maybe lower down in the file, I\n
5025
04:49:03,561 --> 04:49:05,730
and walk through them one at a time.
5026
04:49:05,729 --> 04:49:07,911
So we'll come back to this\nwith an actual example
5027
04:49:07,911 --> 04:49:10,491
but step into will allow\nme to do exactly that.
5028
04:49:10,491 --> 04:49:13,471
In fact, this is a perfect segue to\n
5029
04:49:13,471 --> 04:49:15,893
Let me go ahead and open\nup another file here.
5030
04:49:15,893 --> 04:49:17,600
And, actually, we'll\nuse the same, buggy.
5031
04:49:17,600 --> 04:49:20,580
And we're going to write one\nother thing that's buggy, as well.
5032
04:49:20,580 --> 04:49:24,260
Let me go up here and\ninclude, as before, cs50.h.
5033
04:49:29,780 --> 04:49:32,310
So all of this, I think,\nis correct, so far.
5034
04:49:32,311 --> 04:49:35,541
And let's do this, let's\ngive myself an int called i
5035
04:49:35,541 --> 04:49:38,791
and let's ask the user\nfor a negative integer.
5036
04:49:38,791 --> 04:49:41,561
This is not a function that\nexists, technically, yet.
5037
04:49:41,561 --> 04:49:44,311
But I'm going to assume, for the\n
5038
04:49:44,311 --> 04:49:47,961
Then, I'm just going to print\nout, with %i and a new line
5039
04:49:47,960 --> 04:49:49,620
whatever the human typed in.
5040
04:49:49,620 --> 04:49:52,580
So at this point in the story,\nmy program, I think, is correct.
5041
04:49:52,580 --> 04:49:55,190
Except for the fact that\nget negative int is not
5042
04:49:55,190 --> 04:49:57,950
a function in the CS50\nlibrary or anywhere else.
5043
04:49:57,951 --> 04:49:59,721
I'm going to need to invent it myself.
5044
04:49:59,721 --> 04:50:05,570
So suppose, in this case, that I declare\n
5045
04:50:05,570 --> 04:50:09,890
It's return type, so to speak, should\n
5046
04:50:09,890 --> 04:50:12,620
I want to hand the user back\nin integer, and it's going
5047
04:50:12,620 --> 04:50:14,570
to take no input to keep it simple.
5048
04:50:14,570 --> 04:50:16,070
So I'm just going to say void there.
5049
04:50:16,070 --> 04:50:19,070
No inputs, no special\nprompts, nothing like that.
5050
04:50:19,070 --> 04:50:21,861
Let me, now, give myself\nsome curly braces.
5051
04:50:21,861 --> 04:50:24,771
And let me do something familiar,\nperhaps, from problem set 1.
5052
04:50:24,771 --> 04:50:29,811
Let me give myself a variable,\n
5053
04:50:31,580 --> 04:50:37,850
Assign n the value of get int, asking\n
5054
04:50:39,111 --> 04:50:43,011
And I want to do this while\nn is less than 0, because I
5055
04:50:43,010 --> 04:50:44,650
want to get a negative from the user.
5056
04:50:44,651 --> 04:50:48,401
And recall, from having\nused this block in the past
5057
04:50:48,401 --> 04:50:52,031
I can now return n as the\nvery last step to hand back
5058
04:50:52,030 --> 04:50:56,050
whatever the user has typed in, so\n
5059
04:50:58,010 --> 04:51:00,970
Now, I've deliberately\nmade a mistake here
5060
04:51:00,971 --> 04:51:03,341
and it's a subtle,\nsilly, mathematical one
5061
04:51:03,341 --> 04:51:08,171
but let me compile this program after\n
5062
04:51:08,170 --> 04:51:09,640
so I don't make that mistake again.
5063
04:51:09,640 --> 04:51:12,730
Let me do make buggy, Enter.
5064
04:51:14,980 --> 04:51:18,280
I'll give it a negative\ninteger, like negative 50.
5065
04:51:29,260 --> 04:51:33,341
So it's, clearly, working backwards,\n
5066
04:51:33,341 --> 04:51:35,061
So how could I go about debugging this?
5067
04:51:35,061 --> 04:51:36,686
Well, I could do what I've done before?
5068
04:51:36,686 --> 04:51:43,181
I could use my printf technique and\n
5069
04:51:43,181 --> 04:51:49,570
new line, comma n, just to print\nit out, let me recompile buggy
5070
04:51:49,570 --> 04:51:52,901
let me rerun buggy, let\nme type in negative 50.
5071
04:51:54,890 --> 04:51:57,434
So that didn't really\nhelp me at this point
5072
04:51:57,434 --> 04:51:58,851
because that's the same as before.
5073
04:51:58,850 --> 04:52:02,290
So let me do this, debug50, ./buggy.
5074
04:52:02,291 --> 04:52:04,131
Oh, but I've made a mistake.
5075
04:52:04,131 --> 04:52:05,961
So I didn't set my breakpoint, yet.
5076
04:52:05,960 --> 04:52:09,190
So let me do this, and I'll\nset a breakpoint this time.
5077
04:52:09,190 --> 04:52:11,591
I could set it here, on line 8.
5078
04:52:11,591 --> 04:52:13,600
Let's do it in main, as before.
5079
04:52:17,230 --> 04:52:19,451
That fancy user interface\nis going to pop up.
5080
04:52:19,451 --> 04:52:22,570
It's going to highlight the line\nthat I set the breakpoint on.
5081
04:52:22,570 --> 04:52:25,510
Notice that, on the left\nhand side of the screen
5082
04:52:25,510 --> 04:52:28,911
i is defaulting, at the moment to 0,\n
5083
04:52:29,411 --> 04:52:35,076
But let me, now, Step Over this\n
5084
04:52:35,076 --> 04:52:36,701
and you'll see that I'm being prompted.
5085
04:52:36,701 --> 04:52:40,480
So let's type in my negative 50, Enter.
5086
04:52:40,480 --> 04:52:45,730
Notice now that I'm\nstuck in that function.
5087
04:52:46,510 --> 04:52:50,780
So clearly, the issue seems to be\n
5088
04:52:50,780 --> 04:52:54,380
So, OK, let me stop this execution.
5089
04:52:54,381 --> 04:52:57,436
My problem doesn't seem to be in\n
5090
04:52:58,061 --> 04:53:00,251
Let me set my same breakpoint at line 8.
5091
04:53:00,251 --> 04:53:02,771
Let me rerun debug50 one more time.
5092
04:53:02,771 --> 04:53:07,370
But this time, instead of just stepping\n
5093
04:53:07,370 --> 04:53:09,670
So notice line 8 is, again,\nhighlighted in yellow.
5094
04:53:09,670 --> 04:53:11,950
In the past I've been\nclicking Step Over.
5095
04:53:14,440 --> 04:53:17,740
When I click Step Into,\nboom, now, the debugger
5096
04:53:17,741 --> 04:53:20,651
jumps into that specific function.
5097
04:53:20,651 --> 04:53:23,591
Now, I can step through these\nlines of code, again and again.
5098
04:53:23,591 --> 04:53:25,960
I can see what the value of\nn is as I'm typing it in.
5099
04:53:25,960 --> 04:53:27,760
I can think through my logic, and voila.
5100
04:53:27,760 --> 04:53:31,900
Hopefully, once I've solved the issue,\n
5101
04:53:33,440 --> 04:53:36,310
So Step Over just goes over\nthe line, but executes it
5102
04:53:36,311 --> 04:53:41,471
Step Into lets you go into\nother functions you've written.
5103
04:53:41,471 --> 04:53:43,661
So let's go ahead and do this.
5104
04:53:43,661 --> 04:53:47,811
We've got a bunch of\npossible approaches that we
5105
04:53:47,811 --> 04:53:49,811
can take to solving some\nproblems let's go ahead
5106
04:53:49,811 --> 04:53:50,991
and pace ourselves today, though.
5107
04:53:50,991 --> 04:53:52,161
Let's take a five-minute break, here.
5108
04:53:52,161 --> 04:53:54,949
And when we come back, we'll take\n
5109
04:54:05,260 --> 04:54:11,120
Up until now, both, by way of week 1\n
5110
04:54:11,120 --> 04:54:14,920
we've just translated from Scratch into\n
5111
04:54:14,920 --> 04:54:17,960
like loops and conditionals,\nBoolean expressions, variables.
5112
04:54:17,960 --> 04:54:19,210
So sort of, more of the same.
5113
04:54:19,210 --> 04:54:22,690
But there are features in C that\n
5114
04:54:22,690 --> 04:54:26,560
like data types, the types of variables\n
5115
04:54:26,561 --> 04:54:28,711
but that, in fact, does\nexist in other languages.
5116
04:54:28,710 --> 04:54:30,460
In fact, a few that\nwe'll see before long.
5117
04:54:30,460 --> 04:54:34,931
So to summarize the types we saw last\n
5118
04:54:34,931 --> 04:54:39,311
We had ints, and floats, and\nlongs, and doubles, and chars
5119
04:54:39,311 --> 04:54:42,771
there's also Booles and also string,\n
5120
04:54:42,771 --> 04:54:46,091
But today, let's actually start to\n
5121
04:54:46,091 --> 04:54:50,021
and actually what your Mac and PC\n
5122
04:54:50,021 --> 04:54:53,431
as an int versus a char, versus\na string, versus something else.
5123
04:54:53,431 --> 04:54:56,181
And see if we can't put more tools\n
5124
04:54:56,181 --> 04:54:59,890
so we can start quickly writing\n
5125
04:55:01,061 --> 04:55:04,901
So it turns out, that on\nmost systems nowadays
5126
04:55:04,901 --> 04:55:07,271
though this can vary by\nactual computer, this
5127
04:55:07,271 --> 04:55:10,300
is how large each of the\ndata types, typically
5128
04:55:10,300 --> 04:55:15,850
is in C. When you store a Boolean value,\n
5129
04:55:17,111 --> 04:55:19,361
That's a little excessive,\nbecause, strictly speaking
5130
04:55:19,361 --> 04:55:22,841
you only need 1 bit,\nwhich is 1/8 of this size.
5131
04:55:22,841 --> 04:55:25,451
But for simplicity,\ncomputers use a whole byte
5132
04:55:25,451 --> 04:55:28,001
to represent a Boole, true or false.
5133
04:55:28,001 --> 04:55:32,300
A char, we saw last week,\nis only 1 byte, or 8 bits.
5134
04:55:32,300 --> 04:55:37,210
And this is why ASCII, which uses 1\n
5135
04:55:37,210 --> 04:55:41,861
on, was confined to only 256\nmaximally possible characters.
5136
04:55:41,861 --> 04:55:46,201
Notice that an int is\n4 bytes, or 32 bits.
5137
04:55:46,201 --> 04:55:48,841
A float is also 4 bytes or 32 bits.
5138
04:55:48,841 --> 04:55:52,111
But the things that we call long,\n
5139
04:55:54,690 --> 04:55:58,161
A double is 64 bits of precision\nfor floating point values.
5140
04:55:58,161 --> 04:56:01,475
And a string, for today, we're\n
5141
04:56:01,475 --> 04:56:03,600
We'll come back to that,\nlater today and next week
5142
04:56:03,600 --> 04:56:06,780
as to how much space a string\ntakes up, but, suffice it to say
5143
04:56:06,780 --> 04:56:09,748
it's going to take up a\nvariable amount of space
5144
04:56:09,748 --> 04:56:11,791
depending on whether the\nstring is short or long.
5145
04:56:11,791 --> 04:56:14,730
But we'll see exactly what\nthat means, before long.
5146
04:56:14,730 --> 04:56:19,291
So here's a photograph of\na typical piece of memory
5147
04:56:19,291 --> 04:56:22,021
inside of your Mac, or PC, or phone.
5148
04:56:22,021 --> 04:56:24,421
Odds are, it might be a little\nsmaller in some devices.
5149
04:56:24,420 --> 04:56:27,210
This is known as RAM,\nor random access memory.
5150
04:56:27,210 --> 04:56:29,670
Each of these little black\nchips on this circuit
5151
04:56:29,670 --> 04:56:31,980
board, the green thing,\nthese little black chips
5152
04:56:31,980 --> 04:56:34,890
are where 0s and 1s are actually stored.
5153
04:56:34,890 --> 04:56:36,931
Each of those stores\nsome number of bytes.
5154
04:56:36,931 --> 04:56:39,390
Maybe megabytes, maybe\neven gigabytes, nowadays.
5155
04:56:39,390 --> 04:56:45,690
So let's focus on one of those chips,\n
5156
04:56:45,690 --> 04:56:49,650
Let's consider the fact that, even\n
5157
04:56:49,651 --> 04:56:53,731
how this kind of thing is made, if\n
5158
04:56:53,730 --> 04:56:56,190
for the sake of discussion,\nit stands to reason that
5159
04:56:56,190 --> 04:57:00,091
if this thing is storing 1\nbillion bytes, 1 gigabyte
5160
04:57:00,091 --> 04:57:02,370
then we can number them, arbitrarily.
5161
04:57:02,370 --> 04:57:05,850
Maybe this will be byte\n0, 1, 2, 3, 4, 5, 6, 7, 8.
5162
04:57:05,850 --> 04:57:09,260
Then, maybe, way down here in the bottom\n
5163
04:57:09,260 --> 04:57:13,020
We can just number these things,\nas might be our convention.
5164
04:57:13,021 --> 04:57:14,971
Let's draw that graphically.
5165
04:57:14,971 --> 04:57:17,351
Not with a billion squares,\nbut fewer than those.
5166
04:57:17,350 --> 04:57:19,670
And let's zoom in further,\nand consider that.
5167
04:57:19,670 --> 04:57:21,420
At this point in the\nstory, let's abstract
5168
04:57:21,420 --> 04:57:23,640
away all the hardware,\nand all the little wires
5169
04:57:23,640 --> 04:57:27,990
and just think of memory as taking\n
5170
04:57:27,991 --> 04:57:30,431
as taking up some number of bytes.
5171
04:57:30,431 --> 04:57:34,081
So, for instance, if you were to store\n
5172
04:57:34,080 --> 04:57:38,490
was 1 byte, it might be stored\nat this top left-hand location
5173
04:57:38,491 --> 04:57:40,456
of this black chip of memory.
5174
04:57:40,455 --> 04:57:44,550
If you were to store something like\n
5175
04:57:44,550 --> 04:57:47,820
it might use four of those bytes,\n
5176
04:57:47,820 --> 04:57:49,480
back-to-back-to-back, in this case.
5177
04:57:49,480 --> 04:57:53,530
If you were to store a long or a double,\n
5178
04:57:53,530 --> 04:57:55,650
So I'm filling in these\nsquares to represent
5179
04:57:55,651 --> 04:58:00,421
how much memory and given variable\n
5180
04:58:00,420 --> 04:58:03,490
1, or 4, or 8, in this case, here.
5181
04:58:03,491 --> 04:58:06,421
Well, from here, let's abstract\naway from all of the hardware
5182
04:58:06,420 --> 04:58:08,580
and really focus on\nmemory as being a grid.
5183
04:58:08,580 --> 04:58:11,911
Or, really, like a canvas that\nwe can paint any types of data
5184
04:58:13,111 --> 04:58:16,861
At the end of the day, all of this\n
5185
04:58:16,861 --> 04:58:20,761
But it's up to you and I to build\nabstractions on top of that.
5186
04:58:20,760 --> 04:58:24,390
Things like actual numbers,\ncolors, images, movies, and beyond.
5187
04:58:24,390 --> 04:58:26,701
But we'll start\nlower-level, here, first.
5188
04:58:26,701 --> 04:58:30,210
Suppose I had a program\nthat needs three integers.
5189
04:58:30,210 --> 04:58:33,060
A simple program whose purpose\nin life is to average your three
5190
04:58:33,061 --> 04:58:36,661
scores on an exam, or some such thing.
5191
04:58:36,661 --> 04:58:41,280
Suppose that your three scores were\n
5192
04:58:42,405 --> 04:58:47,290
Let's write a program that does\nthis kind of averaging for us.
5193
04:58:47,291 --> 04:58:49,121
Let me go back to VS Code, here.
5194
04:58:49,120 --> 04:58:52,530
Let me open up a file called scores.c.
5195
04:58:52,530 --> 04:58:55,091
Let me implement this as follows.
5196
04:58:55,091 --> 04:59:00,120
Let me include stdio.h at the\ntop, int main(void) as before.
5197
04:59:00,120 --> 04:59:05,580
Then, inside of main, let me\ndeclare score 1, which is 72.
5198
04:59:08,251 --> 04:59:11,401
Then, a third score, called\nscore 3, which is going to be 33.
5199
04:59:11,401 --> 04:59:15,001
Now, I'm going to use printf to print\n
5200
04:59:15,001 --> 04:59:16,780
and I can do this in\na few different ways.
5201
04:59:16,780 --> 04:59:22,111
But I'm going to print out %f, and\n
5202
04:59:22,111 --> 04:59:28,021
plus score 3, divided by 3,\nclose parentheses semicolon.
5203
04:59:28,021 --> 04:59:31,561
Some relatively simple arithmetic to\n
5204
04:59:31,561 --> 04:59:34,831
if I'm curious what my average grade\n
5205
04:59:35,881 --> 04:59:39,877
Let me, now, do make scores.
5206
04:59:39,877 --> 04:59:43,501
All right, so I've somehow\nmade an error already.
5207
04:59:43,501 --> 04:59:49,411
But this one is, actually, germane\nto a problem we, hopefully
5208
04:59:49,411 --> 04:59:51,120
won't encounter too frequently.
5209
04:59:52,120 --> 04:59:55,620
So underlined to score 1, plus\n
5210
04:59:55,620 --> 05:00:00,510
Format specifies type double, but\n
5211
05:00:02,791 --> 05:00:04,691
Because the arithmetic\nseems to check out.
5212
05:00:05,190 --> 05:00:08,820
AUDIENCE: So the computer is doing the\n
5213
05:00:08,820 --> 05:00:13,521
just gives out a value at the\nend because, well [INAUDIBLE]
5214
05:00:14,471 --> 05:00:15,901
And we'll come back to\nthis in more detail
5215
05:00:15,901 --> 05:00:18,783
but, indeed, what's happening here\n
5216
05:00:18,782 --> 05:00:20,740
obviously, because I\ndefine them right up here.
5217
05:00:20,741 --> 05:00:23,731
And I'm dividing by another\nint, 3, but the catch
5218
05:00:23,730 --> 05:00:28,150
is, recall that C when it performs math,\n
5219
05:00:28,151 --> 05:00:30,070
But integers are not\nfloating point value.
5220
05:00:30,070 --> 05:00:33,151
So if you actually want to get a\nprecise, average for your score
5221
05:00:33,151 --> 05:00:37,021
without throwing away the remainder,\n
5222
05:00:37,021 --> 05:00:39,690
it turns out, we're going to have to--
5223
05:00:42,690 --> 05:00:46,980
[LAUGHTER] we're going to have to\n
5224
05:00:47,611 --> 05:00:50,491
And there's a few ways to\ndo this but the easiest way
5225
05:00:50,491 --> 05:00:52,801
for now, I'm going to go\nahead and do this up here
5226
05:00:52,800 --> 05:00:55,620
I'm going to change the\ndivide by 3 to divide by 3.0.
5227
05:00:55,620 --> 05:00:59,701
Because it turns out, long story short,\n
5228
05:00:59,701 --> 05:01:01,561
participating in an\narithmetic expression
5229
05:01:01,561 --> 05:01:03,991
like this is something\nlike a float, the rest
5230
05:01:03,991 --> 05:01:08,471
will be treated as promoted to\na floating point value as well.
5231
05:01:08,471 --> 05:01:13,756
So let me, now, recompile this\ncode with make scores, Enter.
5232
05:01:13,756 --> 05:01:17,761
This time it worked OK, because\nI'm treating a float as a float.
5233
05:01:19,861 --> 05:01:24,411
All right, my average is\n59.33333 and so forth.
5234
05:01:24,911 --> 05:01:27,600
So the math, presumably, checks out.
5235
05:01:27,600 --> 05:01:30,480
Floating point imprecision\nper last week aside.
5236
05:01:30,480 --> 05:01:33,541
But let's consider the\ndesign of this program.
5237
05:01:33,541 --> 05:01:40,941
What is, kind of, bad about it, or if\n
5238
05:01:40,940 --> 05:01:43,740
are we going to regret the\ndesign of this program?
5239
05:01:43,741 --> 05:01:45,251
What might not be ideal here?
5240
05:01:54,625 --> 05:01:58,480
DAVID MALAN: Yeah, so in this case,\n
5241
05:01:58,480 --> 05:02:01,400
So, if I'm hearing you\ncorrectly, this program
5242
05:02:01,401 --> 05:02:03,861
is only ever going to tell\nme this specific average.
5243
05:02:03,861 --> 05:02:05,991
I'm not even using\nsomething like, get int
5244
05:02:05,991 --> 05:02:09,051
or get float to get three different\nscores, so that's not good.
5245
05:02:09,050 --> 05:02:11,202
And suppose that we wait\nlater in the semester
5246
05:02:11,203 --> 05:02:12,661
I think other problems could arise.
5247
05:02:13,161 --> 05:02:15,280
AUDIENCE: Just thinking\nalso somewhat of an issue
5248
05:02:15,280 --> 05:02:17,161
that you can't reuse that number.
5249
05:02:17,161 --> 05:02:19,710
DAVID MALAN: I can't\nreuse the number because I
5250
05:02:19,710 --> 05:02:23,348
haven't stored the average in some\n
5251
05:02:23,348 --> 05:02:25,890
a big deal, but certainly, if\nI wanted to reuse it elsewhere
5252
05:02:26,911 --> 05:02:29,286
Let's fast-forward again, a\nlittle later in the semester
5253
05:02:29,286 --> 05:02:31,651
I don't just have three\ntest scores or exam scores
5254
05:02:34,951 --> 05:02:36,562
AUDIENCE: Yeah, if you\never want to have to take
5255
05:02:36,562 --> 05:02:39,161
the average of any number of\nscores other than 3, [INAUDIBLE]
5256
05:02:39,161 --> 05:02:42,370
DAVID MALAN: Yeah, I've sort\nof, capped this program at 3.
5257
05:02:42,370 --> 05:02:45,202
And honestly, this is, kind\nof, bordering on copy paste.
5258
05:02:45,203 --> 05:02:48,161
Even though the variables, yes, have\n
5259
05:02:49,061 --> 05:02:51,491
Imagine doing this for a\nwhole grade book for a class.
5260
05:02:51,491 --> 05:02:57,251
Having to score 4, 5, 6, 11 10, 12,\n
5261
05:02:57,251 --> 05:02:59,681
You can imagine just\nhow ugly the code starts
5262
05:02:59,681 --> 05:03:02,896
to get if you're just defining variable\n
5263
05:03:02,896 --> 05:03:07,001
So it turns out, there are\nbetter ways, in languages like C
5264
05:03:07,001 --> 05:03:11,501
if you want to have multiple\nvalues stored in memory that
5265
05:03:11,501 --> 05:03:13,300
happened to be of the same data type.
5266
05:03:13,300 --> 05:03:14,681
Let's take a look back\nat this memory, here
5267
05:03:14,681 --> 05:03:16,806
to see what these things\nmight look like in memory.
5268
05:03:16,806 --> 05:03:18,431
Here's that grid of memory.
5269
05:03:18,431 --> 05:03:20,710
Each of these recall represents a byte.
5270
05:03:20,710 --> 05:03:23,951
To be clear, if I store\nscore 1 in memory first
5271
05:03:23,951 --> 05:03:25,390
how many bytes will it take up?
5272
05:03:28,690 --> 05:03:32,838
So I might draw a score 1 as\nfilling up this part of the memory.
5273
05:03:32,838 --> 05:03:36,131
It's up to the computer as to whether it\n
5274
05:03:36,131 --> 05:03:39,550
I'm just keeping the pictures clean\n
5275
05:03:39,550 --> 05:03:42,341
If I, then, declare another\nvariable, called score 2
5276
05:03:42,341 --> 05:03:44,991
it might end up over there,\nalso taking up 4 bytes.
5277
05:03:44,991 --> 05:03:47,591
And then score 3 might end up here.
5278
05:03:47,591 --> 05:03:51,140
So that's just representing what's going\n
5279
05:03:51,140 --> 05:03:54,940
But technically speaking, to\nbe clear, per week 0, what's
5280
05:03:54,940 --> 05:03:58,841
really being stored in the computer's\n
5281
05:03:58,841 --> 05:04:03,611
32 total, in this case,\nbecause 32 bits is 4 bytes.
5282
05:04:03,611 --> 05:04:07,541
But again, it gets boring\nquickly to think in and look
5283
05:04:09,021 --> 05:04:11,381
So we'll, generally, abstract\nthis away as just using
5284
05:04:11,381 --> 05:04:13,811
decimal numbers, in this case, instead.
5285
05:04:13,811 --> 05:04:18,431
But there might be a better way to\n
5286
05:04:18,431 --> 05:04:21,761
but maybe four, maybe,\nfive, maybe 10, maybe, more
5287
05:04:21,760 --> 05:04:27,370
by declaring one variable to store\n
5288
05:04:27,370 --> 05:04:30,010
or more individual variables.
5289
05:04:30,010 --> 05:04:34,510
The way to do this is by way\nof something known as an array.
5290
05:04:34,510 --> 05:04:42,580
An array is another type of data that\n
5291
05:04:42,580 --> 05:04:45,240
of the same type back-to-back-to-back.
5292
05:04:45,241 --> 05:04:46,491
That is, to say, contiguously.
5293
05:04:46,491 --> 05:04:54,101
So an array can let you create\n
5294
05:04:54,100 --> 05:04:56,860
or even more than\nthat, but describe them
5295
05:04:56,861 --> 05:05:00,651
all using the same variable\nname, the same one name.
5296
05:05:00,651 --> 05:05:05,001
So for instance, if, for one\n
5297
05:05:05,001 --> 05:05:10,061
but I don't want to messily declare\n
5298
05:05:11,221 --> 05:05:13,390
This is today's first\nnew piece of syntax
5299
05:05:13,390 --> 05:05:15,550
the square brackets\nthat we're now seeing.
5300
05:05:15,550 --> 05:05:21,400
This line of code, here, is\nsimilar to int score 1 semicolon
5301
05:05:21,401 --> 05:05:24,621
or int score 1 equals 72 semicolon.
5302
05:05:24,620 --> 05:05:30,040
This line of code is declaring for\n
5303
05:05:30,041 --> 05:05:33,521
And that array is going\nto store three integers.
5304
05:05:34,030 --> 05:05:39,251
Because the type of that\narray is an int, here.
5305
05:05:39,251 --> 05:05:42,370
The square brackets tell the\ncomputer how many ints you want.
5306
05:05:43,241 --> 05:05:45,401
And the name is, of course, scores.
5307
05:05:45,401 --> 05:05:47,801
Which, in English, I've\ndeliberately pluralized
5308
05:05:47,800 --> 05:05:52,361
so that I can describe this array\n
5309
05:05:52,361 --> 05:05:57,230
So if I want to now assign values\n
5310
05:05:59,021 --> 05:06:04,421
I can say, scores bracket 0 equals\n
5311
05:06:04,420 --> 05:06:06,450
and scores bracket 2 equals 33.
5312
05:06:06,451 --> 05:06:08,201
The only thing weird\nthere is, admittedly
5313
05:06:08,201 --> 05:06:10,091
the square brackets which are still new.
5314
05:06:10,091 --> 05:06:14,081
But we're also, notice,\n0 indexing things.
5315
05:06:14,080 --> 05:06:16,605
To zero index means to\nstart counting at 0.
5316
05:06:16,605 --> 05:06:18,730
When we've talked about\nthat before, our four loops
5317
05:06:18,730 --> 05:06:20,260
have, generally, been zero indexed.
5318
05:06:20,260 --> 05:06:24,130
Arrays in C are zero indexed.
5319
05:06:24,131 --> 05:06:25,691
And you do not have choice over that.
5320
05:06:25,690 --> 05:06:28,810
You can't start counting at 1\nin arrays because you prefer to
5321
05:06:28,811 --> 05:06:31,091
you'd be sacrificing\none of the elements.
5322
05:06:31,091 --> 05:06:33,881
You have to start in\narrays counting from 0.
5323
05:06:33,881 --> 05:06:37,390
So out of context, this\ndoesn't solve a problem
5324
05:06:37,390 --> 05:06:39,460
but it, definitely, is\ngoing to once we have more
5325
05:06:39,460 --> 05:06:41,170
than, even, three scores here.
5326
05:06:41,170 --> 05:06:44,010
In fact, let me change\nthis program a little bit.
5327
05:06:45,710 --> 05:06:48,280
And delete these three lines, here.
5328
05:06:48,280 --> 05:06:51,341
And replace it with a\nscores variable that's
5329
05:06:51,341 --> 05:06:54,401
ready to store three total integers.
5330
05:06:54,401 --> 05:06:58,391
And then, initialize them as\nfollows, scores bracket 0 is 72
5331
05:06:58,390 --> 05:07:02,560
as before, scores bracket 1 is\ngoing to be 73, scores bracket 2
5332
05:07:04,001 --> 05:07:08,329
Notice, I do not need to say\nint before any of these lines
5333
05:07:08,329 --> 05:07:10,121
because that's been\ntaken care of, already
5334
05:07:10,120 --> 05:07:14,830
for me on line 5, where I already\n
5335
05:07:17,591 --> 05:07:21,280
Now, down here, this code needs\n
5336
05:07:21,280 --> 05:07:23,560
three variables, score 1, 2, and 3.
5337
05:07:23,561 --> 05:07:28,211
I have 1 variable, but\nthat I can index into.
5338
05:07:28,210 --> 05:07:33,010
I'm going to, here, then, do scores\n
5339
05:07:33,010 --> 05:07:37,630
plus scores bracket 2, which is\n
5340
05:07:37,631 --> 05:07:39,161
giving me back those three integers.
5341
05:07:39,161 --> 05:07:42,120
But notice, I'm using the same\nvariable name, every time.
5342
05:07:42,120 --> 05:07:45,330
And again, I'm using this new square\n
5343
05:07:45,330 --> 05:07:50,850
index into the array to get at the first\n
5344
05:07:50,850 --> 05:07:53,100
and then, to do it again down here.
5345
05:07:53,100 --> 05:07:56,167
Now, this program, still not really\n
5346
05:07:56,168 --> 05:07:58,501
I still can only store three\nscores, but we'll come back
5347
05:07:58,501 --> 05:08:00,190
to something like that before long.
5348
05:08:00,190 --> 05:08:03,210
But for now, we're just introducing\n
5349
05:08:03,210 --> 05:08:09,240
whereby, I can now store multiple\nvalues in the same variable.
5350
05:08:09,241 --> 05:08:11,371
Well, let's enhance this a bit more.
5351
05:08:11,370 --> 05:08:14,920
Instead of hard coding these scores,\n
5352
05:08:14,920 --> 05:08:19,050
let's use get int to ask\nthe user for a score.
5353
05:08:19,050 --> 05:08:22,591
Let's, then, use get int to\nask the user for another score.
5354
05:08:22,591 --> 05:08:25,800
Let's use get int to ask\nthe user for a third score
5355
05:08:25,800 --> 05:08:28,661
storing them in those\nrespective locations.
5356
05:08:28,661 --> 05:08:34,080
And, now, if I go ahead and save\n
5357
05:08:35,161 --> 05:08:38,251
Now these errors should be\ngetting a little familiar.
5358
05:08:41,010 --> 05:08:42,135
Let me give folks a moment.
5359
05:08:45,361 --> 05:08:48,480
That was not intentional, so still\n
5360
05:08:50,580 --> 05:08:53,830
Now, I'm going to go back to the bottom\n
5361
05:08:54,330 --> 05:08:55,930
We're back in business, ./scores.
5362
05:08:55,931 --> 05:08:58,181
Now, the program is getting\na little more interesting.
5363
05:08:58,181 --> 05:09:02,280
So maybe, this year was better and I got\n
5364
05:09:05,161 --> 05:09:06,631
So now, it's a little more dynamic.
5365
05:09:06,631 --> 05:09:07,531
It's a little more interesting.
5366
05:09:07,530 --> 05:09:10,238
But it's still capping the number\n
5367
05:09:10,239 --> 05:09:15,001
But now, I've introduced another,\n
5368
05:09:15,001 --> 05:09:18,368
There's this expression in programming,\n
5369
05:09:18,368 --> 05:09:20,161
[SNIFFS AIR] something\nsmells a little off.
5370
05:09:20,161 --> 05:09:24,811
And there's something off here in\n
5371
05:09:24,811 --> 05:09:29,341
Does anyone see an opportunity to\n
5372
05:09:29,341 --> 05:09:32,491
if my goal, still, is to get three\n
5373
05:09:32,491 --> 05:09:34,691
without it smelling [SNIFF] kind of bad?
5374
05:09:35,190 --> 05:09:37,200
AUDIENCE: [INAUDIBLE] use a 4 loop?
5375
05:09:37,201 --> 05:09:40,219
That way you don't have to copy\nand paste all of those scores.
5376
05:09:40,219 --> 05:09:41,421
DAVID MALAN: Yeah, exactly.
5377
05:09:41,420 --> 05:09:43,282
Those lines of code\nare almost identical.
5378
05:09:43,282 --> 05:09:45,740
And honestly, the only thing\nthat's changing is the number
5379
05:09:45,741 --> 05:09:47,361
and it's just incrementing by 1.
5380
05:09:47,361 --> 05:09:49,591
We have all of the building\nblocks to do this better.
5381
05:09:49,591 --> 05:09:51,390
So let me go ahead and improve this.
5382
05:09:55,980 --> 05:10:00,411
So for int i get 0, i\nless than 3, i plus plus.
5383
05:10:00,411 --> 05:10:03,320
Then, inside of this 4 loop,\nI can distill all three
5384
05:10:03,320 --> 05:10:05,120
of those lines into\nsomething more generic
5385
05:10:05,120 --> 05:10:10,790
like scores bracket i equals get\n
5386
05:10:10,791 --> 05:10:13,166
once, via get int, for a score.
5387
05:10:13,166 --> 05:10:16,261
So this is where arrays\nstart to get pretty powerful.
5388
05:10:16,260 --> 05:10:18,260
You don't have to hard\ncode, that is, literally
5389
05:10:18,260 --> 05:10:20,722
type in all of these magic\nnumbers like 0, 1, and 2.
5390
05:10:20,723 --> 05:10:22,431
You can start to do\nit, programmatically
5391
05:10:24,030 --> 05:10:25,611
So now, I've tightened things up.
5392
05:10:25,611 --> 05:10:28,491
I'm now, dynamically, getting\nthree different scores
5393
05:10:28,491 --> 05:10:31,027
but putting them in three\ndifferent locations.
5394
05:10:31,026 --> 05:10:34,730
And so this program, ultimately, is\n
5395
05:10:34,730 --> 05:10:41,780
Make scores, ./scores, and 100, 99,\n
5396
05:10:41,780 --> 05:10:43,700
But it's a little better designed, too.
5397
05:10:43,701 --> 05:10:45,620
If I really want to\nnitpick, there's something
5398
05:10:45,620 --> 05:10:47,361
that still smells, a little bit, here.
5399
05:10:47,361 --> 05:10:51,800
The fact that I have indeed, this\n
5400
05:10:51,800 --> 05:10:54,150
has to be the same as this number here.
5401
05:10:54,151 --> 05:10:56,431
Otherwise, who knows\nwhat's going to go wrong.
5402
05:10:56,431 --> 05:10:58,640
So what might be a\nsolution, per last week
5403
05:10:58,640 --> 05:11:01,221
to cleaning that code up further, too?
5404
05:11:01,221 --> 05:11:04,011
AUDIENCE: [INAUDIBLE]\nthe user's discretion
5405
05:11:04,010 --> 05:11:06,002
how many input scores [INAUDIBLE].
5406
05:11:06,003 --> 05:11:09,050
DAVID MALAN: OK, so we could leave\n
5407
05:11:09,050 --> 05:11:11,760
And so we could, actually,\ndo something like this.
5408
05:11:11,760 --> 05:11:13,460
Let me take this a few steps ahead.
5409
05:11:13,460 --> 05:11:20,490
Let me say something like, int n gets\n
5410
05:11:20,491 --> 05:11:24,861
then I could actually change this\n
5411
05:11:24,861 --> 05:11:27,230
and, indeed, make the\nwhole program dynamic?
5412
05:11:27,230 --> 05:11:29,931
Ask the human how many tests\nhave there been this semester?
5413
05:11:29,931 --> 05:11:31,761
Then, you can type in\neach of those scores
5414
05:11:31,760 --> 05:11:33,968
because the loop is going\nto iterate that many times.
5415
05:11:33,969 --> 05:11:37,281
And then you'll get the average\nof one test, two test, three--
5416
05:11:37,280 --> 05:11:41,780
well, lost another-- or however\nmany scores that were actually
5417
05:11:41,780 --> 05:11:45,021
specified by the user Yeah, question?
5418
05:11:45,021 --> 05:11:50,026
AUDIENCE: How many bits or\nbytes get used in an array?
5419
05:11:50,026 --> 05:11:52,320
DAVID MALAN: How many\nbytes are used in an array?
5420
05:11:52,320 --> 05:11:56,784
AUDIENCE: [INAUDIBLE] point of\ndoing this is to save [INAUDIBLE]
5421
05:11:56,784 --> 05:11:59,760
DAVID MALAN: So the purpose of\nan array is not to save space.
5422
05:11:59,760 --> 05:12:03,270
It's to eliminate having\nmultiple variable names
5423
05:12:03,271 --> 05:12:05,161
because that gets very messy quickly.
5424
05:12:05,161 --> 05:12:09,241
If you have score 1, score 2,\nscore 3, dot, dot, dot, score 99
5425
05:12:09,241 --> 05:12:12,361
that's, like, 99 different\nvariables, potentially
5426
05:12:12,361 --> 05:12:18,421
that you could collapse into one\nvariable that has 99 locations.
5427
05:12:18,420 --> 05:12:20,490
At different indices, or indexes.
5428
05:12:20,491 --> 05:12:22,831
As someone would say,\nthe index for an array
5429
05:12:22,830 --> 05:12:25,016
is whatever is in the square brackets.
5430
05:12:35,820 --> 05:12:37,541
DAVID MALAN: So it's a good question.
5431
05:12:37,541 --> 05:12:39,631
So if you-- I'm using\nints for everything--
5432
05:12:39,631 --> 05:12:41,820
and honestly, we don't\nreally need ints for scores
5433
05:12:41,820 --> 05:12:46,030
because I'm not likely to get a\n
5434
05:12:46,030 --> 05:12:47,880
And so you could use\ndifferent data types.
5435
05:12:47,881 --> 05:12:50,548
And that list we had on the screen,\nearlier, is not all of them.
5436
05:12:50,547 --> 05:12:54,030
There's a data type called short,\nwhich is shorter than an int
5437
05:12:54,030 --> 05:12:59,111
you could, technically, use char, in\n
5438
05:12:59,111 --> 05:13:01,201
Generally speaking, in\nthe year 2021, these
5439
05:13:01,201 --> 05:13:05,251
tend to be over optima--\noverly optimized decisions.
5440
05:13:05,251 --> 05:13:07,201
Everyone just uses\nints, even though no one
5441
05:13:07,201 --> 05:13:10,561
is going to get a test score that's 2\n
5442
05:13:11,521 --> 05:13:14,512
Years ago, memory was expensive.
5443
05:13:14,512 --> 05:13:16,470
And every one of your\ninstincts would have been
5444
05:13:16,471 --> 05:13:18,961
spot on because memory is so tight.
5445
05:13:18,960 --> 05:13:21,190
But, nowadays, we don't\nworry as much about it.
5446
05:13:21,690 --> 05:13:26,816
AUDIENCE: I have a question\nabout the error [INAUDIBLE]..
5447
05:13:26,816 --> 05:13:30,865
Could it-- when you're doing a\nhash problem on the problem set--
5448
05:13:30,866 --> 05:13:34,271
DAVID MALAN: So what is the\ndifference between dividing two ints
5449
05:13:34,271 --> 05:13:36,640
and not getting an error, as\nyou might have encountered
5450
05:13:36,640 --> 05:13:40,181
in a program like cash,\nversus dividing two ints
5451
05:13:40,181 --> 05:13:42,411
and getting an error\nlike I did a moment ago?
5452
05:13:42,411 --> 05:13:46,541
The problem with the scenario I created\n
5453
05:13:46,541 --> 05:13:52,241
And I was telling printf to use a %f,\n
5454
05:13:52,241 --> 05:13:54,841
of dividing integers by another integer.
5455
05:13:54,841 --> 05:13:57,190
So it was printf that was yelling at me.
5456
05:13:57,190 --> 05:14:00,190
I'm guessing in the scenario you're\n
5457
05:14:00,190 --> 05:14:03,440
printf was not involved in\nthat particular line of code.
5458
05:14:03,440 --> 05:14:05,126
So that's the difference, there.
5459
05:14:05,920 --> 05:14:09,370
So we, now, have this\nability to create an array.
5460
05:14:09,370 --> 05:14:11,771
And an array can store multiple values.
5461
05:14:11,771 --> 05:14:15,710
What, then, might we do that's more\n
5462
05:14:16,210 --> 05:14:18,490
Well, let's take this one step further.
5463
05:14:18,491 --> 05:14:25,391
As opposed to just storing 72, 73, 33 or\n
5464
05:14:25,390 --> 05:14:30,190
because again, an array gives you one\n
5465
05:14:30,190 --> 05:14:32,620
or indices therein,\nbracket 0, bracket 1
5466
05:14:32,620 --> 05:14:35,591
bracket 2 on up, if it\nwere even bigger than that.
5467
05:14:35,591 --> 05:14:40,361
Let's, now, start to consider something\n
5468
05:14:40,361 --> 05:14:43,091
Chars, being 1 byte each,\nso they're even smaller
5469
05:14:43,091 --> 05:14:44,350
they take up much less space.
5470
05:14:44,350 --> 05:14:46,308
And, indeed, if I wanted\nto say a message like
5471
05:14:46,309 --> 05:14:48,460
hi I could use three variables.
5472
05:14:48,460 --> 05:14:52,780
If I wanted a program to print,\nhi, H-I exclamation point
5473
05:14:52,780 --> 05:14:57,490
I could, of course, store those in\n
5474
05:14:57,491 --> 05:15:00,971
And let's, for the sake of discussion,\n
5475
05:15:00,971 --> 05:15:03,941
Let me create a new\nprogram, now, in VS Code.
5476
05:15:03,940 --> 05:15:07,181
This time, I'm going to call it hi.c.
5477
05:15:07,181 --> 05:15:09,911
And I'm not going to bother\nwith the CS50 library.
5478
05:15:09,911 --> 05:15:11,920
I just need the standard\nI/O one, for now.
5479
05:15:13,480 --> 05:15:16,661
And then, inside of main, I'm going\n
5480
05:15:16,661 --> 05:15:20,021
And this is already, hopefully,\nstriking you as a bad idea.
5481
05:15:20,021 --> 05:15:22,570
But we'll go down this\nroad, temporarily
5482
05:15:22,570 --> 05:15:26,561
with c1, and c2, and, finally, c3.
5483
05:15:26,561 --> 05:15:29,921
Storing each character in\nthe phrase I want to print
5484
05:15:29,920 --> 05:15:33,710
and I'm going to print this\nin a different way than usual.
5485
05:15:35,140 --> 05:15:38,740
And we've, generally, dealt with\n
5486
05:15:38,741 --> 05:15:45,861
But %c, %c, %c, will let me print out\n
5487
05:15:45,861 --> 05:15:48,681
So, kind of, a stupid way\nof printing out a string.
5488
05:15:48,681 --> 05:15:51,201
So we already have a solution\nto this problem last week.
5489
05:15:51,201 --> 05:15:54,800
But let's poke around at what's\n
5490
05:15:58,736 --> 05:16:00,611
But we, again, could\nhave done this last week
5491
05:16:00,611 --> 05:16:03,791
with a string and just one\nvariable, or even, 0, at that.
5492
05:16:03,791 --> 05:16:07,480
But let's start converting\nthese characters
5493
05:16:07,480 --> 05:16:12,010
to their apparent numeric equivalents\n
5494
05:16:12,010 --> 05:16:16,570
Let me modify these %c's,\njust to be fun, to be %i's.
5495
05:16:16,570 --> 05:16:20,440
And let me add some spaces so there\n
5496
05:16:20,440 --> 05:16:24,610
Let me, now, recompile\nhi, and let me rerun it.
5497
05:16:24,611 --> 05:16:27,161
Just to guess, what should\nI see on the screen now?
5498
05:16:30,960 --> 05:16:32,296
AUDIENCE: The ASCII values?
5499
05:16:32,296 --> 05:16:34,020
DAVID MALAN: The ASCII values.
5500
05:16:34,021 --> 05:16:36,480
And it's intentional that\nI keep using the same word
5501
05:16:36,480 --> 05:16:42,510
hi, because it should be, hopefully,\n
5502
05:16:42,510 --> 05:16:46,380
Which, is to say, that c knows about\n
5503
05:16:46,381 --> 05:16:48,581
and can do this conversion\nfor us automatically.
5504
05:16:48,580 --> 05:16:51,930
And it seems to be doing it\nimplicitly for us, so to speak.
5505
05:16:51,931 --> 05:16:55,261
Notice that c1, c2 and\nc3 are, obviously, chars
5506
05:16:55,260 --> 05:16:58,680
but printf is able to tolerate\nprinting them as integers.
5507
05:16:58,681 --> 05:17:03,131
If I really want it to be pedantic,\n
5508
05:17:03,131 --> 05:17:05,581
known as typecasting,\nwhere I can actually
5509
05:17:05,580 --> 05:17:10,870
convert one data type to another,\n
5510
05:17:10,870 --> 05:17:14,161
And we saw in week 0,\nchars, or characters
5511
05:17:14,161 --> 05:17:17,760
are just numbers, like 72, 73, and 33.
5512
05:17:17,760 --> 05:17:21,940
So I can use this parenthetical\n
5513
05:17:21,940 --> 05:17:26,883
[LAUGHTER] three chars to\nthree integers, instead.
5514
05:17:26,883 --> 05:17:28,800
So that's what I meant\nto type the first time.
5515
05:17:30,061 --> 05:17:33,541
So parenthesis, int,\nclose parenthesis says
5516
05:17:33,541 --> 05:17:39,101
take whatever variable comes after this,\n
5517
05:17:39,100 --> 05:17:42,900
The effect is going to be no different,\n
5518
05:17:42,901 --> 05:17:49,171
then running ./hi still works the same,\n
5519
05:17:49,920 --> 05:17:53,520
And we can do this all day long,\nchars to ints, floats to ints
5520
05:17:54,510 --> 05:17:56,148
Sometimes, it's equivalent.
5521
05:17:56,149 --> 05:17:58,066
Other times, you're going\nto lose information.
5522
05:17:58,065 --> 05:18:01,530
Taking a float to an\nint, just intuitively
5523
05:18:01,530 --> 05:18:04,050
is going to throw away everything\nafter the decimal point
5524
05:18:04,050 --> 05:18:06,940
because an int has no decimal point.
5525
05:18:06,940 --> 05:18:09,360
But, for now, I'm going to\nrewind to the version of this
5526
05:18:09,361 --> 05:18:13,411
that just did implicit-type\nconversion, or implicit casting
5527
05:18:13,411 --> 05:18:17,611
just to demonstrate that we can, indeed,\n
5528
05:18:18,210 --> 05:18:20,630
Let me go ahead and do\nthis, now, the week 1 way.
5529
05:18:21,631 --> 05:18:24,466
Let's just do printf, quote-unquote--
5530
05:18:24,466 --> 05:18:28,890
Actually, let's do this, string\ns equals quote-unquote hi
5531
05:18:28,890 --> 05:18:33,940
and then let's do a simple printf\n
5532
05:18:33,940 --> 05:18:36,780
So now I've rewound to last\nweek, where we began this story
5533
05:18:36,780 --> 05:18:40,920
but you'll notice that, if we\nkeep playing around with this--
5534
05:18:43,120 --> 05:18:47,730
Oh, and let me introduce the C50 library\n
5535
05:18:47,730 --> 05:18:50,521
Let me go ahead and\nrecompile, rerun this
5536
05:18:50,521 --> 05:18:52,529
we seem to be coding in circles, here.
5537
05:18:52,528 --> 05:18:55,070
Like, I've just done the same\nthing multiple, different ways.
5538
05:18:55,070 --> 05:18:57,661
But there's clearly\nan equivalence, then
5539
05:18:57,661 --> 05:19:01,239
between sequences of chars and strings.
5540
05:19:01,239 --> 05:19:03,031
And if you do it the\nreal pedantic way, you
5541
05:19:03,030 --> 05:19:07,650
have three different variables, c1, c2,\n
5542
05:19:07,651 --> 05:19:12,131
or you can just treat them all together\n
5543
05:19:12,131 --> 05:19:16,291
But it turns out that\nstrings are actually
5544
05:19:16,291 --> 05:19:22,320
implemented by the computer\nin a pretty now familiar way.
5545
05:19:22,320 --> 05:19:28,643
What might a string actually be\nas of this point in the story?
5546
05:19:28,643 --> 05:19:29,850
Where are we going with this?
5547
05:19:29,850 --> 05:19:31,183
Let me try to look further back.
5548
05:19:32,611 --> 05:19:34,861
AUDIENCE: Can a string like\nthis be an array of chars?
5549
05:19:34,861 --> 05:19:37,671
DAVID MALAN: Yeah, a string\nmight be, and indeed is, just
5550
05:19:39,061 --> 05:19:41,451
So last week we took for\ngranted that strings exist.
5551
05:19:41,451 --> 05:19:43,791
Technically, strings exist,\nbut they're implemented
5552
05:19:43,791 --> 05:19:47,331
as arrays of characters,\nwhich actually opens up
5553
05:19:47,330 --> 05:19:50,030
some interesting possibilities for us.
5554
05:19:50,030 --> 05:19:52,560
Because, let me see, let\nme see if I can do this.
5555
05:19:52,561 --> 05:19:55,820
Let me try to print out,\nnow, three integers again.
5556
05:19:55,820 --> 05:20:01,791
But if string s is but an array, as you\n
5557
05:20:01,791 --> 05:20:04,021
s bracket 1, and s bracket 2.
5558
05:20:04,021 --> 05:20:07,911
So maybe I can start poking\naround inside of strings
5559
05:20:07,911 --> 05:20:09,890
even though we didn't\ndo this last week, so I
5560
05:20:09,890 --> 05:20:11,521
can get at those individual values.
5561
05:20:11,521 --> 05:20:15,530
So make hi, ./hi and,\nvoila, there we go again.
5562
05:20:15,530 --> 05:20:20,468
It's the same 72, 73, 33, but\nnow, I'm sort of, hopefully
5563
05:20:20,469 --> 05:20:22,761
like, wrapping my mind around\nthe fact that, all right
5564
05:20:22,760 --> 05:20:25,911
a string is just an array of\ncharacters, and arrays, you
5565
05:20:25,911 --> 05:20:29,221
can index into them using this\nnew square bracket notation.
5566
05:20:29,221 --> 05:20:32,301
So I can get at any one of\nthese individual characters
5567
05:20:32,300 --> 05:20:38,315
and, heck, convert it to an\ninteger like we did in week 0.
5568
05:20:38,315 --> 05:20:41,271
Let me get a little curious now.
5569
05:20:41,271 --> 05:20:44,280
What else might be in\nthe computer's memory?
5570
05:20:44,280 --> 05:20:47,810
Well, let's-- I'll go back to the\n
5571
05:20:47,811 --> 05:20:50,121
Here might be how we\noriginally implemented hi
5572
05:20:50,120 --> 05:20:53,060
with three variables, c1, c2, c3.
5573
05:20:53,061 --> 05:20:55,761
Of course, that map to these\ndecimal digits or equivalent
5574
05:20:57,140 --> 05:20:59,570
But what was this\nlooking like in memory?
5575
05:20:59,570 --> 05:21:02,510
Literally, when you create a\nstring in memory, like this
5576
05:21:02,510 --> 05:21:05,501
string s equals quote-unquote hi,\nlet's consider what's going on
5577
05:21:05,501 --> 05:21:06,876
underneath the hood, so to speak.
5578
05:21:06,876 --> 05:21:11,751
Well, as an abstraction, a string,\n
5579
05:21:11,751 --> 05:21:13,177
it would seem, 3 bytes, right?
5580
05:21:13,177 --> 05:21:15,260
I've gotten rid of the\nbars, there, because if you
5581
05:21:15,260 --> 05:21:19,911
think of a string as a type, I'm just\n
5582
05:21:19,911 --> 05:21:24,471
But technically, a string, we've\njust revealed, is an array
5583
05:21:26,091 --> 05:21:28,010
So technically, if the\nstring is called s
5584
05:21:28,010 --> 05:21:30,230
s bracket 0 will give\nyou the first character
5585
05:21:30,230 --> 05:21:34,070
s bracket 1, the second,\nand s bracket 3, the third.
5586
05:21:34,070 --> 05:21:37,550
But let me ask this question now,\n
5587
05:21:37,550 --> 05:21:40,820
is the only thing in\nyour computer memory
5588
05:21:40,820 --> 05:21:45,050
and the ability, like a canvas to draw\n
5589
05:21:45,050 --> 05:21:46,881
or whatever on it, but\nthat's it, like this
5590
05:21:46,881 --> 05:21:50,031
is what your Mac, and PC, and\nphone ultimately reduced to.
5591
05:21:50,030 --> 05:21:53,990
Suppose that I'm running a piece\n
5592
05:21:53,991 --> 05:21:57,261
and now I write down\nbye exclamation point.
5593
05:21:57,260 --> 05:21:59,120
Well, where might that go in memory?
5594
05:22:00,105 --> 05:22:03,594
B-Y-E. And then the next thing I type\n
5595
05:22:03,594 --> 05:22:05,511
My memory just might get\nfilled up, over time
5596
05:22:05,510 --> 05:22:08,570
with things that you or\nsomeone else are typing.
5597
05:22:08,570 --> 05:22:14,841
But then how does the computer know if,\n
5598
05:22:14,841 --> 05:22:20,411
is right after H-I exclamation point\n
5599
05:22:23,690 --> 05:22:27,330
All we have are bytes, or 0s and 1s.
5600
05:22:27,330 --> 05:22:29,990
So if you were designing\nthis, how would you
5601
05:22:29,991 --> 05:22:32,541
implement some kind of\ndelimiter between the two?
5602
05:22:32,541 --> 05:22:34,521
Or figure out what the\nlength of a string is?
5603
05:22:36,408 --> 05:22:39,367
DAVID MALAN: OK, so the right\nanswer is use a nul character
5604
05:22:39,367 --> 05:22:41,450
and for those who don't\nknow, what does that mean?
5605
05:22:43,753 --> 05:22:45,710
DAVID MALAN: Yeah, so\nit's a special character.
5606
05:22:45,710 --> 05:22:47,780
Let me describe it as\na sentinel character.
5607
05:22:47,780 --> 05:22:49,835
Humans decided some\ntime ago that you know
5608
05:22:49,835 --> 05:22:52,820
what, if we want to delineate\nwhere one string ends
5609
05:22:52,820 --> 05:22:56,271
and where the next one begins,\nwe just need some special symbol.
5610
05:22:56,271 --> 05:22:59,450
And the symbol they'll use is\ngenerally written as backslash 0.
5611
05:22:59,450 --> 05:23:03,816
This is just shorthand notation\nfor literally eight 0 bits.
5612
05:23:06,800 --> 05:23:10,400
And the nickname for eight\n0 bits, in this context
5613
05:23:13,190 --> 05:23:16,170
And we can actually see this as follows.
5614
05:23:16,170 --> 05:23:18,173
If you look at the\ncorresponding decimal digits
5615
05:23:18,173 --> 05:23:20,841
like you could do by doing out\nthe math or doing the conversion
5616
05:23:20,841 --> 05:23:25,820
like we've done in code, you would\n
5617
05:23:25,820 --> 05:23:30,861
but then 1 extra byte that's sort of\n
5618
05:23:30,861 --> 05:23:33,381
And now I've just written\nit as the decimal number 0.
5619
05:23:33,381 --> 05:23:36,381
The implication of this is\nthat the computer is apparently
5620
05:23:36,381 --> 05:23:40,956
using, not 3 bytes to store\na word like hi, but 4 bytes.
5621
05:23:40,955 --> 05:23:46,310
Whatever the length of the string is,\n
5622
05:23:46,311 --> 05:23:48,901
that demarcates the end of the string.
5623
05:23:48,901 --> 05:23:50,941
So we might draw it like this instead.
5624
05:23:50,940 --> 05:23:55,610
And this character is, again,\npronounced nul, or written N-U-L.
5625
05:23:56,580 --> 05:23:59,330
If humans, at the end of the day,\n
5626
05:23:59,330 --> 05:24:01,163
they just needed to\ndecide, all right, well
5627
05:24:01,163 --> 05:24:04,251
how do we distinguish\none string from another?
5628
05:24:04,251 --> 05:24:06,920
It's a lot easier with\nchars, individually, it's
5629
05:24:06,920 --> 05:24:09,710
a lot easier with ints, it's\neven easier With floats, why?
5630
05:24:09,710 --> 05:24:13,880
Because, per that chart earlier,\n
5631
05:24:13,881 --> 05:24:16,070
Every int is always 4 bytes.
5632
05:24:16,070 --> 05:24:19,010
Every long is always 8 bytes.
5633
05:24:20,540 --> 05:24:24,021
Well, hi is 1, 2, 3 with\nan exclamation point.
5634
05:24:24,021 --> 05:24:27,290
Bye is 1, 2, 3, 4 with\nan exclamation point.
5635
05:24:27,290 --> 05:24:30,710
David is D-A-V-I-D, five\nwithout an exclamation point.
5636
05:24:30,710 --> 05:24:34,470
And so a string can be\nany number of bytes long
5637
05:24:34,471 --> 05:24:36,961
so you somehow need to\ndraw a line in the sand
5638
05:24:36,960 --> 05:24:40,967
to separate in memory\none string from another.
5639
05:24:40,967 --> 05:24:43,672
So what's the implication of this?
5640
05:24:43,672 --> 05:24:45,130
Well, let me go back to code, here.
5641
05:24:46,471 --> 05:24:51,390
This is a bit dangerous, but I'm going\n
5642
05:24:53,471 --> 05:24:57,511
So let me go ahead and\nrecompile, make hi.
5643
05:25:02,881 --> 05:25:06,811
Now let me go ahead and\nrerun make hi, ./hi, Enter.
5644
05:25:07,841 --> 05:25:10,921
So you can actually see in the\ncomputer, unbeknownst to you
5645
05:25:10,920 --> 05:25:14,090
previously, that there's indeed\nsomething else going on there.
5646
05:25:14,091 --> 05:25:17,140
And if I were to make one\nother variant of this program--
5647
05:25:17,140 --> 05:25:19,890
let's get rid of just this\none word and let's have two.
5648
05:25:19,890 --> 05:25:21,810
So let me give myself\nanother string called t
5649
05:25:21,811 --> 05:25:26,070
for instance, just this common\n
5650
05:25:26,070 --> 05:25:29,161
Let me, then print out with %s.
5651
05:25:29,161 --> 05:25:35,045
And let me also print out with %s,\n
5652
05:25:35,045 --> 05:25:38,580
Let me recompile this program,\nand obviously the out--
5653
05:25:38,580 --> 05:25:41,730
ugh-- this is what happens\nwhen I go too fast.
5654
05:25:41,730 --> 05:25:45,001
All right, third mistake\ntoday, close quote.
5655
05:25:52,471 --> 05:25:54,871
Now we have a program that's\nprinting both hi and bye
5656
05:25:54,870 --> 05:25:58,980
only so that we can consider what's\n
5657
05:25:58,980 --> 05:26:04,471
If s is storing hi and\napparently one bonus byte that
5658
05:26:04,471 --> 05:26:07,501
demarcates the end of that\nstring, bye is apparently
5659
05:26:07,501 --> 05:26:10,673
going to fit into the\nlocation directly after.
5660
05:26:10,673 --> 05:26:13,591
And it's wrapping around, but that's\n
5661
05:26:13,591 --> 05:26:16,260
But bye, B-Y-E exclamation\npoint is taking up
5662
05:26:16,260 --> 05:26:23,208
1, 2, 3, 4, plus a fifth byte, as well.
5663
05:26:23,208 --> 05:26:27,841
All right, any questions on this\n
5664
05:26:27,841 --> 05:26:29,820
And we'll contextualize\nthis, before long
5665
05:26:29,820 --> 05:26:32,100
so that this isn't just\nlike, OK, who really cares?
5666
05:26:32,100 --> 05:26:34,990
This is going to be the source\nof actually implementing things.
5667
05:26:34,991 --> 05:26:37,771
In fact for problem set 2, like\ncryptography, and encryption
5668
05:26:37,771 --> 05:26:39,728
and scrambling actual human messages.
5669
05:26:40,771 --> 05:26:44,911
AUDIENCE: So normally if\nyou were to not use string
5670
05:26:44,911 --> 05:26:47,741
you would just make a character\nrange that would declare
5671
05:26:47,741 --> 05:26:50,841
how many characters there are so\n
5672
05:26:51,591 --> 05:26:53,741
DAVID MALAN: A good\nquestion, too and let
5673
05:26:53,741 --> 05:26:56,376
me summarize as, if we were\ninstead to use chars all the time
5674
05:26:56,376 --> 05:26:59,501
we would indeed have to know in advance\n
5675
05:26:59,501 --> 05:27:03,010
string that you're storing, how, then,\n
5676
05:27:03,010 --> 05:27:05,260
because when you CS50 wrote\nthe get string function
5677
05:27:05,260 --> 05:27:07,450
we obviously don't know\nhow long the words are
5678
05:27:07,451 --> 05:27:09,281
going to be that you all are typing in.
5679
05:27:09,280 --> 05:27:12,820
It turns out, two weeks from\nnow we'll see that get string
5680
05:27:12,820 --> 05:27:15,580
uses a technique known as\ndynamic memory allocation.
5681
05:27:15,580 --> 05:27:20,030
And it's going to grow or shrink\n
5682
05:27:22,181 --> 05:27:25,710
AUDIENCE: Why are we using a nul value?
5683
05:27:26,986 --> 05:27:28,111
DAVID MALAN: Good question.
5684
05:27:28,111 --> 05:27:31,140
Why are we using a nul value,\nisn't it wasting a byte?
5685
05:27:31,890 --> 05:27:37,471
But I claim there's really no other way\n
5686
05:27:37,471 --> 05:27:44,009
from the start of another, unless we\n
5687
05:27:44,008 --> 05:27:46,800
All we have, at the end of the day,\n
5688
05:27:46,800 --> 05:27:50,161
Therefore, all we can do is spin\nthose bits in some creative way
5689
05:27:51,780 --> 05:27:54,970
So we're minimally going to spend\n1 byte to solve this problem.
5690
05:27:55,471 --> 05:28:00,158
AUDIENCE: How does our memory device\n
5691
05:28:00,157 --> 05:28:03,530
the /n if we don't have\nit stored as a char?
5692
05:28:05,170 --> 05:28:08,950
how does the computer know to move\n
5693
05:28:08,951 --> 05:28:12,251
So /n, even though it\nlooks like two characters
5694
05:28:12,251 --> 05:28:16,151
it's actually stored as just 1\nbyte in the computer's memory.
5695
05:28:16,151 --> 05:28:18,618
There's a mapping between\nit and an actual number.
5696
05:28:18,617 --> 05:28:21,700
And you can see that, for instance,\n
5697
05:28:21,701 --> 05:28:25,485
AUDIENCE: So with that being\nstored would be the [INAUDIBLE]..
5698
05:28:26,681 --> 05:28:32,471
If I had put a /n in my code here,\n
5699
05:28:32,471 --> 05:28:36,101
and here, that would actually shift\n
5700
05:28:36,100 --> 05:28:41,001
need to make room for a /n\nhere and another one over here.
5701
05:28:41,001 --> 05:28:43,173
So it would take two\nmore bytes, exactly.
5702
05:28:43,841 --> 05:28:50,311
AUDIENCE: So if hi exclamation\n
5703
05:28:50,311 --> 05:28:56,891
too as 72, 73, 33, if we are to\n
5704
05:28:56,890 --> 05:29:03,350
and convert them into binary how\n
5705
05:29:04,651 --> 05:29:06,651
DAVID MALAN: And what's\nthe last thing you said?
5706
05:29:08,066 --> 05:29:09,960
DAVID MALAN: It's context sensitive.
5707
05:29:09,960 --> 05:29:12,710
So if, at the end of the day, all\n
5708
05:29:12,710 --> 05:29:16,640
like 72, 73, 33, recall\nthat it's up to the program
5709
05:29:16,640 --> 05:29:19,730
to decide, based on context,\nhow to interpret them.
5710
05:29:19,730 --> 05:29:23,570
And I simplified this story in week 0\n
5711
05:29:23,570 --> 05:29:27,170
as RGB colors, and iMessage\nor a text messaging program
5712
05:29:27,170 --> 05:29:31,700
interprets them as letters, and\n
5713
05:29:31,701 --> 05:29:36,800
How those programs do it is by way\n
5714
05:29:37,341 --> 05:29:39,133
And in fact, later this\nsemester, we'll see
5715
05:29:39,133 --> 05:29:43,761
a data type via which you can represent\n
5716
05:29:43,760 --> 05:29:46,501
and red value, a green\nvalue, and a blue value.
5717
05:29:46,501 --> 05:29:48,861
So we'll see other data types as well.
5718
05:29:49,361 --> 05:29:53,581
AUDIENCE: It seems easy enough to just\n
5719
05:29:53,580 --> 05:29:56,450
so why do we have integers\nand long integers?
5720
05:29:56,451 --> 05:29:59,453
Why can't we make everything\nvariable in its data size?
5721
05:29:59,453 --> 05:30:01,161
DAVID MALAN: Really\ninteresting question.
5722
05:30:01,161 --> 05:30:04,370
Why could we not just make all\ndata types variable in size?
5723
05:30:04,370 --> 05:30:07,820
And some languages, some\nlibraries do exactly this.
5724
05:30:07,820 --> 05:30:11,361
C is an older language, and\nbecause memory was expensive
5725
05:30:12,561 --> 05:30:14,901
The reality was you\ngain benefits from just
5726
05:30:14,901 --> 05:30:17,271
standardizing the size of these things.
5727
05:30:17,271 --> 05:30:19,671
You also get performance\nincreases in the sense
5728
05:30:19,670 --> 05:30:23,880
that if you know every int is\n4 bytes, you can very quickly
5729
05:30:23,881 --> 05:30:26,480
and we'll see this next week,\njump from integer to another
5730
05:30:26,480 --> 05:30:30,861
to another in memory just by adding\n
5731
05:30:30,861 --> 05:30:32,691
You can very quickly poke around.
5732
05:30:32,690 --> 05:30:35,782
Whereas, if you had variable\nlength numbers, you would have to
5733
05:30:35,782 --> 05:30:38,240
kind of, follow, follow, follow,\nlooking for the end of it.
5734
05:30:38,241 --> 05:30:41,041
Follow, follow-- you would have to\n
5735
05:30:41,041 --> 05:30:42,583
So that's a topic we'll come back to.
5736
05:30:42,582 --> 05:30:44,960
But it was generally for efficiency.
5737
05:30:46,431 --> 05:30:52,203
AUDIENCE: Why not store the\nnul character [INAUDIBLE]
5738
05:30:52,203 --> 05:30:55,781
DAVID MALAN: Good question\nwhy not store the--
5739
05:30:55,780 --> 05:30:59,800
why not store the nul\ncharacter at the beginning?
5740
05:30:59,800 --> 05:31:06,150
You could-- let's see, why\nnot store it at the beginning?
5741
05:31:09,341 --> 05:31:12,585
You could absolutely--\nwell, could you do this?
5742
05:31:15,841 --> 05:31:20,640
If you were to do that\nat the beginning--
5743
05:31:22,681 --> 05:31:24,888
No, because I finally thought\nof a problem with this.
5744
05:31:24,888 --> 05:31:26,743
If you store it at\nthe beginning instead
5745
05:31:26,743 --> 05:31:29,161
we'll see in just a moment how\nyou can actually write code
5746
05:31:29,161 --> 05:31:31,411
to figure out where\nthe end of a string is
5747
05:31:31,411 --> 05:31:33,811
and the problem there\nis wouldn't necessarily
5748
05:31:33,811 --> 05:31:37,261
know if you eventually hit a\n0 at the end of the string
5749
05:31:37,260 --> 05:31:41,070
because it's the number 0 in the\n
5750
05:31:41,070 --> 05:31:44,440
or if it's the context of some\nother data type, altogether.
5751
05:31:44,440 --> 05:31:46,860
So the fact that we've standardized--
5752
05:31:46,861 --> 05:31:50,820
the fact that we've standardized\nstrings as ending with nul
5753
05:31:50,820 --> 05:31:54,916
means that we can reliably distinguish\n
5754
05:31:54,916 --> 05:31:56,820
And that's actually a\nperfect segue way, now
5755
05:31:56,820 --> 05:31:59,954
to actually using this\nprimitive to building up
5756
05:31:59,954 --> 05:32:02,621
our own code that manipulates\nthese things that are lower level.
5757
05:32:03,820 --> 05:32:05,911
Let me create a new file called length.
5758
05:32:05,911 --> 05:32:10,260
And let's use this basic idea to\n
5759
05:32:10,260 --> 05:32:14,980
is after it's been stored in a variable.
5760
05:32:16,120 --> 05:32:20,790
Let me include both the CS50\nheader and the standard I/O header
5761
05:32:20,791 --> 05:32:25,511
give myself int main(void) again\n
5762
05:32:25,510 --> 05:32:28,320
Let me prompt the user for\na string s and I'll ask them
5763
05:32:28,320 --> 05:32:32,431
for a string like their name, here.
5764
05:32:32,431 --> 05:32:37,681
And then let me name it more\nverbosely name this time.
5765
05:32:37,681 --> 05:32:39,431
Now let me go ahead and do this.
5766
05:32:39,431 --> 05:32:44,521
Let me iterate over every\ncharacter in this string
5767
05:32:44,521 --> 05:32:46,440
in order to figure out\nwhat its length is.
5768
05:32:46,440 --> 05:32:49,320
So initially, I'm going\nto go ahead and say this
5769
05:32:49,320 --> 05:32:52,300
int length equals 0, because\nI don't know what it is yet.
5770
05:32:52,300 --> 05:32:53,550
So we're going to start at 0.
5771
05:32:53,550 --> 05:32:56,670
And then while the following is true--
5772
05:32:56,670 --> 05:33:01,630
while-- let me-- do I want to do this?
5773
05:33:01,631 --> 05:33:04,320
Let me change this to i,\njust for clarity, let me do
5774
05:33:04,320 --> 05:33:10,050
this, while name bracket i does not\n
5775
05:33:10,050 --> 05:33:13,440
So I typed it on the slide is N-U-L,\n
5776
05:33:13,440 --> 05:33:17,925
you actually use its numeric equivalent,\n
5777
05:33:17,925 --> 05:33:23,190
While name bracket i does not equal the\n
5778
05:33:23,190 --> 05:33:26,730
and increment i to i plus plus.
5779
05:33:26,730 --> 05:33:29,730
And then down here I'm going\nto print out the value of i
5780
05:33:29,730 --> 05:33:33,530
to see what we actually get,\nprinting out the value of i.
5781
05:33:33,530 --> 05:33:35,280
All right, so what's\ngoing to happen here?
5782
05:33:39,001 --> 05:33:43,831
./length and let me type in something\n
5783
05:33:45,001 --> 05:33:48,210
Let me try bye,\nexclamation point, Enter.
5784
05:33:50,131 --> 05:33:52,771
Let me try my own name, David, Enter.
5785
05:33:54,230 --> 05:33:56,140
So what's actually going on here?
5786
05:33:56,140 --> 05:33:58,751
Well, it seems that\nby way of this 4 loop
5787
05:33:58,751 --> 05:34:00,883
we are specifying a\nlocal variable called
5788
05:34:00,883 --> 05:34:03,841
i initialized to 0, because we're\n
5789
05:34:04,841 --> 05:34:08,311
I'm then asking the\nquestion, does location 0
5790
05:34:08,311 --> 05:34:13,561
that is i in the name string,\nwhich we now know is an array
5791
05:34:15,960 --> 05:34:19,905
Because if it doesn't, that means it's\n
5792
05:34:21,901 --> 05:34:25,171
Then, let's come back around to line\n
5793
05:34:26,850 --> 05:34:30,681
So does name bracket 1 not equal /0?
5794
05:34:30,681 --> 05:34:36,331
Well, if it doesn't, and it won't\nif it's an i, or a y, or an a
5795
05:34:36,330 --> 05:34:39,750
based on what I typed in, we're\ngoing to increment i once more.
5796
05:34:39,751 --> 05:34:43,201
Fast-forward to the end of the story,\n
5797
05:34:43,201 --> 05:34:46,681
technically, one space\npast the end of the string
5798
05:34:46,681 --> 05:34:49,771
name bracket i will equal /0.
5799
05:34:49,771 --> 05:34:54,221
So I don't increment i anymore, I\n
5800
05:34:54,221 --> 05:34:58,771
So what we seem to have here with some\n
5801
05:34:58,771 --> 05:35:03,331
is a program that figures out the length\n
5802
05:35:03,330 --> 05:35:06,120
Let's practice our abstraction\nand decompose this into
5803
05:35:06,120 --> 05:35:07,530
maybe, a helper function here.
5804
05:35:07,530 --> 05:35:11,370
Let me grab all of this\ncode here, and assume
5805
05:35:11,370 --> 05:35:15,841
for the sake of discussion for a moment,\n
5806
05:35:18,001 --> 05:35:21,091
And the length of the string\nis name that I want to get
5807
05:35:21,091 --> 05:35:25,260
and then I'll go ahead and print\nout, just as before with %i
5808
05:35:26,658 --> 05:35:28,951
So now I'm abstracting away\nthis notion of figuring out
5809
05:35:29,992 --> 05:35:32,730
That's an opportunity for to\nme to create my own function.
5810
05:35:32,730 --> 05:35:35,775
If I want to create a\nfunction called string length
5811
05:35:35,776 --> 05:35:39,871
I'll claim that I want to\ntake a string as input
5812
05:35:39,870 --> 05:35:45,120
and what should I have this\nfunction return as its return type?
5813
05:35:45,120 --> 05:35:50,350
What should get string\npresumably return?
5814
05:35:53,280 --> 05:35:55,198
Float really wouldn't\nmake sense because we're
5815
05:35:55,198 --> 05:35:57,637
measuring things that are integers.
5816
05:35:57,637 --> 05:35:59,220
In this case, the length of something.
5817
05:35:59,221 --> 05:36:00,901
So indeed, let's have it return an int.
5818
05:36:00,901 --> 05:36:03,641
I can use the same\ncode as before, so I'm
5819
05:36:03,640 --> 05:36:06,435
going to paste what I\ncut earlier in the file.
5820
05:36:06,436 --> 05:36:10,921
The only thing I have to change\nis the name of the variable.
5821
05:36:10,920 --> 05:36:14,501
Because now this function,\nI decided arbitrarily
5822
05:36:14,501 --> 05:36:17,390
that I'm going to call it\ns, just to be more generic.
5823
05:36:17,390 --> 05:36:20,175
So I'm going to look at s\nbracket i at each location.
5824
05:36:20,175 --> 05:36:23,050
And I don't want to print it at the\n
5825
05:36:23,050 --> 05:36:25,510
What's the line of code I should\ninclude here if I actually
5826
05:36:25,510 --> 05:36:28,265
want to hand back the total length?
5827
05:36:31,372 --> 05:36:33,530
DAVID MALAN: Return i, in this case.
5828
05:36:33,530 --> 05:36:35,800
So I'm going return i, not print it.
5829
05:36:35,800 --> 05:36:40,751
Because now, my main function can\n
5830
05:36:40,751 --> 05:36:42,791
and print it on the next line itself.
5831
05:36:42,791 --> 05:36:46,781
I just need a prototype, so that's\n
5832
05:36:46,780 --> 05:36:48,431
I'm going to rerun make length.
5833
05:36:48,431 --> 05:36:49,901
Hopefully I didn't screw up.
5834
05:36:49,901 --> 05:36:53,591
I didn't. ./length,\nI'll type in hi-- oops--
5835
05:36:56,140 --> 05:36:59,230
I'll type in bye again, and so forth.
5836
05:36:59,230 --> 05:37:02,963
So now we have a function that\n
5837
05:37:02,963 --> 05:37:05,381
Well, it turns out we didn't\nactually need this all along.
5838
05:37:05,381 --> 05:37:10,302
It turns out that we can get rid of my\n
5839
05:37:10,302 --> 05:37:12,760
I can definitely delete the\nwhole implementation down here.
5840
05:37:12,760 --> 05:37:16,420
Because it turns out, in\na file called string.h
5841
05:37:16,420 --> 05:37:19,780
which is a new header file today, we\n
5842
05:37:19,780 --> 05:37:23,950
called, more succinctly,\nstrlen, S-T-R-L-E-N. Which
5843
05:37:25,390 --> 05:37:29,501
This is a function that comes with C,\n
5844
05:37:29,501 --> 05:37:33,710
and it does what we just\nimplemented manually.
5845
05:37:33,710 --> 05:37:37,600
So here's an example of, admittedly, a\n
5846
05:37:38,741 --> 05:37:41,111
And how do what kinds\nof functions exist?
5847
05:37:41,111 --> 05:37:45,521
Well, let me pop out of my\nbrowser here to a website that
5848
05:37:45,521 --> 05:37:48,716
is a CS50's incarnation of\nwhat are called manual pages.
5849
05:37:48,716 --> 05:37:52,331
It turns out that in a lot\nof systems, Macs, and Unix
5850
05:37:52,330 --> 05:37:55,360
and Linux systems, including\nthe Visual Studio Code
5851
05:37:55,361 --> 05:37:57,280
instance that we have\nin the cloud, there
5852
05:37:57,280 --> 05:38:00,550
are publicly accessible\nmanual pages for functions.
5853
05:38:00,550 --> 05:38:04,030
They tend to be written very\nexpertly, in a way that's
5854
05:38:05,420 --> 05:38:09,911
So we have here at\nmanual.cs50.io is CS50's version
5855
05:38:09,911 --> 05:38:13,001
of manual pages that have this\nless-comfortable mode that
5856
05:38:13,001 --> 05:38:15,550
give you a, sort of, cheat\nsheet of very frequently used
5857
05:38:15,550 --> 05:38:19,271
helpful functions in C. And\nwe've translated the expert
5858
05:38:19,271 --> 05:38:22,335
notation to things that a\nbeginner can understand.
5859
05:38:22,335 --> 05:38:26,451
So, for instance, let me go ahead and\n
5860
05:38:26,451 --> 05:38:30,460
You'll see that there's documentation\n
5861
05:38:30,460 --> 05:38:32,771
but more interestingly\ndown here, there's
5862
05:38:32,771 --> 05:38:35,111
a whole bunch of\nstring-related functions
5863
05:38:35,111 --> 05:38:36,881
that we haven't even seen most of, yet.
5864
05:38:36,881 --> 05:38:38,921
But there's indeed one\nhere called strlen
5865
05:38:38,920 --> 05:38:40,880
calculate the length of a string.
5866
05:38:40,881 --> 05:38:46,421
And so if I go to strlen here, I'll\n
5867
05:38:47,230 --> 05:38:49,661
And the way a manual\npage typically works
5868
05:38:49,661 --> 05:38:52,570
whether in CS50's format\nor any other, system
5869
05:38:52,570 --> 05:38:55,210
is you see, typically, a\nsynopsis of what header
5870
05:38:55,210 --> 05:38:57,591
files you need to use the function.
5871
05:38:57,591 --> 05:39:00,221
So you would copy paste\nthese couple of lines here.
5872
05:39:00,221 --> 05:39:03,791
You see what the prototype\nis of the function so
5873
05:39:03,791 --> 05:39:06,794
that you know what its inputs are,\n
5874
05:39:06,794 --> 05:39:09,461
Then down below you might see a\ndescription, which in this case
5875
05:39:10,580 --> 05:39:12,430
This function calculates\nthe length of s.
5876
05:39:12,431 --> 05:39:15,370
Then you see what the\nreturn value is, if any
5877
05:39:15,370 --> 05:39:18,570
and you might even see an example, like\n
5878
05:39:18,570 --> 05:39:21,273
So these manual pages\nwhich are again, accessible
5879
05:39:21,273 --> 05:39:23,980
here, and we'll link to these in\n
5880
05:39:23,980 --> 05:39:26,771
are pretty much the place to\nstart when you want to figure out
5881
05:39:26,771 --> 05:39:29,471
has a wheel been invented already?
5882
05:39:29,471 --> 05:39:32,751
Is there a function that might help\n
5883
05:39:32,751 --> 05:39:36,161
so that I don't have to really\nget into the weeds of doing all
5884
05:39:36,161 --> 05:39:37,973
of those lower-level steps as I've had.
5885
05:39:37,973 --> 05:39:40,931
Sometimes the answer is going to be\n
5886
05:39:40,931 --> 05:39:43,421
But again the point of our\nhaving just done this together
5887
05:39:43,420 --> 05:39:46,210
is to reveal that even the\nfunctions you start taking for
5888
05:39:46,210 --> 05:39:50,396
granted, they all reduce to some\nof these basic building blocks.
5889
05:39:50,396 --> 05:39:53,861
At the end of the day, this is\n
5890
05:39:55,210 --> 05:39:57,320
We're just learning,\nnow, how to harness those
5891
05:39:57,320 --> 05:40:01,480
and how to manipulate them ourselves.
5892
05:40:08,065 --> 05:40:16,039
AUDIENCE: We did just see\n[INAUDIBLE] Is that so common
5893
05:40:16,040 --> 05:40:18,296
that we would have to\nspecify it, or is it not?
5894
05:40:18,295 --> 05:40:19,420
DAVID MALAN: Good question.
5895
05:40:19,420 --> 05:40:22,180
Is it so common that you would\nhave to specify it or not?
5896
05:40:22,181 --> 05:40:24,431
You do need to include its\nheader files because that's
5897
05:40:24,431 --> 05:40:25,931
where all of those prototypes are.
5898
05:40:25,931 --> 05:40:29,451
You don't need to worry about\nlinking it in with -l anything.
5899
05:40:29,451 --> 05:40:31,601
And in fact, moving\nforward, you do not ever
5900
05:40:31,600 --> 05:40:35,170
need to worry about linking in\n
5901
05:40:35,170 --> 05:40:39,200
We, the staff, have configured make to\n
5902
05:40:39,201 --> 05:40:41,291
We want you to understand\nthat it is doing it
5903
05:40:41,291 --> 05:40:43,601
but we'll take care of\nall of the -l's for you.
5904
05:40:43,600 --> 05:40:47,620
But the onus is on you for the\nprototypes and the header files.
5905
05:40:47,620 --> 05:40:51,411
Other questions on these\nrepresentations or techniques?
5906
05:40:51,911 --> 05:41:00,181
AUDIENCE: [INAUDIBLE] exclamation mark.
5907
05:41:00,181 --> 05:41:04,784
How does it actually define\nthe spaces [INAUDIBLE]??
5908
05:41:04,784 --> 05:41:06,181
DAVID MALAN: A good question.
5909
05:41:06,181 --> 05:41:09,960
If you were to have a string with actual\n
5910
05:41:09,960 --> 05:41:11,791
what would the computer actually do?
5911
05:41:11,791 --> 05:41:14,221
Well for this. let me\ngo to asciichart.com.
5912
05:41:14,221 --> 05:41:19,140
Which is just a random website that's\n
5913
05:41:20,190 --> 05:41:22,780
This is, in fact, what we had\na screenshot of the other day.
5914
05:41:22,780 --> 05:41:26,348
And if you look here, it's a little\n
5915
05:41:26,348 --> 05:41:29,640
If a computer were to store a space, it\n
5916
05:41:29,640 --> 05:41:34,690
32, or technically, the pattern of 0s\n
5917
05:41:34,690 --> 05:41:37,501
All of the US English keys that\nyou might type on a keyboard
5918
05:41:37,501 --> 05:41:40,651
can be represented with a\nnumber, and using Unicode can
5919
05:41:40,651 --> 05:41:43,181
you express even things like\nemojis and other languages.
5920
05:41:43,681 --> 05:41:47,390
AUDIENCE: Are only strings\nfollowed by nul number
5921
05:41:47,390 --> 05:41:50,776
or let's say we had a series of\nnumbers, would each one of them
5922
05:41:52,105 --> 05:41:53,230
DAVID MALAN: Good question.
5923
05:41:53,230 --> 05:41:56,050
Only strings are accompanied\nby nuls at the end
5924
05:41:56,050 --> 05:41:59,021
because every other data type\nwe've talked about thus far
5925
05:41:59,021 --> 05:42:01,390
is of well defined finite length.
5926
05:42:01,390 --> 05:42:04,451
1 byte for char, 4 bytes\nfor ints and so forth.
5927
05:42:04,451 --> 05:42:08,501
If we think back to last week, we did\n
5928
05:42:08,501 --> 05:42:12,341
Integer overflow, because 4 bytes, heck,\n
5929
05:42:12,341 --> 05:42:14,530
We also talked about\nfloating point imprecision.
5930
05:42:14,530 --> 05:42:17,740
Thankfully in the world of scientific\n
5931
05:42:17,741 --> 05:42:21,191
there are libraries you can\nuse that draw inspiration
5932
05:42:21,190 --> 05:42:23,080
from this idea of a\nstring, and they might
5933
05:42:23,080 --> 05:42:26,900
use 9 bytes for an integer\nvalue or maybe 20 bytes
5934
05:42:26,901 --> 05:42:28,431
that you can count really high.
5935
05:42:28,431 --> 05:42:30,940
But they will then start to\nmanage that memory for you
5936
05:42:30,940 --> 05:42:34,220
and what they're really probably doing\n
5937
05:42:34,221 --> 05:42:37,331
and somehow remembering how\nlong the sequence of bytes is.
5938
05:42:37,330 --> 05:42:40,450
That's how these higher-level\nlibraries work, too.
5939
05:42:40,451 --> 05:42:41,960
All right, this has been a lot.
5940
05:42:41,960 --> 05:42:43,341
Let's take one more break here.
5941
05:42:43,341 --> 05:42:44,931
We'll do a seven-minute break here.
5942
05:42:44,931 --> 05:42:47,725
And when we come back, we'll\nflesh out a few more details.
5943
05:42:50,651 --> 05:42:55,661
So we just saw strlen as an\nexample of a function that
5944
05:42:55,661 --> 05:42:57,158
comes in the string library.
5945
05:42:57,158 --> 05:42:59,951
Let's start to take more of these\n
5946
05:42:59,951 --> 05:43:03,791
So we're not relying only on the\n
5947
05:43:03,791 --> 05:43:05,921
Let me switch over to VS Code.
5948
05:43:05,920 --> 05:43:10,300
And create a file called, say string.h.
5949
05:43:10,300 --> 05:43:12,376
to apply this lesson\nlearned, as follows.
5950
05:43:12,376 --> 05:43:19,030
Let me include cs50.h,\nstdio.h, and this new thing
5951
05:43:19,030 --> 05:43:21,521
string.h as well, at the top.
5952
05:43:21,521 --> 05:43:23,958
I'm going to do the usual\nint main(void) here.
5953
05:43:23,958 --> 05:43:26,501
And then in this program suppose,\nfor the sake of discussion
5954
05:43:26,501 --> 05:43:29,800
that I didn't know about\n%s for printf or, heck
5955
05:43:29,800 --> 05:43:33,560
maybe early on there\nwas no %s format code.
5956
05:43:33,561 --> 05:43:36,681
And so there was no easy\nway to print strings.
5957
05:43:36,681 --> 05:43:40,091
Well, at least if we know that\n
5958
05:43:40,091 --> 05:43:44,081
we could use %c as a\nworkaround, a solution to that
5959
05:43:45,681 --> 05:43:49,181
So let me ask myself for a\nstring s by using get string here
5960
05:43:49,181 --> 05:43:51,761
and I'll ask the user for some input.
5961
05:43:51,760 --> 05:43:57,520
And then, let me print out say, output\n
5962
05:43:58,721 --> 05:44:02,261
Now, the simplest way to do this, of\n
5963
05:44:02,260 --> 05:44:05,220
printf %s, and plug in\nthe s, and we're done.
5964
05:44:05,221 --> 05:44:07,991
But again, for the sake of\ndiscussion, I forgot about
5965
05:44:07,991 --> 05:44:12,081
or someone didn't implement %s,\nso how else could we do this?
5966
05:44:12,080 --> 05:44:16,060
Well, in pseudo code, or in English\n
5967
05:44:16,061 --> 05:44:23,171
this problem, printing out the string\n
5968
05:44:23,170 --> 05:44:26,680
How might we go about solving this?
5969
05:44:26,681 --> 05:44:28,407
Just in English, high-level?
5970
05:44:28,407 --> 05:44:29,990
What would your pseudo code look like?
5971
05:44:30,491 --> 05:44:33,829
AUDIENCE: You could\njust print each letter.
5972
05:44:33,829 --> 05:44:35,621
DAVID MALAN: OK, so\njust print each letter.
5973
05:44:35,620 --> 05:44:37,751
And maybe, more precisely,\nsome kind of loop.
5974
05:44:37,751 --> 05:44:41,291
Like, let's iterate over\nall of the characters in s
5975
05:44:43,550 --> 05:44:48,310
Well, for int i, get 0 is kind of the\n
5976
05:44:49,841 --> 05:44:51,626
OK, how long do I want to iterate?
5977
05:44:51,626 --> 05:44:53,501
Well, it's going to\ndepend on what I type in
5978
05:44:53,501 --> 05:44:55,561
but that's why we have strlen now.
5979
05:44:55,561 --> 05:45:00,341
So iterate up to the length of\ns, and then increment i with plus
5980
05:45:01,335 --> 05:45:04,931
And then let's just print\nout %c with no new line
5981
05:45:04,931 --> 05:45:07,271
because I want everything\non the same line
5982
05:45:07,271 --> 05:45:12,041
whatever the character\nis at s bracket i.
5983
05:45:12,041 --> 05:45:14,050
And then at the very\nend, I'll give myself
5984
05:45:14,050 --> 05:45:16,611
that new line, just to move the\ncursor down to the next line
5985
05:45:16,611 --> 05:45:18,611
so the dollar sign is\nnot in a weird place.
5986
05:45:18,611 --> 05:45:21,491
All right, so let's see if I\ndidn't screw up any of the code
5987
05:45:21,491 --> 05:45:26,951
make string, Enter, so far so good,\n
5988
05:45:30,280 --> 05:45:33,940
Let me do it once more with\nbye, Enter, and that works, too.
5989
05:45:33,940 --> 05:45:36,670
Notice I very deliberately\nand quickly gave myself
5990
05:45:36,670 --> 05:45:39,520
two spaces here and one space\nhere just because I, literally
5991
05:45:39,521 --> 05:45:42,881
wanted these things to line up properly,\n
5992
05:45:42,881 --> 05:45:46,091
But that was just a\ndeliberate formatting detail.
5993
05:45:47,780 --> 05:45:53,501
Which is a claim I've made before,\nbut it's not well-designed.
5994
05:45:53,501 --> 05:45:57,431
It is well-designed in that I'm using\n
5995
05:45:57,431 --> 05:45:59,921
like, I've not reinvented\na wheel, there's no line 15
5996
05:45:59,920 --> 05:46:02,530
or below, I didn't implement\nstring length myself.
5997
05:46:02,530 --> 05:46:07,900
So I'm at least practicing\nwhat I've preached.
5998
05:46:07,901 --> 05:46:12,621
But there's still an\nimperfection, a suboptimality.
5999
05:46:12,620 --> 05:46:15,170
This one's really subtle though.
6000
05:46:15,170 --> 05:46:18,590
And you have to think\nabout how loops work.
6001
05:46:18,591 --> 05:46:22,901
What am I doing that's\nnot super efficient?
6002
05:46:24,131 --> 05:46:27,439
AUDIENCE: [INAUDIBLE]\nover and over again.
6003
05:46:27,438 --> 05:46:29,230
DAVID MALAN: Yeah, this\nis a little subtle.
6004
05:46:29,230 --> 05:46:31,721
But if you think back to the\nbasic definition of a 4 loop
6005
05:46:31,721 --> 05:46:34,331
and recall when I highlighted\nthings last week, what happens?
6006
05:46:34,330 --> 05:46:37,090
Well, the first thing\nis that i gets set to 0.
6007
05:46:37,091 --> 05:46:38,570
Then we check the condition.
6008
05:46:38,570 --> 05:46:39,820
How do we check the condition?
6009
05:46:39,820 --> 05:46:42,640
We call strlen on s,\nwe get back an answer
6010
05:46:42,640 --> 05:46:49,070
like 3 if it's a H-I exclamation point\n
6011
05:46:49,070 --> 05:46:50,830
and then we print out the character.
6012
05:46:50,830 --> 05:46:53,320
Then we increment i from 0 to 1.
6013
05:46:54,728 --> 05:46:56,021
How do I recheck the condition?
6014
05:46:58,361 --> 05:47:01,151
Get back the same answer, 3.
6015
05:47:04,061 --> 05:47:08,951
So we print out another character. i\n
6016
05:47:11,170 --> 05:47:12,220
Well, what's the string like the best?
6017
05:47:16,120 --> 05:47:19,690
So I keep asking the same\nquestion sort of stupidly
6018
05:47:19,690 --> 05:47:22,480
because the string is, presumably,\nnever changing in length.
6019
05:47:22,480 --> 05:47:24,418
And indeed, every time\nI check that condition
6020
05:47:24,419 --> 05:47:25,961
that function is going to get called.
6021
05:47:25,960 --> 05:47:28,640
And every time, the answer\nfor hi is going to be 3.
6022
05:47:30,355 --> 05:47:35,111
So it's a marginal suboptimality,\nbut I could do better, right?
6023
05:47:35,111 --> 05:47:39,820
Don't ask multiple times questions\n
6024
05:47:39,820 --> 05:47:45,221
So how could I remember the answer to\n
6025
05:47:45,221 --> 05:47:49,011
How could I remember the\nanswer to this question?
6026
05:47:50,291 --> 05:47:51,707
AUDIENCE: Store it in a variable.
6027
05:47:51,706 --> 05:47:53,440
DAVID MALAN: So store\nit in a variable, right?
6028
05:47:53,440 --> 05:47:56,358
That's been our answer most any time\n
6029
05:47:57,381 --> 05:48:02,140
Well, I could do something like this,\n
6030
05:48:02,140 --> 05:48:05,460
Then I can just change\nthis function call.
6031
05:48:05,460 --> 05:48:07,420
Let me fix my spelling here.
6032
05:48:07,420 --> 05:48:11,620
Let me fix this to be comparing\n
6033
05:48:11,620 --> 05:48:14,501
Because now strlen is only\ncalled once on line 9.
6034
05:48:14,501 --> 05:48:17,001
And I'm reusing the value\nof that variable, a.k.a.
6035
05:48:17,001 --> 05:48:18,501
length, again, and again, and again.
6036
05:48:19,543 --> 05:48:24,021
Turns out that 4 loops let you\n
6037
05:48:24,021 --> 05:48:28,280
so we can do this a little\nmore elegantly all in one line.
6038
05:48:28,280 --> 05:48:31,030
And this is just some\nsyntactic improvement.
6039
05:48:31,030 --> 05:48:36,190
I could actually do something\nlike this, n equals strlen of s
6040
05:48:36,190 --> 05:48:39,010
and then I could just say n\nhere or I could call it length.
6041
05:48:39,010 --> 05:48:41,927
But heck, while I'm being succinct\n
6042
05:48:41,927 --> 05:48:46,361
So now it's just a marginal\nchange but I've now
6043
05:48:46,361 --> 05:48:50,291
declared two variables\ninside of my loop, i and n.
6044
05:48:50,291 --> 05:48:53,561
i is set to 0. n extends\nto the string length of s.
6045
05:48:53,561 --> 05:48:57,641
But now, hereafter, all of my condition\n
6046
05:48:57,640 --> 05:49:00,431
i less than n, and n is never changing.
6047
05:49:00,431 --> 05:49:02,269
All right, so a marginal\nimprovement there.
6048
05:49:02,269 --> 05:49:04,061
Now that I've used this\nnew function, let's
6049
05:49:04,061 --> 05:49:06,186
use some other functions\nthat might be of interest.
6050
05:49:06,186 --> 05:49:12,941
Let me write a quick program here\n
6051
05:49:12,940 --> 05:49:16,070
changes to uppercase some\nstring that the user types in.
6052
05:49:16,070 --> 05:49:19,751
So let me code a file\ncalled uppercase.c.
6053
05:49:19,751 --> 05:49:25,780
Up here I'll use my new friends,\n
6054
05:49:25,780 --> 05:49:31,330
So standard I/O, and string.h So\njust as before int main(void).
6055
05:49:31,330 --> 05:49:33,880
And then inside of main, what\nI'm going to do this time
6056
05:49:33,881 --> 05:49:38,651
is let's ask the user for a string\n
6057
05:49:39,940 --> 05:49:44,390
And then let me print\nout something like after.
6058
05:49:44,390 --> 05:49:48,670
So that it-- just so I can see what\n
6059
05:49:48,670 --> 05:49:52,870
And then after this, let me\ndo the following, for int, i
6060
05:49:52,870 --> 05:49:56,290
equals 0, oh, let's\npractice that same lesson
6061
05:49:56,291 --> 05:50:02,050
so n equals the string length of\n
6062
05:50:02,050 --> 05:50:05,861
So really, nothing\nnew, fundamentally yet.
6063
05:50:05,861 --> 05:50:11,530
How do I now convert characters from\n
6064
05:50:11,530 --> 05:50:14,260
In other words, if I type\nin hi, H-I in lowercase
6065
05:50:14,260 --> 05:50:19,751
I want my program, now, to uppercase\n
6066
05:50:19,751 --> 05:50:23,030
Well how can I go about doing this?
6067
05:50:23,030 --> 05:50:25,271
Well you might recall\nthat there is this--
6068
05:50:25,271 --> 05:50:28,161
you might recall that\nthere is this ASCII chart.
6069
05:50:28,161 --> 05:50:31,116
So let's just consult this\nreal quick on asciichart.com.
6070
05:50:31,116 --> 05:50:35,771
We've looked at this last week\nnotice that a-- capital A is 65
6071
05:50:35,771 --> 05:50:39,701
capital B is 66, capital\nC is 67, and heck, here's
6072
05:50:39,701 --> 05:50:43,901
lowercase a, lowercase b,\nlowercase c, and that's 97, 98, 99.
6073
05:50:43,901 --> 05:50:47,241
And if I actually do some\nmath, there's a distance of 32.
6074
05:50:47,741 --> 05:50:49,901
So if I want to go from\nuppercase to lowercase
6075
05:50:49,901 --> 05:50:55,049
I can do 65 plus 32 will give me\n97 and that actually works out
6076
05:50:55,048 --> 05:50:56,591
across the board for everything else.
6077
05:50:56,591 --> 05:51:00,280
66 plus 32 gets me to 98 or lowercase b.
6078
05:51:00,280 --> 05:51:04,900
Or conversely, if you have a\nlowercase a, and its value is 97
6079
05:51:04,901 --> 05:51:11,111
subtract 32 and boom, you have capital\n
6080
05:51:11,111 --> 05:51:13,721
But now that we know that\nstrings are just arrays
6081
05:51:13,721 --> 05:51:17,591
and we know that characters,\nwhich are in those arrays
6082
05:51:17,591 --> 05:51:20,710
are just binary\nrepresentations of numbers
6083
05:51:20,710 --> 05:51:23,558
I think we can manipulate a\nfew of these things as follows.
6084
05:51:23,558 --> 05:51:25,390
Let me go back to my\nprogram here, and first
6085
05:51:25,390 --> 05:51:29,620
ask the question, if the current\n
6086
05:51:29,620 --> 05:51:33,190
is lowercase, let's\nforce it to uppercase.
6087
05:51:33,190 --> 05:51:34,510
So how am I going to do that?
6088
05:51:34,510 --> 05:51:40,720
If the character at s bracket i,\n
6089
05:51:40,721 --> 05:51:45,581
is greater than or equal to\nlowercase a, and s bracket
6090
05:51:45,580 --> 05:51:50,920
i is less than or equal to\nlowercase z, kind of a weird Boolean
6091
05:51:50,920 --> 05:51:55,720
expression but it's completely\nlegitimate, because in this array
6092
05:51:55,721 --> 05:51:58,491
s is a whole bunch of characters\nthat the humans typed in
6093
05:51:58,491 --> 05:52:01,781
because that's what a string is,\n
6094
05:52:01,780 --> 05:52:03,940
be a little nonsensical\nbecause when have you ever
6095
05:52:03,940 --> 05:52:05,591
compared numbers to letters?
6096
05:52:05,591 --> 05:52:11,829
But we know from week 0 lowercase a\n
6097
05:52:16,850 --> 05:52:20,650
And so that would allow us to answer\n
6098
05:52:21,670 --> 05:52:24,790
All right, so let me\nanswer that question.
6099
05:52:24,791 --> 05:52:27,401
If it is, what do I want to print out?
6100
05:52:27,401 --> 05:52:30,131
I don't want to print\nout the letter itself
6101
05:52:30,131 --> 05:52:33,550
I want to print out the\nletter minus 32, right?
6102
05:52:33,550 --> 05:52:37,420
Because if it happens to be a\nlowercase a, 97, 97 minus 32
6103
05:52:37,420 --> 05:52:39,790
gives me 65, which is\nuppercase A, and I know that
6104
05:52:39,791 --> 05:52:43,121
just from having stared\nat that chart in the past.
6105
05:52:43,120 --> 05:52:48,433
Else if the character is not\nbetween little a and big A
6106
05:52:48,433 --> 05:52:50,140
I'm just going to\nprint out the character
6107
05:52:50,140 --> 05:52:52,810
itself by printing s bracket i.
6108
05:52:52,811 --> 05:52:55,841
And at the very end of this, I'm\n
6109
05:52:55,841 --> 05:52:57,741
to move the cursor to the next line.
6110
05:52:57,741 --> 05:52:59,191
So again, it's a little wordy.
6111
05:52:59,190 --> 05:53:03,280
But this loop here, which I\nborrowed from our code previously
6112
05:53:03,280 --> 05:53:05,771
just iterates over the string, a.k.a.
6113
05:53:05,771 --> 05:53:08,890
array, character-by-character,\nthrough its length.
6114
05:53:08,890 --> 05:53:11,620
This line 11 here is\njust asking the question
6115
05:53:11,620 --> 05:53:15,130
if that current character,\nthe i-th character of s
6116
05:53:15,131 --> 05:53:18,161
is greater than or equal\nto little a and less
6117
05:53:18,161 --> 05:53:23,501
than or equal to little z, that\nis between 97 and 132, then
6118
05:53:23,501 --> 05:53:29,201
we're going to go ahead and\nforce it to uppercase instead.
6119
05:53:29,201 --> 05:53:33,550
All right, and let me zoom\nout here for just a second.
6120
05:53:33,550 --> 05:53:38,530
And sorry, I misspoke 122, which\nis what you might have said.
6121
05:53:41,530 --> 05:53:44,540
Let me go ahead now and\ncompile and run this program.
6122
05:53:44,541 --> 05:53:50,471
So make uppercase, ./uppercase, and\n
6123
05:53:50,471 --> 05:53:52,781
And there's the capitalized\nversion, thereof.
6124
05:53:52,780 --> 05:53:55,181
Let me do it again, with\nmy own name in lowercase
6125
05:53:55,181 --> 05:53:57,361
and now it's capitalized as well.
6126
05:53:57,361 --> 05:53:59,120
Well, what could we do to improve this?
6127
05:54:00,221 --> 05:54:01,901
Let's stop reinventing wheels.
6128
05:54:01,901 --> 05:54:04,101
Let's go to the manual pages.
6129
05:54:04,100 --> 05:54:07,751
So let me go here and search for\nsomething like, I don't know
6130
05:54:09,881 --> 05:54:12,730
I did some auto complete\nhere, our little search box
6131
05:54:12,730 --> 05:54:14,980
is saying that, OK there's\nan is-lower function
6132
05:54:14,980 --> 05:54:16,811
check whether a character is lowercase.
6133
05:54:17,901 --> 05:54:23,411
Well let me check, is lower, now I see\n
6134
05:54:27,163 --> 05:54:28,870
that's the header file\nI need to include.
6135
05:54:28,870 --> 05:54:32,830
This is the prototype for is-lower,\n
6136
05:54:35,591 --> 05:54:38,661
I feel like is-lower should\nreturn true or false.
6137
05:54:38,661 --> 05:54:42,940
So let's scroll down to the\ndescription and return value.
6138
05:54:42,940 --> 05:54:45,070
It returns, oh this is interesting.
6139
05:54:45,070 --> 05:54:49,631
And this is a convention in C. This\n
6140
05:54:49,631 --> 05:54:55,081
if C is a lowercase letter and 0\nif C is not a lowercase letter.
6141
05:54:57,491 --> 05:55:02,591
So like 1, negative 1, something that's\n
6142
05:55:02,591 --> 05:55:05,661
and 0 if it is not a lowercase letter.
6143
05:55:05,661 --> 05:55:07,420
So how can we use this building block?
6144
05:55:07,420 --> 05:55:09,490
Let me go back to my code here.
6145
05:55:09,491 --> 05:55:13,871
Let me add this file, include ctype.h.
6146
05:55:13,870 --> 05:55:17,380
And down here, let me get rid of\nthis cryptic expression, which
6147
05:55:17,381 --> 05:55:23,320
was kind of painful to come up with,\n
6148
05:55:26,230 --> 05:55:29,650
That should actually work but why?
6149
05:55:29,651 --> 05:55:34,781
Well is-lower, again, returns a non-zero\n
6150
05:55:36,411 --> 05:55:37,675
That means it could return 1.
6151
05:55:37,675 --> 05:55:38,800
It could return negative 1.
6152
05:55:38,800 --> 05:55:40,631
It could return 50 or negative 50.
6153
05:55:40,631 --> 05:55:42,911
It's actually not\nprecisely defined, why?
6154
05:55:43,960 --> 05:55:48,010
This was a common convention to\nuse 0 to represent false and use
6155
05:55:48,010 --> 05:55:50,380
any other value to represent true.
6156
05:55:50,381 --> 05:55:54,401
And so it turns out, that\ninside of Boolean expressions
6157
05:55:54,401 --> 05:55:59,016
if you put a value like a function\n
6158
05:55:59,015 --> 05:56:00,640
that's going to be equivalent to false.
6159
05:56:00,640 --> 05:56:03,236
It's like the answer\nbeing no, it is not lower.
6160
05:56:03,236 --> 05:56:06,251
But you can also, in\nparentheses, put the name
6161
05:56:06,251 --> 05:56:10,181
of the function and its arguments,\n
6162
05:56:10,181 --> 05:56:15,491
Because we could do something like\n
6163
05:56:16,508 --> 05:56:19,091
Because that's the definition,\nif it returns a non-zero value
6164
05:56:20,021 --> 05:56:23,471
But a more succinct way to do that\n
6165
05:56:23,471 --> 05:56:28,371
If it's is lower, then print\nout the character minus 32.
6166
05:56:28,370 --> 05:56:30,850
So this would be the common\nway of using one of these
6167
05:56:30,850 --> 05:56:34,286
is- functions to check if\nthe answer is true or false.
6168
05:56:37,070 --> 05:56:38,931
DAVID MALAN: OK, well we might be done.
6169
05:56:43,780 --> 05:56:47,440
It would be incorrect to check for\n
6170
05:56:47,440 --> 05:56:49,810
You want to check for the opposite of 0.
6171
05:56:51,131 --> 05:56:56,081
Or more succinctly, like I did by\n
6172
05:56:56,080 --> 05:56:58,820
Let me see what happens here.
6173
05:56:58,820 --> 05:57:02,951
So this is great, but some of you\n
6174
05:57:03,940 --> 05:57:06,490
A moment ago when we were on\nthe manual pages searching
6175
05:57:06,491 --> 05:57:09,641
for things related to lowercase,\nwhat might be another building
6176
05:57:13,420 --> 05:57:14,960
Based on what's on the screen here?
6177
05:57:18,401 --> 05:57:21,359
There's a function that would literally\n
6178
05:57:21,359 --> 05:57:24,293
so I don't have to get into the\nweeds of negative 32, plus 32.
6179
05:57:24,293 --> 05:57:25,751
I don't have to consult that chart.
6180
05:57:25,751 --> 05:57:29,381
Someone has solved this\nproblem for me in the past.
6181
05:57:29,381 --> 05:57:33,941
And let's see if I can\nactually get back to it.
6182
05:57:34,780 --> 05:57:36,800
Let me go ahead, now, and use this.
6183
05:57:36,800 --> 05:57:39,490
So instead of doing\ns bracket i minus 32
6184
05:57:39,491 --> 05:57:44,141
let's use a function that someone else\n
6185
05:57:44,681 --> 05:57:47,511
And now it's going to\ndo the solution for me.
6186
05:57:47,510 --> 05:57:54,790
So if I rerun make uppercase, and then\n
6187
05:57:54,791 --> 05:57:56,381
now it's working as expected.
6188
05:57:56,381 --> 05:58:00,131
And honestly, if I read the\ndocumentation for to-upper
6189
05:58:00,131 --> 05:58:03,431
by going back to its man page,\nor manual page, what you'll see
6190
05:58:03,431 --> 05:58:08,681
is that it says if it's lowercase,\n
6191
05:58:09,311 --> 05:58:13,174
If it's not lowercase, it's already\nuppercase, it's punctuation
6192
05:58:13,173 --> 05:58:14,966
it will just return\nthe original character.
6193
05:58:14,966 --> 05:58:18,161
Which means, thanks to this\nfunction, I can actually
6194
05:58:18,161 --> 05:58:21,911
tighten this up significantly,\nget rid of all of my conditional
6195
05:58:21,911 --> 05:58:26,291
there, and just print out\nthe to-upper return value
6196
05:58:26,291 --> 05:58:29,320
and leave it to whoever wrote\nthat function to figure out
6197
05:58:29,320 --> 05:58:33,730
if something's uppercase or lowercase.
6198
05:58:33,730 --> 05:58:38,080
All right, questions on\nthese kinds of tricks?
6199
05:58:38,080 --> 05:58:41,350
Again, it all reduces to\nweek 0 basics, but we're just
6200
05:58:41,350 --> 05:58:43,010
building these abstractions on top.
6201
05:58:43,510 --> 05:58:45,468
AUDIENCE: I'm wondering\nif there's any way just
6202
05:58:45,469 --> 05:58:49,370
to import all packages under\na certain subdomain instead
6203
05:58:49,370 --> 05:58:51,380
of having to do multiple\n[INAUDIBLE] statements
6204
05:58:51,381 --> 05:58:52,673
kind of like a star [INAUDIBLE]
6205
05:58:54,440 --> 05:58:57,380
There is no easy way in C\nto say, give me everything.
6206
05:58:57,381 --> 05:58:59,931
That was for, historically,\nperformance reasons.
6207
05:58:59,931 --> 05:59:03,201
They want you to be explicit\nas to what you want to include.
6208
05:59:03,201 --> 05:59:05,991
In other languages like\nPython, Java, one of which
6209
05:59:05,991 --> 05:59:08,774
we'll see later this term, you\ncan say, give me everything.
6210
05:59:08,774 --> 05:59:11,691
But that, actually, tends to be best\n
6211
05:59:11,690 --> 05:59:14,260
execution or compilation of your code.
6212
05:59:14,760 --> 05:59:17,105
AUDIENCE: Does to-upper\naccommodate for special characters?
6213
05:59:17,600 --> 05:59:20,240
Does to-upper accommodate special\ncharacters like punctuation?
6214
05:59:20,741 --> 05:59:22,701
If I read the documentation\nmore pedantically
6215
05:59:23,971 --> 05:59:27,201
It will properly hand me\nback an exclamation point
6216
05:59:28,861 --> 05:59:33,230
So if I do make uppercase here,\nand let me do ./upper, sorry--
6217
05:59:33,230 --> 05:59:37,881
./uppercase, hi with an exclamation\n
6218
05:59:37,881 --> 05:59:40,070
pass it through unchanged Yeah?
6219
05:59:40,070 --> 05:59:43,460
AUDIENCE: Do we access to a\nfunction that would do all of that
6220
05:59:43,460 --> 05:59:45,850
but just to the screen\nrather than to [INAUDIBLE]
6221
05:59:45,850 --> 05:59:47,810
DAVID MALAN: Really good question, too.
6222
05:59:47,811 --> 05:59:52,371
No, we do not have access to a function\n
6223
05:59:52,370 --> 05:59:56,001
with CS50's library that will just\n
6224
05:59:56,001 --> 05:59:58,431
In C, that's actually\neasier said than done.
6225
05:59:59,811 --> 06:00:04,070
So stay tuned for another language\n
6226
06:00:04,070 --> 06:00:06,771
All right, so what does\nthis leave us with?
6227
06:00:06,771 --> 06:00:08,780
There's just a-- let's\ncome full circle now
6228
06:00:08,780 --> 06:00:11,751
to where we began today where we\n
6229
06:00:12,350 --> 06:00:16,070
Recall that we talked about rm\ntaking command line argument.
6230
06:00:16,070 --> 06:00:18,730
The file you want to delete,\nwe talked about clang
6231
06:00:18,730 --> 06:00:20,480
taking command line\narguments, that again
6232
06:00:20,480 --> 06:00:22,400
modify the behavior of the program.
6233
06:00:22,401 --> 06:00:25,941
How is it that maybe you and I\ncan start to write programs that
6234
06:00:25,940 --> 06:00:28,100
actually take command line arguments?
6235
06:00:28,100 --> 06:00:31,880
Well here is where I\ncan finally explain why
6236
06:00:31,881 --> 06:00:35,001
we've been typing int\nmain(void) for the past week
6237
06:00:35,001 --> 06:00:38,751
and just asking that you take on faith\n
6238
06:00:38,751 --> 06:00:45,081
Well, by default in C, at least\n
6239
06:00:45,080 --> 06:00:48,270
there's only two official\nways to write main functions.
6240
06:00:48,271 --> 06:00:50,721
You might see other formats\nonline, but they're generally
6241
06:00:50,721 --> 06:00:53,131
not consistent with the\ncurrent specification.
6242
06:00:53,131 --> 06:00:56,421
This, again, was sort of a\nboilerplate for the simplest
6243
06:00:56,420 --> 06:00:59,030
function we might write last\nweek, and recall that we've
6244
06:00:59,030 --> 06:01:00,470
been doing this the whole time.
6245
06:01:00,471 --> 06:01:05,251
(Void) What that (void) means, for all\n
6246
06:01:05,251 --> 06:01:08,151
and you have written thus far,\nis that none of our programs
6247
06:01:08,151 --> 06:01:11,301
that we've written take\ncommand line arguments.
6248
06:01:11,300 --> 06:01:13,370
That's what the void there means.
6249
06:01:13,370 --> 06:01:18,210
It turns out that main is the way you\n
6250
06:01:18,210 --> 06:01:20,001
in fact, take command\nline arguments, that
6251
06:01:20,001 --> 06:01:24,021
is words after the command\nin your terminal window.
6252
06:01:24,021 --> 06:01:26,480
If you want to actually not\nuse get int or get string
6253
06:01:26,480 --> 06:01:30,230
you want the human to be able to\n
6254
06:01:31,100 --> 06:01:34,200
And just run-- print\nhello, David on the screen.
6255
06:01:34,201 --> 06:01:38,721
You can use command line arguments,\nwords after the program name
6256
06:01:41,010 --> 06:01:44,720
So we're going to change this in a\n
6257
06:01:44,721 --> 06:01:48,191
but something that's now a bit\nmore familiar syntactically.
6258
06:01:48,190 --> 06:01:52,700
If you change that (void) in main\n
6259
06:01:52,701 --> 06:01:57,741
int, argc, comma, string, argv,\nopen bracket, close bracket
6260
06:01:57,741 --> 06:02:00,891
you are now giving yourself\naccess to writing programs
6261
06:02:00,890 --> 06:02:03,170
that take command line arguments.
6262
06:02:03,170 --> 06:02:06,380
Argc, which stands for\nargument count is going
6263
06:02:06,381 --> 06:02:10,671
to be an integer that stores how many\n
6264
06:02:10,670 --> 06:02:13,310
The C automatically gives that to you.
6265
06:02:13,311 --> 06:02:16,971
String argv stands for\nargument vector, that's
6266
06:02:16,971 --> 06:02:21,361
going to be an array of all of the words\n
6267
06:02:21,361 --> 06:02:23,390
So with today's building\nblock of an array
6268
06:02:23,390 --> 06:02:26,240
we have the ability now to let\nthe humans type as many words
6269
06:02:26,241 --> 06:02:28,161
or as few words, as\nthey want at the prompt.
6270
06:02:28,161 --> 06:02:31,161
C is going to automatically put\nthem in an array called argv
6271
06:02:31,161 --> 06:02:36,620
and it's going to tell us how many\n
6272
06:02:36,620 --> 06:02:40,320
The int, as the return type here,\n
6273
06:02:40,320 --> 06:02:43,611
Let's use this definition\nto make, maybe
6274
06:02:43,611 --> 06:02:45,230
just a couple of simple programs.
6275
06:02:45,230 --> 06:02:47,330
But in problem set 2\nwill we actually use
6276
06:02:47,330 --> 06:02:50,730
this to control the\nbehavior of your own code.
6277
06:02:50,730 --> 06:02:57,381
Let me code up a file called\nargv.0 just to keep it aptly named.
6278
06:02:59,960 --> 06:03:01,501
Let me go ahead and include--
6279
06:03:02,001 --> 06:03:05,210
That is not the right name of a\nprogram, let's start that over.
6280
06:03:05,210 --> 06:03:09,710
Let's go ahead and code up argv.c.
6281
06:03:11,061 --> 06:03:17,151
include cs50.h, include\nstdio.h, int, main, not void
6282
06:03:17,151 --> 06:03:24,286
let's actually say int, argc, string,\n
6283
06:03:24,286 --> 06:03:26,661
No numbers in between because\nyou don't know, in advance
6284
06:03:26,661 --> 06:03:29,570
how many words the human's\ngoing to type at their prompt.
6285
06:03:29,570 --> 06:03:31,021
Now let's go ahead and do this.
6286
06:03:31,021 --> 06:03:35,061
Let's write a very simple program that\n
6287
06:03:35,061 --> 06:03:36,921
whoever the name is that gets typed.
6288
06:03:36,920 --> 06:03:40,520
But not using get string, let's\ninstead have the human just
6289
06:03:40,521 --> 06:03:44,151
type their name at the prompt, just like\n
6290
06:03:44,151 --> 06:03:46,431
so it's just one and\ndone when you hit Enter.
6291
06:03:47,870 --> 06:03:52,640
Let me go ahead then and do this,\nprintf, quote-unquote, hello
6292
06:03:52,640 --> 06:03:55,760
comma, and instead of world\ntoday, I want to print out
6293
06:03:55,760 --> 06:03:57,630
whatever the human typed in.
6294
06:03:57,631 --> 06:04:03,111
So let's go ahead and do\nthis, argv, bracket 0 for now.
6295
06:04:03,111 --> 06:04:07,341
But I don't think this is quite\nwhat I want because, of course
6296
06:04:07,341 --> 06:04:12,631
that's going to literally print\nout argv, bracket, 0, bracket.
6297
06:04:12,631 --> 06:04:16,771
I need a placeholder, so let me\n
6298
06:04:16,771 --> 06:04:20,780
So if argv is an array, but\nit's an array of strings
6299
06:04:20,780 --> 06:04:24,740
then argv bracket 0 is\nitself a single string.
6300
06:04:24,741 --> 06:04:27,711
And so it can be plugged\ninto that %s placeholder.
6301
06:04:27,710 --> 06:04:30,001
Let me go ahead and save my program.
6302
06:04:30,001 --> 06:04:33,600
And compile argv, so far, so good.
6303
06:04:33,600 --> 06:04:37,431
Let me now type in my name\nafter the name of the program.
6304
06:04:38,241 --> 06:04:42,541
I'm literally typing an extra word,\n
6305
06:04:42,541 --> 06:04:45,550
OK, it's apparently a little\nbuggy in a couple of ways.
6306
06:04:45,550 --> 06:04:48,760
I forgot my /n but\nthat's not a huge deal.
6307
06:04:48,760 --> 06:04:53,220
But apparently, inside of\nargv is literally everything
6308
06:04:53,221 --> 06:04:55,531
that humans typed in including\nthe name of the program.
6309
06:04:55,530 --> 06:05:00,510
So logically, how do I print out hello,\n
6310
06:05:00,510 --> 06:05:01,980
the actual name of the program?
6311
06:05:03,721 --> 06:05:05,311
AUDIENCE: Change the index to 1.
6312
06:05:06,061 --> 06:05:10,201
So presumably index to 1, if that's\n
6313
06:05:11,201 --> 06:05:15,671
So let's do make argv\nagain, ./argv, Enter.
6314
06:05:17,890 --> 06:05:19,951
So this is another form of nul.
6315
06:05:19,951 --> 06:05:23,581
But this is user error, now, on my part.
6316
06:05:23,580 --> 06:05:25,330
I didn't do exactly what I said I would.
6317
06:05:25,830 --> 06:05:26,790
AUDIENCE: You forgot the parameter.
6318
06:05:26,791 --> 06:05:28,691
DAVID MALAN: Yeah, I\nforgot the parameter.
6319
06:05:29,960 --> 06:05:31,710
I should probably deal\nwith that, somehow
6320
06:05:31,710 --> 06:05:33,552
so that people aren't\nbreaking my program
6321
06:05:33,552 --> 06:05:35,260
and printing out random\nthings, like nul.
6322
06:05:35,260 --> 06:05:39,030
But if I do say argv, David,\nnow you see hello, David.
6323
06:05:39,030 --> 06:05:42,330
I can get a little curious,\nlike what's at location 2?
6324
06:05:42,330 --> 06:05:47,670
Well we can see, make argv,\nbracket, ./argv, David, Enter.
6325
06:05:47,670 --> 06:05:49,170
All right, so just nothing is there.
6326
06:05:49,170 --> 06:05:52,462
But it turns out, in a couple of weeks,\n
6327
06:05:52,462 --> 06:05:54,570
and see if we can't crash\nprograms deliberately
6328
06:05:54,570 --> 06:05:57,061
because nothing is\nstopping me from saying
6329
06:05:57,061 --> 06:06:00,730
oh what's at location 2\nmillion, for instance?
6330
06:06:00,730 --> 06:06:02,611
We could really start to get curious.
6331
06:06:02,611 --> 06:06:04,681
But for now, we'll do the right thing.
6332
06:06:04,681 --> 06:06:08,620
But let's now make sure the human has\n
6333
06:06:08,620 --> 06:06:15,181
So let's say this, if argc equals\n
6334
06:06:15,181 --> 06:06:19,021
and one more word after that, go\nahead and trust that in argv 1
6335
06:06:19,021 --> 06:06:21,241
as you proposed, is the person's name.
6336
06:06:21,241 --> 06:06:26,071
Else, let's go ahead and default\n
6337
06:06:26,070 --> 06:06:30,120
like, well, if we don't get a name\n
6338
06:06:31,561 --> 06:06:34,306
So now we're programming defensively.
6339
06:06:34,306 --> 06:06:37,350
This time the human, even if they\n
6340
06:06:37,350 --> 06:06:40,225
or they give us too many names,\n
6341
06:06:40,225 --> 06:06:42,150
because I now have some\nerror handling here.
6342
06:06:42,151 --> 06:06:46,291
Because, again, argc is argument\n
6343
06:06:51,001 --> 06:06:52,800
Let me make the same mistake as before.
6344
06:06:53,311 --> 06:06:55,171
I don't get this weird nul behavior.
6345
06:06:55,170 --> 06:06:56,610
I get something well-defined.
6346
06:06:57,870 --> 06:07:01,111
I could do David Malan, but\nthat's not currently supported.
6347
06:07:01,111 --> 06:07:05,550
I would need to alter my logic to\n
6348
06:07:06,605 --> 06:07:08,030
So what's the point of this?
6349
06:07:08,030 --> 06:07:09,780
At the moment, it's\njust a simple exercise
6350
06:07:09,780 --> 06:07:14,962
to actually give myself a way of taking\n
6351
06:07:14,962 --> 06:07:16,920
Because, consider, it's\njust more convenient in
6352
06:07:16,920 --> 06:07:18,930
this new, command-line-interface world.
6353
06:07:18,931 --> 06:07:23,118
If you had to use get string\nevery time you compile your code
6354
06:07:23,117 --> 06:07:24,450
it'd be kind of annoying, right?
6355
06:07:24,451 --> 06:07:28,201
You type make, then you might get a\n
6356
06:07:28,201 --> 06:07:31,951
Then you type in hello, or cash, or\n
6357
06:07:31,951 --> 06:07:33,591
it just really slows the process.
6358
06:07:33,591 --> 06:07:35,701
But in this\ncommand-line-interface world
6359
06:07:35,701 --> 06:07:39,031
if you support command line arguments,\n
6360
06:07:39,030 --> 06:07:42,431
Like, scrolling up and down in\n
6361
06:07:42,431 --> 06:07:46,690
You can just type commands more quickly\n
6362
06:07:46,690 --> 06:07:49,260
And you don't have to keep\nprompting the user, more
6363
06:07:49,260 --> 06:07:52,020
pedantically, for more and more info.
6364
06:07:52,021 --> 06:07:54,541
So any questions then on\ncommand line arguments?
6365
06:07:54,541 --> 06:07:58,261
Which, finally, reveals why\nwe had (void) initially
6366
06:07:58,260 --> 06:08:00,870
but what more we can now put in main.
6367
06:08:00,870 --> 06:08:03,330
That's how you take\ncommand line arguments.
6368
06:08:04,760 --> 06:08:06,870
AUDIENCE: If you were to put--
6369
06:08:06,870 --> 06:08:11,580
if you were to use argv, and you\n
6370
06:08:11,580 --> 06:08:14,183
would it still give you, like, a string?
6371
06:08:14,184 --> 06:08:15,766
Would that still be considered string?
6372
06:08:15,766 --> 06:08:17,184
Or would you consider [INAUDIBLE]?
6373
06:08:18,021 --> 06:08:20,811
If you were to type at\nthe command line something
6374
06:08:20,811 --> 06:08:24,921
like, not a word, but\nsomething like the number 42
6375
06:08:24,920 --> 06:08:27,710
that would actually be\ntreated as a string.
6376
06:08:28,550 --> 06:08:30,480
Because again, context matters.
6377
06:08:30,480 --> 06:08:33,201
So if your program is\ncurrently manipulating memory
6378
06:08:33,201 --> 06:08:36,771
as though its characters or strings,\n
6379
06:08:36,771 --> 06:08:41,061
are, they will be interpreted\nas ASCII text, or Unicode text.
6380
06:08:41,061 --> 06:08:44,901
If we therefore go to the chart here,\n
6381
06:08:44,901 --> 06:08:48,771
then how do you distinguish numbers\n
6382
06:08:50,151 --> 06:08:58,641
Well, notice 65 is a, 97 is a,\nbut also 49 is 1, and 50 is 2.
6383
06:08:58,640 --> 06:09:01,760
So the designers of ASCII,\nand then later Unicode
6384
06:09:01,760 --> 06:09:04,940
realized well wait a minute,\nif we want to support programs
6385
06:09:04,940 --> 06:09:07,700
that let you type things\nthat look like numbers
6386
06:09:07,701 --> 06:09:10,611
even though they're not\ntechnically ints or floats
6387
06:09:10,611 --> 06:09:14,881
we need a way in ASCII and\nUnicode to represent even numbers.
6388
06:09:16,131 --> 06:09:19,471
And it's a little silly that we have\n
6389
06:09:19,471 --> 06:09:22,123
But again, if you're in the\nworld of letters and characters
6390
06:09:22,123 --> 06:09:24,291
you've got to come up with\na mapping for everything.
6391
06:09:24,291 --> 06:09:26,050
And notice here, here's the dot.
6392
06:09:26,050 --> 06:09:30,650
Even if you were to represent 1.23\n
6393
06:09:30,651 --> 06:09:35,101
even the dot now is going to be\n
6394
06:09:35,100 --> 06:09:37,190
So again, context here matters.
6395
06:09:37,190 --> 06:09:41,630
All right, one final example\nto tease apart what this int is
6396
06:09:41,631 --> 06:09:44,101
and what it's been\ndoing here for so long.
6397
06:09:44,100 --> 06:09:49,040
So I'm going to add one\nbit of logic to a new file
6398
06:09:49,041 --> 06:09:52,011
that I'm going to call exit.c.
6399
06:09:53,390 --> 06:09:57,140
We're going to introduce something that\n
6400
06:09:57,140 --> 06:09:59,240
It turns out this is not\na feature we've used yet
6401
06:09:59,241 --> 06:10:01,501
but it's just useful to know about.
6402
06:10:01,501 --> 06:10:04,611
Especially when automating\ntests of your own code.
6403
06:10:04,611 --> 06:10:08,376
When it comes to figuring out if\na program succeeded or failed.
6404
06:10:08,376 --> 06:10:13,131
It turns out that main has one\n
6405
06:10:13,131 --> 06:10:18,591
An ability to signal to the user\n
6406
06:10:18,591 --> 06:10:22,021
And that's by way of\nmain's return value.
6407
06:10:22,021 --> 06:10:26,320
So I'm going modify this\nprogram as follows, like this.
6408
06:10:26,320 --> 06:10:29,181
Suppose I want to write\na similar program that
6409
06:10:29,181 --> 06:10:32,161
requires that the user\ntype a word at the prompt.
6410
06:10:32,161 --> 06:10:36,710
So that argc has to be 2\nfor whatever design purpose.
6411
06:10:36,710 --> 06:10:43,251
If argc does not equal 2, I want to\n
6412
06:10:43,251 --> 06:10:46,850
I want to insist that the user\noperate the program correctly.
6413
06:10:46,850 --> 06:10:53,060
So I might give them an error message\n
6414
06:10:53,061 --> 06:10:55,441
But now I want to quit\nout of the program.
6415
06:10:56,570 --> 06:11:01,521
The right way, quote-unquote, to do\n
6416
06:11:01,521 --> 06:11:04,851
Now it's a little weird\nbecause no one called main yet
6417
06:11:04,850 --> 06:11:07,251
right, main just gets\ncalled automatically
6418
06:11:07,251 --> 06:11:09,561
but the convention is\nanytime something goes
6419
06:11:09,561 --> 06:11:14,361
wrong in a program you should\nreturn a non-zero value from main.
6420
06:11:16,041 --> 06:11:19,730
We don't need to get into the weeds of\n
6421
06:11:20,480 --> 06:11:26,030
But if you return 1, that is a clue to\n
6422
06:11:26,030 --> 06:11:27,690
device that's something went wrong.
6423
06:11:29,931 --> 06:11:35,721
If everything works fine, like, let's go\n
6424
06:11:35,721 --> 06:11:40,881
before, quote-unquote argv bracket 1.
6425
06:11:40,881 --> 06:11:43,341
So this is just a version of\nthe program without an else.
6426
06:11:43,341 --> 06:11:45,651
So this is the same\nas doing, essentially
6427
06:11:45,651 --> 06:11:47,841
an else here like I did earlier.
6428
06:11:47,841 --> 06:11:51,001
I want to signal to the\ncomputer that all is well.
6429
06:11:52,550 --> 06:11:55,911
But strictly speaking, if\nI'm already returning here
6430
06:11:55,911 --> 06:11:58,820
I don't technically need, if\nI really want to be nit picky
6431
06:11:58,820 --> 06:12:01,131
I don't technically need the\nelse because the only way
6432
06:12:01,131 --> 06:12:05,747
I'm going to get to line 11\nis if I didn't already return.
6433
06:12:07,440 --> 06:12:10,790
The only new thing here logically,\n
6434
06:12:10,791 --> 06:12:13,070
I'm returning a value from main.
6435
06:12:13,070 --> 06:12:14,990
That's something I\ncould always have done
6436
06:12:14,991 --> 06:12:19,551
because main has always been defined by\n
6437
06:12:19,550 --> 06:12:24,140
By default, main automatically,\n
6438
06:12:24,140 --> 06:12:27,111
If you've never once use the\nreturn keyword, which you probably
6439
06:12:27,111 --> 06:12:29,631
haven't in main, it just\nautomatically returns 0
6440
06:12:29,631 --> 06:12:31,556
and the system assumes\nthat all went well.
6441
06:12:31,556 --> 06:12:33,651
But now that we're starting\nto get a little more
6442
06:12:33,651 --> 06:12:35,781
sophisticated with our\ncode, and you know
6443
06:12:35,780 --> 06:12:39,740
the programmer, something went\n
6444
06:12:39,741 --> 06:12:44,871
You can exit out of them by returning\n
6445
06:12:44,870 --> 06:12:47,300
And this is fortuitous\nthat it's an int, right?
6446
06:12:49,370 --> 06:12:53,510
Unfortunately, in programming, there are\n
6447
06:12:54,501 --> 06:12:57,471
And int gives you 4\nbillion possible codes
6448
06:12:57,471 --> 06:13:00,716
that you can use, a.k.a. exit\nstatuses, to signify errors.
6449
06:13:00,716 --> 06:13:04,190
So if you've ever on your Mac\nor PC gotten some weird pop up
6450
06:13:04,190 --> 06:13:07,580
that an error happened, sometimes,\n
6451
06:13:07,580 --> 06:13:09,680
Maybe it's positive,\nmaybe it's negative.
6452
06:13:09,681 --> 06:13:14,431
It might say error code 123, or\n
6453
06:13:14,431 --> 06:13:18,570
What you're generally seeing, are\n
6454
06:13:18,570 --> 06:13:21,870
values from main in a program\nthat someone at Microsoft
6455
06:13:21,870 --> 06:13:25,380
or Apple, or somewhere else\nwrote, something went wrong
6456
06:13:25,381 --> 06:13:30,241
they are unnecessarily showing you,\n
6457
06:13:30,241 --> 06:13:33,361
If only, so that when you call\n
6458
06:13:33,361 --> 06:13:36,451
you can tell them what exit\nstatus you encountered
6459
06:13:36,451 --> 06:13:39,331
what error code you encounter.
6460
06:13:39,330 --> 06:13:43,650
All right, any questions\non exit statuses
6461
06:13:43,651 --> 06:13:48,841
which is the last of our new\nbuilding blocks, for now?
6462
06:13:50,300 --> 06:13:57,800
AUDIENCE: [INAUDIBLE] You know how\n
6463
06:13:57,800 --> 06:13:59,679
if you want to make [INAUDIBLE]
6464
06:14:00,346 --> 06:14:03,526
The question is can you\ndo things again and again
6465
06:14:03,526 --> 06:14:06,151
at the command line like you\ncould with get string and get int.
6466
06:14:06,151 --> 06:14:08,131
Which, by default,\nrecall are automatically
6467
06:14:08,131 --> 06:14:10,681
designed to keep prompting\nthe user in their own loop
6468
06:14:10,681 --> 06:14:14,221
until they give you an int, or a\n
6469
06:14:15,001 --> 06:14:16,471
You're going to get an\nerror message but then
6470
06:14:16,471 --> 06:14:18,263
you're going to be\nreturned to your prompt.
6471
06:14:18,262 --> 06:14:21,647
And it's up to you to type\nit correctly the next time.
6472
06:14:22,730 --> 06:14:27,695
AUDIENCE: [INAUDIBLE]\nautomatically for you.
6473
06:14:27,695 --> 06:14:29,570
DAVID MALAN: If you\ndo not return a value
6474
06:14:29,570 --> 06:14:32,990
explicitly main will\nautomatically return 0 for you
6475
06:14:32,991 --> 06:14:36,901
that is the way C simply works\nso it's not strictly necessary.
6476
06:14:36,901 --> 06:14:39,771
But now that we're starting\nto return values explicitly
6477
06:14:39,771 --> 06:14:42,351
if something goes wrong,\nit would be good practice
6478
06:14:42,350 --> 06:14:45,740
to also start returning a value\n
6479
06:14:48,036 --> 06:14:52,070
So let's now get out of\nthe weeds and contextualize
6480
06:14:52,070 --> 06:14:55,460
this for some actual problems that\n
6481
06:14:55,460 --> 06:14:57,390
by way of problems set 2 and beyond.
6482
06:15:00,001 --> 06:15:04,251
So here for instance, is a\nproblem that you might think back
6483
06:15:04,251 --> 06:15:08,241
to when you were a kid the\n
6484
06:15:08,241 --> 06:15:10,491
the grade level in which\nsome book is written.
6485
06:15:10,491 --> 06:15:14,001
If you're a young student, you\nmight read at first-grade level
6486
06:15:14,001 --> 06:15:15,501
or third-grade level in the US.
6487
06:15:15,501 --> 06:15:17,293
Or, if you're in college\npresumably, you're
6488
06:15:17,293 --> 06:15:19,206
reading at a university-level of text.
6489
06:15:19,205 --> 06:15:22,333
But what does it mean\nfor text, like in a book
6490
06:15:22,333 --> 06:15:24,501
or in an essay, or something\nlike that to correspond
6491
06:15:24,501 --> 06:15:25,850
to some kind of grade level?
6492
06:15:25,850 --> 06:15:29,210
Well, here's a quote-- a\ntitle of a childhood book.
6493
06:15:29,210 --> 06:15:31,850
One Fish, Two Fish, Red Fish, Blue Fish.
6494
06:15:31,850 --> 06:15:35,100
What might the grade level be for\n
6495
06:15:35,100 --> 06:15:37,850
Maybe, when you were a kid or if\n
6496
06:15:37,850 --> 06:15:40,520
these things, what might the\ngrade level of this thing be?
6497
06:15:47,643 --> 06:15:49,911
DAVID MALAN: Before grade\n1 is, in fact, correct.
6498
06:15:49,911 --> 06:15:51,550
So that's for really young kids?
6499
06:15:53,440 --> 06:15:56,470
These are pretty simple phrases, right?
6500
06:15:57,760 --> 06:16:00,220
I mean there's not even\nverbs in these sentences
6501
06:16:00,221 --> 06:16:04,301
they're just nouns and adjectives,\nand very short sentences.
6502
06:16:04,300 --> 06:16:06,460
And so that might be a\nheuristic we could use.
6503
06:16:06,460 --> 06:16:09,070
When analyzing text, well if\nthe words are kind of short
6504
06:16:09,070 --> 06:16:11,501
the sentences are kind of\nshort, everything's very simple
6505
06:16:11,501 --> 06:16:14,510
that's probably a very\nyoung, or early, grade level.
6506
06:16:14,510 --> 06:16:17,925
And so by one formulation, it might\n
6507
06:16:19,931 --> 06:16:22,282
Mr and Mrs. Dursley, of\nnumber 4, Privet Drive
6508
06:16:22,282 --> 06:16:25,240
were proud to say that they were\n
6509
06:16:25,241 --> 06:16:27,221
They were the last\npeople you would expect
6510
06:16:27,221 --> 06:16:29,381
to be involved in anything\nstrange or mysterious
6511
06:16:29,381 --> 06:16:32,111
because they just didn't\nhold with such nonsense.
6512
06:16:33,043 --> 06:16:34,751
All right, what grade\nlevel is this book?
6513
06:16:36,039 --> 06:16:37,331
DAVID MALAN: OK, I heard third.
6514
06:16:38,846 --> 06:16:40,241
DAVID MALAN: Seventh, fifth.
6515
06:16:41,411 --> 06:16:44,800
But grade 7, according to\none particular measure.
6516
06:16:44,800 --> 06:16:49,062
And whether or not we can debate exactly\n
6517
06:16:49,062 --> 06:16:51,520
and maybe you're feeling ahead\nof your time, or behind now.
6518
06:16:51,521 --> 06:16:55,730
But here, we have a snippet of text.
6519
06:16:55,730 --> 06:17:00,820
What makes this text assume an older\n
6520
06:17:00,820 --> 06:17:03,951
a higher grade level, would you think?
6521
06:17:06,675 --> 06:17:09,370
DAVID MALAN: Yeah, it's longer,\ndifferent types of words
6522
06:17:09,370 --> 06:17:11,773
there's commas now in\nphrases, and so forth.
6523
06:17:11,774 --> 06:17:13,941
So there's just some kind\nof sophistication to this.
6524
06:17:13,940 --> 06:17:16,540
So it turns out for the\nupcoming problem set
6525
06:17:16,541 --> 06:17:19,631
among the things you'll do is\ntake, as input, texts like this
6526
06:17:20,771 --> 06:17:23,333
Considering , well, how\nmany words are in the text?
6527
06:17:23,332 --> 06:17:24,790
How many sentences are in the text?
6528
06:17:24,791 --> 06:17:26,636
How many letters are in the text?
6529
06:17:26,635 --> 06:17:30,430
And use those according to a\n
6530
06:17:30,431 --> 06:17:33,940
exactly, the grade level of some\n
6531
06:17:34,843 --> 06:17:37,050
Well what else are we going\nto do in the coming days?
6532
06:17:37,050 --> 06:17:39,670
Well I've alluded to this notion\nof cryptography in the past.
6533
06:17:39,670 --> 06:17:42,610
This notion of scrambling\ninformation in such a way
6534
06:17:42,611 --> 06:17:45,683
that you can hide the\ncontents of a message
6535
06:17:45,683 --> 06:17:47,890
from someone who might\notherwise intercept it, right?
6536
06:17:47,890 --> 06:17:50,390
The earliest form of this might\nalso be when you're younger
6537
06:17:50,390 --> 06:17:53,650
and you're in class, and you're passing\n
6538
06:17:53,651 --> 06:17:54,911
from yourself to someone else.
6539
06:17:54,911 --> 06:17:57,221
You don't want to necessarily\nwrite a note in English
6540
06:17:57,221 --> 06:17:59,381
or some other written,\nlanguage you might want
6541
06:17:59,381 --> 06:18:01,691
to scramble it somehow, or encrypt it.
6542
06:18:01,690 --> 06:18:04,720
Maybe you change the As\nto a B, and the Bs to a C.
6543
06:18:04,721 --> 06:18:07,031
So that if the teacher snaps\nit up and intercepts it
6544
06:18:07,030 --> 06:18:09,460
they can't actually\nunderstand what it is you've
6545
06:18:09,460 --> 06:18:11,420
written because it's encrypted.
6546
06:18:11,420 --> 06:18:13,870
So long as your friend,\nthe recipient of this note
6547
06:18:13,870 --> 06:18:16,150
knows how you manipulated it.
6548
06:18:16,151 --> 06:18:19,901
How you added or subtracted\nletters to each other
6549
06:18:19,901 --> 06:18:23,111
they can decrypt it, which\nis to reverse that process.
6550
06:18:23,111 --> 06:18:26,331
So formally, in the world of\ncryptography and computer science
6551
06:18:26,330 --> 06:18:28,390
this is another problem to solve.
6552
06:18:28,390 --> 06:18:31,434
Your input, though, when you have a\n
6553
06:18:31,434 --> 06:18:33,101
is what's generally known as plain text.
6554
06:18:33,100 --> 06:18:37,240
There's some algorithm that's\ngoing to then encipher, or encrypt
6555
06:18:37,241 --> 06:18:40,361
that information, into what's\ncalled ciphertext, which
6556
06:18:40,361 --> 06:18:42,911
is the scrambled version that\ntheoretically can get safely
6557
06:18:42,911 --> 06:18:45,370
intercepted and your message\nhas not been spoiled
6558
06:18:45,370 --> 06:18:48,880
unless that intercept\nactually knows what algorithm
6559
06:18:48,881 --> 06:18:51,411
you used inside of this process.
6560
06:18:51,411 --> 06:18:53,980
So that would be generally\nknown as a cipher.
6561
06:18:53,980 --> 06:18:57,341
The ciphers typically take,\nthough, not one input, but two.
6562
06:18:57,341 --> 06:19:01,945
If, for instance, your cipher\nis as simple as A becomes B
6563
06:19:01,945 --> 06:19:05,681
B becomes C, C becomes D,\ndot dot dot, Z becomes A
6564
06:19:05,681 --> 06:19:09,401
you're essentially adding one to\nevery letter and encrypting it.
6565
06:19:09,401 --> 06:19:12,011
Now that would be,\nwhat we call, the key.
6566
06:19:12,010 --> 06:19:15,730
You and the recipient both have to\n
6567
06:19:15,730 --> 06:19:19,541
in advance, what number you're\ngoing to use that day to rotate
6568
06:19:19,541 --> 06:19:21,221
or change all of these letters by.
6569
06:19:21,221 --> 06:19:24,671
Because when you add 1, they\nupon receiving your ciphertext
6570
06:19:24,670 --> 06:19:27,350
have to subtract 1 to\nget back the answer.
6571
06:19:27,350 --> 06:19:31,990
For instance, if the input,\nplaintext, is hi, as before
6572
06:19:31,991 --> 06:19:37,271
and the key is 1, the ciphertext using\n
6573
06:19:37,271 --> 06:19:41,980
otherwise known as the Caesar cipher,\n
6574
06:19:41,980 --> 06:19:45,668
So it's similar, but it's at\nleast scrambled at first glance.
6575
06:19:45,669 --> 06:19:47,711
And unless the teacher\nreally cares to figure out
6576
06:19:47,710 --> 06:19:50,681
what algorithm are they using today,\n
6577
06:19:50,681 --> 06:19:53,960
it's probably sufficiently\nsecure for your purposes.
6578
06:19:53,960 --> 06:19:55,420
How do you reverse the process?
6579
06:19:55,420 --> 06:19:58,450
Well, your friend gets this\nand reverses it by negative 1.
6580
06:19:58,451 --> 06:20:02,890
So I becomes H, J becomes I,\nand things like punctuation
6581
06:20:02,890 --> 06:20:05,320
remain untouched at\nleast in this scheme.
6582
06:20:05,320 --> 06:20:07,841
So let's consider one\nfinal example here.
6583
06:20:07,841 --> 06:20:15,341
If the input to the algorithm\nis Uijtxbtdt50, and the key
6584
06:20:17,350 --> 06:20:23,770
Such that now B should become A, and C\n
6585
06:20:23,771 --> 06:20:25,390
So we're going in the other direction.
6586
06:20:27,291 --> 06:20:30,261
Well if we spread all the letters\n
6587
06:20:30,260 --> 06:20:36,040
and we start subtracting one letter,\n
6588
06:20:36,041 --> 06:20:41,480
T becomes S, X becomes W, A, was, D, T--
6589
06:22:01,705 --> 06:22:06,530
DAVID J. MALAN: This is CS50, and\nthis is already week three.
6590
06:22:06,530 --> 06:22:10,056
And even as we've gotten much more\n
6591
06:22:10,056 --> 06:22:11,931
and some of the C stuff\nthat we've been doing
6592
06:22:11,931 --> 06:22:15,201
is all the more cryptic looking,\n
6593
06:22:15,201 --> 06:22:19,008
like, everything we've been doing\n
6594
06:22:19,008 --> 06:22:20,841
So keep that in mind,\nparticularly as things
6595
06:22:20,841 --> 06:22:23,591
seem like they're getting more\n
6596
06:22:23,591 --> 06:22:26,210
It's just a process of learning\na new language that ultimately
6597
06:22:26,210 --> 06:22:28,791
lets us express this process.
6598
06:22:28,791 --> 06:22:31,851
And of course, last week we really\n
6599
06:22:31,850 --> 06:22:33,600
inputs and outputs are represented.
6600
06:22:33,600 --> 06:22:37,940
And this thing here, a photograph\nthereof, is called what?
6601
06:22:39,989 --> 06:22:41,031
DAVID J. MALAN: RAM, I heard--
6602
06:22:41,030 --> 06:22:44,030
Random Access Memory or just\ngenerally known as memory.
6603
06:22:44,030 --> 06:22:46,640
And recall that we looked at\none of these little black chips
6604
06:22:46,640 --> 06:22:48,861
that contains all of the bytes--
6605
06:22:48,861 --> 06:22:50,151
all of the bits, ultimately.
6606
06:22:50,151 --> 06:22:52,761
It's just kind of a grid,\nsort of an artist grid, that
6607
06:22:52,760 --> 06:22:55,880
allows us to think about every\none of these memory locations
6608
06:22:55,881 --> 06:22:58,401
as just having a number or\nan address, so to speak.
6609
06:22:58,401 --> 06:23:01,131
Like, this might be byte\nnumber 0 and then 1 and then 2
6610
06:23:01,131 --> 06:23:04,041
and then, maybe way down here\nagain, something like 2 billion
6611
06:23:04,041 --> 06:23:06,441
if you have 2 gigabytes of memory.
6612
06:23:06,440 --> 06:23:10,431
And so as we did that, we started to\n
6613
06:23:10,431 --> 06:23:14,120
to create kind of our own information,\n
6614
06:23:14,120 --> 06:23:16,740
just the basics like ints\nand floats and so forth.
6615
06:23:16,741 --> 06:23:18,831
But we also talked about strings.
6616
06:23:18,830 --> 06:23:21,440
And what is a string as you now know it?
6617
06:23:21,440 --> 06:23:24,560
How would you describe in\nlayperson's terms a string?
6618
06:23:26,841 --> 06:23:28,258
DAVID J. MALAN: An array of characters.
6619
06:23:28,258 --> 06:23:30,051
And an array, meanwhile--\nlet's go there.
6620
06:23:30,050 --> 06:23:34,940
How might someone else define an\n
6621
06:23:38,030 --> 06:23:41,431
AUDIENCE: Kind of like\nan indexed set of things.
6622
06:23:41,431 --> 06:23:43,421
DAVID J. MALAN: An indexed\nset of things-- not bad.
6623
06:23:43,420 --> 06:23:46,570
And I think a key characteristic to\n
6624
06:23:46,570 --> 06:23:48,251
does actually pertain to memory.
6625
06:23:50,170 --> 06:23:53,560
Byte after byte after byte\nis what constitutes an array.
6626
06:23:53,561 --> 06:23:56,171
And we'll see in a couple of\nweeks time that there's actually
6627
06:23:56,170 --> 06:24:00,130
more interesting ways to use this same\n
6628
06:24:00,131 --> 06:24:03,601
things that are sort of two directional\n
6629
06:24:04,100 --> 06:24:07,330
But for now, all we've talked about\n
6630
06:24:07,330 --> 06:24:11,600
from left to right, top to bottom,\n
6631
06:24:11,600 --> 06:24:14,260
So today, we'll consider still an array.
6632
06:24:14,260 --> 06:24:17,380
But we won't focus so\nmuch on representation
6633
06:24:17,381 --> 06:24:18,820
of strings or other data types.
6634
06:24:18,820 --> 06:24:21,278
We'll actually now focus on\nthe other part of that process
6635
06:24:21,278 --> 06:24:24,310
of inputs becoming outputs,\nnamely the thing in the middle--
6636
06:24:25,300 --> 06:24:29,560
But we have to keep in mind, even though\n
6637
06:24:29,561 --> 06:24:32,971
thus far, certainly on the board\n
6638
06:24:32,971 --> 06:24:34,721
have the luxury of\njust kind of eyeballing
6639
06:24:34,721 --> 06:24:38,181
the whole thing with a bird's eye view\n
6640
06:24:38,681 --> 06:24:40,513
If I asked you where a\nparticular number is
6641
06:24:40,513 --> 06:24:43,510
like zero, odds are your eyes\nwould go right to where it is
6642
06:24:43,510 --> 06:24:46,510
and boom, problem solved\nin sort of one step.
6643
06:24:46,510 --> 06:24:51,620
But the catch is, with a computer\n
6644
06:24:51,620 --> 06:24:55,091
the human, can [INAUDIBLE] see\n
6645
06:24:55,091 --> 06:24:58,451
It's better to think of your\n
6646
06:24:58,451 --> 06:25:01,361
or more specifically an\narray of memory like this
6647
06:25:01,361 --> 06:25:05,861
as really being a set of closed\n
6648
06:25:05,861 --> 06:25:08,620
And only by opening\neach of those doors can
6649
06:25:08,620 --> 06:25:10,451
the computer actually\nsee what's in there
6650
06:25:10,451 --> 06:25:12,971
which is to say that the\ncomputer, unlike you, doesn't
6651
06:25:12,971 --> 06:25:16,841
have this bird's eye view of all\n
6652
06:25:16,841 --> 06:25:19,760
It has to much more\nmethodically look here
6653
06:25:19,760 --> 06:25:23,950
maybe look here, maybe look here, and\n
6654
06:25:23,951 --> 06:25:28,031
Now fortunately, we already have some\n
6655
06:25:28,030 --> 06:25:29,570
Boolean expressions, and the like--
6656
06:25:29,570 --> 06:25:31,480
where you could imagine\nwriting some code
6657
06:25:31,480 --> 06:25:36,041
that very methodically goes from left\n
6658
06:25:36,041 --> 06:25:39,851
more sophisticated that actually\n
6659
06:25:39,850 --> 06:25:43,210
And just remember that the\nconventions we've had since last week
6660
06:25:43,210 --> 06:25:47,681
now is that these arrays are\nzero indexed, so to speak.
6661
06:25:47,681 --> 06:25:52,511
To be zero indexed just means that the\n
6662
06:25:52,510 --> 06:25:56,290
So this is location 0, 1, 2, 3, 4, 5, 6.
6663
06:25:56,291 --> 06:25:59,681
And notice even though there\nare seven total doors here
6664
06:25:59,681 --> 06:26:02,291
the right-most one,\nof course, is called 6
6665
06:26:02,291 --> 06:26:04,300
just because we've\nstarted counting at 0.
6666
06:26:04,300 --> 06:26:09,041
So in the general case, if you\nhad n doors or n bytes of memory
6667
06:26:09,041 --> 06:26:13,841
0 would always be at the left, and n\n
6668
06:26:13,841 --> 06:26:18,221
That's sort of a generalization of just\n
6669
06:26:18,221 --> 06:26:22,151
All right, so let's revisit the problem\n
6670
06:26:22,151 --> 06:26:24,884
with in week zero, which was\nthis notion of searching.
6671
06:26:24,883 --> 06:26:26,800
And what does it mean\nto search for something?
6672
06:26:26,800 --> 06:26:29,578
Well, to find information-- and\nthis, of course, is omnipresent.
6673
06:26:29,579 --> 06:26:32,621
Anytime you take out your phone, you're\n
6674
06:26:32,620 --> 06:26:35,501
Any time you pull up a browser,\n
6675
06:26:35,501 --> 06:26:39,670
So search is kind of one of the\n
6676
06:26:41,238 --> 06:26:44,320
So let's consider how the Googles, the\n
6677
06:26:44,320 --> 06:26:48,021
are implementing something as\nseemingly familiar as this.
6678
06:26:48,021 --> 06:26:50,381
So here might be the problem statement.
6679
06:26:50,381 --> 06:26:52,811
We want some input to\nbecome some output.
6680
06:26:52,811 --> 06:26:54,161
What's that input going to be?
6681
06:26:54,161 --> 06:26:57,881
Maybe it's a bunch of closed doors\n
6682
06:26:57,881 --> 06:27:00,611
to get back an answer, true or false.
6683
06:27:00,611 --> 06:27:03,146
Is something we're\nlooking for there or not?
6684
06:27:03,146 --> 06:27:05,771
You can imagine taking this one\nstep further and trying to find
6685
06:27:05,771 --> 06:27:07,881
where is the thing you're looking for.
6686
06:27:07,881 --> 06:27:10,300
But for now, let's just take\none bite out of the problem.
6687
06:27:10,300 --> 06:27:14,920
Can we tell ourselves, true or\nfalse, is some number behind one
6688
06:27:14,920 --> 06:27:17,751
of these doors or lockers in memory?
6689
06:27:17,751 --> 06:27:22,120
But before we go there and start\n
6690
06:27:22,940 --> 06:27:27,040
Let's consider how we might\nlay the foundation of, like
6691
06:27:27,041 --> 06:27:30,469
comparing whether one algorithm\nis better than another.
6692
06:27:30,469 --> 06:27:32,261
We talked about\ncorrectness, and it sort of
6693
06:27:32,260 --> 06:27:35,800
goes without saying that any code you\n
6694
06:27:36,820 --> 06:27:39,701
Otherwise, what's the point if it\n
6695
06:27:39,701 --> 06:27:41,620
But we also talked about design.
6696
06:27:41,620 --> 06:27:45,640
And in your own words, what do we\n
6697
06:27:45,640 --> 06:27:48,830
designed at this stage than another?
6698
06:27:48,830 --> 06:27:52,240
How do you think about\nthis notion of design now?
6699
06:27:53,411 --> 06:27:55,661
AUDIENCE: Easier to understand\nor easier to institute.
6700
06:27:55,661 --> 06:27:57,350
DAVID J. MALAN: OK, so easier to understand.
6701
06:28:00,341 --> 06:28:03,173
DAVID J. MALAN: Efficiency, and what do\n
6702
06:28:07,190 --> 06:28:09,751
It doesn't use up too much\nmemory, and it isn't redundant.
6703
06:28:09,751 --> 06:28:11,480
So you can think about\ndesign along a few
6704
06:28:11,480 --> 06:28:13,438
of these axes-- sort of\nthe quality of the code
6705
06:28:13,438 --> 06:28:15,630
but also the quality of the performance.
6706
06:28:15,631 --> 06:28:20,179
And as our programs get bigger and\n
6707
06:28:20,179 --> 06:28:22,221
those kinds of things are\nreally going to matter.
6708
06:28:22,221 --> 06:28:24,221
And in the real world,\nif you start writing code
6709
06:28:24,221 --> 06:28:26,991
not just by yourself but with\nsomeone else, getting the design
6710
06:28:26,991 --> 06:28:30,501
right is just going to make it easier\n
6711
06:28:30,501 --> 06:28:33,030
write code, with just\nhigher probability.
6712
06:28:33,030 --> 06:28:36,650
So let's consider how we might focus\n
6713
06:28:36,651 --> 06:28:38,751
the efficiency, of an algorithm.
6714
06:28:38,751 --> 06:28:42,591
And the way we might talk about the\n
6715
06:28:42,591 --> 06:28:45,710
or how slow they are, is in\nterms of their running time.
6716
06:28:45,710 --> 06:28:49,260
That is to say, when they're\n
6717
06:28:49,260 --> 06:28:52,370
And we might measure this in\nseconds or milliseconds or minutes
6718
06:28:52,370 --> 06:28:54,710
or just some number of\nsteps in the general case
6719
06:28:54,710 --> 06:28:58,920
because presumably fewer steps, to\n
6720
06:28:58,920 --> 06:29:00,770
So how might we think\nabout running times?
6721
06:29:00,771 --> 06:29:03,831
Well, there's one general\nnotation we should define today.
6722
06:29:03,830 --> 06:29:07,970
So computer scientists tend to describe\n
6723
06:29:07,971 --> 06:29:12,201
or a piece of code, for that matter, in\n
6724
06:29:12,201 --> 06:29:14,991
This is literally a\ncapitalized O, a big O.
6725
06:29:14,991 --> 06:29:18,471
And this generally means that the\nrunning time of some algorithm
6726
06:29:18,471 --> 06:29:21,920
is on the order of such and such,\n
6727
06:29:21,919 --> 06:29:24,529
is just going to be a very\nsimple mathematical formula.
6728
06:29:24,529 --> 06:29:26,810
It's kind of a way of waving\nyour hands mathematically
6729
06:29:26,811 --> 06:29:31,131
to convey the idea of just how fast\n
6730
06:29:31,131 --> 06:29:32,961
without getting into\nthe weeds of like, it
6731
06:29:32,960 --> 06:29:36,690
took this many milliseconds or\n
6732
06:29:36,690 --> 06:29:40,640
So you might recall then from week\n
6733
06:29:41,721 --> 06:29:45,021
At the time, we just use this to\n
6734
06:29:45,021 --> 06:29:47,960
Recall that this red straight\nline was the first algorithm
6735
06:29:49,280 --> 06:29:54,230
The yellow line that's still\n
6736
06:29:54,230 --> 06:29:58,730
That line represented what\nalternative algorithm?
6737
06:29:59,730 --> 06:30:00,980
What is that second algorithm?
6738
06:30:01,791 --> 06:30:03,291
AUDIENCE: Like, two pages at a time.
6739
06:30:03,291 --> 06:30:06,441
DAVID J. MALAN: Two pages at a time, which\n
6740
06:30:06,440 --> 06:30:09,700
potentially double back a page if maybe\n
6741
06:30:10,201 --> 06:30:12,733
So it had a potential bug\nbut arguably solvable.
6742
06:30:12,733 --> 06:30:15,440
This last algorithm, though, was\n
6743
06:30:15,440 --> 06:30:18,950
strategy where I sort of unnecessarily\n
6744
06:30:18,951 --> 06:30:20,960
and then in half and\nthen in half, which
6745
06:30:20,960 --> 06:30:25,251
as dramatic as that was unnecessarily,\n
6746
06:30:25,251 --> 06:30:27,780
bites out of the\nproblem-- like 500 pages
6747
06:30:27,780 --> 06:30:33,720
the first time, another 250, another\n
6748
06:30:33,721 --> 06:30:36,944
And so we described its\nrunning time as this picture
6749
06:30:36,943 --> 06:30:39,861
there, though I didn't use that\n
6750
06:30:39,861 --> 06:30:42,591
But indeed, time to solve\nmight be measured just
6751
06:30:42,591 --> 06:30:44,510
abstractly in some unit of measure--
6752
06:30:44,510 --> 06:30:47,780
seconds, milliseconds, minutes, pages--
6753
06:30:49,471 --> 06:30:52,191
So let's now slap some numbers on this.
6754
06:30:52,190 --> 06:30:55,610
If we had n pages in that\nphone book, n just representing
6755
06:30:55,611 --> 06:30:57,980
a generic number, the\nfirst algorithm here
6756
06:30:57,980 --> 06:30:59,811
we might describe as taking n steps.
6757
06:30:59,811 --> 06:31:03,230
Second algorithm we might describe\n
6758
06:31:03,230 --> 06:31:05,870
maybe give or take one if we\nhave to double back but generally
6759
06:31:06,843 --> 06:31:09,050
And then this thing, if you\nremember your logarithms
6760
06:31:09,050 --> 06:31:11,008
was sort of a fundamentally\ndifferent formula--
6761
06:31:11,008 --> 06:31:14,460
log base 2 of n or just\nlog of n for short.
6762
06:31:14,460 --> 06:31:17,150
So this is of a fundamentally\ndifferent formula.
6763
06:31:17,151 --> 06:31:20,791
But what's noteworthy is that\nthese first two algorithms
6764
06:31:20,791 --> 06:31:24,411
even though, yes, the second\nalgorithm was hands down faster--
6765
06:31:24,411 --> 06:31:26,271
I mean, literally twice as fast--
6766
06:31:26,271 --> 06:31:30,921
when you start to zoom out and if\n
6767
06:31:30,920 --> 06:31:36,920
these first two start to look\nawfully similar to one another.
6768
06:31:36,920 --> 06:31:39,170
And if we keep zooming out\nand zooming out and zooming
6769
06:31:39,170 --> 06:31:41,450
out as n gets really large--
6770
06:31:41,451 --> 06:31:43,581
that is, the x-axis gets really long--
6771
06:31:43,580 --> 06:31:47,610
these first two algorithms start\nto become essentially the same.
6772
06:31:47,611 --> 06:31:50,751
And so this is where computer\nscientists use big O notation.
6773
06:31:50,751 --> 06:31:54,780
Instead of saying specifically,\nthis algorithm takes any steps.
6774
06:31:54,780 --> 06:31:57,980
And this one n divided by 2, a\ncomputer scientist would say
6775
06:31:57,980 --> 06:32:01,370
eh, each of those algorithms\ntakes on the order of n steps
6776
06:32:01,370 --> 06:32:03,380
or on the order of n over 2.
6777
06:32:04,131 --> 06:32:07,851
On the order of n over 2\nis pretty much the same
6778
06:32:07,850 --> 06:32:13,980
when n gets really large as being\n
6779
06:32:13,980 --> 06:32:18,841
So yes, in practice, it's obviously\n
6780
06:32:18,841 --> 06:32:22,311
But in the big picture, when n\nbecomes a million, a billion
6781
06:32:22,311 --> 06:32:24,980
the numbers are already\nso darn big at that point
6782
06:32:24,980 --> 06:32:28,041
that these are as, the\nshapes of these curves imply
6783
06:32:28,041 --> 06:32:30,291
pretty much functionally equivalent.
6784
06:32:30,291 --> 06:32:33,441
But this one still\nlooks better and better
6785
06:32:33,440 --> 06:32:36,830
as n gets large because it's\nrising so much less quickly.
6786
06:32:36,830 --> 06:32:39,020
And so here, a computer\nscientist would say
6787
06:32:39,021 --> 06:32:43,679
that that third algorithm was on the\n
6788
06:32:43,679 --> 06:32:45,471
And they don't have to\nbother with the base
6789
06:32:45,471 --> 06:32:49,341
because it's a smaller mathematical\n
6790
06:32:49,341 --> 06:32:52,251
a constant, multiplicative factor.
6791
06:32:52,251 --> 06:32:54,201
So in short, what are\nthe takeaways here?
6792
06:32:54,201 --> 06:32:56,631
This is just a new\nvocabulary that we'll start
6793
06:32:56,631 --> 06:33:00,381
to use when we just want to describe\n
6794
06:33:00,381 --> 06:33:02,961
To make this more real, if\nany of you have implemented
6795
06:33:02,960 --> 06:33:08,650
a for loop at this point in any of your\n
6796
06:33:08,651 --> 06:33:11,951
where maybe in was the height\nof your pyramid or maybe n
6797
06:33:11,951 --> 06:33:15,911
was something else that you wanted\nto do n times, you wrote code
6798
06:33:15,911 --> 06:33:20,800
or you implemented an algorithm\n
6799
06:33:21,561 --> 06:33:23,711
So this is just a way now\nto retroactively start
6800
06:33:23,710 --> 06:33:27,850
describing with somewhat\nmathematical notation what we've
6801
06:33:27,850 --> 06:33:30,140
been doing in practice for a while now.
6802
06:33:30,140 --> 06:33:35,861
So here's a list of commonly seen\n
6803
06:33:35,861 --> 06:33:39,370
This is not a thorough list\nbecause you could come up
6804
06:33:39,370 --> 06:33:42,070
with an infinite number of\nmathematical formulas, certainly.
6805
06:33:42,070 --> 06:33:45,760
But the common ones we'll discuss\n
6806
06:33:45,760 --> 06:33:48,430
probably reduce to this list here.
6807
06:33:48,431 --> 06:33:50,681
And if you were to study\nmore computer science theory
6808
06:33:50,681 --> 06:33:52,263
this list would get longer and longer.
6809
06:33:52,263 --> 06:33:56,290
But for now, these are sort of the\n
6810
06:33:56,291 --> 06:33:59,061
All right, two other pieces\nof vocabulary, if you will
6811
06:33:59,061 --> 06:34:00,521
before we start to use this stuff--
6812
06:34:00,521 --> 06:34:03,791
so this, a big omega,\ncapital omega symbol
6813
06:34:03,791 --> 06:34:09,531
is used now to describe a lower bound\n
6814
06:34:09,530 --> 06:34:12,970
So to be clear, big O is\non the order of-- that
6815
06:34:12,971 --> 06:34:16,271
is, an upper bound-- on how\nmany steps an algorithm might
6816
06:34:16,271 --> 06:34:19,061
take, on the order of so many steps.
6817
06:34:19,061 --> 06:34:21,941
If you want to talk, though,\nfrom the other perspective, well
6818
06:34:21,940 --> 06:34:24,070
how few steps my algorithm take?
6819
06:34:24,070 --> 06:34:26,681
Maybe in the so-called\nbest case, it'd be nice
6820
06:34:26,681 --> 06:34:29,471
if we had a notation to\njust describe what a lower
6821
06:34:29,471 --> 06:34:32,201
bound is because some\nalgorithms might be super fast
6822
06:34:32,201 --> 06:34:34,251
in these so-called best cases.
6823
06:34:34,251 --> 06:34:38,021
So the symbology is almost the\nsame, but we replace the big O
6824
06:34:39,140 --> 06:34:42,940
So to be clear, big O describes\nan upper bound and omega
6825
06:34:44,471 --> 06:34:46,901
And we'll see examples\nof this before long.
6826
06:34:46,901 --> 06:34:52,781
And then lastly, last one here, big\n
6827
06:34:52,780 --> 06:34:57,220
when you have a case where both\n
6828
06:34:57,221 --> 06:35:00,251
running time is the\nsame as the lower bound.
6829
06:35:00,251 --> 06:35:03,940
You can then describe it in one breath\n
6830
06:35:03,940 --> 06:35:08,021
instead of saying it's in big O\nand in omega of something else.
6831
06:35:08,021 --> 06:35:12,280
All right, so out of context, sort\n
6832
06:35:12,280 --> 06:35:15,824
but all they refer to is upper\nbounds, lower bounds, or when
6833
06:35:15,824 --> 06:35:17,241
they happen to be one in the same.
6834
06:35:17,241 --> 06:35:20,801
And we'll now introduce over time\n
6835
06:35:20,800 --> 06:35:23,350
apply these to concrete problems.
6836
06:35:23,350 --> 06:35:27,580
But first, let me pause to\nsee if there's any questions.
6837
06:35:43,920 --> 06:35:45,880
DAVID J. MALAN: Smaller n\nfunctions move faster.
6838
06:35:45,881 --> 06:35:50,621
So yes, if you have something\nlike n, that takes only steps.
6839
06:35:50,620 --> 06:35:53,626
If you have a formula like n\n
6840
06:35:53,626 --> 06:35:56,260
that take more steps\nand therefore be slower.
6841
06:35:56,260 --> 06:35:58,170
So the larger the\nmathematical expression
6842
06:35:58,170 --> 06:36:02,640
the slower your algorithm is because the\n
6843
06:36:05,161 --> 06:36:07,530
AUDIENCE: So you want your\nn function to be small?
6844
06:36:07,530 --> 06:36:10,771
DAVID J. MALAN: You want your n function,\n
6845
06:36:10,771 --> 06:36:12,581
And in fact, the Holy\nGrail, so to speak
6846
06:36:12,580 --> 06:36:16,050
would be this last one here either\n
6847
06:36:16,050 --> 06:36:19,260
when an algorithm is on\nthe order of a single step.
6848
06:36:19,260 --> 06:36:23,730
That means it literally takes constant\n
6849
06:36:23,730 --> 06:36:26,850
100 steps, but a fixed,\nconstant number of steps.
6850
06:36:26,850 --> 06:36:30,480
That's the best because even\nas the phone book gets bigger
6851
06:36:30,480 --> 06:36:34,710
even as the data set you're\nsearching gets larger and larger
6852
06:36:34,710 --> 06:36:37,861
if something only takes a finite\nnumber of steps constantly
6853
06:36:37,861 --> 06:36:42,001
then it doesn't matter how big\nthe data set actually gets.
6854
06:36:42,001 --> 06:36:46,201
Questions as well on these notations--\n
6855
06:36:46,201 --> 06:36:47,671
This is actually very helpful.
6856
06:36:47,670 --> 06:36:49,530
I'm seeing pointing this way?
6857
06:36:52,587 --> 06:36:54,920
DAVID J. MALAN: What is the input\nto each of these functions?
6858
06:36:54,920 --> 06:36:58,070
It is an expression of how\nmany steps an algorithm takes.
6859
06:36:58,070 --> 06:37:00,440
So in fact, let me go\nahead and make this
6860
06:37:00,440 --> 06:37:03,580
more concrete with an actual\nexample here if we could.
6861
06:37:03,580 --> 06:37:06,710
So on stage here, we have\nseven lockers which represent
6862
06:37:06,710 --> 06:37:08,661
if you will, an array of memory.
6863
06:37:08,661 --> 06:37:10,760
And this array of\nmemory is maybe storing
6864
06:37:10,760 --> 06:37:14,510
seven integers, seven integers that\n
6865
06:37:14,510 --> 06:37:17,450
And if we want to search\nfor these values, how might
6866
06:37:18,408 --> 06:37:20,617
Well, for this, why don't\nwe make things interesting?
6867
06:37:20,617 --> 06:37:22,170
Would a volunteer like to come on up?
6868
06:37:22,170 --> 06:37:25,280
Have to be masked and on the\ninternet if you are comfortable.
6869
06:37:25,280 --> 06:37:28,550
Both of-- oh, there's someone putting\n
6870
06:37:34,741 --> 06:37:37,441
And in just a moment,\nour brave volunteer
6871
06:37:37,440 --> 06:37:41,431
is going to help me find a\nspecific number in the data set
6872
06:37:41,431 --> 06:37:42,881
that we have here on the screen.
6873
06:37:42,881 --> 06:37:46,591
So come on down, and I'll get things\n
6874
06:37:55,890 --> 06:37:57,473
DAVID J. MALAN: [? Nomira. ?] Nice to meet.
6875
06:37:58,061 --> 06:38:02,011
So here we have for Nomira seven\nlockers or an array of memory.
6876
06:38:02,010 --> 06:38:03,900
And behind each of\nthese doors is a number.
6877
06:38:03,901 --> 06:38:06,930
And the goal, quite simply,\nis, given this array of memory
6878
06:38:06,930 --> 06:38:12,041
as input, to return, true or false, is\n
6879
06:38:12,041 --> 06:38:14,611
So suppose I care about the number 0.
6880
06:38:14,611 --> 06:38:18,091
What would be the simplest,\nmost correct algorithm you could
6881
06:38:18,091 --> 06:38:22,561
apply in order to find us the number 0?
6882
06:38:22,561 --> 06:38:25,969
OK, try opening the first one.
6883
06:38:25,969 --> 06:38:28,511
All right, and maybe just step\naside so the audience can see.
6884
06:38:28,510 --> 06:38:30,520
I think you have not found 0 yet.
6885
06:38:31,841 --> 06:38:34,480
Let's move on to your next choice.
6886
06:38:36,359 --> 06:38:37,901
DAVID J. MALAN: Oh, go ahead, second door.
6887
06:38:38,776 --> 06:38:41,918
Let's just move from left to\nright, sort of searching our way.
6888
06:38:48,561 --> 06:38:51,591
All right, also not working\nout so well yet, but that's OK.
6889
06:38:51,591 --> 06:38:55,971
If you want to go on to the\nnext, we're still looking for 0.
6890
06:38:57,291 --> 06:38:58,863
All right, it's not so good yet.
6891
06:39:04,460 --> 06:39:10,230
No, that's a-- all\nright, very well done.
6892
06:39:13,390 --> 06:39:16,690
All right, so I kind of set you\nup for a fairly slow algorithm
6893
06:39:16,690 --> 06:39:18,911
but let me just ask you\nto describe what is it
6894
06:39:18,911 --> 06:39:21,895
you did by following\nthe steps I gave you.
6895
06:39:21,895 --> 06:39:24,021
AUDIENCE: I just went one\nby one to each character.
6896
06:39:24,021 --> 06:39:26,021
DAVID J. MALAN: You went one\nby one to each character
6897
06:39:26,021 --> 06:39:27,741
if you want to talk into here.
6898
06:39:27,741 --> 06:39:29,501
So you went one by\none by each character.
6899
06:39:29,501 --> 06:39:32,980
And would you say that algorithm\nleft or right is correct?
6900
06:39:35,320 --> 06:39:37,361
AUDIENCE: Or, yes, in the scenario.
6901
06:39:37,361 --> 06:39:38,861
DAVID J. MALAN: OK, yes in this scenario.
6902
06:39:39,881 --> 06:39:40,421
What's going through your mind?
6903
06:39:40,420 --> 06:39:42,460
AUDIENCE: Because it's not the\nmost efficient way to do it.
6904
06:39:43,341 --> 06:39:45,971
So we see a contrast here\nbetween correctness and design.
6905
06:39:45,971 --> 06:39:48,721
I mean, I do think it was correct\n
6906
06:39:50,320 --> 06:39:52,850
But it took some number of steps.
6907
06:39:52,850 --> 06:39:54,640
So in fact, this would be an algorithm.
6908
06:39:54,640 --> 06:39:56,710
It has a name, called linear search.
6909
06:39:56,710 --> 06:39:59,201
And, [? Nomira, ?] as you\ndid, you kind of walked along
6910
06:39:59,201 --> 06:40:01,096
a line going from left to right.
6911
06:40:01,721 --> 06:40:04,241
If you had gone from right\nto left, would the algorithm
6912
06:40:04,241 --> 06:40:07,391
have been fundamentally better?
6913
06:40:08,140 --> 06:40:09,245
DAVID J. MALAN: OK, and why?
6914
06:40:09,245 --> 06:40:11,620
AUDIENCE: Because the zero is\nhere in the first scenario.
6915
06:40:11,620 --> 06:40:15,220
But if it was like, the zero is in\n
6916
06:40:15,221 --> 06:40:19,381
DAVID J. MALAN: Yeah, and so here is\n
6917
06:40:19,381 --> 06:40:20,631
becomes a little less obvious.
6918
06:40:20,631 --> 06:40:23,068
You would absolutely have\ngiven yourself a better result
6919
06:40:23,067 --> 06:40:25,150
if you would just happened\nto start from the right
6920
06:40:25,151 --> 06:40:27,141
or if I had pointed you\nto start over there.
6921
06:40:27,140 --> 06:40:30,268
But the catch is if I asked her to\n
6922
06:40:30,268 --> 06:40:31,600
well, that would have backfired.
6923
06:40:31,600 --> 06:40:33,308
And this time, it\nwould have taken longer
6924
06:40:33,309 --> 06:40:35,811
to find that number because\nit's way over here instead.
6925
06:40:35,811 --> 06:40:40,331
And so in the general case, going\n
6926
06:40:40,330 --> 06:40:43,900
is probably as correct as you can\n
6927
06:40:43,901 --> 06:40:47,878
about the order of these numbers-- and\n
6928
06:40:47,878 --> 06:40:49,960
Some of them are smaller,\nsome of them are bigger.
6929
06:40:49,960 --> 06:40:51,668
There doesn't seem to\nbe rhyme or reason.
6930
06:40:51,669 --> 06:40:55,691
Linear search is about as good as you\n
6931
06:40:55,690 --> 06:40:57,414
a priori about the numbers.
6932
06:40:57,414 --> 06:41:00,081
So I have a little thank you gift\nhere, a little CS stress ball.
6933
06:41:00,080 --> 06:41:03,040
Round of applause for\nour first volunteer.
6934
06:41:08,030 --> 06:41:12,050
Let's try to formalize what I\njust described as linear search
6935
06:41:12,050 --> 06:41:15,298
because indeed, no matter which\nend [? Nomira ?] had started on
6936
06:41:15,298 --> 06:41:17,091
I could have kind of\nchanged up the problem
6937
06:41:17,091 --> 06:41:19,431
to make sure that it\nappears to be running slow.
6938
06:41:20,451 --> 06:41:23,631
If zero were among those doors,\n
6939
06:41:24,620 --> 06:41:30,230
So let's now try to translate what\n
6940
06:41:30,230 --> 06:41:32,570
pseudo code as from week zero.
6941
06:41:32,570 --> 06:41:35,091
So with pseudo code, we\njust need a terse English
6942
06:41:35,091 --> 06:41:38,131
like, or any language, syntax\nto describe what we did.
6943
06:41:38,131 --> 06:41:40,640
So here might be one formulation\nof what [? Nomira ?] did.
6944
06:41:40,640 --> 06:41:45,050
For each door, from left to right,\n
6945
06:41:46,640 --> 06:41:51,990
Else, at the very end of the program,\n
6946
06:41:53,161 --> 06:41:55,161
And by the seventh door,\n[? Nomira ?] had indeed
6947
06:41:55,161 --> 06:41:58,791
returned true by saying,\nwell, there is the zero.
6948
06:41:58,791 --> 06:42:01,431
But let's consider if this\npseudo code is now correct
6949
06:42:02,510 --> 06:42:06,720
First of all, normally, when we've\n
6950
06:42:06,721 --> 06:42:10,461
And yet down here, return\nfalse is aligned with the for.
6951
06:42:10,460 --> 06:42:14,060
Why did I not indent the return\nfalse, or put another way
6952
06:42:14,061 --> 06:42:21,141
why did I not do if number is behind\n
6953
06:42:21,140 --> 06:42:24,216
Why would that version of this\ncode have been problematic?
6954
06:42:34,681 --> 06:42:37,050
DAVID J. MALAN: OK, I'm not sure\nit's because of redundancy.
6955
06:42:37,050 --> 06:42:39,361
Let me go ahead and\njust make this explicit.
6956
06:42:39,361 --> 06:42:42,870
If I had instead done\nelse return false, I
6957
06:42:42,870 --> 06:42:47,302
don't think it's so much redundancy\nthat I'd be worried about.
6958
06:42:47,302 --> 06:42:48,510
Let me bounce somewhere else.
6959
06:42:49,170 --> 06:42:52,710
AUDIENCE: Um, maybe\n[INAUDIBLE] for the entire list
6960
06:42:52,710 --> 06:42:54,091
after just checking one number.
6961
06:42:54,091 --> 06:42:56,280
DAVID J. MALAN: Yeah, it would\nbe returning falls for--
6962
06:42:56,280 --> 06:42:58,155
even though I'd only\nlooked at-- [? Nomira ?]
6963
06:42:58,155 --> 06:42:59,460
had only looked at one element.
6964
06:42:59,460 --> 06:43:02,503
And it would have been as though if\n
6965
06:43:02,503 --> 06:43:05,971
she opens this up and says, nope,\n
6966
06:43:05,971 --> 06:43:09,069
That would give me an incorrect\nresult because obviously
6967
06:43:09,068 --> 06:43:11,611
at that stage in the algorithm,\nshe wouldn't have even looked
6968
06:43:11,611 --> 06:43:13,081
through any of the other doors.
6969
06:43:13,080 --> 06:43:16,870
So just the original indentation\nof this, if you will
6970
06:43:16,870 --> 06:43:19,560
without the [? else, ?]\nis correct because only
6971
06:43:19,561 --> 06:43:23,311
if I get to the bottom of this\nalgorithm or the pseudo code does
6972
06:43:23,311 --> 06:43:26,101
it make sense to conclude\nat that point, once she's
6973
06:43:26,100 --> 06:43:29,580
gone through all of the doors,\nthat nope, there's in fact--
6974
06:43:29,580 --> 06:43:32,911
the number I'm looking for is,\nin fact, not actually there.
6975
06:43:32,911 --> 06:43:37,151
So how might we consider now the\nrunning time of this algorithm?
6976
06:43:37,151 --> 06:43:40,291
We have a few different\ntypes of vocabulary now.
6977
06:43:40,291 --> 06:43:43,441
And if we consider now how\nwe might think about this
6978
06:43:43,440 --> 06:43:46,980
let's start to translate it from\n
6979
06:43:46,980 --> 06:43:48,780
to something a little lower level.
6980
06:43:48,780 --> 06:43:52,181
We've been writing code using\nn and loops and the like.
6981
06:43:52,181 --> 06:43:56,701
So let's take this higher level\npseudo code and now just kind of
6982
06:43:56,701 --> 06:43:59,251
get a middle ground\nbetween English and C.
6983
06:43:59,251 --> 06:44:03,271
Let me propose that we think about\n
6984
06:44:03,271 --> 06:44:05,041
as being a little more pedantic.
6985
06:44:05,041 --> 06:44:13,181
For i from 0 to n minus 1, if number\n
6986
06:44:13,181 --> 06:44:15,881
Otherwise, at the end of\nthe program, return false.
6987
06:44:15,881 --> 06:44:17,881
Now I'm kind of mixing\nEnglish and C here
6988
06:44:17,881 --> 06:44:20,311
but that's reasonable if the\nreader is familiar with C
6989
06:44:21,901 --> 06:44:23,891
And notice this pattern here.
6990
06:44:23,890 --> 06:44:29,010
This is a way of just saying in pseudo\n
6991
06:44:29,010 --> 06:44:33,720
Start at 0 and then just\ncount up to n minus 1.
6992
06:44:33,721 --> 06:44:37,771
And recall n minus 1 is not one\nshy of the end of the array.
6993
06:44:37,771 --> 06:44:40,530
N minus 1 is the end of\nthe array because again, we
6994
06:44:41,940 --> 06:44:45,240
So this is a very common way\nof expressing this kind of loop
6995
06:44:45,241 --> 06:44:48,271
from the left all the way\nto the right of an array.
6996
06:44:48,271 --> 06:44:51,570
Doors I'm kind of implicitly\ntreating as the name of this array
6997
06:44:51,570 --> 06:44:54,361
like it's a variable from last\nweek that I defined as being
6998
06:44:54,361 --> 06:44:56,140
an array of integers in this case.
6999
06:44:56,140 --> 06:45:01,050
So doors bracket i means that\nwhen i is 0, it's this location.
7000
06:45:02,580 --> 06:45:06,150
When i is 7 or, more generally n minus--
7001
06:45:06,151 --> 06:45:10,631
sorry, 6 or, more generally, n\n
7002
06:45:10,631 --> 06:45:13,061
So same idea but a translation of it.
7003
06:45:13,061 --> 06:45:17,281
So now let's consider what the\n
7004
06:45:17,280 --> 06:45:20,370
If we have this menu of possible\nanswers to this question
7005
06:45:20,370 --> 06:45:23,290
how efficient or inefficient\nis this algorithm
7006
06:45:23,291 --> 06:45:26,011
let's take a look in the\ncontext of this pseudo code.
7007
06:45:26,010 --> 06:45:28,860
We don't even have to bother\ngoing all the way to C.
7008
06:45:28,861 --> 06:45:32,081
How do we go about analyzing\neach of these steps?
7009
06:45:33,960 --> 06:45:39,900
This outermost loop here for i from\n
7010
06:45:39,901 --> 06:45:42,151
is going to execute how many times?
7011
06:45:42,151 --> 06:45:45,781
How many times will that loop execute?
7012
06:45:45,780 --> 06:45:48,900
Let me give folks this\nmoment to think on it.
7013
06:45:48,901 --> 06:45:51,691
How many times is that\ngoing to loop here?
7014
06:45:54,721 --> 06:45:55,890
DAVID J. MALAN: n times, right?
7015
06:45:55,890 --> 06:45:58,080
Because it's from 0 to n minus 1.
7016
06:45:58,080 --> 06:46:00,690
And if it's a little weird to\nthink in from 0 to n minus 1
7017
06:46:00,690 --> 06:46:04,192
this is essentially the same\nmathematically as from 1 to n.
7018
06:46:04,192 --> 06:46:06,150
And that's perhaps a\nlittle more obviously more
7019
06:46:08,050 --> 06:46:12,541
So I might just make a note to myself\n
7020
06:46:12,541 --> 06:46:14,131
What about these inner steps?
7021
06:46:14,131 --> 06:46:17,521
Well, how many steps or seconds\ndoes it take to ask a question?
7022
06:46:19,651 --> 06:46:23,101
if the number you're looking\nfor is behind doors bracket i
7023
06:46:23,100 --> 06:46:25,931
well, as [? Nomira ?] did,\nthat's kind of like one step.
7024
06:46:25,931 --> 06:46:27,181
So you open the door and boom.
7025
06:46:27,181 --> 06:46:30,460
All right, maybe it's two steps,\n
7026
06:46:30,460 --> 06:46:32,681
So this is some constant\nnumber of steps.
7027
06:46:32,681 --> 06:46:34,440
Let's just call it one for simplicity.
7028
06:46:34,440 --> 06:46:37,860
How many steps or seconds\ndoes it take to return true?
7029
06:46:37,861 --> 06:46:40,224
I don't know exactly in\nthe computer's memory
7030
06:46:40,223 --> 06:46:41,640
but that feels like a single step.
7031
06:46:43,300 --> 06:46:46,320
So if this takes one\nstep, this takes one step
7032
06:46:46,320 --> 06:46:49,320
but only if the condition\nis true, it looks
7033
06:46:49,320 --> 06:46:53,730
like you're doing a constant\nnumber of things n times.
7034
06:46:53,730 --> 06:46:56,890
Or maybe you're doing\none additional step.
7035
06:46:56,890 --> 06:46:59,370
So in short, the only thing\nthat really matters here
7036
06:46:59,370 --> 06:47:02,580
in terms of the efficiency or\ninefficiency of the algorithm
7037
06:47:02,580 --> 06:47:05,855
is what are you doing again and again\n
7038
06:47:05,855 --> 06:47:07,230
the thing that's going to add up.
7039
06:47:07,230 --> 06:47:10,440
Doing one thing or two things\na constant number of times?
7040
06:47:11,370 --> 06:47:16,411
But looping, that's going to add up over\n
7041
06:47:16,411 --> 06:47:19,780
the bigger n is going to be and the\n
7042
06:47:19,780 --> 06:47:22,681
which is all to say if you\nwere to describe roughly
7043
06:47:22,681 --> 06:47:27,480
how many steps does this\nalgorithm take in big O notation
7044
06:47:27,480 --> 06:47:30,420
what might your instincts say?
7045
06:47:30,420 --> 06:47:35,710
How many steps is this algorithm on the\n
7046
06:47:40,471 --> 06:47:42,581
And indeed, that's going\nto be the case here.
7047
06:47:42,811 --> 06:47:44,894
Because you're essentially,\nat the end of the day
7048
06:47:44,894 --> 06:47:48,352
doing n things as an upper\nbound on running time.
7049
06:47:48,352 --> 06:47:50,310
And that's, in fact, what\nexactly what happened
7050
06:47:50,311 --> 06:47:52,861
with [? Nomira. ?] She had\nto look at all n lockers
7051
06:47:52,861 --> 06:47:55,811
before finally getting\nto the right answer.
7052
06:47:55,811 --> 06:47:58,831
But what if she got\nlucky and the number we
7053
06:47:58,830 --> 06:48:01,740
were looking for was not\nat the end of the array
7054
06:48:01,741 --> 06:48:04,441
but was at the beginning of the array?
7055
06:48:04,440 --> 06:48:06,010
How might we think about that?
7056
06:48:06,010 --> 06:48:09,480
Well, have a nomenclature for this\n
7057
06:48:09,480 --> 06:48:12,091
Remember, omega notation\nis a lower bound.
7058
06:48:12,091 --> 06:48:18,600
So given this menu of possible running\n
7059
06:48:18,600 --> 06:48:23,257
what might the omega notation be\n
7060
06:48:24,631 --> 06:48:26,611
DAVID J. MALAN: Omega of 1, and why that?
7061
06:48:28,333 --> 06:48:30,751
DAVID J. MALAN: Right, because if\njust by chance she gets lucky
7062
06:48:30,751 --> 06:48:33,661
and the number she's looking\nfor is right there where
7063
06:48:33,661 --> 06:48:35,850
she begins the algorithm, that's it.
7064
06:48:36,901 --> 06:48:39,570
Maybe it's two steps if you have\nto unlock the door and open it
7065
06:48:39,570 --> 06:48:41,100
but it's a constant number of steps.
7066
06:48:41,100 --> 06:48:43,170
And the way we describe\nconstant number of steps
7067
06:48:43,170 --> 06:48:45,390
is just with a single number like 1.
7068
06:48:45,390 --> 06:48:49,350
So the omega notation for linear\nsearch might be omega of 1
7069
06:48:49,350 --> 06:48:53,010
because in the best case, she might just\n
7070
06:48:53,010 --> 06:48:56,220
But in the worst case, we need to\n
7071
06:48:58,350 --> 06:49:01,050
So again there's this way\nnow of talking symbolically
7072
06:49:01,050 --> 06:49:06,510
about best cases and worst cases\n
7073
06:49:06,510 --> 06:49:09,240
Theta notation, just\nas a little trivia now
7074
06:49:09,241 --> 06:49:12,326
is it applicable based on the\ndefinition I gave earlier?
7075
06:49:13,201 --> 06:49:15,991
DAVID J. MALAN: OK, no, because you\n
7076
06:49:15,991 --> 06:49:18,481
when those two bounds,\nupper and lower, happen
7077
06:49:18,480 --> 06:49:21,100
to be the same for shorthand\nnotation, if you will.
7078
06:49:21,100 --> 06:49:25,890
So it suffices here to talk about\njust big O and omega notation.
7079
06:49:25,890 --> 06:49:28,240
Well, what if we are a\nlittle smarter about this?
7080
06:49:28,241 --> 06:49:31,711
Let me go ahead and sort\nof semi-secretly here
7081
06:49:32,741 --> 06:49:34,966
But first, how about\none other volunteer?
7082
06:49:34,966 --> 06:49:37,591
One other volunteer-- you have\nto be comfortable with your mask
7083
06:49:37,591 --> 06:49:39,870
and your being on the internet.
7084
06:49:42,541 --> 06:49:44,241
Yes, you want to come on down?
7085
06:49:45,241 --> 06:49:48,001
And don't look at what I'm\ndoing because I'm going to--
7086
06:49:52,701 --> 06:49:55,191
take your time and\ndon't look up this way
7087
06:49:55,190 --> 06:49:58,911
because I need a moment to\nrearrange all of the numbers.
7088
06:49:58,911 --> 06:50:01,791
And actually, if you could stay\nright there before coming up
7089
06:50:01,791 --> 06:50:05,271
just an awkward few seconds\nwhile I finish hiding the numbers
7090
06:50:08,251 --> 06:50:10,850
DAVID J. MALAN: I will be right with you.
7091
06:50:10,850 --> 06:50:15,470
Actually, if-- do you want to\nwarm up the crowd for a moment
7092
06:50:16,644 --> 06:50:18,061
So you want to introduce yourself?
7093
06:50:27,861 --> 06:50:30,561
DAVID J. MALAN: All right,\nI think I am ready.
7094
06:50:30,561 --> 06:50:32,061
Thank you for stalling there.
7095
06:50:33,330 --> 06:50:34,070
DAVID J. MALAN: And I didn't catch your name.
7096
06:50:35,811 --> 06:50:36,801
AUDIENCE: Rave, like a party.
7097
06:50:39,291 --> 06:50:41,121
So Rave has kindly volunteered now.
7098
06:50:41,120 --> 06:50:43,085
And I'm going to give you an\nadditional advantage this time.
7099
06:50:43,760 --> 06:50:47,540
DAVID J. MALAN: Unbeknownst to you, I\n
7100
06:50:48,890 --> 06:50:50,480
So they're not in the same\nrandom order like they
7101
06:50:50,480 --> 06:50:52,523
were for [? Nomira. ?]\nYou now have the advantage
7102
06:50:52,523 --> 06:50:55,039
to know that the numbers are\nsorted from small to big.
7103
06:50:55,580 --> 06:50:59,540
DAVID J. MALAN: Given that, and given perhaps\n
7104
06:50:59,541 --> 06:51:03,501
with the phone book, where might you\n
7105
06:51:06,169 --> 06:51:07,961
DAVID J. MALAN: Let's find\nnumber six this time.
7106
06:51:07,960 --> 06:51:09,300
Let's make things interesting.
7107
06:51:11,661 --> 06:51:12,411
DAVID J. MALAN: OK, so the middle.
7108
06:51:14,030 --> 06:51:14,780
DAVID J. MALAN: --that would be right here.
7109
06:51:16,820 --> 06:51:18,681
And you find, sadly, the number five.
7110
06:51:22,742 --> 06:51:24,825
DAVID J. MALAN: All right, and\njust to keep it uniform
7111
06:51:24,826 --> 06:51:27,441
just like I did, I opened to the\nright half of the phone book.
7112
06:51:27,710 --> 06:51:29,240
DAVID J. MALAN: Let's keep it similar.
7113
06:51:30,583 --> 06:51:32,542
DAVID J. MALAN: All right,\nand, uh, a little too far
7114
06:51:32,543 --> 06:51:34,431
even though I know you\nwanted to go one over.
7115
06:51:34,431 --> 06:51:35,440
AUDIENCE: All good, all good.
7116
06:51:35,440 --> 06:51:37,161
DAVID J. MALAN: And now we're\ngoing to go which direction?
7117
06:51:37,161 --> 06:51:38,578
AUDIENCE: Over here in the middle.
7118
06:51:38,578 --> 06:51:40,681
DAVID J. MALAN: Right, and\nvoila, the number six.
7119
06:51:40,681 --> 06:51:42,201
All right, so very nicely done.
7120
06:51:44,980 --> 06:51:46,681
A little stressful for you as well.
7121
06:51:47,570 --> 06:51:50,151
So here we see by nature\nof the locker door
7122
06:51:50,151 --> 06:51:54,711
still being open sort of an\nartifact of the greater efficiency
7123
06:51:54,710 --> 06:51:57,920
it would seem, of this\nalgorithm because now that Rave
7124
06:51:57,920 --> 06:52:00,830
was given the assumption that\n
7125
06:52:00,830 --> 06:52:04,670
on the left to large on the right,\n
7126
06:52:04,670 --> 06:52:07,970
and conquer algorithm from week zero\n
7127
06:52:09,561 --> 06:52:13,011
And simply by starting in\nthe middle and realizing
7128
06:52:13,010 --> 06:52:17,030
OK, too small, then by going to\n
7129
06:52:17,030 --> 06:52:20,181
went a little too far, then by\ngoing to the left half, which
7130
06:52:20,181 --> 06:52:24,021
Rave able to find in just\nthree steps instead of seven
7131
06:52:24,021 --> 06:52:28,081
the number six in this case that\nwe were actually searching for.
7132
06:52:28,080 --> 06:52:32,250
So you can see that this would\nseem to be more efficient.
7133
06:52:32,251 --> 06:52:35,061
Let's consider for just\na moment is it correct.
7134
06:52:35,061 --> 06:52:40,611
If I had used different numbers but\n
7135
06:52:40,611 --> 06:52:43,739
would it still have\nworked this algorithm?
7136
06:52:45,530 --> 06:52:48,280
Like, why would it still\nhave worked, do you think?
7137
06:52:51,061 --> 06:52:52,811
DAVID J. MALAN: Yeah, so\nso long as the numbers
7138
06:52:52,811 --> 06:52:55,121
are always in the same\norder from left to right
7139
06:52:55,120 --> 06:52:58,330
or, heck, they could even be in reverse\n
7140
06:52:58,330 --> 06:53:02,770
the decisions that Rave was making--\n
7141
06:53:02,771 --> 06:53:05,181
would guide us to the\nsolution no matter what.
7142
06:53:05,181 --> 06:53:07,820
And it would seem to take fewer steps.
7143
06:53:07,820 --> 06:53:10,580
So if we consider now the\npseudo code for this algorithm
7144
06:53:10,580 --> 06:53:12,890
let's take a look how we\nmight describe binary search.
7145
06:53:12,890 --> 06:53:15,760
So binary search we might\ndescribe with something like this.
7146
06:53:15,760 --> 06:53:19,001
If the number is behind the middle\n
7147
06:53:19,001 --> 06:53:21,070
then we can just return true.
7148
06:53:21,070 --> 06:53:24,651
Else if the number is\nless than the middle door
7149
06:53:24,651 --> 06:53:27,161
so if six is less than whatever\nis behind the middle door
7150
06:53:27,161 --> 06:53:29,501
then Rave would have\nsearched the left half.
7151
06:53:29,501 --> 06:53:32,050
Else if the number is\ngreater than the middle door
7152
06:53:32,050 --> 06:53:34,060
Rave would have searched the right half.
7153
06:53:34,061 --> 06:53:38,201
Else, if there are no doors-- and\n
7154
06:53:38,201 --> 06:53:40,070
this up top just to keep things clean.
7155
06:53:40,070 --> 06:53:43,751
If there's no doors, what should Rave\n
7156
06:53:43,751 --> 06:53:47,230
if I gave her no lockers to work with?
7157
06:53:48,280 --> 06:53:50,380
But this is an important\ncase to consider
7158
06:53:50,381 --> 06:53:54,341
because if in the process of\nsearching by locker by locker
7159
06:53:54,341 --> 06:53:58,991
we might have whittled down the\n
7160
06:53:58,991 --> 06:54:01,731
to one door to zero\ndoors-- and at that point
7161
06:54:01,730 --> 06:54:03,591
we might have had no\ndoors left to search.
7162
06:54:03,591 --> 06:54:06,401
So we have to naturally have a\nscenario for just considering
7163
06:54:07,480 --> 06:54:11,271
So it's not to say that maybe I don't\n
7164
06:54:11,271 --> 06:54:13,271
But as she divides and\ndivides and divides
7165
06:54:13,271 --> 06:54:17,081
if she runs out of lockers to ask those\n
7166
06:54:17,080 --> 06:54:20,020
if I ran out of phone book\npages to tear in half
7167
06:54:20,021 --> 06:54:23,570
I too might have had to\nreturn false as in this case.
7168
06:54:23,570 --> 06:54:26,861
So how can we now describe\nthis a little more like C
7169
06:54:26,861 --> 06:54:30,070
just to give ourselves a variable\n
7170
06:54:30,070 --> 06:54:33,291
Well, I might talk about\ndoors as being an array.
7171
06:54:33,291 --> 06:54:36,851
And so if I want to express the middle\n
7172
06:54:38,830 --> 06:54:40,630
I'm assuming that\nsomeone has done the math
7173
06:54:40,631 --> 06:54:43,811
to figure out what the middle door\n
7174
06:54:43,811 --> 06:54:46,301
And then doors, if the\nnumber we're looking for
7175
06:54:46,300 --> 06:54:49,830
is less than doors bracket\nmiddle, then search door
7176
06:54:49,830 --> 06:54:53,590
zero through doors middle minus 1.
7177
06:54:53,591 --> 06:54:57,971
So again, this is a more pedantic way of\n
7178
06:54:57,971 --> 06:55:00,520
search the left half,\nsearch the right half--
7179
06:55:00,520 --> 06:55:07,151
but start to now describe it in\n
7180
06:55:07,151 --> 06:55:08,954
like we did with our array notation.
7181
06:55:08,954 --> 06:55:10,871
The last scenario, of\ncourse, is if the number
7182
06:55:10,870 --> 06:55:13,120
is greater than the\ndoor's bracket middle
7183
06:55:13,120 --> 06:55:16,420
then Rave would have wanted to\nsearch the middle door plus 1--
7184
06:55:16,420 --> 06:55:21,610
so 1 over-- through doors n minus 1--
7185
06:55:22,690 --> 06:55:25,750
So again, just a way of sort of\n
7186
06:55:27,349 --> 06:55:31,180
So how might we translate\nthis now into big O notation?
7187
06:55:31,181 --> 06:55:38,230
Well, in the worst case, how many\n
7188
06:55:39,431 --> 06:55:43,390
Given seven doors or given\nmore generically n doors
7189
06:55:43,390 --> 06:55:47,980
how many times could she go left or go\n
7190
06:55:50,471 --> 06:55:53,148
What's the way to think about that?
7191
06:55:57,640 --> 06:56:00,515
And even if you're not feeling wholly\n
7192
06:56:00,515 --> 06:56:03,611
still, pretty much in programming and\n
7193
06:56:03,611 --> 06:56:06,791
any time we talk about some algorithm\n
7194
06:56:06,791 --> 06:56:10,541
in half, in half, in half,\nor any other multiple
7195
06:56:10,541 --> 06:56:12,941
it's probably involving\nlogarithms in some sense.
7196
06:56:12,940 --> 06:56:15,760
And log base n essentially\nrefers to the number
7197
06:56:15,760 --> 06:56:21,520
of times you can divide n by 2 until\n
7198
06:56:21,521 --> 06:56:23,771
or equivalently zero doors left.
7199
06:56:24,730 --> 06:56:28,631
So we might say that indeed,\nbinary search is in big O of log n
7200
06:56:28,631 --> 06:56:32,800
because the door that Rave\nopened last, this one
7201
06:56:32,800 --> 06:56:34,720
happened to be three doors away.
7202
06:56:34,721 --> 06:56:37,151
And actually, if you do\nthe math here, that roughly
7203
06:56:37,151 --> 06:56:38,961
works out to be exactly that case.
7204
06:56:38,960 --> 06:56:43,001
If we add one, that's sort of out\n
7205
06:56:43,001 --> 06:56:46,300
we were able to search it\nin just three total steps.
7206
06:56:46,300 --> 06:56:48,341
What about omega notation, though?
7207
06:56:48,341 --> 06:56:51,581
Like, in the best case, Rave\nmight have gotten lucky.
7208
06:56:51,580 --> 06:56:53,530
She opened the door, and there it is.
7209
06:56:53,530 --> 06:56:59,330
So how might we describe a lower bound\n
7210
06:57:03,580 --> 06:57:08,170
So here too, we see that in some cases\n
7211
06:57:08,170 --> 06:57:10,060
like, they're pretty equivalent.
7212
06:57:10,061 --> 06:57:15,191
And so this is why sometimes\n
7213
06:57:15,190 --> 06:57:17,650
case in the worst case\nbecause honestly, in general
7214
06:57:17,651 --> 06:57:19,961
who really cares if you just\nget lucky once in a while
7215
06:57:19,960 --> 06:57:21,640
and your algorithm is super fast?
7216
06:57:21,640 --> 06:57:24,611
What you probably care about\nis what's the worst case.
7217
06:57:25,751 --> 06:57:29,530
how long am I going to be sitting\n
7218
06:57:29,530 --> 06:57:35,167
or beach ball trying to give myself\n
7219
06:57:35,168 --> 06:57:38,001
Well, odds are, you're going to\n
7220
06:57:38,001 --> 06:57:39,791
So indeed, moving\nforward, will generally
7221
06:57:39,791 --> 06:57:43,091
talk about the running time of\n
7222
06:57:43,091 --> 06:57:45,140
a little less so in terms of omega.
7223
06:57:45,140 --> 06:57:47,501
But understanding the\nrange can be important
7224
06:57:47,501 --> 06:57:53,061
depending on the nature of the data that\n
7225
06:57:53,061 --> 06:57:55,871
All right let me pause and\nsee if there is any questions.
7226
06:58:03,210 --> 06:58:05,791
AUDIENCE: So this method\nis clearly more efficient
7227
06:58:05,791 --> 06:58:10,800
but it requires that the information\n
7228
06:58:10,800 --> 06:58:14,131
How do you ensure that you\ncan compile information
7229
06:58:14,131 --> 06:58:15,626
in a particular order at scale?
7230
06:58:15,626 --> 06:58:17,501
DAVID J. MALAN: Yeah, it's\na really good question.
7231
06:58:17,501 --> 06:58:20,376
And if I can generalize it, how do\n
7232
06:58:20,376 --> 06:58:22,920
at scale, which algorithm is better?
7233
06:58:22,920 --> 06:58:25,800
I've sort of led us down\nthis road of implying
7234
06:58:25,800 --> 06:58:27,900
that Rave's second\nalgorithm, binary search
7235
06:58:27,901 --> 06:58:29,941
is better because it's so much faster.
7236
06:58:29,940 --> 06:58:33,960
It's log of n in the worst\ncase instead of big O of n.
7237
06:58:33,960 --> 06:58:37,591
But Rave was given an advantage when\n
7238
06:58:38,582 --> 06:58:40,290
And so that sort of\ninvites the question
7239
06:58:40,291 --> 06:58:42,031
well, given a whole\nbunch of random data
7240
06:58:42,030 --> 06:58:45,070
either a small data set or, heck,\n
7241
06:58:45,070 --> 06:58:48,901
billions of pieces of data,\nshould you sort it first
7242
06:58:48,901 --> 06:58:51,511
from smallest to\nlargest and then search?
7243
06:58:51,510 --> 06:58:56,280
Or should you just dive right\nin and search it linearly?
7244
06:58:56,280 --> 06:58:57,998
Like, how might you think about that?
7245
06:58:57,998 --> 06:58:59,791
If you are Google, for\ninstance, and you've
7246
06:58:59,791 --> 06:59:03,451
got millions, billions of web pages,\n
7247
06:59:03,451 --> 06:59:06,210
because it's always going to work\neven though it might be slow?
7248
06:59:06,210 --> 06:59:09,181
Or should they invest the time\nin sorting all of that data--
7249
06:59:10,991 --> 06:59:13,261
and then search it more efficiently?
7250
06:59:13,260 --> 06:59:15,798
Like, how do you decide\nbetween those options?
7251
06:59:15,798 --> 06:59:18,091
AUDIENCE: If you're sorting\nthe data, then wouldn't you
7252
06:59:18,091 --> 06:59:20,934
have to go through all of the data?
7253
06:59:20,934 --> 06:59:23,101
DAVID J. MALAN: Yeah, if you had\nto sort the data first--
7254
06:59:23,100 --> 06:59:25,060
and we don't yet formally\nknow how to do this.
7255
06:59:25,061 --> 06:59:27,478
But obviously, as humans, we\ncould probably figure it out.
7256
06:59:27,477 --> 06:59:29,640
You do have to look at\nall of the data anyway.
7257
06:59:29,640 --> 06:59:33,120
And so you're sort of wasting your\n
7258
06:59:35,041 --> 06:59:37,259
But maybe it depends a bit more.
7259
06:59:37,258 --> 06:59:39,300
Like, that's absolutely\nright, and if you're just
7260
06:59:39,300 --> 06:59:42,420
searching for one thing in life,\n
7261
06:59:42,420 --> 06:59:46,080
to sort it and then search it because\n
7262
06:59:46,080 --> 06:59:48,240
But what's another scenario\nin which you might not
7263
06:59:48,241 --> 06:59:53,771
worry about that whereby it might\n
7264
06:59:54,271 --> 07:00:00,940
AUDIENCE: [INAUDIBLE] you can go\n
7265
07:00:00,940 --> 07:00:02,170
to find out what's happening.
7266
07:00:02,170 --> 07:00:03,212
DAVID J. MALAN: Yeah, exactly.
7267
07:00:03,212 --> 07:00:05,753
So if your problem is a\nGoogle-like problem where
7268
07:00:05,753 --> 07:00:08,710
you have more than just one user\n
7269
07:00:08,710 --> 07:00:11,140
website page, probably you\nshould incur the cost up front
7270
07:00:11,140 --> 07:00:14,980
and sort the whole thing because\n
7271
07:00:14,980 --> 07:00:17,170
is going to be faster,\nfaster, faster because it's
7272
07:00:17,170 --> 07:00:20,800
going to [INAUDIBLE] algorithm of binary\n
7273
07:00:20,800 --> 07:00:23,681
that's going to add up\nto be way fewer steps
7274
07:00:23,681 --> 07:00:25,971
than doing linear search multiple times.
7275
07:00:25,971 --> 07:00:27,851
So again, kind of\ndepends on the use case
7276
07:00:27,850 --> 07:00:29,710
and kind of depends on\nhow important it is.
7277
07:00:29,710 --> 07:00:32,411
And this happens even\nin real world contexts.
7278
07:00:32,411 --> 07:00:35,260
I think back always to graduate\n
7279
07:00:35,260 --> 07:00:36,970
to analyze some large data set.
7280
07:00:36,971 --> 07:00:39,761
And honestly, it was actually\neasier at the time for me
7281
07:00:39,760 --> 07:00:42,670
to write pretty inefficient\nbut hopefully correct
7282
07:00:42,670 --> 07:00:43,897
code because you know what?
7283
07:00:43,898 --> 07:00:47,230
I could just go to sleep for eight hours\n
7284
07:00:47,730 --> 07:00:50,695
I didn't have to bother writing\n
7285
07:00:50,695 --> 07:00:51,820
to run it more efficiently.
7286
07:00:52,320 --> 07:00:55,881
Because I was the only user, and I\n
7287
07:00:55,881 --> 07:00:58,061
And so this was kind of\na reasonable approach
7288
07:00:58,061 --> 07:01:01,911
reasonable until I woke up eight\n
7289
07:01:01,911 --> 07:01:05,320
And now I had to spend another eight\n
7290
07:01:05,320 --> 07:01:07,271
But even there, you\nsee an example where
7291
07:01:07,271 --> 07:01:09,251
what is your most precious resource?
7292
07:01:11,080 --> 07:01:13,330
Is it time to write the code?
7293
07:01:13,330 --> 07:01:15,458
Is it the amount of memory\nthe computer is using?
7294
07:01:15,458 --> 07:01:18,251
These are all resources we'll start\n
7295
07:01:18,251 --> 07:01:20,440
depends on what your goals are.
7296
07:01:20,440 --> 07:01:23,411
Any questions, then, on\nupper bounds, lower bounds
7297
07:01:23,411 --> 07:01:26,620
or each of these two\nsearches, linear or binary?
7298
07:01:27,351 --> 07:01:29,940
AUDIENCE: So just, when you're\ncalculating running time
7299
07:01:29,940 --> 07:01:34,677
does the sorting step\ncount for that time?
7300
07:01:34,677 --> 07:01:37,510
DAVID J. MALAN: When analyzing running\n
7301
07:01:37,510 --> 07:01:39,670
If you want it to if you actually do it.
7302
07:01:39,670 --> 07:01:41,260
At the moment, it did not apply.
7303
07:01:41,260 --> 07:01:45,460
I just gave Rave the luxury of\nknowing that the data was sorted.
7304
07:01:45,460 --> 07:01:48,880
But if I really wanted to charge\nher for the amount of time
7305
07:01:48,881 --> 07:01:52,091
it took to find that number six,\nI should have added the time
7306
07:01:52,091 --> 07:01:54,073
to sort plus the time to search.
7307
07:01:54,073 --> 07:01:55,780
And in fact, that's\na road we'll go down.
7308
07:01:55,780 --> 07:01:57,530
Why don't we go ahead and\npace ourselves as before?
7309
07:01:57,530 --> 07:01:58,948
Let's take a 10 minute break here.
7310
07:01:58,948 --> 07:02:01,491
And when we come back, we'll\nwrite some actual code.
7311
07:02:01,491 --> 07:02:05,399
So we've seen a couple of searches--\n
7312
07:02:05,399 --> 07:02:06,941
to be fair, we saw back in week zero.
7313
07:02:06,940 --> 07:02:10,150
But let's actually translate at\n
7314
07:02:10,151 --> 07:02:13,331
using this building block from\nlast week where we can actually
7315
07:02:13,330 --> 07:02:17,180
define an array if we want, like an\n
7316
07:02:17,181 --> 07:02:18,911
So let me switch over to VS Code here.
7317
07:02:18,911 --> 07:02:22,271
Let me go ahead and start\na program called numbers.c.
7318
07:02:22,271 --> 07:02:25,300
And in numbers.c, let me go ahead here.
7319
07:02:25,300 --> 07:02:29,201
And how about let's include\nour familiar header files?
7320
07:02:31,030 --> 07:02:35,690
I'll include standardio.h that we can\n
7321
07:02:35,690 --> 07:02:38,771
And now I'm going to go ahead\nand give myself int main void.
7322
07:02:38,771 --> 07:02:40,460
No command line arguments today.
7323
07:02:40,460 --> 07:02:41,593
So I'll leave that as void.
7324
07:02:41,593 --> 07:02:43,300
And I'm going to go\nahead and give myself
7325
07:02:43,300 --> 07:02:45,771
an array of how about seven numbers?
7326
07:02:45,771 --> 07:02:48,581
So I'll call it int number 7.
7327
07:02:48,580 --> 07:02:50,620
And then I can fill\nthis array with numbers.
7328
07:02:50,620 --> 07:02:54,460
Like, numbers brackets 0 can be\n
7329
07:02:54,460 --> 07:02:58,668
could be the number 6, and numbers\n
7330
07:02:58,669 --> 07:03:01,211
And this is the same list that\nwe saw with [? Nomira ?] a bit
7331
07:03:01,210 --> 07:03:03,530
ago where it was 4, then 6, then 8.
7332
07:03:04,280 --> 07:03:06,700
There's actually another\nsyntax I can show you here.
7333
07:03:06,701 --> 07:03:09,581
If you know in advance\nin a C program that you
7334
07:03:09,580 --> 07:03:14,750
want an array of certain values and you\n
7335
07:03:14,751 --> 07:03:17,771
you want, you can actually do this\n
7336
07:03:17,771 --> 07:03:20,921
You can say, don't worry\nabout how big this is.
7337
07:03:20,920 --> 07:03:23,980
It's going to be implicit by\nway of these curly braces.
7338
07:03:23,980 --> 07:03:28,971
Here, I can do 4, 6, 8, 2,\n7, 5, 0, close curly brace.
7339
07:03:28,971 --> 07:03:31,091
So it's a somewhat new\nuse of curly braces.
7340
07:03:31,091 --> 07:03:35,140
But this has the effect of giving\n
7341
07:03:35,140 --> 07:03:36,800
of which are a whole bunch of integers.
7342
07:03:37,390 --> 07:03:41,320
The compiler can infer it from what's\n
7343
07:03:41,320 --> 07:03:44,501
And it seems to be of\nsize 1, 2, 3, 4, 5, 6, 7.
7344
07:03:44,501 --> 07:03:49,870
And all seven elements will be\n
7345
07:03:50,690 --> 07:03:53,140
So just a minor optimization\ncode wise to tighten up
7346
07:03:53,140 --> 07:03:56,451
what would have otherwise been\n
7347
07:03:56,451 --> 07:03:59,441
Now let's go ahead and implement\nlinear search, as we called it.
7348
07:03:59,440 --> 07:04:02,483
And you can do this in a bunch of\n
7349
07:04:02,483 --> 07:04:09,190
For int i get 0, i is\nless than 7 i plus plus.
7350
07:04:09,190 --> 07:04:12,161
Then inside of my loop, I'm\ngoing to ask the question, well
7351
07:04:12,161 --> 07:04:17,381
if the numbers at location i\nequals equals, as we asked of
7352
07:04:17,381 --> 07:04:21,041
[? Nomira, ?] the number 0, then I'm\n
7353
07:04:21,041 --> 07:04:25,811
like printf found backslash n.
7354
07:04:25,811 --> 07:04:27,641
And then I'm going to return 0.
7355
07:04:27,640 --> 07:04:30,400
Just because of last week's\ndiscussion of returning
7356
07:04:30,401 --> 07:04:34,511
a value for main when all is well,\n
7357
07:04:34,510 --> 07:04:37,420
just to signal that indeed,\nI found what I'm looking for.
7358
07:04:37,420 --> 07:04:44,920
Otherwise, on what line do I want to\n
7359
07:04:44,920 --> 07:04:46,960
and return something other than 0?
7360
07:04:46,960 --> 07:04:51,220
Right, I don't think I want an else\n
7361
07:04:51,221 --> 07:04:55,390
So on what line would you prefer I\n
7362
07:04:55,390 --> 07:04:58,850
of not found and I'll return an error?
7363
07:05:04,370 --> 07:05:06,078
So at the end of the\nfor loop because you
7364
07:05:06,079 --> 07:05:07,911
want to give the\nprogram or our volunteer
7365
07:05:07,911 --> 07:05:11,341
earlier a chance to go through all\n
7366
07:05:11,341 --> 07:05:14,061
But if you go through the whole\nthing, through the whole loop
7367
07:05:14,061 --> 07:05:17,991
at the very end, you probably just\n
7368
07:05:17,991 --> 07:05:20,421
and then return\nsomething like positive 1
7369
07:05:20,420 --> 07:05:22,400
just to signify that an error happened.
7370
07:05:22,401 --> 07:05:24,531
And again, this was a\nminor detail last week.
7371
07:05:24,530 --> 07:05:28,730
Any time main is successful, the\n
7372
07:05:30,210 --> 07:05:33,380
And if something goes wrong, like you\n
7373
07:05:33,381 --> 07:05:37,311
you might return something other than\n
7374
07:05:37,311 --> 07:05:39,563
or even negative numbers if you want.
7375
07:05:39,562 --> 07:05:41,520
All right, well, let me\ngo ahead and save this.
7376
07:05:49,401 --> 07:05:51,861
All right, and it's found,\nas I would hope it would be.
7377
07:05:51,861 --> 07:05:55,070
And just as a little check, let's\n
7378
07:05:55,070 --> 07:05:59,151
not there, like the number negative 1.
7379
07:05:59,151 --> 07:06:02,001
Let me go ahead and recompile\nthe code with make numbers.
7380
07:06:02,001 --> 07:06:04,251
Let me rerun the code\nwith dot slash numbers
7381
07:06:04,251 --> 07:06:06,260
and hopefully-- whew, OK, not found.
7382
07:06:06,260 --> 07:06:08,833
So proof by example seems\nto be working correctly.
7383
07:06:08,833 --> 07:06:11,001
But let's make things a\nlittle more interesting now.
7384
07:06:11,001 --> 07:06:14,061
Right now, I'm using just\nan array of integers.
7385
07:06:14,061 --> 07:06:18,451
Let me go ahead and introduce\nmaybe an array of strings instead.
7386
07:06:18,451 --> 07:06:21,721
And maybe this time, I'll store a\n
7387
07:06:21,721 --> 07:06:23,461
but actual strings of names.
7388
07:06:24,690 --> 07:06:26,490
Well, let me go back to my code here.
7389
07:06:26,491 --> 07:06:30,831
I'm going to switch us over to\nmaybe a file called names.c.
7390
07:06:30,830 --> 07:06:34,490
And in here, I'll go\nahead and include cs50.h.
7391
07:06:37,911 --> 07:06:41,390
And I'm going to go ahead and\nfor now include a new friend
7392
07:06:41,390 --> 07:06:44,838
from last week, string.h, which gives\n
7393
07:06:44,838 --> 07:06:47,631
Int main void because I'm not going\n
7394
07:06:48,841 --> 07:06:53,690
And now if I want an array of strings,\n
7395
07:06:56,480 --> 07:06:58,460
And then I could start\ndoing like before.
7396
07:06:58,460 --> 07:07:01,940
Names bracket 0 could be someone\nlike Bill, and names bracket 1
7397
07:07:01,940 --> 07:07:05,100
could be someone like\nCharlie and so forth.
7398
07:07:05,100 --> 07:07:08,712
But there's this new\nimprovement I can make.
7399
07:07:08,712 --> 07:07:11,420
Let me just let the compiler figure\n
7400
07:07:11,420 --> 07:07:16,911
And using curly braces, I'll do Bill\n
7401
07:07:16,911 --> 07:07:24,050
George and then Ginny and then Percy and\n
7402
07:07:24,050 --> 07:07:27,291
All right, so now I have\nthese seven names as strings.
7403
07:07:32,091 --> 07:07:35,690
i is less than 7 as before,\ni plus plus as before.
7404
07:07:35,690 --> 07:07:39,260
And inside of the, loop lets this\n
7405
07:07:39,260 --> 07:07:41,990
and suppose we're searching\nfor Ron arbitrarily.
7406
07:07:41,991 --> 07:07:44,451
He is there, so we should\neventually find him.
7407
07:07:44,451 --> 07:07:51,890
Let me go ahead and say if names bracket\n
7408
07:07:51,890 --> 07:07:55,701
of my if condition, I'm going to\n
7409
07:07:55,701 --> 07:07:57,951
And I'm going to return 0\njust because all is well.
7410
07:07:57,951 --> 07:08:00,751
And I'm going to take your\nadvice from the get go this time
7411
07:08:00,751 --> 07:08:04,920
and, at the end of the loop, print out\n
7412
07:08:04,920 --> 07:08:08,030
I have not printed found, and\nI have not returned already.
7413
07:08:08,030 --> 07:08:12,200
So I'm just going to go ahead and\n
7414
07:08:12,201 --> 07:08:14,781
All right, let me go ahead and\ncross my fingers as always.
7415
07:08:17,670 --> 07:08:20,661
And it doesn't seem\nto like my code here.
7416
07:08:20,661 --> 07:08:22,730
This is perhaps a new\nerror that you might not
7417
07:08:22,730 --> 07:08:25,440
have seen yet in names.c line 11.
7418
07:08:25,440 --> 07:08:28,280
So that's this line\nhere, my if condition.
7419
07:08:28,280 --> 07:08:32,330
Result of comparison against a\nstring literal is unspecified.
7420
07:08:32,330 --> 07:08:34,752
Use an explicit string\ncomparison function instead.
7421
07:08:34,753 --> 07:08:37,460
I mean, that's kind of a mouthful,\n
7422
07:08:37,460 --> 07:08:39,960
you're probably not going to\nknow how to make sense of that.
7423
07:08:39,960 --> 07:08:43,490
But it does kind of draw our\nattention to something being awry
7424
07:08:43,491 --> 07:08:48,201
with the equality checking\nhere, with equal equals and Ron.
7425
07:08:48,201 --> 07:08:50,601
And here's where again\nwe've been telling
7426
07:08:50,600 --> 07:08:53,060
sort of a white lie for\nthe past couple of weeks.
7427
07:08:53,061 --> 07:08:57,261
Strings are a thing in C. Strings\nare a thing in programming.
7428
07:08:57,260 --> 07:08:59,030
But recall from last\nweek, I did disclaim
7429
07:08:59,030 --> 07:09:01,010
there's no such thing\nas a string data type
7430
07:09:01,010 --> 07:09:05,030
technically because it's not\na primitive in the way an int
7431
07:09:05,030 --> 07:09:08,570
and a float and a bool are that are\n
7432
07:09:08,570 --> 07:09:12,530
You can't just use equation\nequals to compare two strings.
7433
07:09:12,530 --> 07:09:15,470
You actually have to use\na special function that's
7434
07:09:15,471 --> 07:09:18,501
in this header file we talked\nbriefly about last week.
7435
07:09:18,501 --> 07:09:21,140
In that header file was\nstring length or strlen.
7436
07:09:21,140 --> 07:09:23,850
But there's other\nfunctions instead as well.
7437
07:09:23,850 --> 07:09:27,440
Let me, in fact, go ahead\nand open up the manual pages.
7438
07:09:32,120 --> 07:09:37,161
In string.h you can perhaps infer\n
7439
07:09:37,161 --> 07:09:40,973
the place of equals equals for today.
7440
07:09:43,041 --> 07:09:47,666
DAVID J. MALAN: So strcmp, S-T-R-C-M-P,\n
7441
07:09:47,666 --> 07:09:49,791
And if I click on that,\nwe'll see more information.
7442
07:09:49,791 --> 07:09:53,871
And indeed, if I click on strcmp,\nwe'll see under the synopsis
7443
07:09:53,870 --> 07:09:58,850
that, OK, I need to use the CS50 header\n
7444
07:09:58,850 --> 07:10:02,210
Here is its prototype,\nwhich is telling me
7445
07:10:02,210 --> 07:10:05,720
that strcmp takes two\nstrings, S1 and S2, that
7446
07:10:05,721 --> 07:10:07,251
are presumably going to be compared.
7447
07:10:07,251 --> 07:10:09,591
And it returns an integer,\nwhich is interesting.
7448
07:10:10,791 --> 07:10:14,091
The description of this function is\n
7449
07:10:15,230 --> 07:10:18,471
So uppercase or lowercase\nmatters, just FYI.
7450
07:10:18,471 --> 07:10:20,931
And then let's look it\nthe return value here.
7451
07:10:20,931 --> 07:10:25,221
The return value of this function\nreturns an int less than 0
7452
07:10:25,221 --> 07:10:32,841
if S1 comes before S2, 0 if S1 is the\n
7453
07:10:35,300 --> 07:10:39,140
So the reason that this function\n
7454
07:10:39,140 --> 07:10:41,751
bool, true or false, is\nthat it actually will
7455
07:10:41,751 --> 07:10:45,111
allow us to sort these things\n
7456
07:10:45,111 --> 07:10:49,341
if two strings come in this order or\n
7457
07:10:49,341 --> 07:10:51,440
you need three possible return values.
7458
07:10:51,440 --> 07:10:53,271
And a bool, of course,\nonly gives you two
7459
07:10:53,271 --> 07:10:56,841
but an int gives you like 4 billion\n
7460
07:10:56,841 --> 07:11:01,881
So 0 or a positive number or a negative\n
7461
07:11:01,881 --> 07:11:06,320
And the documentation goes on to explain\n
7462
07:11:06,320 --> 07:11:09,620
Recall that capital A\nis 65, capital B is 66
7463
07:11:09,620 --> 07:11:12,021
and it's those underlying\nASCII or Unicode
7464
07:11:12,021 --> 07:11:15,411
numbers that a computer uses to figure\n
7465
07:11:15,411 --> 07:11:17,541
or after it like in the dictionary.
7466
07:11:17,541 --> 07:11:20,311
But for our purposes now,\nwe only care about equality.
7467
07:11:20,311 --> 07:11:22,041
So I'm going to go ahead and do this.
7468
07:11:22,041 --> 07:11:26,181
If I want to compare names\nbracket i against Ron
7469
07:11:26,181 --> 07:11:33,681
I use stir compare or strcmp, names\n
7470
07:11:33,681 --> 07:11:35,870
So it's a little more\ninvolved than actually
7471
07:11:35,870 --> 07:11:40,190
using equals equals, which\ndoes work for integers, longs
7472
07:11:41,361 --> 07:11:45,210
But for strings, it turns out we\n
7473
07:11:45,710 --> 07:11:47,990
Well, last week, recall\nwhat a string really is.
7474
07:11:47,991 --> 07:11:50,581
It's an array of characters.
7475
07:11:50,580 --> 07:11:54,050
And so whereas you can use equals\nequals for single characters
7476
07:11:54,050 --> 07:11:56,960
strcmp, as we'll\neventually see, is going
7477
07:11:56,960 --> 07:11:58,798
to compare multiple characters for us.
7478
07:11:59,841 --> 07:12:03,931
There's a loop needed, and that's\n
7479
07:12:03,931 --> 07:12:06,651
But it doesn't just work out of\n
7480
07:12:06,651 --> 07:12:10,628
That would literally be comparing\n
7481
07:12:10,628 --> 07:12:12,710
And we'll come back to\nthis next week as to what's
7482
07:12:12,710 --> 07:12:14,280
really going on under the hood.
7483
07:12:14,280 --> 07:12:18,501
So let me go ahead and fix one\nbug that I just realized I made.
7484
07:12:18,501 --> 07:12:23,811
I want to check if the return\nvalue of str compare is equal to 0
7485
07:12:23,811 --> 07:12:27,021
because per the documentation,\nthat meant they're the same.
7486
07:12:27,021 --> 07:12:30,001
All right, let me go ahead\nand make names this time.
7487
07:12:31,070 --> 07:12:34,131
Dot slash names, Enter, found.
7488
07:12:34,131 --> 07:12:39,651
And just as a sanity check, let's\n
7489
07:12:39,651 --> 07:12:43,371
Searching now for Hermione\nafter recompiling the code
7490
07:12:44,931 --> 07:12:46,640
And she's not, in fact, found.
7491
07:12:46,640 --> 07:12:49,580
So here's just a similar\nimplementation of linear search
7492
07:12:49,580 --> 07:12:53,930
not for integers this time\nbut instead for strings
7493
07:12:53,931 --> 07:12:57,501
the subtlety really being we need\n
7494
07:12:57,501 --> 07:13:02,001
to actually do the legwork for us of\n
7495
07:13:02,001 --> 07:13:05,568
All right, questions on either of these\n
7496
07:13:05,568 --> 07:13:07,253
AUDIENCE: So, if I do [INAUDIBLE]
7497
07:13:07,253 --> 07:13:08,460
DAVID J. MALAN: Ah, good question.
7498
07:13:08,460 --> 07:13:12,620
If I had not fixed what I claimed was\n
7499
07:13:12,620 --> 07:13:15,201
and we saw an example of\nthis last week, actually.
7500
07:13:15,201 --> 07:13:21,831
If a function returns an integer,\n
7501
07:13:21,830 --> 07:13:25,100
when you get back 0, the\nexpression, the Boolean expression
7502
07:13:29,300 --> 07:13:33,830
If a function returns any positive\n
7503
07:13:33,830 --> 07:13:37,130
that's going to be\ninterpreted as true even
7504
07:13:37,131 --> 07:13:41,871
if it's positive or negative, whether\n
7505
07:13:41,870 --> 07:13:45,931
And so if I did this, this\nwould be saying the opposite.
7506
07:13:45,931 --> 07:13:51,291
So if I were to say this, if str compare\n
7507
07:13:51,291 --> 07:13:57,754
implicitly like saying this does not\n
7508
07:13:57,754 --> 07:14:00,171
but you don't want to check\nfor true because, again, we're
7509
07:14:01,710 --> 07:14:05,361
So the reason I did 0\nhere in this case is
7510
07:14:05,361 --> 07:14:08,971
that it explicitly checks for the return\n
7511
07:14:15,309 --> 07:14:17,351
DAVID J. MALAN: Yes, you might\nnot have seen this yet
7512
07:14:17,350 --> 07:14:20,940
but you can express the\nequivalent because if you
7513
07:14:20,940 --> 07:14:24,661
want to check if this is\nfalse, you can actually
7514
07:14:24,661 --> 07:14:27,661
use an exclamation point,\nknown as a bang in programming
7515
07:14:29,320 --> 07:14:32,524
So false becomes true,\ntrue becomes false.
7516
07:14:32,524 --> 07:14:34,441
So this would be another\nway of expressing it.
7517
07:14:34,440 --> 07:14:39,300
This is arguably a worse design, though,\n
7518
07:14:39,300 --> 07:14:43,021
says you should be checking\nfor 0 or a positive value
7519
07:14:43,021 --> 07:14:46,111
or a negative value, and this\nlittle trick, while correct
7520
07:14:46,111 --> 07:14:49,861
and I think you can make a reasonable\n
7521
07:14:49,861 --> 07:14:51,780
And I would argue instead\nfor the first way
7522
07:14:51,780 --> 07:14:53,968
checking for equals equals 0 instead.
7523
07:14:53,968 --> 07:14:55,800
And if that's a little\nsubtle, not to worry.
7524
07:14:55,800 --> 07:15:00,600
We'll come back to little syntactic\n
7525
07:15:00,600 --> 07:15:05,130
Other questions on linear\nsearch in these two forms.
7526
07:15:05,131 --> 07:15:06,811
Is there another hand or hands?
7527
07:15:08,971 --> 07:15:10,261
OK, just holler if I missed.
7528
07:15:10,260 --> 07:15:12,372
So let's now actually take\nthis one step further.
7529
07:15:12,372 --> 07:15:15,330
Suppose that we want to write a\n
7530
07:15:15,330 --> 07:15:19,470
a little more like a phone book that\n
7531
07:15:19,471 --> 07:15:21,390
just integers but actual phone numbers.
7532
07:15:21,390 --> 07:15:23,701
Well, we could escalate\nthings like this.
7533
07:15:23,701 --> 07:15:27,210
We could now have two arrays-- one\n
7534
07:15:27,210 --> 07:15:29,730
And I'm going to use\nstrings for the numbers now
7535
07:15:29,730 --> 07:15:32,370
the phone numbers, because\nin most communities
7536
07:15:32,370 --> 07:15:36,091
phone numbers might have dashes,\n
7537
07:15:36,091 --> 07:15:39,390
that really looks more like a string\n
7538
07:15:39,390 --> 07:15:42,940
Probably don't want to use an int lest\n
7539
07:15:42,940 --> 07:15:47,521
So let me switch back to VS Code here,\n
7540
07:15:47,521 --> 07:15:49,394
in a file called phonebook.c.
7541
07:15:49,394 --> 07:15:51,061
And now let me go ahead and do the same.
7542
07:15:52,741 --> 07:15:58,681
Let me include standardio.h,\nand let me include string.h.
7543
07:15:58,681 --> 07:16:01,741
I'm going to again do int main void.
7544
07:16:01,741 --> 07:16:05,071
And then inside of my program, I'm\n
7545
07:16:06,870 --> 07:16:09,361
String names will be\njust two of us this time.
7546
07:16:12,751 --> 07:16:15,241
And then I'll give myself--\noops, typo already.
7547
07:16:15,241 --> 07:16:18,151
If I want this to be an array, I\n
7548
07:16:18,151 --> 07:16:19,741
The compiler can count for me.
7549
07:16:19,741 --> 07:16:21,661
But I do need the square brackets.
7550
07:16:21,661 --> 07:16:28,111
Then for numbers, I'm again going to\n
7551
07:16:28,111 --> 07:16:33,870
the curly braces that how about\nCarter can be at 1-617-495-1000.
7552
07:16:33,870 --> 07:16:35,640
And how about my own number here--
7553
07:16:35,640 --> 07:16:39,361
1-949-468-- oh pattern appearing--
7554
07:16:42,960 --> 07:16:44,890
Well, I'm just kind of lined things up.
7555
07:16:44,890 --> 07:16:47,820
So Carter's number is\napparently first in this array
7556
07:16:47,820 --> 07:16:51,161
and I'm claiming that he'll be\n
7557
07:16:51,161 --> 07:16:53,971
I, David, will be the first--\nthe second in the names array
7558
07:16:53,971 --> 07:16:56,628
and second in the numbers array.
7559
07:16:56,628 --> 07:16:59,460
If you want to have a little fun\n
7560
07:16:59,460 --> 07:17:01,630
or call me some time at that number.
7561
07:17:01,631 --> 07:17:05,311
So now let's actually use\nthis data in some way.
7562
07:17:05,311 --> 07:17:08,729
Let's go ahead and actually search\n
7563
07:17:11,850 --> 07:17:16,450
There's two of us this time-- so i less\n
7564
07:17:16,451 --> 07:17:18,841
And now I'm going to practice\nwhat I preached earlier
7565
07:17:18,841 --> 07:17:22,800
and I'm going to use str compare\nto find my name in this case.
7566
07:17:22,800 --> 07:17:29,460
And I'm going to say if strcmp of names\n
7567
07:17:29,460 --> 07:17:33,100
and that equals 0,\nmeaning they're the same
7568
07:17:33,100 --> 07:17:35,970
then just as before, I'm going to\n
7569
07:17:35,971 --> 07:17:37,681
But this time, I'm going to\nmake the program more useful
7570
07:17:37,681 --> 07:17:39,460
and not just say found or not found.
7571
07:17:39,460 --> 07:17:43,411
Now I'm implementing a phone book, like\n
7572
07:17:43,411 --> 07:17:46,741
So I'm going to say something\nlike, quote unquote, found percent
7573
07:17:46,741 --> 07:17:53,191
s backslash n and then actually\nplug in numbers bracket i
7574
07:17:53,190 --> 07:17:56,730
to correspond to the\ncurrent name bracket i.
7575
07:17:56,730 --> 07:17:58,585
And then I'll return 0 as before.
7576
07:17:58,585 --> 07:18:00,960
And then down here if we get\nall the way through the loop
7577
07:18:00,960 --> 07:18:04,480
and David's not there for some reason,\n
7578
07:18:05,940 --> 07:18:10,950
So let me go ahead and compile this\n
7579
07:18:10,951 --> 07:18:13,601
and it seems to have found the number.
7580
07:18:13,600 --> 07:18:17,490
So this code I'm going\nto claim is correct.
7581
07:18:17,491 --> 07:18:20,551
It's kind of stupid because I've\n
7582
07:18:20,550 --> 07:18:22,050
app that only supports two people.
7583
07:18:22,050 --> 07:18:23,863
They're only going to be me and Carter.
7584
07:18:23,864 --> 07:18:26,281
This would be like downloading\nthe contacts app on a phone
7585
07:18:26,280 --> 07:18:28,198
and you can only call\ntwo people in the world.
7586
07:18:28,198 --> 07:18:30,181
There's no ability to\nadd names or edit things.
7587
07:18:30,181 --> 07:18:33,221
That, of course, could come later\n
7588
07:18:33,221 --> 07:18:35,054
But for now for the\nsake of discussion, I've
7589
07:18:35,054 --> 07:18:37,811
just hardcoded two\nnames and two numbers.
7590
07:18:37,811 --> 07:18:40,651
But for what it does, I\nclaim this is correct.
7591
07:18:40,651 --> 07:18:43,951
It's going to find me\nand print out my number.
7592
07:18:45,841 --> 07:18:49,921
Let's start to now consider if\nwe're not just using arrays
7593
07:18:49,920 --> 07:18:52,212
but are we using them, well?
7594
07:18:52,212 --> 07:18:55,170
We started to use them last week,\n
7595
07:18:55,170 --> 07:18:59,040
And what might I even mean by\nusing an array well or designing
7596
07:19:01,001 --> 07:19:06,300
Any critiques or concerns\nwith why this might not
7597
07:19:06,300 --> 07:19:08,460
be the best road for us\nto be going down when
7598
07:19:08,460 --> 07:19:12,900
I want to implement something like a\n
7599
07:19:12,901 --> 07:19:15,901
It seems all too vulnerable\nto just mistakes.
7600
07:19:15,901 --> 07:19:19,981
For instance, if I screw up the actual\n
7601
07:19:19,980 --> 07:19:24,550
such that it's now more or less than\n
7602
07:19:24,550 --> 07:19:27,780
it feels like there's not a tight\n
7603
07:19:27,780 --> 07:19:31,501
of data, and it's just sort of\nis trusting on the honor system
7604
07:19:31,501 --> 07:19:37,800
that any time I use names bracket i\n
7605
07:19:38,521 --> 07:19:40,440
If you're the one writing\nthe code, you're probably
7606
07:19:40,440 --> 07:19:42,001
not going to really screw this up.
7607
07:19:42,001 --> 07:19:44,626
But if you start collaborating\nwith someone else or the program
7608
07:19:44,626 --> 07:19:48,061
is getting much, much longer, the\n
7609
07:19:48,061 --> 07:19:52,471
remember that you're sort of just\n
7610
07:19:52,471 --> 07:19:54,781
like this is going to fail eventually.
7611
07:19:54,780 --> 07:19:57,900
Someone's not going to realize that,\n
7612
07:19:57,901 --> 07:20:00,901
And you're going to start out putting\n
7613
07:20:00,901 --> 07:20:05,070
is to say it'd be much nicer if\n
7614
07:20:05,070 --> 07:20:09,210
pieces of data, names and numbers,\n
7615
07:20:09,210 --> 07:20:13,260
that you're not just trusting that\n
7616
07:20:13,260 --> 07:20:16,990
and numbers, have this kind of\nrelationship with themselves.
7617
07:20:16,991 --> 07:20:19,561
So let's consider how\nwe might solve this.
7618
07:20:19,561 --> 07:20:23,761
A new feature today that we'll introduce\n
7619
07:20:23,760 --> 07:20:27,640
In C, we have the ability to\ninvent our own data types
7620
07:20:27,640 --> 07:20:30,916
if you will-- data types\nthat the authors of C decades
7621
07:20:30,916 --> 07:20:32,791
ago just didn't envision\nor just didn't think
7622
07:20:32,791 --> 07:20:36,241
were necessary because we can implement\n
7623
07:20:36,241 --> 07:20:38,641
just as you could create\ncustom puzzle pieces
7624
07:20:38,640 --> 07:20:40,721
or in C, you can create\ncustom functions.
7625
07:20:40,721 --> 07:20:45,121
So in C, can you create\nyour own types of data
7626
07:20:45,120 --> 07:20:49,260
that go beyond the built in ints\nand floats and even strings?
7627
07:20:49,260 --> 07:20:54,900
You can make, for instance, a\n
7628
07:20:54,901 --> 07:20:57,451
type in the context of\nelections or a person data type
7629
07:20:57,451 --> 07:21:00,311
more generically that might\nhave a name and a number.
7630
07:21:02,080 --> 07:21:07,830
Well, let me go here and propose\n
7631
07:21:07,830 --> 07:21:11,280
wouldn't it be nice if we\ncould have a person data type
7632
07:21:11,280 --> 07:21:13,830
and then we could have\nan array called people?
7633
07:21:13,830 --> 07:21:17,130
And maybe that array is our\nonly array with two things
7634
07:21:19,561 --> 07:21:22,501
But somehow, those data\ntypes, these persons
7635
07:21:22,501 --> 07:21:25,519
would have both a name and a\nnumber associated with them.
7636
07:21:25,519 --> 07:21:27,061
So we don't need two separate arrays.
7637
07:21:27,061 --> 07:21:31,601
We need one array of persons,\na brand new data type.
7638
07:21:33,251 --> 07:21:35,431
Well, if we want every\nperson in the world
7639
07:21:35,431 --> 07:21:37,681
or in this program to\nhave a name and a number
7640
07:21:37,681 --> 07:21:40,741
we literally right out\nfirst those two data types.
7641
07:21:40,741 --> 07:21:42,221
Give me a string called name.
7642
07:21:42,221 --> 07:21:45,211
Give me a string called\nnumber semicolon, after each.
7643
07:21:45,210 --> 07:21:48,390
And then we wrap that,\nthose two lines of code
7644
07:21:48,390 --> 07:21:51,091
with this syntax, which at first\nglance is a little cryptic.
7645
07:21:51,091 --> 07:21:52,771
It's a lot of words all of a sudden.
7646
07:21:52,771 --> 07:21:57,091
But typedef is a new keyword today\nthat defines a new data type.
7647
07:21:57,091 --> 07:22:00,870
This is the C key word that\nlets you create your own data
7648
07:22:00,870 --> 07:22:02,460
type for the very first time.
7649
07:22:02,460 --> 07:22:07,201
Struct is another related key word that\n
7650
07:22:07,201 --> 07:22:11,671
a simple data type, like an int or a\n
7651
07:22:13,300 --> 07:22:17,070
It's got some dimensions to it, like\n
7652
07:22:17,070 --> 07:22:19,620
or even 50 things inside of it.
7653
07:22:19,620 --> 07:22:23,670
The last word down here is the name\n
7654
07:22:23,670 --> 07:22:26,340
and it weirdly goes\nafter the curly braces.
7655
07:22:26,341 --> 07:22:30,120
But this is how you invent\na data type called person.
7656
07:22:30,120 --> 07:22:33,030
And what this code is\nimplying is that henceforth
7657
07:22:33,030 --> 07:22:38,341
the compiler clang will know that a\n
7658
07:22:38,341 --> 07:22:41,131
a string and a number that's a string.
7659
07:22:41,131 --> 07:22:44,131
And you don't have to worry\nabout having multiple arrays now.
7660
07:22:44,131 --> 07:22:48,311
You can just have an array\nof people moving forward.
7661
07:22:48,311 --> 07:22:50,279
So how can we go about using this?
7662
07:22:50,278 --> 07:22:52,320
Well, let me go back to\nmy code from before where
7663
07:22:52,320 --> 07:22:53,701
I was implementing a phone book.
7664
07:22:53,701 --> 07:22:56,076
And why don't we enhance the\nphone book code a little bit
7665
07:22:56,076 --> 07:22:58,591
by borrowing some of that new syntax?
7666
07:22:58,591 --> 07:23:01,081
Let me go to the top of\nmy program above main
7667
07:23:01,080 --> 07:23:04,020
and define a type that's\na structure or a data
7668
07:23:04,021 --> 07:23:08,861
structure that has a name inside of\n
7669
07:23:08,861 --> 07:23:12,511
And the name of this new structure\n
7670
07:23:12,510 --> 07:23:17,911
Inside of my code now, let me go ahead\n
7671
07:23:17,911 --> 07:23:21,870
Let me give myself an array\ncalled people of size 2.
7672
07:23:21,870 --> 07:23:25,358
And I'm going to use the\nnon-terse way to do this.
7673
07:23:25,358 --> 07:23:26,940
I'm not going to use the curly braces.
7674
07:23:26,940 --> 07:23:31,501
I'm going to more pedantic spell out\n
7675
07:23:31,501 --> 07:23:35,221
at location 0, which is the\nfirst person in an array
7676
07:23:35,221 --> 07:23:37,051
because you always start counting at 0.
7677
07:23:37,050 --> 07:23:40,861
I'm going to give that person\na name of quote unquote Carter.
7678
07:23:40,861 --> 07:23:44,341
And the dot is admittedly one\nnew piece of syntax today too.
7679
07:23:44,341 --> 07:23:46,771
The dot means go inside\nof that structure
7680
07:23:46,771 --> 07:23:50,550
and access the variable called\n
7681
07:23:50,550 --> 07:23:52,710
Similarly, if I'm going\nto give Carter a number
7682
07:23:52,710 --> 07:23:57,390
I can go into people bracket 0 dot\n
7683
07:23:57,390 --> 07:24:02,310
as before plus 1-617-495-1000.
7684
07:24:02,311 --> 07:24:04,591
And then I can do the\nsame for myself here--
7685
07:24:04,591 --> 07:24:08,506
people bracket-- where should I go?
7686
07:24:08,506 --> 07:24:10,468
OK, one because again, two elements.
7687
07:24:10,468 --> 07:24:11,800
But we started counting at zero.
7688
07:24:11,800 --> 07:24:13,780
Bracket name equals quote unquote David.
7689
07:24:13,780 --> 07:24:18,700
And then lastly, people bracket 1\n
7690
07:24:24,730 --> 07:24:27,611
So now if I scroll\ndown here to my logic
7691
07:24:27,611 --> 07:24:30,491
I don't think this part\nneeds to change too much.
7692
07:24:30,491 --> 07:24:35,041
I'm still, for the sake of discussion,\n
7693
07:24:35,041 --> 07:24:37,991
is 0 on up to but not through 2.
7694
07:24:37,991 --> 07:24:41,031
But I think this line\nof code needs to change.
7695
07:24:41,030 --> 07:24:49,271
How should I now refer to the\ni-th person's name as I iterate?
7696
07:24:49,271 --> 07:24:52,611
What should I compare quote\nunquote David to this time?
7697
07:24:54,411 --> 07:24:57,438
AUDIENCE: People bracket i dot name.
7698
07:24:57,438 --> 07:24:59,230
DAVID J. MALAN: Yeah, people\nbracket i dot name.
7699
07:24:59,530 --> 07:25:01,198
Because people is the name of the array.
7700
07:25:01,198 --> 07:25:04,841
Bracket i is the i-th person that we're\n
7701
07:25:04,841 --> 07:25:07,600
first zero, then one, maybe\nhigher if it had more people.
7702
07:25:07,600 --> 07:25:10,931
Then dot is our new syntax for\ngoing inside of a data structure
7703
07:25:10,931 --> 07:25:14,230
and accessing a variable therein\nwhich in this case is name.
7704
07:25:14,230 --> 07:25:16,640
And so I can compare\nDavid just as before.
7705
07:25:16,640 --> 07:25:21,251
So it's a little more verbose, but\n
7706
07:25:21,251 --> 07:25:26,831
because now these people are full\n
7707
07:25:26,830 --> 07:25:29,170
There's no more honor\nsystem inside of my loop
7708
07:25:29,170 --> 07:25:31,462
that this is going to line\nup because in just a moment
7709
07:25:31,462 --> 07:25:34,227
I'm going to fix this one last\nremnant of the previous version.
7710
07:25:34,227 --> 07:25:36,310
And if I can call back on\nyou again, what should I
7711
07:25:36,311 --> 07:25:39,191
change numbers bracket i to this time?
7712
07:25:39,190 --> 07:25:45,120
AUDIENCE: [INAUDIBLE] dot number.
7713
07:25:45,120 --> 07:25:46,751
DAVID J. MALAN: Dot number, exactly.
7714
07:25:46,751 --> 07:25:49,300
So gone is the honor\nsystem that just assumes
7715
07:25:49,300 --> 07:25:52,420
that bracket i in this array lines up\n
7716
07:25:54,791 --> 07:25:56,351
It's an array called people.
7717
07:25:56,350 --> 07:25:58,420
The things it stores are persons.
7718
07:25:58,420 --> 07:26:00,161
A person has a name and a number.
7719
07:26:00,161 --> 07:26:02,578
And so even though it's kind\nof marginal admittedly given
7720
07:26:02,578 --> 07:26:05,411
that this is a short program and\n
7721
07:26:05,411 --> 07:26:07,661
look more complicated\nat first glance, we're
7722
07:26:07,661 --> 07:26:10,901
now laying the foundation for just\n
7723
07:26:10,901 --> 07:26:13,541
can't screw up now the\nassociation of names
7724
07:26:13,541 --> 07:26:17,140
with numbers because every person's\n
7725
07:26:17,140 --> 07:26:21,070
encapsulated inside\nof the same data type.
7726
07:26:21,070 --> 07:26:22,631
And that's a term of art in CS.
7727
07:26:22,631 --> 07:26:26,081
Encapsulation means to\nencapsulate-- that is, contain--
7728
07:26:26,080 --> 07:26:28,190
related pieces of information.
7729
07:26:28,190 --> 07:26:34,030
And thus, we have a person that\n
7730
07:26:34,811 --> 07:26:36,671
And this just sets\nthe foundation for all
7731
07:26:36,670 --> 07:26:39,550
of the cool stuff we've talked\nabout and you use every day.
7732
07:26:40,353 --> 07:26:43,271
Well, recall that an image is a bunch\n
7733
07:26:43,271 --> 07:26:46,931
Every one of those dots\nhas RGB values associated
7734
07:26:46,931 --> 07:26:48,791
with it-- red, green, and blue.
7735
07:26:48,791 --> 07:26:52,121
You could imagine now creating\na structure in C probably where
7736
07:26:52,120 --> 07:26:55,900
maybe you have three values,\nthree variables-- one called red
7737
07:26:55,901 --> 07:26:57,761
one called green, one called blue.
7738
07:26:57,760 --> 07:27:00,341
And then you could name the\nthing not person but pixel.
7739
07:27:00,341 --> 07:27:04,271
And now you could store in C three\n
7740
07:27:04,271 --> 07:27:08,338
some green, some blue-- and collectively\n
7741
07:27:08,338 --> 07:27:11,381
And you could imagine doing something\n
7742
07:27:11,381 --> 07:27:14,800
Music, you might have three\n
7743
07:27:14,800 --> 07:27:17,021
the duration, the loudness of it.
7744
07:27:17,021 --> 07:27:20,480
And you can imagine coming up with\n
7745
07:27:20,480 --> 07:27:21,730
So this is a little low level.
7746
07:27:21,730 --> 07:27:24,280
We're just using like a\nfamiliar contacts application.
7747
07:27:24,280 --> 07:27:28,630
But we now have the way in code\nto express most any type of data
7748
07:27:28,631 --> 07:27:32,621
that we might want to implement\nor discuss ultimately.
7749
07:27:32,620 --> 07:27:37,870
So any questions now on struct\nor defining our own types
7750
07:27:37,870 --> 07:27:42,470
the purposes for which are to use\n
7751
07:27:42,471 --> 07:27:45,640
now in a better design but\nalso to lay the foundation
7752
07:27:45,640 --> 07:27:50,240
for implementing cooler and cooler\n
7753
07:27:50,741 --> 07:27:52,074
AUDIENCE: What's the [INAUDIBLE]
7754
07:27:52,074 --> 07:27:55,074
DAVID J. MALAN: What's the difference\n
7755
07:27:55,911 --> 07:27:58,751
So slight side note, C\nis not object-oriented.
7756
07:27:58,751 --> 07:28:02,350
Languages like Java and C++ and\n
7757
07:28:02,350 --> 07:28:05,560
of, programmed yourself, had friends\n
7758
07:28:05,561 --> 07:28:09,791
languages in those languages they have\n
7759
07:28:10,811 --> 07:28:14,381
And objects can store not\njust data, like variables.
7760
07:28:14,381 --> 07:28:18,851
Objects can also store functions, and\n
7761
07:28:18,850 --> 07:28:20,740
But it's not sort of conventional.
7762
07:28:20,741 --> 07:28:24,131
In C, you have data\nstructures that store data.
7763
07:28:24,131 --> 07:28:29,140
In languages like Java and C+, you have\n
7764
07:28:29,771 --> 07:28:32,151
Python is an object-oriented\nlanguage as well.
7765
07:28:32,151 --> 07:28:35,631
So we'll see this issue in a few weeks,\n
7766
07:28:36,131 --> 07:28:38,116
AUDIENCE: Could you\nuse this [INAUDIBLE]??
7767
07:28:38,741 --> 07:28:41,381
Could you use this struct to\nredefine how an int is defined?
7768
07:28:42,611 --> 07:28:46,061
We talked a couple of times\nnow about integer overflow.
7769
07:28:46,061 --> 07:28:50,261
And most recently, you might have seen\n
7770
07:28:50,260 --> 07:28:52,841
that was literally related\nto an int overflow.
7771
07:28:52,841 --> 07:28:57,251
That's the result of ints only\nstoring 4 bytes or 32 bits
7772
07:28:57,251 --> 07:29:00,161
or even as long as 64 bits or 8 bytes.
7773
07:29:01,241 --> 07:29:03,881
But if you want to implement\nsome financial software
7774
07:29:03,881 --> 07:29:06,461
or some scientific or\nmathematical software that
7775
07:29:06,460 --> 07:29:10,330
allows you to count way bigger\nthan a typical int or a long
7776
07:29:10,330 --> 07:29:13,495
you could imagine John coming\nup with your own structure.
7777
07:29:13,495 --> 07:29:15,620
And in fact, in some\nlanguages there is a structure
7778
07:29:15,620 --> 07:29:19,730
called big int, which allows you\nto express even bigger numbers.
7779
07:29:20,330 --> 07:29:24,770
Well, maybe you store inside of\na big ant an array of values.
7780
07:29:24,771 --> 07:29:27,852
And you somehow allow yourself\nto store more and more bits
7781
07:29:27,852 --> 07:29:29,810
based on how high you\nwant to be able to count.
7782
07:29:30,830 --> 07:29:34,200
We now have the ability now to do\n
7783
07:29:34,201 --> 07:29:36,561
even if it's not built in for us.
7784
07:29:43,122 --> 07:29:45,830
DAVID J. MALAN: Could you define a name\n
7785
07:29:46,430 --> 07:29:48,347
It starts to get\nsyntactically a little messy
7786
07:29:48,347 --> 07:29:51,590
so I did it a little more\npedantic line by line.
7787
07:29:52,791 --> 07:29:57,271
AUDIENCE: [INAUDIBLE] function\nyou use for the function
7788
07:29:57,271 --> 07:29:59,701
at the bottom of the [INAUDIBLE].
7789
07:29:59,701 --> 07:30:03,389
Could you do something\nlike that [INAUDIBLE]??
7790
07:30:03,389 --> 07:30:05,759
DAVID J. MALAN: Prototypes--\nyou have to do A and C. You
7791
07:30:05,759 --> 07:30:09,389
have to define anything you're going\n
7792
07:30:09,389 --> 07:30:11,139
to use before you actually use it.
7793
07:30:11,139 --> 07:30:15,190
So it is deliberate that I put it\n
7794
07:30:15,190 --> 07:30:19,230
Otherwise, the compiler would not know\n
7795
07:30:19,230 --> 07:30:22,111
use it here on what's line 14.
7796
07:30:22,111 --> 07:30:25,840
So it has to come first, or it has to\n
7797
07:30:25,840 --> 07:30:29,340
so that you include it at\nthe very top of your code.
7798
07:30:37,643 --> 07:30:39,350
DAVID J. MALAN: Yeah, good\nquestion, and we'll
7799
07:30:39,350 --> 07:30:42,860
come back to this later in the term when\n
7800
07:30:42,861 --> 07:30:44,991
and storing things in actual databases.
7801
07:30:44,991 --> 07:30:48,681
Generally speaking, even though we\n
7802
07:30:48,681 --> 07:30:52,161
or in the US, we have social security\n
7803
07:30:52,161 --> 07:30:56,420
often have other punctuation in it,\n
7804
07:30:57,741 --> 07:31:01,701
You could not store any of that syntax\n
7805
07:31:01,701 --> 07:31:03,241
You could only store numbers.
7806
07:31:03,241 --> 07:31:05,301
So one motivation for\nusing a string is just
7807
07:31:05,300 --> 07:31:08,931
I can store whatever the human wanted\n
7808
07:31:10,050 --> 07:31:13,789
Another reason for\nstoring things as strings
7809
07:31:13,789 --> 07:31:15,831
even if they look like\nnumbers, is in the context
7810
07:31:15,830 --> 07:31:17,448
of zip codes in the United States.
7811
07:31:17,449 --> 07:31:18,741
Again, we'll come back to this.
7812
07:31:18,741 --> 07:31:21,051
But long story short--\nyears ago, actually--
7813
07:31:21,050 --> 07:31:24,093
I was using Microsoft\nOutlook for my email client.
7814
07:31:24,093 --> 07:31:25,550
And eventually I switched to Gmail.
7815
07:31:25,550 --> 07:31:27,260
And this is like 10 plus years ago now.
7816
07:31:27,260 --> 07:31:31,790
And Outlook at the time lets you export\n
7817
07:31:33,140 --> 07:31:34,971
More on that in the weeks to come too.
7818
07:31:34,971 --> 07:31:36,763
And that just means I\ncould download a text
7819
07:31:36,762 --> 07:31:40,070
file with all of my friends and\n
7820
07:31:40,070 --> 07:31:44,172
Unfortunately, I open that same CSV\n
7821
07:31:44,172 --> 07:31:46,130
just to kind of spot\ncheck it and see if what's
7822
07:31:46,131 --> 07:31:47,761
in there was what it was expected.
7823
07:31:47,760 --> 07:31:51,230
And I must have instinctively hit,\n
7824
07:31:51,230 --> 07:31:54,260
And Excel at least has this habit\n
7825
07:31:54,260 --> 07:31:56,990
If things look like numbers,\nit treats them as numbers.
7826
07:31:56,991 --> 07:31:58,401
And Apple Numbers does this too.
7827
07:31:58,401 --> 07:32:00,441
Google Spreadsheets\ndoes this to nowadays.
7828
07:32:00,440 --> 07:32:07,400
But long story short, I then imported\n
7829
07:32:07,401 --> 07:32:11,121
And now 10 plus years later, I'm still\n
7830
07:32:11,120 --> 07:32:17,001
members whose zip codes are in\nCambridge, Massachusetts 2138
7831
07:32:17,001 --> 07:32:20,661
which is missing the 0 because\nwe here in Cambridge are 02138.
7832
07:32:20,661 --> 07:32:23,780
And that's because I\ntreated or I let Excel
7833
07:32:23,780 --> 07:32:26,990
treat what looks like a number\nas an actual number or int
7834
07:32:26,991 --> 07:32:29,901
and now leading zeros become a\n
7835
07:32:29,901 --> 07:32:33,261
mean nothing, but in the\nmail system, they do--
7836
07:32:34,431 --> 07:32:36,013
All right, other final questions here.
7837
07:32:39,021 --> 07:32:42,861
DAVID J. MALAN: Yeah, so could I have\n
7838
07:32:42,861 --> 07:32:47,091
array to solve the problem\nearlier of having just one array?
7839
07:32:47,091 --> 07:32:51,111
Yes, but one, I would argue\nit's less readable, especially
7840
07:32:51,111 --> 07:32:53,001
as I get lots of names and numbers.
7841
07:32:53,001 --> 07:32:56,091
And two, that too is also kind\nof relying on the honor system.
7842
07:32:56,091 --> 07:32:59,300
It would be all too easy to omit some\n
7843
07:33:00,300 --> 07:33:04,370
So I would argue it too is not\nas good as introducing a struct.
7844
07:33:05,541 --> 07:33:10,431
Two dimensional arrays just means\n
7845
07:33:10,431 --> 07:33:12,440
All right, so now that\nwe have this ability
7846
07:33:12,440 --> 07:33:16,570
to store different types of data\nlike contacts in a phone book
7847
07:33:16,570 --> 07:33:18,320
having names and\naddresses, let's actually
7848
07:33:18,320 --> 07:33:21,140
take a step back and\nconsider how we might now
7849
07:33:21,140 --> 07:33:26,210
solve one of the original problems by\n
7850
07:33:26,210 --> 07:33:30,291
given in advance and considering,\n
7851
07:33:30,291 --> 07:33:33,261
costly, how time consuming is\nthat because that might tip
7852
07:33:33,260 --> 07:33:37,370
the scales in favor of sorting,\nthen searching, or maybe just
7853
07:33:37,370 --> 07:33:39,320
not sorting and only searching.
7854
07:33:39,320 --> 07:33:42,830
It'll give us a sense of just\nhow expensive, so to speak
7855
07:33:42,830 --> 07:33:44,705
sorting something actually is.
7856
07:33:44,705 --> 07:33:46,580
Well, what's the\nformulation of this problem?
7857
07:33:46,580 --> 07:33:48,140
It's the same thing as week zero.
7858
07:33:49,311 --> 07:33:51,480
We want it to be output as sorted.
7859
07:33:51,480 --> 07:33:54,920
So for instance, if we're\ntaking unsorted input as input
7860
07:33:54,920 --> 07:33:58,255
we want the sorted output as\nthe result. More concretely
7861
07:33:58,256 --> 07:33:59,631
if we've got numbers like these--
7862
07:33:59,631 --> 07:34:04,461
63852741, which are just\nrandomly arranged numbers--
7863
07:34:04,460 --> 07:34:08,870
we want to get back out 12345678.
7864
07:34:08,870 --> 07:34:10,800
So we just want those\nthings to be sorted.
7865
07:34:10,800 --> 07:34:12,830
So again, inside of\nthe black box here is
7866
07:34:12,830 --> 07:34:17,820
going to be one or more algorithms\n
7867
07:34:17,820 --> 07:34:20,041
So how might we go about doing this?
7868
07:34:20,041 --> 07:34:23,541
Well, just to vary things a bit\n
7869
07:34:23,541 --> 07:34:25,941
for a bit more audience participation.
7870
07:34:25,940 --> 07:34:28,150
But this time, we need\neight people if we may.
7871
07:34:28,151 --> 07:34:30,651
All of you have to be comfortable\nappearing on the internet.
7872
07:34:30,651 --> 07:34:33,526
OK, so this is actually quite\n
7873
07:34:33,526 --> 07:34:37,113
How about 1, 2, 3, 4, 5, 6, 7--
7874
07:34:37,113 --> 07:34:41,421
oh, OK, and someone volunteering\ntheir friend-- number eight.
7875
07:34:43,236 --> 07:34:45,111
And if you could, I'm\ngoing to set things up.
7876
07:34:45,111 --> 07:34:47,991
If you all could join Valerie,\nmy colleague over there
7877
07:34:47,991 --> 07:34:53,271
to give you a prop to use here,\nwe'll go ahead in just a moment
7878
07:34:53,271 --> 07:34:56,195
and try to find some numbers at hand.
7879
07:34:59,541 --> 07:35:05,181
In just a moment, each of our volunteers\n
7880
07:35:05,181 --> 07:35:09,741
And that integer is initially\ngoing to be in unsorted order.
7881
07:35:09,741 --> 07:35:13,461
And I claim that using an algorithm,\nstep by step instructions
7882
07:35:13,460 --> 07:35:18,181
we can probably sort these folks in\n
7883
07:35:18,181 --> 07:35:22,791
So they're in wardrobe right now just\n
7884
07:35:22,791 --> 07:35:27,306
with a Jersey number on it, which will\n
7885
07:35:31,010 --> 07:35:35,630
Give us just a moment to finish\ngetting the attire ready.
7886
07:35:35,631 --> 07:35:40,401
They're being handed\na shirt and a number.
7887
07:35:40,401 --> 07:35:42,981
And let me ask the\naudience for just a moment.
7888
07:35:42,980 --> 07:35:47,120
As we have these numbers up here on the\n
7889
07:35:47,120 --> 07:35:48,713
They're just in random order.
7890
07:35:48,713 --> 07:35:49,881
And let me ask the audience.
7891
07:35:49,881 --> 07:35:55,341
How would you go about sorting\n
7892
07:35:55,341 --> 07:35:57,291
How would you go about sorting these?
7893
07:35:57,291 --> 07:35:58,498
Yeah, what are your thoughts?
7894
07:35:58,498 --> 07:36:04,687
AUDIENCE: [INAUDIBLE] the number\n
7895
07:36:05,271 --> 07:36:08,907
AUDIENCE: The following number is\n
7896
07:36:09,491 --> 07:36:11,161
AUDIENCE: If not, then [INAUDIBLE].
7897
07:36:11,161 --> 07:36:13,712
DAVID J. MALAN: OK, so just\nto recap, you would start
7898
07:36:13,712 --> 07:36:15,170
with one of the numbers on the end.
7899
07:36:15,170 --> 07:36:17,570
You would look to the number to\nthe right or to the left of it
7900
07:36:17,570 --> 07:36:18,890
depending on which end you start at.
7901
07:36:18,890 --> 07:36:21,473
And if it's out of order, you\nwould just start to swap things.
7902
07:36:22,710 --> 07:36:24,795
There's a whole bunch\nof mistakes to fix here
7903
07:36:24,795 --> 07:36:26,420
because things are pretty out of order.
7904
07:36:26,420 --> 07:36:29,420
But probably, if you start to\nsolve small problems at a time
7905
07:36:29,420 --> 07:36:32,270
you can achieve the end result of\n
7906
07:36:32,271 --> 07:36:35,181
Other instincts, if you were\njust handed these numbers, how
7907
07:36:35,181 --> 07:36:38,438
you might go about sorting them?
7908
07:36:44,501 --> 07:36:46,280
DAVID J. MALAN: OK, I like that.
7909
07:36:46,280 --> 07:36:50,200
So to recap there, find the smallest\n
7910
07:36:51,431 --> 07:36:54,703
And then presumably, you could do\n
7911
07:36:54,703 --> 07:36:57,411
And that would seem to give you\n
7912
07:36:57,411 --> 07:37:00,070
And if you all are attired here--
7913
07:37:00,070 --> 07:37:03,161
do you want to come\non up if you're ready?
7914
07:37:03,161 --> 07:37:05,021
We had some [? felt ?] volunteers too.
7915
07:37:07,390 --> 07:37:09,881
So if you all would like\nto line yourselves up
7916
07:37:09,881 --> 07:37:12,131
facing the audience in\nexactly this order-- so
7917
07:37:12,131 --> 07:37:14,531
whoever is number zero\nshould be way over here
7918
07:37:14,530 --> 07:37:18,070
and whoever is number five\nshould be way over there.
7919
07:37:18,070 --> 07:37:21,280
Feel free to distance as much as you'd\n
7920
07:37:24,013 --> 07:37:25,181
And make a little more room.
7921
07:37:29,451 --> 07:37:31,094
DAVID J. MALAN: 4, hopefully 1.
7922
07:37:31,094 --> 07:37:32,261
Yeah, keep them to the side.
7923
07:37:38,050 --> 07:37:41,435
All right, so here, we have\nan array of eight numbers--
7924
07:37:41,436 --> 07:37:42,561
eight integers if you will.
7925
07:37:42,561 --> 07:37:45,131
And do you want to each say\na quick hello to the group?
7926
07:38:05,080 --> 07:38:07,265
AUDIENCE: Hi, I'm\nCeleste, and go Strauss.
7927
07:38:08,140 --> 07:38:11,291
Well, welcome all to the stage,\nand let's just visualize
7928
07:38:11,291 --> 07:38:13,791
perhaps organically, how you\neight would solve this problem.
7929
07:38:13,791 --> 07:38:16,901
So we currently have the numbers\n0 through 7 quite out of order.
7930
07:38:16,901 --> 07:38:20,869
Could you go ahead and just\nyourselves from 0 through 7?
7931
07:38:25,760 --> 07:38:28,560
DAVID J. MALAN: OK, so what did they just do?
7932
07:38:29,061 --> 07:38:30,621
First of all, yes, very well done.
7933
07:38:34,390 --> 07:38:37,398
How would you describe\nwhat they just did?
7934
07:38:38,230 --> 07:38:40,420
Could you go back into\nthat order on the screen--
7935
07:38:44,681 --> 07:38:47,800
And could you do exactly\nwhat you just did again?
7936
07:38:58,550 --> 07:39:01,841
All right, so admittedly, there's kind\n
7937
07:39:01,841 --> 07:39:05,510
except number four, are doing something\n
7938
07:39:05,510 --> 07:39:07,751
And that's not really how\na computer typically works.
7939
07:39:07,751 --> 07:39:11,260
Just like a computer can only look at\n
7940
07:39:11,260 --> 07:39:15,850
at a time, so can a computer only move\n
7941
07:39:15,850 --> 07:39:18,077
a locker, checking what's\nthere, moving it as needed.
7942
07:39:18,078 --> 07:39:21,161
So let's try this more methodically\n
7943
07:39:21,161 --> 07:39:26,710
If you all could randomize\nyourself again to 52741630
7944
07:39:26,710 --> 07:39:29,122
let's take the second of\nthose approaches first.
7945
07:39:29,122 --> 07:39:30,580
I'm going to look at these numbers.
7946
07:39:30,580 --> 07:39:33,230
And even though I as the human\ncan obviously see all the numbers
7947
07:39:33,230 --> 07:39:35,291
and I just kind of have the\nintuition for how to fix this
7948
07:39:35,291 --> 07:39:37,541
we got to be more methodical\nbecause eventually, we've
7949
07:39:37,541 --> 07:39:39,861
got to translate this to\npseudo code and then code.
7950
07:39:40,751 --> 07:39:43,561
I'm going to search for, as you\nproposed, the smallest number.
7951
07:39:43,561 --> 07:39:45,311
And I'm going to start\nfrom left to right.
7952
07:39:45,311 --> 07:39:48,461
I could do it right to left, but left\n
7953
07:39:48,460 --> 07:39:51,440
All right, 5 at this moment is\nthe smallest number I've seen.
7954
07:39:51,440 --> 07:39:54,179
So I'm going to remember that\nin a variable, if you will.
7955
07:39:54,179 --> 07:39:55,721
Now I'm going to take one more step--
7956
07:39:56,291 --> 07:39:59,621
OK, 2 I'm going to compare to the\n
7957
07:39:59,620 --> 07:40:03,521
I'm going to forget about 5 and only\n
7958
07:40:04,181 --> 07:40:07,451
7, nope-- I'm going to ignore that\n
7959
07:40:08,170 --> 07:40:11,530
4, 1-- OK, I'm going to\nupdate the variable in mind
7960
07:40:11,530 --> 07:40:12,790
because that's indeed smaller.
7961
07:40:12,791 --> 07:40:15,491
Now obviously, we the humans\nknow that's getting pretty small.
7962
07:40:16,541 --> 07:40:19,991
I have to check all values to see\n
7963
07:40:19,991 --> 07:40:22,796
because 6 is not, 3 is not, but 0 is.
7964
07:40:22,795 --> 07:40:23,920
And what's your name again?
7965
07:40:25,631 --> 07:40:32,320
Where should Celeste or number 0 go\n
7966
07:40:32,320 --> 07:40:33,951
All right, I'm seeing a lot of this.
7967
07:40:33,951 --> 07:40:37,181
So at the beginning of the array,\nso before doing this for real
7968
07:40:37,181 --> 07:40:38,921
let's have you pop out in front.
7969
07:40:38,920 --> 07:40:42,460
And could you all shift\nand make room for Celeste?
7970
07:40:42,460 --> 07:40:46,390
Is this a good idea to have all\nof them move or equivalently
7971
07:40:46,390 --> 07:40:48,911
move everything in the array\nto make room for Celeste
7972
07:40:52,030 --> 07:40:53,238
That felt like a lot of work.
7973
07:40:53,239 --> 07:40:56,561
And even though it happened pretty\n
7974
07:40:56,561 --> 07:40:58,401
to happen just to move her in place.
7975
07:40:58,401 --> 07:41:00,941
So what would be marginally\nsmarter perhaps--
7976
07:41:00,940 --> 07:41:03,320
a little more efficient, perhaps?
7977
07:41:07,414 --> 07:41:08,831
DAVID J. MALAN: OK, replace two values.
7978
07:41:08,830 --> 07:41:12,450
So if you want to go back to where\n
7979
07:41:12,451 --> 07:41:13,921
he's not in the right place.
7980
07:41:13,920 --> 07:41:15,150
He's got to move eventually.
7981
07:41:15,859 --> 07:41:18,661
If that's where Celeste belongs,\nwhy don't we just swap 5 and 0?
7982
07:41:18,661 --> 07:41:21,286
So if you want to go ahead and\nexchange places with each other.
7983
07:41:21,286 --> 07:41:22,591
Notice what's just happened.
7984
07:41:22,591 --> 07:41:25,780
The problem I'm trying to\nsolve has gotten smaller.
7985
07:41:25,780 --> 07:41:28,380
Instead of being size\n8, now it's size 7.
7986
07:41:28,381 --> 07:41:31,411
Now granted, I moved 5 to\nanother wrong location.
7987
07:41:31,411 --> 07:41:33,300
But if these numbers\nstarted off randomly
7988
07:41:33,300 --> 07:41:37,150
it doesn't really matter where 5 goes\n
7989
07:41:38,401 --> 07:41:41,791
And now if I go back, my loop\nis sort of coming back around.
7990
07:41:41,791 --> 07:41:46,050
I can ignore Celeste and make this\n
7991
07:41:46,050 --> 07:41:47,791
because I know she's in the right place.
7992
07:41:53,431 --> 07:41:57,960
Now I know as a human this\nshould be my next smallest.
7993
07:41:57,960 --> 07:42:02,460
But why, intuitively, should\nI keep going, do you think?
7994
07:42:02,460 --> 07:42:05,310
I can't sort of optimize as a\nhuman and just say, number 1
7995
07:42:05,311 --> 07:42:07,121
let's get you into the right place.
7996
07:42:07,120 --> 07:42:08,970
I still want to check the whole array.
7997
07:42:10,230 --> 07:42:12,837
AUDIENCE: Perhaps there's another 1.
7998
07:42:12,837 --> 07:42:14,670
DAVID J. MALAN: Maybe there's\nanother 1, and that
7999
07:42:14,670 --> 07:42:16,390
could be another problem altogether.
8000
07:42:17,631 --> 07:42:18,818
AUDIENCE: Could be another 0
8001
07:42:18,818 --> 07:42:20,611
DAVID J. MALAN: There could\nbe another 0 indeed
8002
07:42:20,611 --> 07:42:22,921
but I did go through\nthe list once, right?
8003
07:42:22,920 --> 07:42:24,330
And I kind of know there isn't.
8004
07:42:24,931 --> 07:42:27,661
AUDIENCE: You don't know that\nevery value is represented.
8005
07:42:27,661 --> 07:42:31,747
So maybe there's a [INAUDIBLE] You\n
8006
07:42:32,580 --> 07:42:34,960
DAVID J. MALAN: Yeah, I don't\nnecessarily know what is there.
8007
07:42:34,960 --> 07:42:39,300
And honestly, I only stipulated earlier\n
8008
07:42:39,300 --> 07:42:42,541
I could use two and remember the\n
8009
07:42:42,541 --> 07:42:44,491
I could use three variables, four.
8010
07:42:44,491 --> 07:42:47,801
But then I'm going to start to use\n
8011
07:42:47,800 --> 07:42:51,240
So if I've stipulated that I only have\n
8012
07:42:51,241 --> 07:42:53,551
I don't know anything\nmore about these elements
8013
07:42:53,550 --> 07:42:55,758
because the only thing I'm\nremembering at this moment
8014
07:42:55,758 --> 07:42:57,850
is number 1 is the\nsmallest element I've seen.
8015
07:43:01,070 --> 07:43:03,030
OK, I know that number\n1, and your name was--
8016
07:43:03,739 --> 07:43:06,211
DAVID J. MALAN: --Hannah is\nthe next smallest element.
8017
07:43:06,210 --> 07:43:08,701
I could have everyone move\nover to make room, but nope.
8018
07:43:09,420 --> 07:43:11,340
You know, even though you're\nso close to where I want you
8019
07:43:11,341 --> 07:43:13,530
I'm just going to keep it\nsimple and swap you two.
8020
07:43:13,530 --> 07:43:15,931
So granted, I've made the\nproblem a little worse.
8021
07:43:15,931 --> 07:43:19,561
But on average, I could get\nlucky too and just pop number 2
8022
07:43:20,640 --> 07:43:22,570
Now let me just accelerate this.
8023
07:43:22,570 --> 07:43:26,911
I can now ignore Hannah and Celeste,\n
8024
07:43:34,830 --> 07:43:37,920
So let's go ahead and swap 2 and 7.
8025
07:43:37,920 --> 07:43:40,590
And now I'll just kind of\norchestrate it verbally.
8026
07:43:40,591 --> 07:43:42,311
4, you're about to have to do something.
8027
07:43:46,111 --> 07:43:48,841
OK, 3-- could you swap with 4?
8028
07:43:48,841 --> 07:43:52,036
All right, now we have 7, 6, 4, 5.
8029
07:43:52,036 --> 07:43:54,780
OK, 4, could you swap with 7?
8030
07:44:02,521 --> 07:44:04,050
And now perhaps round of applause.
8031
07:44:05,341 --> 07:44:09,611
OK, hang on there one minute.
8032
07:44:09,611 --> 07:44:11,491
So we'll do this one other approach.
8033
07:44:11,491 --> 07:44:14,581
And my God, that felt so much\nslower than the first approach
8034
07:44:14,580 --> 07:44:17,400
but that's, one, because I was\n
8035
07:44:17,401 --> 07:44:22,051
But two, we were doing one thing at a\n
8036
07:44:22,050 --> 07:44:25,470
had the luxury of moving\nlike eight different CPUs--
8037
07:44:25,471 --> 07:44:28,051
brains, if you will-- were all\noperating at the same time.
8038
07:44:28,050 --> 07:44:29,501
And computers like that exist.
8039
07:44:29,501 --> 07:44:32,254
If you have a computer with\nmultiple cores, so to speak
8040
07:44:32,254 --> 07:44:34,171
that's like having a\ncomputer that technically
8041
07:44:34,170 --> 07:44:35,850
can do multiple things at once.
8042
07:44:35,850 --> 07:44:38,830
But software typically, at least\nas we've written it thus far
8043
07:44:38,830 --> 07:44:40,555
can only do one thing at a time.
8044
07:44:40,556 --> 07:44:42,431
So in a bit, we'll add\nup all of these steps.
8045
07:44:42,431 --> 07:44:44,223
But for now, let's take\none other approach.
8046
07:44:44,223 --> 07:44:46,501
If you all could reorder\nyourselves like that--
8047
07:44:46,501 --> 07:44:51,300
52741630-- let's take\nthe other approach that
8048
07:44:51,300 --> 07:44:54,970
was recommended by just fixing small\n
8049
07:44:54,971 --> 07:44:57,091
So we're back in the original order.
8050
07:44:57,091 --> 07:44:59,172
5 and 2 are clearly out of order.
8051
07:44:59,881 --> 07:45:01,711
Let's just bite this problem off now.
8052
07:45:03,451 --> 07:45:04,890
Now let me take a next step.
8053
07:45:06,690 --> 07:45:09,360
There's a gap, yes, but that\nmight not be a big deal.
8054
07:45:12,780 --> 07:45:15,510
OK, 7 and 1, let's have you swap.
8055
07:45:15,510 --> 07:45:18,480
7 and 6, let's have you swap.
8056
07:45:22,050 --> 07:45:23,850
Now let me pause for just a moment.
8057
07:45:26,830 --> 07:45:29,700
But have I improved the problem?
8058
07:45:29,701 --> 07:45:32,460
Right, I can't see-- like\nbefore, I can't optimize like
8059
07:45:32,460 --> 07:45:34,570
before because 0 is obviously not here.
8060
07:45:34,570 --> 07:45:38,521
So unless they're still way back there,\n
8061
07:45:40,661 --> 07:45:42,407
But have I made any improvements?
8062
07:45:43,616 --> 07:45:45,841
In what sense is this improved?
8063
07:45:45,841 --> 07:45:50,411
What's a concrete thing you\ncould point to is better?
8064
07:45:50,911 --> 07:45:52,471
AUDIENCE: Sorted the highest number.
8065
07:45:52,471 --> 07:45:55,051
DAVID J. MALAN: I've sorted the\n
8066
07:45:55,050 --> 07:45:59,760
And conversely, if you prefer, Celeste\n
8067
07:45:59,760 --> 07:46:04,330
Now worst case, Celeste is going to\n
8068
07:46:04,330 --> 07:46:06,900
So I might need to do this\nthing like n total times
8069
07:46:06,901 --> 07:46:08,576
to move her all the way over.
8070
07:46:08,576 --> 07:46:09,701
But that might work out OK.
8071
07:46:23,741 --> 07:46:25,871
notice that the high\nvalues, as you noted
8072
07:46:25,870 --> 07:46:28,510
are sort of bubbling up, if you\nwill, to the end of the list.
8073
07:46:36,140 --> 07:46:37,730
5, 6, 7, of course, are good.
8074
07:46:37,730 --> 07:46:40,536
So now you can sort of see\nthe problem resolving itself.
8075
07:46:40,536 --> 07:46:42,161
And let's just do this part now faster.
8076
07:46:48,521 --> 07:46:53,238
All right, now 1 and 2,\n2, and 3, and 0, and good.
8077
07:46:53,238 --> 07:46:54,820
So we do have some optimization there.
8078
07:46:54,820 --> 07:46:57,221
We don't need to keep going\nbecause those all are sorted.
8079
07:47:01,091 --> 07:47:04,690
1 and 0-- and big round\nof applause in closing.
8080
07:47:09,251 --> 07:47:11,501
We need the puppets back,\nbut you can keep the shirts.
8081
07:47:11,501 --> 07:47:13,201
Thank you for volunteering here.
8082
07:47:13,201 --> 07:47:16,171
Feel free to make your\nway exits left or right.
8083
07:47:16,170 --> 07:47:18,120
And let's see if,\nthanks to our volunteers
8084
07:47:18,120 --> 07:47:24,361
here, we can't now formalize a little\n
8085
07:47:24,361 --> 07:47:28,530
I claim that the first algorithm\nour volunteers kindly acted out
8086
07:47:28,530 --> 07:47:30,091
is what's called selection sort.
8087
07:47:30,091 --> 07:47:35,341
And as the name implied, we selected\n
8088
07:47:35,341 --> 07:47:37,890
and again, working our\nway from left to right
8089
07:47:37,890 --> 07:47:42,460
putting Celeste into the right place,\n
8090
07:47:42,460 --> 07:47:45,420
So selection sort, as\nit's formally called
8091
07:47:45,420 --> 07:47:48,360
can be described, for instance,\nwith this pseudo code here--
8092
07:47:52,710 --> 07:47:54,780
This is just how talk about arrays.
8093
07:47:54,780 --> 07:47:59,280
The left end is 0, the right end\n
8094
07:47:59,280 --> 07:48:01,080
n happened to be eight people.
8095
07:48:03,061 --> 07:48:06,601
So for i from 0 to n\nminus 1, what did I do?
8096
07:48:06,600 --> 07:48:11,760
I found the smallest number between\n
8097
07:48:13,170 --> 07:48:15,390
It's a little cryptic at\nfirst glance, but this
8098
07:48:15,390 --> 07:48:18,870
is just a very pseudo\ncode-like way of saying
8099
07:48:18,870 --> 07:48:22,320
find the smallest element\namong all eight volunteers
8100
07:48:22,320 --> 07:48:27,480
because if i starts at 0 and n minus\n
8101
07:48:27,480 --> 07:48:31,591
8, 8 people, so 8 minus\n1 is 7, this first
8102
07:48:31,591 --> 07:48:34,681
says find the smallest number\nbetween numbers bracket 0
8103
07:48:34,681 --> 07:48:37,710
and numbers bracket 7, if you will.
8104
07:48:38,911 --> 07:48:42,091
Swap the smallest number\nwith numbers bracket i.
8105
07:48:42,091 --> 07:48:45,570
So that's how we got Celeste from\n
8106
07:48:45,570 --> 07:48:47,701
We just swapped those two values.
8107
07:48:47,701 --> 07:48:50,201
What then happens next\nin this pseudo code?
8108
07:48:50,201 --> 07:48:52,261
i, of course, goes from 0 to 1.
8109
07:48:52,260 --> 07:48:54,420
And that's the technical\nway of saying now
8110
07:48:54,420 --> 07:48:58,170
find the smallest element among\nthe 7 remaining volunteers
8111
07:48:58,170 --> 07:49:01,930
ignoring Celeste this time because she\n
8112
07:49:01,931 --> 07:49:04,320
So the problem went\nfrom size 8 to size 7.
8113
07:49:04,320 --> 07:49:07,861
And if we repeat, size 6,\n5, 4, 3, 2, 1, until boom
8114
07:49:07,861 --> 07:49:10,181
it's all done at the very end.
8115
07:49:10,181 --> 07:49:13,561
So this is just one way of\nexpressing in pseudo code what
8116
07:49:13,561 --> 07:49:17,401
we did a little more organically\n
8117
07:49:17,401 --> 07:49:19,781
volunteered out in the audience.
8118
07:49:19,780 --> 07:49:24,661
So if we consider, then, the\nefficiency of this algorithm
8119
07:49:24,661 --> 07:49:27,091
maybe abstracting it away\nnow as a bunch of doors
8120
07:49:27,091 --> 07:49:31,320
where the left most again is always\n
8121
07:49:31,320 --> 07:49:34,710
or equivalently, the second to last\n
8122
07:49:34,710 --> 07:49:38,911
is n minus 3 where n might\nbe 8 or anything else
8123
07:49:38,911 --> 07:49:43,980
how do we think about or quantify\n
8124
07:49:46,591 --> 07:49:49,591
I mean, that was a lot\nof steps to be adding up.
8125
07:49:49,591 --> 07:49:53,491
It's probably more than n, right,\n
8126
07:49:54,390 --> 07:49:59,100
It was like n plus n\nminus 1 plus n minus 2.
8127
07:50:01,440 --> 07:50:04,980
We got like the whole\nteam in the orchestra now.
8128
07:50:04,980 --> 07:50:09,541
Let me propose we think about it this\n
8129
07:50:09,541 --> 07:50:13,710
So the first time, I had to\nlook at n different volunteers.
8130
07:50:13,710 --> 07:50:17,490
n was 8 in this case, but generically,\n
8131
07:50:17,491 --> 07:50:19,591
in order to decide who was the smallest.
8132
07:50:19,591 --> 07:50:21,631
And sure enough, Celeste\nwas at the very end.
8133
07:50:21,631 --> 07:50:23,463
She happened to be all\nthe way to the right.
8134
07:50:23,463 --> 07:50:27,870
But I only knew that once I looked\nat all 8 or all n volunteers.
8135
07:50:27,870 --> 07:50:30,240
So that took me n steps first.
8136
07:50:30,241 --> 07:50:33,631
But once the list was swapped\ninto the right place, then
8137
07:50:33,631 --> 07:50:37,651
my problem with size n minus 1,\nand I had n minus 1 other people
8138
07:50:40,230 --> 07:50:44,186
Then after that, it's n minus 2 plus\n
8139
07:50:44,186 --> 07:50:45,631
dot until I had one final step.
8140
07:50:45,631 --> 07:50:48,820
And it's obvious that I only\nhave one human left to consider.
8141
07:50:48,820 --> 07:50:51,541
So we might wave our hands at\nthis with a little ellipsis
8142
07:50:51,541 --> 07:50:54,761
and just say dot dot dot\nplus 1 for the final step.
8143
07:50:54,760 --> 07:50:56,251
Now what does this actually equal?
8144
07:50:56,251 --> 07:50:57,841
Well, this is where you\nmight think back on, like
8145
07:50:57,841 --> 07:50:59,760
your high school math\nor physics textbook that
8146
07:50:59,760 --> 07:51:03,001
has a little cheat sheet at the end\n
8147
07:51:03,001 --> 07:51:05,850
That happens to work\nout mathematically to be
8148
07:51:05,850 --> 07:51:09,480
n times n plus 1 all divided by 2.
8149
07:51:09,480 --> 07:51:13,050
That's just what that recurrence,\n
8150
07:51:13,050 --> 07:51:15,661
So if you take on faith that\nthat math is correct, let's
8151
07:51:15,661 --> 07:51:19,890
just now multiply this\nout mathematically.
8152
07:51:19,890 --> 07:51:26,280
That's n squared plus n divided by 2 or\n
8153
07:51:26,280 --> 07:51:29,251
And here's where we're starting\n
8154
07:51:29,251 --> 07:51:34,440
Like, honestly, as n gets really\n
8155
07:51:34,440 --> 07:51:39,300
or a billion web pages in Google search\n
8156
07:51:39,300 --> 07:51:41,760
is going to matter the\nmost mathematically
8157
07:51:43,561 --> 07:51:46,201
Is n squared divided by\n2 the dominant factor
8158
07:51:46,201 --> 07:51:48,620
or is n divided by 2\nthe dominant factor?
8159
07:51:50,355 --> 07:51:51,480
DAVID J. MALAN: Yeah, n squared.
8160
07:51:51,480 --> 07:51:53,650
I mean, no matter what n\nis-- and the bigger it is
8161
07:51:53,651 --> 07:51:56,941
the bigger raising it to\nthe power 2 is going to be.
8162
07:51:57,690 --> 07:52:00,630
Let's just wave our hands at this\nbecause at the end of the day
8163
07:52:00,631 --> 07:52:04,140
as n gets really large, the dominant\n
8164
07:52:04,890 --> 07:52:08,650
Even the divided 2, as I claimed earlier\n
8165
07:52:08,651 --> 07:52:11,320
the two straight lines if you\nkeep zooming out essentially
8166
07:52:11,320 --> 07:52:16,120
looked the same when n is large enough,\n
8167
07:52:16,850 --> 07:52:21,490
So that is to say a computer scientist\n
8168
07:52:21,491 --> 07:52:24,221
on the order of n squared steps.
8169
07:52:24,221 --> 07:52:25,871
That's an oversimplification.
8170
07:52:25,870 --> 07:52:28,960
If we really added it up, it's\nactually this many steps-- n
8171
07:52:28,960 --> 07:52:30,970
squared divided by 2 plus n over 2.
8172
07:52:30,971 --> 07:52:34,781
But again, if we want to just be able\n
8173
07:52:34,780 --> 07:52:38,170
performance, I think it's going to\n
8174
07:52:38,170 --> 07:52:44,080
order term to get a sense of what the\n
8175
07:52:44,080 --> 07:52:46,900
or what it even looks like graphically.
8176
07:52:46,901 --> 07:52:50,531
All right, so with that said,\nwe might describe bubble sort
8177
07:52:52,151 --> 07:52:56,061
sorry, selection sort as\nbeing in big O of n squared.
8178
07:52:56,061 --> 07:53:01,391
But what if we consider now the\n
8179
07:53:01,390 --> 07:53:03,431
to talk about a lower bound?
8180
07:53:03,431 --> 07:53:07,666
In the best case, how many\nsteps does selection sort take?
8181
07:53:07,666 --> 07:53:09,041
Well, here, we need some context.
8182
07:53:09,041 --> 07:53:11,681
Like, what does it mean to be\nthe best case or the worst case
8183
07:53:13,451 --> 07:53:16,960
Like, what could you imagine meaning\n
8184
07:53:16,960 --> 07:53:20,011
trying to sort a bunch of numbers?
8185
07:53:20,011 --> 07:53:21,460
I got the whole crew here again.
8186
07:53:21,760 --> 07:53:23,411
AUDIENCE: They would already be sorted.
8187
07:53:23,411 --> 07:53:24,730
DAVID J. MALAN: All right, they're\nalready sorted, right?
8188
07:53:24,730 --> 07:53:28,390
I can't really imagine a better scenario\n
8189
07:53:28,390 --> 07:53:30,400
but they're already sorted for me.
8190
07:53:30,401 --> 07:53:35,531
But does this algorithm\nleverage that fact in practice?
8191
07:53:35,530 --> 07:53:38,470
Even if all of our humans\nhad lined up from 0 to 7
8192
07:53:38,471 --> 07:53:41,291
I'm pretty sure I would have\npretty naively started here.
8193
07:53:41,291 --> 07:53:43,031
And yes, Celeste happens to be here.
8194
07:53:43,030 --> 07:53:47,736
But I only know she needs to be here\n
8195
07:53:47,736 --> 07:53:50,361
And then I would have realized,\nwell, that was a waste of time.
8196
07:53:51,640 --> 07:53:53,861
But then what would I have done?
8197
07:53:53,861 --> 07:53:57,221
I would have ignored her position\n
8198
07:53:57,221 --> 07:54:00,621
I would have done the same thing now\n
8199
07:54:00,620 --> 07:54:03,681
So every time I walk through,\nI'm not doing much useful work.
8200
07:54:03,681 --> 07:54:06,221
But I am doing those\ncomparisons because I
8201
07:54:06,221 --> 07:54:09,861
don't know until I do the work that\n
8202
07:54:09,861 --> 07:54:14,800
So this would seem to imply that\n
8203
07:54:14,800 --> 07:54:18,100
scenario, even, a lower bound on the\n
8204
07:54:20,201 --> 07:54:21,831
DAVID J. MALAN: A little louder?
8205
07:54:22,771 --> 07:54:24,611
DAVID J. MALAN: It's still\ngoing to be n squared
8206
07:54:24,611 --> 07:54:30,181
in fact, because the code I'm giving\n
8207
07:54:30,181 --> 07:54:34,620
from any of that scenario because\nit just mindlessly continues
8208
07:54:36,131 --> 07:54:41,371
So in this case, yes, I would claim that\n
8209
07:54:43,190 --> 07:54:44,940
So those are the kinds\nof numbers to beat.
8210
07:54:44,940 --> 07:54:48,120
It seems like the upper bound\nand lower bound of selection
8211
07:54:50,491 --> 07:54:52,741
And so we can also describe\nselection sort, therefore
8212
07:54:52,741 --> 07:54:54,078
as being in theta of n squared.
8213
07:54:54,078 --> 07:54:56,911
That's the first algorithm we've\n
8214
07:54:56,911 --> 07:54:59,010
which is to say that it's kind of slow.
8215
07:54:59,010 --> 07:55:00,955
I mean, maybe other\nalgorithms are slower
8216
07:55:00,955 --> 07:55:02,580
but this isn't the best starting point.
8217
07:55:03,730 --> 07:55:07,291
Well, there's a reason that I guided us\n
8218
07:55:07,291 --> 07:55:09,791
Even though you verbally proposed\nthem in a different order
8219
07:55:09,791 --> 07:55:12,960
this second algorithm we did is\ngenerally known as bubble sort.
8220
07:55:12,960 --> 07:55:15,570
And I deliberately used\nthat word a bit ago
8221
07:55:15,570 --> 07:55:19,291
saying the big values are\nbubbling their way up to the right
8222
07:55:19,291 --> 07:55:22,351
to kind of capture the fact that,\nindeed, this algorithm works
8223
07:55:22,980 --> 07:55:25,230
But let's consider if\nit's better or worse.
8224
07:55:25,230 --> 07:55:28,320
So here, we have pseudo\ncode for bubble sort.
8225
07:55:28,320 --> 07:55:30,291
You could write this\ntoo in different ways.
8226
07:55:30,291 --> 07:55:32,911
But let's consider what\nwe did on the stage.
8227
07:55:32,911 --> 07:55:36,210
We repeated the following\nn minus 1 times.
8228
07:55:36,210 --> 07:55:39,751
We initialized at least, even though\n
8229
07:55:39,751 --> 07:55:44,791
a variable like i from 0\nto n minus 2, n minus 2.
8230
07:55:44,791 --> 07:55:46,171
And then I asked this question.
8231
07:55:46,170 --> 07:55:52,830
If numbers bracket i and numbers\n
8232
07:55:54,611 --> 07:55:56,941
So again, I just did it more\nintuitively by pointing
8233
07:55:56,940 --> 07:55:59,310
but this would be a way,\nwith a bit of pseudo code
8234
07:55:59,311 --> 07:56:00,508
to describe what's going on.
8235
07:56:00,508 --> 07:56:03,091
But notice that I'm doing something\na little differently here.
8236
07:56:03,091 --> 07:56:07,091
I'm iterating from if\nequals 0 to n minus 2.
8237
07:56:07,591 --> 07:56:11,100
Well, if I'm comparing two\nthings, left hand and right hand
8238
07:56:11,100 --> 07:56:13,050
I'd still want to start at 0.
8239
07:56:13,050 --> 07:56:15,810
But I don't want to go\nall the way to n minus 1
8240
07:56:15,811 --> 07:56:19,409
because then, I'd be going past\nthe boundary of my array, which
8241
07:56:19,951 --> 07:56:22,441
I want to make sure that my\nleft hand-- i, if you will--
8242
07:56:22,440 --> 07:56:27,181
stops at n minus 2 so that when\nI plus 1 in my pseudo code
8243
07:56:27,181 --> 07:56:29,971
I'm looking at the last two\nelements, not the last element
8244
07:56:31,260 --> 07:56:33,093
That's actually a common\nprogramming mistake
8245
07:56:33,094 --> 07:56:34,980
that we'll undoubtedly\nsoon make by going
8246
07:56:34,980 --> 07:56:37,240
beyond the boundaries of your array.
8247
07:56:37,241 --> 07:56:43,891
So this pseudo code, then, allows me to\n
8248
07:56:43,890 --> 07:56:46,091
and swap them if they're out of order.
8249
07:56:46,091 --> 07:56:50,550
Why do I repeat the whole\nthing n minus 1 times?
8250
07:56:50,550 --> 07:56:55,861
Like, why does it not suffice\njust to do this loop here?
8251
07:56:55,861 --> 07:56:59,341
Think what happened with Celeste.
8252
07:56:59,341 --> 07:57:03,841
Why do I repeat this whole\nthing n minus 1 times?
8253
07:57:12,216 --> 07:57:14,591
DAVID J. MALAN: Indeed, and I think\nif I can recap accurately
8254
07:57:14,591 --> 07:57:16,001
think back to Celeste again.
8255
07:57:16,001 --> 07:57:18,491
And I'm sorry to keep calling\non you as our number 0.
8256
07:57:18,491 --> 07:57:22,031
Each time through bubble\nsort, she only moved one step.
8257
07:57:22,030 --> 07:57:25,640
And so in total, if there's n\nlocations, at the end of the day
8258
07:57:25,640 --> 07:57:30,291
she needs to move n minus 1 steps to get\n
8259
07:57:30,291 --> 07:57:34,691
And so this inner loop, if you\n
8260
07:57:34,690 --> 07:57:37,060
that just fixes some of the problems.
8261
07:57:37,061 --> 07:57:40,511
But it doesn't fix all of the problems\n
8262
07:57:42,021 --> 07:57:45,771
And so how might we quantify the\nrunning time of this algorithm?
8263
07:57:45,771 --> 07:57:49,030
Well, one way to see it is to just\n
8264
07:57:49,030 --> 07:57:53,530
The outer loop repeats n\nminus 1 times by definition.
8265
07:57:54,850 --> 07:58:00,320
The inner loop, the for loop,\nalso iterates n minus 1 times.
8266
07:58:00,820 --> 07:58:02,901
Because it's going from 0 to n minus 2.
8267
07:58:02,901 --> 07:58:06,941
And if that's hard to think about,\n
8268
07:58:06,940 --> 07:58:09,681
if you just add 1 to\nboth ends of the formula.
8269
07:58:09,681 --> 07:58:14,271
So that means you're doing n\nminus 1 things n minus 1 times.
8270
07:58:14,271 --> 07:58:16,901
So I literally multiply how\nmany times the outer loop
8271
07:58:16,901 --> 07:58:20,320
is running by how many times the\n
8272
07:58:20,320 --> 07:58:23,681
sort of FOIL method n minus 1 squared.
8273
07:58:23,681 --> 07:58:25,661
And I could multiply\nthat whole thing out.
8274
07:58:25,661 --> 07:58:28,690
Well, let's consider this just\na little more methodically here.
8275
07:58:28,690 --> 07:58:32,742
If I have n minus 1 on the\nouter, n minus 1 on the inner--
8276
07:58:32,742 --> 07:58:33,950
let's go ahead and FOIL this.
8277
07:58:33,951 --> 07:58:37,541
So n squared minus n\nminus n plus 1, combine
8278
07:58:37,541 --> 07:58:40,931
like terms-- n squared minus 2n plus 1.
8279
07:58:40,931 --> 07:58:45,831
And now which of these terms is clearly\n
8280
07:58:46,811 --> 07:58:48,081
DAVID J. MALAN: --the n squared.
8281
07:58:48,080 --> 07:58:50,620
So yes, even though\nminus 2n is a good thing
8282
07:58:50,620 --> 07:58:53,021
because it's subtracting off\nsome of the time required
8283
07:58:53,021 --> 07:58:56,050
plus 1 is not that big a thing,\n
8284
07:58:56,050 --> 07:58:58,480
n gets really large, like\nin the millions or billions
8285
07:58:58,480 --> 07:59:02,870
certainly, that bubble sort 2\nis on the order of n squared.
8286
07:59:02,870 --> 07:59:05,470
It's not the same exactly\nas selection sort.
8287
07:59:05,471 --> 07:59:07,390
But as n gets big,\nhonestly, we're barely
8288
07:59:07,390 --> 07:59:09,890
going to be able to notice\nthe difference most likely.
8289
07:59:09,890 --> 07:59:13,550
And so it too might be said to\nbe on the order of n squared.
8290
07:59:13,550 --> 07:59:18,890
And if we consider now the lower\n
8291
07:59:18,890 --> 07:59:23,080
here's where things get\npotentially interesting.
8292
07:59:23,080 --> 07:59:28,810
What might you claim is the running\n
8293
07:59:28,811 --> 07:59:32,591
And the best case, I claim, is when\n
8294
07:59:32,591 --> 07:59:35,081
Is our pseudo code going\nto take that into account?
8295
07:59:43,403 --> 07:59:45,070
DAVID J. MALAN: Yes, and that's the key word.
8296
07:59:45,070 --> 07:59:49,510
To summarize, in bubble sort, I do have\n
8297
07:59:49,510 --> 07:59:52,120
don't look at all n elements,\nthat I'm theoretically
8298
07:59:52,120 --> 07:59:53,620
just guessing if it's sorted or not.
8299
07:59:53,620 --> 07:59:55,841
Like, I obviously\nintuitively have to look
8300
07:59:55,841 --> 07:59:58,751
at every element to decide yay\nor nay, it's in the right order.
8301
07:59:58,751 --> 08:00:01,541
And my original pseudo code,\nthough, is pretty naive.
8302
08:00:01,541 --> 08:00:07,001
It's just going to blindly go back and\n
8303
08:00:08,170 --> 08:00:10,120
But what if I add a\nbit of an optimization
8304
08:00:10,120 --> 08:00:12,370
that you might have glimpsed\non the slide a moment ago
8305
08:00:12,370 --> 08:00:15,970
where if I compare two people and I\n
8306
08:00:15,971 --> 08:00:18,911
don't swap them, and I go all the\nway through the list comparing
8307
08:00:18,911 --> 08:00:22,480
every pair of adjacent\npeople, and I make no swaps
8308
08:00:22,480 --> 08:00:25,181
it would be kind of not\njust naive but stupid
8309
08:00:25,181 --> 08:00:28,721
to do that same process again\n
8310
08:00:28,721 --> 08:00:31,511
I'm not going to make\nany different decisions.
8311
08:00:31,510 --> 08:00:34,100
I'm going to do nothing\nagain, nothing again.
8312
08:00:34,100 --> 08:00:37,510
So at that point, it would be stupid,\n
8313
08:00:38,570 --> 08:00:42,431
So if I modify our pseudo code with\n
8314
08:00:44,411 --> 08:00:50,561
Inside of that same pseudo code, what\n
8315
08:00:50,561 --> 08:00:54,371
Like quit, prematurely before\nthe loops are finished running.
8316
08:00:54,370 --> 08:00:57,740
One of the loops has gone\nthrough per the indentation here.
8317
08:00:57,741 --> 08:01:00,551
But if I do a loop from\nleft to right and I
8318
08:01:00,550 --> 08:01:03,041
have made no swaps, which you\ncan think of as just being
8319
08:01:03,041 --> 08:01:06,611
one other variable that's plus plusing\n
8320
08:01:07,181 --> 08:01:09,100
if I've made no swaps\nfrom left to right
8321
08:01:09,100 --> 08:01:11,600
I'm not going to make any swaps\nthe next time around either.
8322
08:01:11,600 --> 08:01:14,390
So let's just quit at that point.
8323
08:01:14,390 --> 08:01:16,990
And that is to say in the\nbest case, if you will
8324
08:01:16,991 --> 08:01:20,771
when the list is already sorted,\n
8325
08:01:20,771 --> 08:01:25,820
might indeed be omega of n\nif you add that optimization
8326
08:01:25,820 --> 08:01:28,420
so as to short circuit\nall of that inefficient
8327
08:01:28,420 --> 08:01:34,190
looping to do it only as\nmany times as is necessary.
8328
08:01:34,190 --> 08:01:36,640
Let me pause to see if\nthere's any questions here.
8329
08:01:37,552 --> 08:01:46,399
AUDIENCE: [INAUDIBLE] to optimize the\n
8330
08:01:46,399 --> 08:01:47,441
DAVID J. MALAN: Good question.
8331
08:01:47,440 --> 08:01:53,050
If the running time of selection sort\n
8332
08:01:53,050 --> 08:01:58,841
of n squared but selection sort is in\n
8333
08:01:58,841 --> 08:02:01,661
in omega of n, which sounds better--
8334
08:02:01,661 --> 08:02:04,870
I think if I may, should we\njust always use bubble sort?
8335
08:02:04,870 --> 08:02:09,040
Yes if we think that we\nmight benefit over time
8336
08:02:09,041 --> 08:02:13,611
from a lot of good case\nscenarios or best case scenarios.
8337
08:02:13,611 --> 08:02:15,701
However, the goal at\nhand in just a bit is
8338
08:02:15,701 --> 08:02:17,841
going to be to do even\nbetter than both of these.
8339
08:02:17,841 --> 08:02:19,690
So hold that question\nfurther for a moment.
8340
08:02:20,190 --> 08:02:25,717
AUDIENCE: [INAUDIBLE] n minus 1?
8341
08:02:27,440 --> 08:02:31,251
So I say omega of n, but is it\ntechnically omega of n minus 1?
8342
08:02:31,251 --> 08:02:34,471
Maybe, but again, we're\nthrowing away lower order terms.
8343
08:02:34,471 --> 08:02:38,031
And that's an advantage because we're\n
8344
08:02:38,030 --> 08:02:41,540
Just like I plotted with the\ngreen and yellow and red chart
8345
08:02:41,541 --> 08:02:44,341
I just want to get a sense of\nthe shape of these algorithms
8346
08:02:44,341 --> 08:02:47,690
so that when n gets really\nlarge, which of these choices
8347
08:02:47,690 --> 08:02:49,545
is going to matter the most?
8348
08:02:49,545 --> 08:02:51,920
At the end of the day, it's\nactually perfectly reasonable
8349
08:02:51,920 --> 08:02:53,820
to use selection sort\nor bubble sort if you
8350
08:02:53,820 --> 08:02:56,570
don't have that much data because\n
8351
08:02:56,570 --> 08:02:58,881
My God, our computers\nnowadays are 1 gigahertz
8352
08:02:58,881 --> 08:03:02,800
2 gigahertz, 1 billion things per\n
8353
08:03:02,800 --> 08:03:05,300
But if we have large data sets,\nas we will later in the term
8354
08:03:05,300 --> 08:03:07,830
and as you might in the real world,\n
8355
08:03:07,830 --> 08:03:09,890
then you're going to want\nto be more thoughtful.
8356
08:03:09,890 --> 08:03:11,600
And that's where we're going today.
8357
08:03:11,600 --> 08:03:14,450
All right, so let's actually see\nthis visualized a little bit.
8358
08:03:14,451 --> 08:03:16,370
In a moment, I'm going\nto change screens here
8359
08:03:16,370 --> 08:03:21,710
to open up what is a little\nvisualization tool that will give us
8360
08:03:21,710 --> 08:03:25,280
a sense of how these things actually\n
8361
08:03:25,280 --> 08:03:27,181
than our humans are able\nto do here on stage.
8362
08:03:27,181 --> 08:03:31,940
So here is another visualization of a\n
8363
08:03:31,940 --> 08:03:35,030
Short bars mean small numbers,\ntall bars mean big numbers.
8364
08:03:35,030 --> 08:03:37,280
So instead of having the\nnumbers on their torsos here
8365
08:03:37,280 --> 08:03:42,050
we just have bars that are small or tall\n
8366
08:03:42,050 --> 08:03:44,990
Let me go ahead, and I\npreconfigured this in advance
8367
08:03:44,991 --> 08:03:46,411
to operate somewhat quickly.
8368
08:03:46,411 --> 08:03:49,911
Let's go ahead and do selections\nsort by clicking this button.
8369
08:03:49,911 --> 08:03:52,580
And you'll see some pink bars flying by.
8370
08:03:52,580 --> 08:03:56,000
And that's like me walking\nleft and right, left and right
8371
08:03:56,001 --> 08:03:58,501
to select the next smallest number.
8372
08:03:58,501 --> 08:04:01,940
And so what you'll see happening on\n
8373
08:04:01,940 --> 08:04:04,880
is Celeste, if you will, and\nall of the other smaller numbers
8374
08:04:04,881 --> 08:04:08,390
are appearing on the left while\n
8375
08:04:10,170 --> 08:04:13,537
So again, we no longer have to\ntouch the smaller numbers here.
8376
08:04:13,538 --> 08:04:16,370
So that's why the problem is getting\n
8377
08:04:17,151 --> 08:04:20,781
But you can notice now\nvisually, look at how many times
8378
08:04:22,280 --> 08:04:25,070
This is why things\nthat are n squared tend
8379
08:04:25,070 --> 08:04:29,190
to be frowned upon if avoidable because\n
8380
08:04:29,690 --> 08:04:31,970
When I was walking through, I\nkept pointing at the same humans
8381
08:04:34,021 --> 08:04:37,111
So let's see if bubble sort looks\nor feels a little different.
8382
08:04:37,111 --> 08:04:40,548
Let me re-randomize the thing, and let\n
8383
08:04:40,548 --> 08:04:43,341
And as you might infer, there's\n
8384
08:04:43,341 --> 08:04:44,633
not all of which we'll look at.
8385
08:04:46,100 --> 08:04:48,931
Same pink coloration, but it's\ndoing something different.
8386
08:04:48,931 --> 08:04:52,221
It's two pink bars going through\nagain and again comparing
8387
08:04:53,911 --> 08:04:57,411
And you'll see that the largest\n
8388
08:04:57,411 --> 08:05:02,480
to the right, but the smaller\nnumbers, like our number 0 was
8389
08:05:02,480 --> 08:05:04,648
is only slowly making its way over.
8390
08:05:06,561 --> 08:05:09,261
And it's going to take a while\nto get all the way to the left.
8391
08:05:09,260 --> 08:05:12,920
And here too, notice how\nmany times the same bars
8392
08:05:12,920 --> 08:05:16,950
are becoming pink, how many times the\n
8393
08:05:17,960 --> 08:05:21,501
Because it's only solving one\nproblem at a time on each pass.
8394
08:05:21,501 --> 08:05:25,078
And each time we do that, we're stepping\n
8395
08:05:25,078 --> 08:05:28,161
And now granted, I could speed this\n
8396
08:05:28,161 --> 08:05:32,480
but my God, this is only, what, like\n
8397
08:05:33,501 --> 08:05:36,681
Like, this is what n squared\nlooks like and feels like.
8398
08:05:36,681 --> 08:05:38,931
And now I'm just trying to\ncome up with words to say
8399
08:05:38,931 --> 08:05:40,640
until we get to the finish line here.
8400
08:05:40,640 --> 08:05:43,611
Like, this would be annoying if\nthis is the speed of sorting
8401
08:05:43,611 --> 08:05:47,494
and this is why I sort of secretly\n
8402
08:05:47,493 --> 08:05:49,911
because it would have taken\nus an annoying number of steps
8403
08:05:49,911 --> 08:05:51,570
to get that in place for her.
8404
08:05:51,570 --> 08:05:54,530
So those two algorithms are n squared.
8405
08:05:54,530 --> 08:05:56,626
Can we do, in fact, better?
8406
08:05:56,626 --> 08:05:59,751
Well, to save the best algorithm for\n
8407
08:06:00,251 --> 08:06:04,771
And when we come back, we'll\ndo even better than n squared.
8408
08:06:06,841 --> 08:06:11,541
So the challenge at hand is to\ndo better than selection sort
8409
08:06:11,541 --> 08:06:14,631
and better than bubble sort\nand ideally not just marginally
8410
08:06:14,631 --> 08:06:16,820
better but fundamentally better.
8411
08:06:16,820 --> 08:06:20,330
Just like in week zero, that third\n
8412
08:06:20,330 --> 08:06:23,220
was sort of fundamentally\nfaster than the other two.
8413
08:06:23,221 --> 08:06:26,311
So can we do better than something\non the order of n squared?
8414
08:06:26,311 --> 08:06:28,671
Well, I bet we can if\nwe start to approach
8415
08:06:28,670 --> 08:06:30,440
the problem a little differently.
8416
08:06:30,440 --> 08:06:32,562
The sorts we've done\nthus far, generally known
8417
08:06:32,562 --> 08:06:34,520
as comparison sorts--\nand that kind of captures
8418
08:06:34,521 --> 08:06:38,361
the reality that we were doing a huge\n
8419
08:06:38,361 --> 08:06:41,721
And you kind of saw that in the vertical\n
8420
08:06:41,721 --> 08:06:43,216
was being compared again and again.
8421
08:06:43,216 --> 08:06:45,591
But there's this programming\ntechnique, and it's actually
8422
08:06:45,591 --> 08:06:48,501
a mathematical technique\nknown as recursion
8423
08:06:48,501 --> 08:06:50,311
that we've actually seen before.
8424
08:06:50,311 --> 08:06:53,361
And this is a building\nblock or a mental model
8425
08:06:53,361 --> 08:06:56,661
we can bring to bear on the problem\nto solve the sorting problem
8426
08:06:56,661 --> 08:06:58,100
sort of fundamentally differently.
8427
08:06:58,100 --> 08:07:00,980
But first, let's look at it\nin a more familiar context.
8428
08:07:00,980 --> 08:07:07,550
A little bit ago, I proposed this pseudo\n
8429
08:07:07,550 --> 08:07:10,490
And notice that what was\ninteresting about this code
8430
08:07:10,491 --> 08:07:14,331
even though I didn't call it out at the\n
8431
08:07:14,330 --> 08:07:17,090
Like, I claim this is\nan algorithm for search
8432
08:07:17,091 --> 08:07:21,111
and yet it seems a little unfair\nthat I'm using the verb search
8433
08:07:21,111 --> 08:07:23,320
inside of the algorithm for search.
8434
08:07:23,320 --> 08:07:26,080
It's like an English sort of\ndefining a word by using the word.
8435
08:07:26,080 --> 08:07:28,210
Normally, you shouldn't\nreally get away with that.
8436
08:07:28,210 --> 08:07:30,791
But there's something\ninteresting about this technique
8437
08:07:30,791 --> 08:07:35,471
here because even though this\nwhole thing is a search algorithm
8438
08:07:35,471 --> 08:07:40,871
and I'm using my own algorithm to\n
8439
08:07:40,870 --> 08:07:42,880
the key feature here\nthat doesn't normally
8440
08:07:42,881 --> 08:07:46,031
happen in English when you\ndefine a word in terms of a word
8441
08:07:46,030 --> 08:07:49,661
is that when I search the left\n
8442
08:07:51,170 --> 08:07:52,450
I'm using the same algorithm.
8443
08:07:52,451 --> 08:07:55,421
But the problem is, by\ndefinition, half as large.
8444
08:07:55,420 --> 08:07:58,540
So this isn't going to be a\ncyclical argument in the same way.
8445
08:07:58,541 --> 08:08:02,111
This approach, by using\nsearch within search
8446
08:08:02,111 --> 08:08:05,651
is going to whittle the problem down\n
8447
08:08:05,651 --> 08:08:08,541
one door or no doors remains.
8448
08:08:08,541 --> 08:08:11,081
And so recursion is a\nprogramming technique
8449
08:08:11,080 --> 08:08:14,290
whereby a function calls itself.
8450
08:08:14,291 --> 08:08:18,581
And we haven't seen this yet in C, and\n
8451
08:08:18,580 --> 08:08:22,482
But in C, you can have\na function call itself.
8452
08:08:22,483 --> 08:08:24,190
And the form that\ntakes is like literally
8453
08:08:24,190 --> 08:08:28,540
using the function's name inside of\n
8454
08:08:28,541 --> 08:08:32,671
We've actually seen an opportunity\nfor this once before too.
8455
08:08:33,670 --> 08:08:35,920
Here's that same pseudo code\nfor searching for someone
8456
08:08:35,920 --> 08:08:37,540
in an actual, physical phone book.
8457
08:08:37,541 --> 08:08:40,061
And notice these yellow lines here.
8458
08:08:40,061 --> 08:08:44,171
We described those in week zero\nas inducing a loop, a cycle.
8459
08:08:44,170 --> 08:08:48,670
And this is a very procedural approach,\n
8460
08:08:48,670 --> 08:08:50,830
are very mechanically,\nif you will, telling
8461
08:08:50,830 --> 08:08:54,730
me to go back to line three to\ndo this kind of looping thing.
8462
08:08:54,730 --> 08:08:59,111
But really, what that's doing in the\n
8463
08:08:59,111 --> 08:09:04,480
book is it's just telling me to search\n
8464
08:09:04,480 --> 08:09:08,260
I'm doing it more mechanically\nagain by sort of telling myself
8465
08:09:08,260 --> 08:09:09,850
what line number to go back to.
8466
08:09:09,850 --> 08:09:12,760
But that's equivalent to just telling\n
8467
08:09:12,760 --> 08:09:15,310
search the right half, the\nkey thing being the left
8468
08:09:15,311 --> 08:09:18,081
have and the right half are\nsmaller than the original problem.
8469
08:09:18,080 --> 08:09:21,990
It would be a bug if I just said search\n
8470
08:09:21,991 --> 08:09:23,741
because obviously, you\nnever get anywhere.
8471
08:09:23,741 --> 08:09:25,901
But if you search the\nhalf, the half, the half
8472
08:09:25,901 --> 08:09:27,831
problem gets smaller and smaller.
8473
08:09:27,830 --> 08:09:34,540
So let's reformulate week zero's phone\n
8474
08:09:34,541 --> 08:09:39,501
but recursive whereby in\nthis search algorithm
8475
08:09:39,501 --> 08:09:42,940
AKA binary search, formerly\ncalled divide and conquer, I'm
8476
08:09:42,940 --> 08:09:46,300
going to literally use also\nthe keyword search here.
8477
08:09:46,300 --> 08:09:48,310
Notice among the benefits\nof doing this is it
8478
08:09:48,311 --> 08:09:51,159
kind of tightens the code up,\nmakes it a little more succinct
8479
08:09:51,158 --> 08:09:53,201
even though that's kind\nof a fringe benefit here.
8480
08:09:53,201 --> 08:09:56,710
But it's an elegant\nway too of describing
8481
08:09:56,710 --> 08:10:01,570
a problem by just having\na function use itself
8482
08:10:01,570 --> 08:10:05,690
to solve a smaller puzzle at hand.
8483
08:10:05,690 --> 08:10:08,740
So let's now consider a\nfamiliar problem, a smaller
8484
08:10:08,741 --> 08:10:11,493
version than the one you've dabbled\nwith-- this sort of pyramid
8485
08:10:11,492 --> 08:10:12,700
this half pyramid from Mario.
8486
08:10:12,701 --> 08:10:15,431
And let's throw away the parts\nthat aren't that interesting
8487
08:10:15,431 --> 08:10:19,780
and just consider how we might, up\n
8488
08:10:19,780 --> 08:10:21,880
this left aligned pyramid, if you will.
8489
08:10:21,881 --> 08:10:28,991
Let me go over here, and let me create\n
8490
08:10:28,991 --> 08:10:32,441
And in this file, I'm going to\ngo ahead and include cs50.h.
8491
08:10:32,440 --> 08:10:36,251
And I'm going to include stdio.h.
8492
08:10:36,251 --> 08:10:41,800
And the goal at hand is to implement in\n
8493
08:10:41,800 --> 08:10:43,640
this and exactly this pyramid.
8494
08:10:43,640 --> 08:10:46,473
So no get string or any of that--\n
8495
08:10:46,473 --> 08:10:49,760
and print exactly this\npyramid of height 4 here.
8496
08:10:50,931 --> 08:10:55,306
Well, let me go ahead, and in main,\n
8497
08:10:55,306 --> 08:10:56,931
well, we'll go ahead and generalize it.
8498
08:10:56,931 --> 08:10:58,763
Let's go ahead and ask\nthe user for heights.
8499
08:10:58,763 --> 08:11:00,611
We're using getint as before.
8500
08:11:00,611 --> 08:11:02,861
And I'll store that in a\nvariable called height.
8501
08:11:02,861 --> 08:11:05,111
And then let me go ahead\nand simply call the function
8502
08:11:05,111 --> 08:11:06,771
draw passing in that height.
8503
08:11:06,771 --> 08:11:09,401
So for the moment, let me\nassume that someone somewhere
8504
08:11:09,401 --> 08:11:10,991
has implemented a draw function.
8505
08:11:10,991 --> 08:11:14,171
And this, then, is the\nentirety of my program.
8506
08:11:14,170 --> 08:11:17,330
All right, unfortunately, C does\nnot come with a draw function.
8507
08:11:17,330 --> 08:11:19,180
So let me go ahead and invent one.
8508
08:11:19,181 --> 08:11:20,661
It doesn't need to return a value.
8509
08:11:20,661 --> 08:11:23,210
It just needs to print\nsomething-- so-called side effect.
8510
08:11:23,210 --> 08:11:27,820
So I'm going to define a function\n
8511
08:11:27,820 --> 08:11:30,640
I'll call it n for number, but\nI could call it anything I want.
8512
08:11:32,210 --> 08:11:37,600
I'm going to go ahead and print out a\n
8513
08:11:38,350 --> 08:11:42,070
The salient features here are that this\n
8514
08:11:43,181 --> 08:11:46,600
And now in height four, the\nfirst row has one brick.
8515
08:11:49,631 --> 08:11:52,461
That's a nice pattern that I\ncan probably represent in code.
8516
08:11:53,570 --> 08:11:55,751
Well, how about 4 int i gets--
8517
08:11:55,751 --> 08:11:57,311
let me do it the old school way--
8518
08:11:58,091 --> 08:12:02,890
And then i is less than or equal to n.
8519
08:12:04,541 --> 08:12:08,171
so I'm going from 1 to 4 just\nto keep myself sane here.
8520
08:12:08,170 --> 08:12:11,050
And then inside of this\nloop, what do I want to do?
8521
08:12:11,050 --> 08:12:12,920
Well, let me keep it\nconventional, in fact.
8522
08:12:12,920 --> 08:12:16,330
Let me just change this to\nbe the more conventional 0
8523
08:12:16,330 --> 08:12:20,530
to n even though it might not be\n
8524
08:12:21,911 --> 08:12:24,326
On row 1, I want two\nbricks, dot dot dot.
8525
08:12:26,890 --> 08:12:28,600
But I'm being more conventional.
8526
08:12:28,600 --> 08:12:32,800
So on each row, how many\nbricks do I want to print?
8527
08:12:32,800 --> 08:12:34,240
Well, I think I want to do this.
8528
08:12:34,241 --> 08:12:40,061
For int j, for instance, common to\n
8529
08:12:40,061 --> 08:12:47,291
let's start j at 0 and do this\nso long as is less than i plus 1
8530
08:12:50,861 --> 08:12:55,030
Well, again, when I equals 0, that's\n
8531
08:12:55,030 --> 08:12:57,010
When i equals 1, that's the second row.
8532
08:12:57,791 --> 08:13:00,611
And dot dot dot, when i\nis 3, I want four bricks.
8533
08:13:00,611 --> 08:13:03,820
So again, I have to add 1 to i\nto get the total number of bricks
8534
08:13:03,820 --> 08:13:05,480
that I want to print to the screen.
8535
08:13:05,480 --> 08:13:10,181
So inside of this nested for loop,\n
8536
08:13:13,330 --> 08:13:17,472
I'm going to save the new\nline for about here instead.
8537
08:13:17,473 --> 08:13:19,181
All right, the last\nthing I'm going to do
8538
08:13:19,181 --> 08:13:22,911
is copy and paste the prototype\nat the top of the file.
8539
08:13:23,980 --> 08:13:27,088
And again, this is of\nnow week one, week two.
8540
08:13:27,088 --> 08:13:29,381
Wouldn't necessarily come to\nyour mind as quickly as it
8541
08:13:29,381 --> 08:13:32,741
might to mine after all this practice,\n
8542
08:13:32,741 --> 08:13:35,591
of what you yourself did\nalready for Mario-- printing out
8543
08:13:35,591 --> 08:13:38,690
a pyramid that hopefully in a\nmoment is going to look like this.
8544
08:13:38,690 --> 08:13:40,730
So let me go back to my code.
8545
08:13:40,730 --> 08:13:44,920
Let me run make iteration, and\nlet me do dot slash iteration.
8546
08:13:46,901 --> 08:13:50,201
Seems to be correct, and let's assume\n
8547
08:13:56,131 --> 08:13:59,820
So this is indeed an example\nof iteration-- doing something
8548
08:14:02,521 --> 08:14:05,521
Like, I literally have a function\n
8549
08:14:05,521 --> 08:14:09,931
But I can think about implementing\n
8550
08:14:11,070 --> 08:14:13,080
And it's not strictly\nnecessary for this problem
8551
08:14:13,080 --> 08:14:15,720
because this problem honestly\nis not that complicated
8552
08:14:15,721 --> 08:14:17,747
to solve once you have\npractice under your belt.
8553
08:14:17,747 --> 08:14:20,580
Certainly the first time around,\n
8554
08:14:20,580 --> 08:14:23,610
But now that you kind of\nassociate, OK, row one
8555
08:14:23,611 --> 08:14:26,370
with one brick, row two with two\n
8556
08:14:27,701 --> 08:14:30,131
But how else could we\nthink about this problem?
8557
08:14:30,131 --> 08:14:33,300
Well, this physical structure,\nthese bricks, in some sense
8558
08:14:33,300 --> 08:14:39,341
is a recursive structure, a structure\n
8559
08:14:40,721 --> 08:14:45,961
Well, if I were to ask you the question,\n
8560
08:14:45,960 --> 08:14:49,080
look like, you would point,\nof course, to this picture.
8561
08:14:49,080 --> 08:14:55,530
But you could also kind of\ncleverly say to me, well
8562
08:14:55,530 --> 08:15:00,600
it's actually a pyramid of\nheight 3 plus 1 additional row.
8563
08:15:00,600 --> 08:15:02,597
And here's that cyclical\nargument, right?
8564
08:15:02,598 --> 08:15:05,431
Kind of obnoxious to do typically\n
8565
08:15:05,431 --> 08:15:07,561
because you're defining one\nthing in terms of itself.
8566
08:15:07,561 --> 08:15:08,769
What's a pyramid of height 4?
8567
08:15:08,769 --> 08:15:12,541
Well, it's a pyramid of\nheight 3 plus 1 more row.
8568
08:15:12,541 --> 08:15:15,300
But we can kind of leverage\nthis logic in code.
8569
08:15:15,300 --> 08:15:16,951
Well, what's a pyramid of height 3?
8570
08:15:16,951 --> 08:15:19,230
Well, it's a pyramid of\nheight 2 plus 1 more row.
8571
08:15:19,230 --> 08:15:21,300
Fine, what's a pyramid of height 2?
8572
08:15:21,300 --> 08:15:23,730
Well, it's a pyramid of\nheight 1 plus 1 more row.
8573
08:15:23,730 --> 08:15:26,730
And then hopefully, this process\n
8574
08:15:26,730 --> 08:15:29,350
the pyramid is getting\nsmaller and smaller.
8575
08:15:29,350 --> 08:15:32,970
So you're not going to have this\n
8576
08:15:32,971 --> 08:15:36,911
infinitely many times because when\n
8577
08:15:36,911 --> 08:15:38,491
the end of the pyramid, fine.
8578
08:15:38,491 --> 08:15:39,961
What is a pyramid of height 1?
8579
08:15:39,960 --> 08:15:42,990
Well, it's a pyramid of no\nheight plus one more row.
8580
08:15:42,991 --> 08:15:45,631
And at that point, things\njust get negative--
8581
08:15:46,411 --> 08:15:48,396
Things just would otherwise go negative.
8582
08:15:48,396 --> 08:15:49,771
And so you can just kind of stop.
8583
08:15:49,771 --> 08:15:52,291
The base case is when\nthere is no more pyramid.
8584
08:15:52,291 --> 08:15:56,441
So there's a way to draw a line in the\n
8585
08:15:56,440 --> 08:16:00,540
But this idea of defining a physical\n
8586
08:16:00,541 --> 08:16:06,541
or code in terms of itself actually lets\n
8587
08:16:06,541 --> 08:16:08,521
Let me go back to my code here.
8588
08:16:08,521 --> 08:16:14,161
Let me go ahead and create one\n
8589
08:16:14,161 --> 08:16:20,460
that leverages this idea of this\n
8590
08:16:22,440 --> 08:16:26,100
Let me go ahead and include\nstandardio.h, int main void.
8591
08:16:26,100 --> 08:16:30,360
And then inside of main, I'm going\n
8592
08:16:30,361 --> 08:16:34,831
height equals get int,\nasking the user for height.
8593
08:16:34,830 --> 08:16:38,230
And then I'm going to go ahead\nand call draw passing in height.
8594
08:16:38,230 --> 08:16:39,881
So that's going to stay the same.
8595
08:16:39,881 --> 08:16:45,941
I even am going to make my prototype\n
8596
08:16:45,940 --> 08:16:48,181
And now I'm going to\nimplement void down here
8597
08:16:48,181 --> 08:16:49,951
with that same prototype, of course.
8598
08:16:49,951 --> 08:16:52,831
But the code now is going\nto be a little different.
8599
08:16:54,431 --> 08:16:59,971
Well, first of all, if you ask\nme to draw a pyramid of height n
8600
08:16:59,971 --> 08:17:02,941
I'm going to be kind of a wise\nass here and say, well, just
8601
08:17:02,940 --> 08:17:05,370
draw a pyramid of n minus 1--
8602
08:17:06,122 --> 08:17:08,580
All right, but there's still\na little more work to be done.
8603
08:17:08,580 --> 08:17:13,020
What happens after I print or\ndraw a pyramid of height n minus 1
8604
08:17:13,021 --> 08:17:17,701
according to our structural\ndefinition a moment ago?
8605
08:17:17,701 --> 08:17:22,831
What remains after drawing a pyramid\n
8606
08:17:25,201 --> 08:17:26,761
We need one more row of hashes.
8607
08:17:26,760 --> 08:17:28,110
OK, so I can do that, right?
8608
08:17:28,111 --> 08:17:29,574
I'm OK with the single loops.
8609
08:17:29,574 --> 08:17:30,991
There's no nesting necessary here.
8610
08:17:30,991 --> 08:17:35,371
I'm just going to do this-- for\nint i get 0, i is less than n
8611
08:17:35,370 --> 08:17:37,545
which is the height that's\npassed in, i plus plus.
8612
08:17:37,545 --> 08:17:39,420
And then inside of this\nloop, I'm very simply
8613
08:17:39,420 --> 08:17:41,100
going to print out a single hash.
8614
08:17:41,100 --> 08:17:45,010
And then down here, I'm going to\n
8615
08:17:45,960 --> 08:17:48,080
I might not be as comfortable\nwith nested loops.
8616
08:17:49,080 --> 08:17:52,020
What does this loop do\nhere on line 17 through 20?
8617
08:17:52,021 --> 08:17:57,901
It literally prints n hashes by\n
8618
08:17:59,491 --> 08:18:02,291
So that's sort of week one style syntax.
8619
08:18:02,291 --> 08:18:05,101
But this is kind of trippy\nnow because I've somehow
8620
08:18:05,100 --> 08:18:09,841
boiled down the implementation of\n
8621
08:18:11,611 --> 08:18:15,611
But this is problematic as\nis because in this case
8622
08:18:15,611 --> 08:18:22,261
my drawer function, notice, is always\n
8623
08:18:23,251 --> 08:18:28,561
But ideally, when do I want\nthis cyclical process to stop?
8624
08:18:28,561 --> 08:18:32,371
When do I want to not call draw anymore?
8625
08:18:35,657 --> 08:18:37,740
When I get to the top of\nthe pyramid, when n is 1
8626
08:18:37,741 --> 08:18:40,081
or heck, when the pyramids\nall gone and n equals 0.
8627
08:18:40,080 --> 08:18:42,420
I can pick any line in\nthe sand, so long as it's
8628
08:18:42,420 --> 08:18:44,050
sort of at the end of the process.
8629
08:18:44,050 --> 08:18:45,841
Then I don't want to call draw anymore.
8630
08:18:45,841 --> 08:18:48,210
So maybe what I should do is this.
8631
08:18:48,210 --> 08:18:54,810
If n equals equals 0, there's\nreally nothing to draw.
8632
08:18:54,811 --> 08:18:58,701
So I'm just going to go\nahead and return like this.
8633
08:18:58,701 --> 08:19:01,281
Otherwise, I'm going\nto go ahead and draw
8634
08:19:01,280 --> 08:19:04,598
n minus 1 rows and then one more row.
8635
08:19:04,598 --> 08:19:06,140
And I could express this differently.
8636
08:19:06,140 --> 08:19:08,900
I could do something like this,\nwhich would be equivalent.
8637
08:19:08,901 --> 08:19:13,851
I could say something like if n\nis greater than or equal to 0
8638
08:19:13,850 --> 08:19:15,823
then go ahead and draw the row.
8639
08:19:15,823 --> 08:19:17,030
But I like it this way first.
8640
08:19:17,030 --> 08:19:18,948
For now, I'm going to\ngo with the original way
8641
08:19:18,948 --> 08:19:22,100
just to ask a simple question and\n
8642
08:19:23,330 --> 08:19:26,100
And heck, just to be\nsuper safe, just in case
8643
08:19:26,100 --> 08:19:28,400
the user types in a\nnegative number, let me also
8644
08:19:28,401 --> 08:19:31,341
just check if n is a negative number,\n
8645
08:19:32,570 --> 08:19:35,850
I'm not returning a value because\nagain, the function is void.
8646
08:19:35,850 --> 08:19:38,040
It doesn't need or have a return value.
8647
08:19:38,041 --> 08:19:40,221
So just saying return suffices.
8648
08:19:40,221 --> 08:19:45,021
But if n equals 1 or 2\nor 3 or anything higher
8649
08:19:45,021 --> 08:19:50,480
it is reasonable to draw a pyramid of\n
8650
08:19:50,480 --> 08:19:55,041
of 4, 3, and then go ahead\nand print one more row.
8651
08:19:55,041 --> 08:20:00,811
So this is an example now of code\n
8652
08:20:02,510 --> 08:20:07,280
But this so-called base case\nensures, this conditional ensures
8653
08:20:07,280 --> 08:20:09,050
that we're not going to do this forever.
8654
08:20:09,050 --> 08:20:11,661
Otherwise, we literally would\ndo this infinitely many times
8655
08:20:11,661 --> 08:20:14,690
and something bad is\nprobably going to happen.
8656
08:20:14,690 --> 08:20:18,591
All right, let me go ahead and\n
8657
08:20:18,591 --> 08:20:22,611
OK, no syntax errors-- dot slash\nrecursion, Enter, height of 4
8658
08:20:24,951 --> 08:20:28,761
If only because some of you have run\n
8659
08:20:28,760 --> 08:20:32,780
let me get rid of the base case\n
8660
08:20:34,341 --> 08:20:36,510
Oh, and actually, now\nit's actually catching it.
8661
08:20:36,510 --> 08:20:39,411
So the compiler is smart\nenough here to realize
8662
08:20:39,411 --> 08:20:42,830
that all paths through this\nfunction will call itself.
8663
08:20:42,830 --> 08:20:45,661
AKA, It's going to loop forever.
8664
08:20:45,661 --> 08:20:47,451
So let me do the first thing.
8665
08:20:47,451 --> 08:20:49,911
Suppose I only check for n equaling 0.
8666
08:20:49,911 --> 08:20:53,751
Let me go ahead and recompile\nthis code with make recursion.
8667
08:20:53,751 --> 08:20:56,390
And now let me just be\nkind of uncooperative.
8668
08:20:56,390 --> 08:21:00,560
When I run this program, still\nworks for 4, still works for 0.
8669
08:21:00,561 --> 08:21:03,471
What if I do like negative 100?
8670
08:21:03,471 --> 08:21:07,281
Have any of you experienced a\nsegmentation fault or core dump?
8671
08:21:08,600 --> 08:21:13,341
Like, this means I have somehow\n
8672
08:21:13,341 --> 08:21:17,271
And in short, I actually called\nthis function thousands of times
8673
08:21:17,271 --> 08:21:19,911
accidentally, it would seem\nnow, until the program just
8674
08:21:19,911 --> 08:21:22,774
bailed on me because I eventually\ntouched memory in the computer
8675
08:21:23,690 --> 08:21:25,501
That'll make even more sense next week.
8676
08:21:25,501 --> 08:21:27,001
But for now, it's simply a bug.
8677
08:21:27,001 --> 08:21:28,791
And I can avoid that\nbug in this context
8678
08:21:28,791 --> 08:21:33,261
probably not your own pset context,\n
8679
08:21:33,260 --> 08:21:35,700
allow for negative numbers at all.
8680
08:21:35,701 --> 08:21:38,061
So with this building\nblock in place, what
8681
08:21:38,061 --> 08:21:41,781
can we now do in terms of\nthose same numbers to sort?
8682
08:21:41,780 --> 08:21:44,751
Well, it turns out there's a\n
8683
08:21:44,751 --> 08:21:46,341
And there's bunches of others too.
8684
08:21:46,341 --> 08:21:51,283
But merge sort is a nice one to discuss\n
8685
08:21:51,283 --> 08:21:53,451
is going to do better than\nselection sort and bubble
8686
08:21:53,451 --> 08:21:55,851
sort that is better than n squared.
8687
08:21:55,850 --> 08:21:58,598
But the catch is it's a\nlittle harder to think about.
8688
08:21:58,598 --> 08:22:01,640
In fact, I'll act it out myself with\n
8689
08:22:01,640 --> 08:22:05,721
rather than humans because recursion\n
8690
08:22:05,721 --> 08:22:08,061
to wrap your mind around,\ntypically a bit of practice.
8691
08:22:08,061 --> 08:22:10,269
But I'll see if we can't\nwalk through it methodically
8692
08:22:10,269 --> 08:22:12,620
enough such that this comes to light.
8693
08:22:12,620 --> 08:22:16,820
So here's the pseudo code I propose\n
8694
08:22:16,820 --> 08:22:20,001
In the spirit of recursion,\nthis sorting algorithm
8695
08:22:20,001 --> 08:22:25,381
literally calls itself by using\n
8696
08:22:25,381 --> 08:22:26,961
So how does merge sort work?
8697
08:22:26,960 --> 08:22:30,380
It sort of obnoxiously says, well, if\n
8698
08:22:30,381 --> 08:22:33,081
go sort the left half, then\ngo sort the right half
8699
08:22:33,080 --> 08:22:34,790
and then merge the two together.
8700
08:22:34,791 --> 08:22:36,043
Now obnoxious in what sense?
8701
08:22:36,043 --> 08:22:38,751
Well, if I just asked you to sort\n
8702
08:22:38,751 --> 08:22:40,431
well, go sort that\nthing and then go sort
8703
08:22:40,431 --> 08:22:43,098
that thing, what was the point\nof asking you in the first place?
8704
08:22:43,098 --> 08:22:45,441
But the key is that\neach of these lines is
8705
08:22:45,440 --> 08:22:48,240
sorting a smaller piece of the problem.
8706
08:22:48,241 --> 08:22:50,811
So eventually, we'll be\nable to pare this down
8707
08:22:50,811 --> 08:22:54,771
into something that doesn't go on\n
8708
08:22:56,480 --> 08:22:59,300
There's a scenario where we\njust check, wait a minute
8709
08:22:59,300 --> 08:23:01,881
if there's only one\nnumber to sort, that's it.
8710
08:23:01,881 --> 08:23:03,831
Quit then because you're all done.
8711
08:23:03,830 --> 08:23:06,950
So there has to be this base\ncase in any use of recursion
8712
08:23:06,951 --> 08:23:11,451
to make sure that you don't\nmindlessly call yourself forever.
8713
08:23:11,451 --> 08:23:13,741
You've got to stop at some point.
8714
08:23:13,741 --> 08:23:16,921
So let's focus on the\nthird of these steps.
8715
08:23:16,920 --> 08:23:21,688
What does it mean to merge two\nlists, two halves of a list
8716
08:23:21,688 --> 08:23:23,480
just because this is\napparently going to be
8717
08:23:23,480 --> 08:23:25,650
a key ingredient-- so\nhere, for instance
8718
08:23:25,651 --> 08:23:28,251
are two halves of a list of size 8.
8719
08:23:28,251 --> 08:23:31,370
We have the numbers 2-- and I'll call\n
8720
08:23:36,561 --> 08:23:41,070
Notice that the left half at the\n
8721
08:23:41,070 --> 08:23:45,361
and the right half, 0136,\nis also sorted as well.
8722
08:23:45,361 --> 08:23:48,230
So that's a good thing because\nit means that theoretically, I've
8723
08:23:48,230 --> 08:23:49,640
sorted the left half already.
8724
08:23:49,640 --> 08:23:51,980
I've sorted the right half\nalready before we began.
8725
08:23:51,980 --> 08:23:54,050
I just need to merge these two halves.
8726
08:23:54,050 --> 08:23:56,161
What does it mean to sort two halves?
8727
08:23:56,161 --> 08:23:57,911
Well, for the sake of\ndiscussion, I'm just
8728
08:23:57,911 --> 08:24:03,701
going to turn over most of the numbers\n
8729
08:24:04,721 --> 08:24:07,751
There's two halves here, left and right.
8730
08:24:07,751 --> 08:24:09,820
At the moment, I'm\nonly going to consider
8731
08:24:09,820 --> 08:24:13,893
the leftmost element of each half--\n
8732
08:24:13,893 --> 08:24:15,100
and the one on the left here.
8733
08:24:15,100 --> 08:24:18,161
How do I merge these two lists together?
8734
08:24:18,161 --> 08:24:22,901
Well, if I look at 2 and I look at 0,\n
8735
08:24:23,863 --> 08:24:25,570
So I'm going to grab\nthe 0, and I'm going
8736
08:24:25,570 --> 08:24:28,510
to put it into its own place\non this new shelf here.
8737
08:24:28,510 --> 08:24:34,661
And now I'm going to consider,\nas part of my iteration
8738
08:24:34,661 --> 08:24:37,631
the beginning of this list and\nthe new beginning of this list.
8739
08:24:37,631 --> 08:24:39,491
So I'm now comparing 2 and 1.
8740
08:24:40,570 --> 08:24:42,760
I'm going to go ahead and grab the 1.
8741
08:24:42,760 --> 08:24:45,490
Now I'm going to compare the\nbeginning of the left list
8742
08:24:45,491 --> 08:24:47,801
and the new beginning of\nthe right list, 2 and 3.
8743
08:24:49,379 --> 08:24:51,671
Now I'm going to compare the\nbeginning of the left list
8744
08:24:51,670 --> 08:24:53,650
and the beginning of\nthe right list, 4 and 3.
8745
08:24:55,870 --> 08:24:58,611
Now I'm going to compare the 4\nagainst the beginning and end
8746
08:24:58,611 --> 08:25:00,191
it turns out, of the second list--
8747
08:25:01,348 --> 08:25:03,640
Now I'm going to compare the\nbeginning of the left list
8748
08:25:03,640 --> 08:25:05,183
and the beginning of the right list--
8749
08:25:06,791 --> 08:25:09,371
I'm realizing this is not going\nto end well because I left
8750
08:25:09,370 --> 08:25:10,780
too much distance between the numbers.
8751
08:25:10,780 --> 08:25:12,698
But that has nothing to\ndo with the algorithm.
8752
08:25:12,698 --> 08:25:14,201
7 is the beginning of the left list.
8753
08:25:14,201 --> 08:25:15,820
6 is the beginning of the right list.
8754
08:25:17,320 --> 08:25:20,771
And at the risk of\nknocking all of these over
8755
08:25:20,771 --> 08:25:27,671
if I now make room for this\nelement, we have hopefully
8756
08:25:27,670 --> 08:25:34,760
sorted the whole thing by having merged\n
8757
08:25:40,809 --> 08:25:43,101
I'm a little worried that's\njust getting sarcastic now
8758
08:25:43,100 --> 08:25:48,230
but we now have merged two half lists.
8759
08:25:48,230 --> 08:25:51,791
We haven't done the guts of the\n
8760
08:25:52,791 --> 08:25:55,521
But I claim that that\nis how mechanically you
8761
08:25:57,320 --> 08:25:59,420
You keep looking at the\nbeginning of each list
8762
08:25:59,420 --> 08:26:02,001
and you just kind of\nweave them together based
8763
08:26:02,001 --> 08:26:05,431
on which one belongs\nfirst based on its size.
8764
08:26:05,431 --> 08:26:07,820
So if you agree that\nthat was a reasonable way
8765
08:26:07,820 --> 08:26:12,381
to merge two lists together,\nlet's go ahead and focus lastly
8766
08:26:12,381 --> 08:26:15,621
on what it means to\nactually sort the left half
8767
08:26:15,620 --> 08:26:17,797
and sort the right half of\na whole bunch of numbers.
8768
08:26:17,797 --> 08:26:19,880
And for this, I'm going\nto go ahead and order them
8769
08:26:19,881 --> 08:26:21,681
in this seemingly random order.
8770
08:26:21,681 --> 08:26:24,561
And I just have a little cheat\n
8771
08:26:24,561 --> 08:26:26,691
And I'm going to start at\nthe very top this time.
8772
08:26:26,690 --> 08:26:29,751
And hopefully, these will\nnot fall down at any point.
8773
08:26:29,751 --> 08:26:35,661
But I'm just deliberately putting\n
8774
08:26:41,361 --> 08:26:43,701
Hopefully this won't fall over.
8775
08:26:43,701 --> 08:26:48,351
Here is now an array of\nsize 8 with eight integers.
8776
08:26:49,651 --> 08:26:52,693
I could use selection sort and just\n
8777
08:26:52,692 --> 08:26:55,460
I could use bubble sort and just\ncompare pairs, pairs, pairs.
8778
08:26:55,460 --> 08:26:58,190
But those are going to be on\nthe order of big O of n squared.
8779
08:26:58,190 --> 08:27:00,360
My hope is to do\nfundamentally better here.
8780
08:27:00,361 --> 08:27:02,120
So let's see if we can do better.
8781
08:27:02,120 --> 08:27:04,161
All right, so let me\nlook now at my code.
8782
08:27:05,850 --> 08:27:07,670
How do I implement merge sort?
8783
08:27:07,670 --> 08:27:09,663
Well, if there's only\none number, I quit.
8784
08:27:10,580 --> 08:27:12,630
There's eight numbers,\nso that's not applicable.
8785
08:27:12,631 --> 08:27:14,963
I'm going to go ahead and\nsort the left half of numbers.
8786
08:27:14,963 --> 08:27:16,820
All right, here's the left half--
8787
08:27:18,830 --> 08:27:21,590
Do I sort an array of size 4?
8788
08:27:21,591 --> 08:27:24,530
Well, here's where the\nrecursion kicks in.
8789
08:27:24,530 --> 08:27:26,510
How do you sort a list of size 4?
8790
08:27:26,510 --> 08:27:28,670
Well, there's the pseudo\ncode on the board.
8791
08:27:28,670 --> 08:27:31,850
I sort the left half\nof the list of size 4.
8792
08:27:37,041 --> 08:27:38,841
All right, now I have a list of size 2.
8793
08:27:50,841 --> 08:27:52,370
If only one number, I'm done.
8794
08:27:53,594 --> 08:27:55,011
All right, what was the next step?
8795
08:27:55,010 --> 08:27:56,540
You have to now rewind in time.
8796
08:27:56,541 --> 08:28:00,871
I just sorted the left half of\nthe left half of the left half.
8797
08:28:07,100 --> 08:28:11,420
So now at this point in the story,\n
8798
08:28:11,420 --> 08:28:14,210
the 5 assorted, and the 2 is sorted.
8799
08:28:14,210 --> 08:28:18,771
But what's the third and final step\n
8800
08:28:20,010 --> 08:28:21,960
So here's the left,\nhere's the right list.
8801
08:28:21,960 --> 08:28:23,210
How do I merge these together?
8802
08:28:23,210 --> 08:28:25,890
I compare the lists,\nand I put the two there.
8803
08:28:25,890 --> 08:28:27,951
I only have the [? 5 ?]\nleft, and I do that.
8804
08:28:27,951 --> 08:28:30,351
So now we see some visible progress.
8805
08:28:32,580 --> 08:28:37,220
We started to sort the left half of\n
8806
08:28:39,260 --> 08:28:42,450
We've just sorted the left\nhalf of the left half.
8807
08:28:42,451 --> 08:28:45,501
So what comes after sorting\nthe left half of anything?
8808
08:28:46,052 --> 08:28:48,260
All right, here's the sort\nof same nonsensical thing.
8809
08:28:55,951 --> 08:28:59,241
So that's the 4, and that's the 7.
8810
08:29:00,471 --> 08:29:05,431
In total, I've now sorted the\nleft half of the original thing.
8811
08:29:07,453 --> 08:29:08,661
Wait a minute, wait a minute.
8812
08:29:10,911 --> 08:29:13,580
I have sorted the left\nhalf of the left half
8813
08:29:13,580 --> 08:29:16,710
and I've sorted the right\nhalf of the left half.
8814
08:29:16,710 --> 08:29:19,070
What do I now need to do lastly?
8815
08:29:19,070 --> 08:29:21,001
Merge those two lists together.
8816
08:29:21,001 --> 08:29:22,922
So again, I put my\nfinger on the beginning
8817
08:29:22,922 --> 08:29:24,630
of this list, the\nbeginning of this list.
8818
08:29:24,631 --> 08:29:26,721
And if you want, I'll do the same\nthing when I merged last time
8819
08:29:26,721 --> 08:29:28,191
to be clear what I'm comparing.
8820
08:29:28,190 --> 08:29:30,800
2 and 4-- the 2 obviously comes first.
8821
08:29:36,030 --> 08:29:40,280
The 5 comes next and then\nlastly, of course, the 7.
8822
08:29:40,280 --> 08:29:44,370
Notice that the 2457 are now sorted.
8823
08:29:44,370 --> 08:29:47,163
So the original left half is sorted.
8824
08:29:47,163 --> 08:29:49,370
And I'll do the rest a little\nfaster because, my God
8825
08:29:49,370 --> 08:29:50,745
this feels like it takes forever.
8826
08:29:50,745 --> 08:29:52,790
But I bet we're on to something here.
8827
08:29:54,681 --> 08:29:56,931
I've just sorted the left\nhalf of the original.
8828
08:29:56,931 --> 08:29:58,640
Sort the right half of the original.
8829
08:29:59,661 --> 08:30:02,241
I sort the left half of the right half.
8830
08:30:03,411 --> 08:30:05,721
I sort the left half of the left half.
8831
08:30:06,681 --> 08:30:08,721
I sort the right half of the left half.
8832
08:30:09,620 --> 08:30:11,751
Now I merge the two together.
8833
08:30:11,751 --> 08:30:14,550
The 1 comes first, the 6 comes next.
8834
08:30:14,550 --> 08:30:18,740
Now I sort the right\nhalf of the right half.
8835
08:30:25,591 --> 08:30:27,440
So that's the third step of that phase.
8836
08:30:27,440 --> 08:30:32,120
Now where are we in the stor-- oh\n
8837
08:30:32,120 --> 08:30:36,591
We have sorted the left\nhalf of the right half
8838
08:30:36,591 --> 08:30:38,541
and the right half of the right half.
8839
08:30:40,623 --> 08:30:42,791
So I'm going to compare,\nand I'm going to move those
8840
08:30:42,791 --> 08:30:44,951
down just to make clear\nwhat I'm comparing
8841
08:30:44,951 --> 08:30:46,661
the beginning of both sublists.
8842
08:30:57,370 --> 08:30:59,290
And then lastly comes the 6.
8843
08:30:59,291 --> 08:31:01,241
All right, where are we in the story?
8844
08:31:01,241 --> 08:31:03,581
We've now sorted the\nleft half of the original
8845
08:31:03,580 --> 08:31:05,140
and the right half of the original.
8846
08:31:07,390 --> 08:31:09,431
All right, so I'm going\nto make the same point.
8847
08:31:09,431 --> 08:31:12,050
And this is actually\nliterally what we did earlier
8848
08:31:12,050 --> 08:31:16,120
because I deliberately demoed those\n
8849
08:31:37,061 --> 08:31:41,121
And lastly-- this is when\nwe run out of memory--
8850
08:31:41,120 --> 08:31:45,370
the 7 over there is actually in place.
8851
08:31:47,751 --> 08:31:50,227
OK, so admittedly, a\nlittle harder to explain
8852
08:31:50,227 --> 08:31:52,310
and honestly, it gets a\nlittle trippy because it's
8853
08:31:52,311 --> 08:31:55,341
so easy to forget about\nwhere you are in the story
8854
08:31:55,341 --> 08:31:58,041
because we're constantly\ndiving into the algorithm
8855
08:31:58,041 --> 08:31:59,611
and then backing back out of it.
8856
08:31:59,611 --> 08:32:02,331
But in code, we could\nexpress this pretty correctly
8857
08:32:02,330 --> 08:32:05,360
and, it turns out, pretty\nefficiently because what
8858
08:32:05,361 --> 08:32:09,021
I was doing, even though it's\nlonger when I do it verbally
8859
08:32:09,021 --> 08:32:12,620
I was touching these elements a\nminimal amount of times, right?
8860
08:32:12,620 --> 08:32:15,890
I wasn't going back and forth, back\n
8861
08:32:16,850 --> 08:32:21,600
I was deliberately only ever merging\n
8862
08:32:21,600 --> 08:32:24,560
So every time we merge, even\nthough I was doing it quickly
8863
08:32:24,561 --> 08:32:28,730
my fingers were only touching\neach of the elements once.
8864
08:32:28,730 --> 08:32:34,512
And how many times did we divide,\n
8865
08:32:34,512 --> 08:32:36,470
Well, we started with\nall of the elements here
8866
08:32:36,471 --> 08:32:37,679
and there were eight of them.
8867
08:32:37,679 --> 08:32:41,221
And then we moved them\n1, 2, 3 positions.
8868
08:32:41,221 --> 08:32:47,781
So the height of this visualization,\n
8869
08:32:47,780 --> 08:32:50,640
If I started with 8, turns\nout if you do the arithmetic
8870
08:32:50,640 --> 08:32:54,740
this is log n height\nbecause 2 to the 3 is 8.
8871
08:32:54,741 --> 08:32:57,111
But for now, just trust\nthat this is a log n height.
8872
08:32:58,640 --> 08:33:02,631
Well, it's of width n because\nthere's n elements any time
8873
08:33:03,830 --> 08:33:08,097
So technically, I was kind of\n
8874
08:33:08,098 --> 08:33:09,681
is the first time I've needed shelves.
8875
08:33:09,681 --> 08:33:13,370
With the human examples, we just had the\n
8876
08:33:13,881 --> 08:33:16,561
Here, I was sort of using\nmore and more memory.
8877
08:33:16,561 --> 08:33:19,131
In fact, I was using like\nfour times as much memory
8878
08:33:19,131 --> 08:33:21,291
even though that was just\nfor visualization's sake.
8879
08:33:21,291 --> 08:33:25,401
Merge sort actually requires that you\n
8880
08:33:25,401 --> 08:33:28,406
to move the elements into when\nyou're merging them together.
8881
08:33:28,405 --> 08:33:31,280
But if I really wanted and if I\n
8882
08:33:31,280 --> 08:33:33,950
honestly, I could have just gone back\n
8883
08:33:33,951 --> 08:33:35,611
That would have been sufficient.
8884
08:33:35,611 --> 08:33:40,550
So merge sort uses more memory\nfor this merging process
8885
08:33:40,550 --> 08:33:43,460
but the advantage of\nusing more memory is
8886
08:33:43,460 --> 08:33:49,041
that the total running time, if you can\n
8887
08:33:49,041 --> 08:33:51,921
The big O notation for\nmerge sort, it turns out
8888
08:33:51,920 --> 08:33:54,890
is actually going to be n times log n.
8889
08:33:54,890 --> 08:33:57,411
And even if you're a little\nrusty still on your logarithms
8890
08:33:57,411 --> 08:34:02,751
we saw in week zero and again\n
8891
08:34:05,241 --> 08:34:07,611
That's faster than linear\nsearch, which was n.
8892
08:34:07,611 --> 08:34:13,191
So n times log n is, of course,\n
8893
08:34:13,190 --> 08:34:16,190
So it's sort of lower on this little\n
8894
08:34:16,190 --> 08:34:19,710
which is to suggest that it's running\n
8895
08:34:19,710 --> 08:34:22,701
And in fact, if we consider\nthe best case running time
8896
08:34:22,701 --> 08:34:27,006
turns out it's not quite as good\nas bubble sort with omega of n
8897
08:34:27,006 --> 08:34:29,631
where you can just sort of abort\nif you realize, wait a minute
8898
08:34:30,681 --> 08:34:35,210
Merge sort, you actually have to do that\n
8899
08:34:35,210 --> 08:34:40,530
So it's actually in omega and\n
8900
08:34:40,530 --> 08:34:42,591
So again, a trade off\nthere because if you
8901
08:34:42,591 --> 08:34:44,901
happen to have a data set\nthat is very often sorted
8902
08:34:44,901 --> 08:34:47,026
honestly, you might want\nto stick with bubble sort.
8903
08:34:47,026 --> 08:34:49,851
But in the general case,\nwhere the data is unsorted
8904
08:34:49,850 --> 08:34:53,181
n log n as sounding\nbetter than n squared.
8905
08:34:53,181 --> 08:34:55,190
Well, what does it\nactually look or feel like?
8906
08:34:55,190 --> 08:34:58,591
Give me a moment to just change\nover to our visualization here.
8907
08:34:58,591 --> 08:35:02,570
And we'll see with this example\nwhat merge sort looks like
8908
08:35:02,570 --> 08:35:04,530
depicted with now these vertical bars.
8909
08:35:04,530 --> 08:35:07,220
So same algorithm, but instead\nof my numbers on shelves
8910
08:35:07,221 --> 08:35:12,331
here is a random array\nof numbers being sorted.
8911
08:35:12,330 --> 08:35:14,480
And you can see it being\ndone half at a time.
8912
08:35:14,480 --> 08:35:18,080
And you see sort of remnants\nof the previous bars.
8913
08:35:21,861 --> 08:35:26,390
Let me zoom out so you can\nactually see the height here.
8914
08:35:26,390 --> 08:35:29,091
Let me go ahead and randomize\nthis again and run merge sort.
8915
08:35:29,670 --> 08:35:34,460
Now you can see the second array and\n
8916
08:35:34,460 --> 08:35:38,490
And even though this one looks way\n
8917
08:35:38,491 --> 08:35:40,531
it does seem to be moving faster.
8918
08:35:40,530 --> 08:35:44,030
And it seems to be merging halves\ntogether, and boom, it's done.
8919
08:35:44,030 --> 08:35:48,350
So let's actually see, in conclusion,\n
8920
08:35:48,350 --> 08:35:51,440
and consider that moving forward\nas we write more and more code
8921
08:35:51,440 --> 08:35:54,710
the goal is, again, not just to be\n
8922
08:35:54,710 --> 08:35:58,181
And one measure of design is\ngoing to indeed be efficiency.
8923
08:35:58,181 --> 08:36:02,480
So here we have, in final, a\nvisualization of three algorithms--
8924
08:36:02,480 --> 08:36:05,361
selection sort, bubble\nsort, and merge sort--
8925
08:36:06,811 --> 08:36:09,980
And let's see what these algorithms\n
8926
08:36:09,980 --> 08:36:12,216
Oh, if we can dim the\nlights for dramatic effect--
8927
08:36:16,521 --> 08:36:20,331
selection's on top, bubble on\nbottom, merge in the middle.
8928
08:39:13,991 --> 08:39:18,011
DAVID J. MALAN: Well, this is CS50,\n
8929
08:39:18,011 --> 08:39:19,991
and recall that last\nweek, week three, we
8930
08:39:19,991 --> 08:39:22,931
began to explore the inside of\na computer's memory a bit more.
8931
08:39:22,932 --> 08:39:25,992
We talked about arrays, which\nwere just chunks of memory
8932
08:39:25,991 --> 08:39:28,811
back to back to back that really\n
8933
08:39:28,812 --> 08:39:32,082
to bottom, and this is actually a\n
8934
08:39:32,081 --> 08:39:34,121
new to programming,\nand certainly new to C.
8935
08:39:34,121 --> 08:39:39,131
You've seen this approach of just using\n
8936
08:39:40,522 --> 08:39:45,731
So for instance, here is a photo taken\n
8937
08:39:45,731 --> 08:39:49,151
and this is an opportunity to\nexplore exactly what happens
8938
08:39:49,151 --> 08:39:52,271
if we start to zoom in and zoom in and\n
8939
08:39:52,272 --> 08:39:56,022
any TV show like CSI, or\nwhatever, or any movie that
8940
08:39:56,022 --> 08:40:01,961
explores forensic information might\n
8941
08:40:01,961 --> 08:40:05,354
on an image like this to see\nwhat the glint in someone's eye
8942
08:40:05,354 --> 08:40:08,022
is because that reveals the license\nplate number of someone that
8943
08:40:08,917 --> 08:40:10,792
Something that's a little\nover the top there
8944
08:40:10,792 --> 08:40:14,022
but there's an opportunity here to\n
8945
08:40:14,022 --> 08:40:17,022
For instance, let's zoom on\nthis puppet here's eye and let's
8946
08:40:17,022 --> 08:40:19,332
zoom in a little more to\nsee what might be reflected.
8947
08:40:19,331 --> 08:40:21,941
Let's zoom in a little\nmore, and that's it.
8948
08:40:21,941 --> 08:40:24,412
There's only finite\namount of information
8949
08:40:24,412 --> 08:40:26,531
if you have an image\nrepresented in this way.
8950
08:40:26,531 --> 08:40:29,682
We're using pixels-- these dots on\n
8951
08:40:29,682 --> 08:40:32,141
because if you're only using\na finite amount of memory
8952
08:40:32,141 --> 08:40:35,472
then at the end of the day, you can only\n
8953
08:40:35,472 --> 08:40:39,282
At least I don't really see in this\n
8954
08:40:39,281 --> 08:40:42,011
or something like that that you\n
8955
08:40:42,011 --> 08:40:45,041
So today we'll explore these\nkinds of representations
8956
08:40:45,042 --> 08:40:47,862
of how you might use memory\nin new and interesting ways
8957
08:40:47,862 --> 08:40:51,222
to represent now, very\nfamiliar things, but also
8958
08:40:51,222 --> 08:40:54,432
start to explore what some of the\n
8959
08:40:54,432 --> 08:40:58,211
But consider after all that this doesn't\n
8960
08:40:58,211 --> 08:41:00,522
as many pixels as something\nlike this other image
8961
08:41:00,522 --> 08:41:04,492
you can imagine just doing something\n
8962
08:41:04,491 --> 08:41:07,181
And if you think of an image as\njust having rows and columns
8963
08:41:07,182 --> 08:41:09,492
these rows otherwise known\nas scan lines-- something
8964
08:41:09,491 --> 08:41:13,061
we'll explore in the coming week--\n
8965
08:41:13,062 --> 08:41:17,472
by just using two different\nvalues, maybe a zero and a one.
8966
08:41:17,472 --> 08:41:21,502
Or yellow and purple, or vice versa,\n
8967
08:41:21,502 --> 08:41:25,691
Now in practice, recall we talked\n
8968
08:41:25,691 --> 08:41:32,774
but maybe an R, a G, and a B value--\n
8969
08:41:32,775 --> 08:41:33,942
but we'll come back to that.
8970
08:41:33,941 --> 08:41:35,649
That would just be a\nmore involved image.
8971
08:41:35,650 --> 08:41:41,472
But for fun, if today you want to tackle\n
8972
08:41:41,472 --> 08:41:44,891
if you go to this URL here,\nwe've put together an opportunity
8973
08:41:47,562 --> 08:41:51,162
If you go to this URL here, that'll\n
8974
08:41:51,162 --> 08:41:53,502
If you have a laptop\nwith you today that'll
8975
08:41:53,502 --> 08:41:56,902
look a little something like this, which\n
8976
08:41:56,901 --> 08:42:01,241
So if you'd like to go ahead and use\n
8977
08:42:01,241 --> 08:42:04,691
feature to color in those\nindividual squares if you'd like
8978
08:42:04,691 --> 08:42:08,111
see if you can't make something a little\n
8979
08:42:08,112 --> 08:42:12,202
and we'll exhibit some of the best or\n
8980
08:42:12,202 --> 08:42:15,425
So let's transition then to something\n
8981
08:42:15,424 --> 08:42:17,592
And not all of you have\nused, presumably, Photoshop
8982
08:42:17,592 --> 08:42:20,842
but you're probably generally familiar\n
8983
08:42:20,842 --> 08:42:23,062
and creating images\nor photos or the like.
8984
08:42:23,062 --> 08:42:25,992
And here is a screenshot\nof p's color picker
8985
08:42:25,991 --> 08:42:27,978
via which you can\nchange what color you're
8986
08:42:27,978 --> 08:42:30,311
going to draw with the paint\nbrush, or what color you're
8987
08:42:30,312 --> 08:42:32,292
going to fill in with the paint bucket.
8988
08:42:32,292 --> 08:42:34,391
It's representative of any\nkind of graphical tool.
8989
08:42:34,391 --> 08:42:36,801
And there's a lot of\ninformation in here
8990
08:42:36,801 --> 08:42:39,281
but there's perhaps some\nfamiliar terms now--
8991
08:42:39,281 --> 08:42:43,151
R, G, and B. In fact, right\nnow this is Photoshop's way
8992
08:42:43,151 --> 08:42:45,851
of saying you're about to fill\nin your background or foreground
8993
08:42:45,851 --> 08:42:48,041
with the color black,\nand that appears to be
8994
08:42:48,042 --> 08:42:51,492
represented with an R, a G, and\na B value of zero, zero, zero.
8995
08:42:51,491 --> 08:42:57,341
Or alternatively, using a\nhash symbol and then 000000.
8996
08:42:57,342 --> 08:42:59,801
And if some of you have\nalready made web pages before
8997
08:42:59,801 --> 08:43:01,691
and you know a little\nbit of HTML and CSS
8998
08:43:01,691 --> 08:43:04,031
you probably are familiar\nwith this kind of syntax--
8999
08:43:04,031 --> 08:43:07,891
a hash symbol and then six, or\n
9000
08:43:07,891 --> 08:43:10,391
And if we look at a few different\ncolors here, for instance
9001
08:43:10,391 --> 08:43:12,491
here might be the\nrepresentation of white.
9002
08:43:12,491 --> 08:43:18,671
Now the R, the G, and the B values\n
9003
08:43:18,671 --> 08:43:23,471
Or alternatively, it looks like\n
9004
08:43:23,472 --> 08:43:26,950
could represent that same\ncolor white with FFFFFF.
9005
08:43:26,950 --> 08:43:28,242
And let's just do a few others.
9006
08:43:28,241 --> 08:43:32,981
Here is red, and it turns out that\n
9007
08:43:37,909 --> 08:43:39,702
So there's perhaps a\npattern here emerging.
9008
08:43:39,702 --> 08:43:43,782
Here is green, zero, 255, zero, a.k.a.
9009
08:43:43,781 --> 08:43:48,022
00FF00, or lastly, here\nblue, which is no red
9010
08:43:48,022 --> 08:43:51,731
no green but apparently a lot\nof blue, 255 again, a.k.a.
9011
08:43:53,831 --> 08:43:57,221
Now some of you, again, might\nhave seen this notation before
9012
08:43:57,222 --> 08:44:00,432
these zeros and these F's and all of\n
9013
08:44:00,432 --> 08:44:02,205
but this is another form of notation.
9014
08:44:02,205 --> 08:44:04,122
And in fact, we'll explore\nthis today-- really
9015
08:44:04,121 --> 08:44:06,851
is just a precondition for\ntalking about some other concepts.
9016
08:44:06,851 --> 08:44:10,002
But the ideas, ultimately,\nare really no different.
9017
08:44:10,002 --> 08:44:13,182
What we're about to see is\na different base system--
9018
08:44:13,182 --> 08:44:15,312
not just binary, not just\ndecimal, but something
9019
08:44:15,312 --> 08:44:17,231
we're about to call hexadecimal.
9020
08:44:17,231 --> 08:44:21,191
But first, recall that with RGB\nwe previously did the following.
9021
08:44:21,191 --> 08:44:23,592
Any RGB value-- red,\ngreen, blue-- just combine
9022
08:44:23,592 --> 08:44:26,121
some amount of red or green or blue.
9023
08:44:26,121 --> 08:44:30,702
So here we have 72, 73, 33, which in the\n
9024
08:44:33,761 --> 08:44:36,252
Just hi with an exclamation\npoint, but in the context
9025
08:44:36,252 --> 08:44:40,481
of a Photoshop-like program, this\nmight instead be representing
9026
08:44:40,481 --> 08:44:42,919
collectively, this shade\nof yellow, for instance
9027
08:44:42,919 --> 08:44:45,502
when you combine that much red\nthat much green that much blue.
9028
08:44:46,812 --> 08:44:49,062
If you've got a lot of\nred, no green, no blue
9029
08:44:49,062 --> 08:44:50,652
together that's going to give us red.
9030
08:44:50,651 --> 08:44:53,441
If you've got no red, a\nlot of green, no blue
9031
08:44:53,441 --> 08:44:55,211
that's going to give\nus, of course, green.
9032
08:44:55,211 --> 08:44:58,529
If you've got no red, no green,\na lot of blue, that of course
9033
08:44:59,572 --> 08:45:03,761
So there's a pattern emerging here\n
9034
08:45:05,952 --> 08:45:12,641
And it's maybe somehow equated with 255,\n
9035
08:45:12,641 --> 08:45:15,912
Meanwhile, if we combine one last\n
9036
08:45:16,991 --> 08:45:20,719
that's actually going to give us\na single white pixel like this.
9037
08:45:21,761 --> 08:45:25,479
Here was binary-- in the world of binary\n
9038
08:45:25,479 --> 08:45:26,772
Could have been anything else--
9039
08:45:26,772 --> 08:45:31,902
A or B, X or Y, but the world\nstandardized on these numerals
9040
08:45:32,741 --> 08:45:35,951
In our world's decimal system, of\n
9041
08:45:35,952 --> 08:45:39,461
As of today though, we're going to\n
9042
08:45:39,461 --> 08:45:43,346
in the context of images and also\n
9043
08:45:43,347 --> 08:45:45,195
and there's some conveniences to it.
9044
08:45:45,194 --> 08:45:47,112
Where now, you're going\nto be able to count up
9045
08:45:47,112 --> 08:45:49,961
to F in a notation called hexadecimal.
9046
08:45:49,961 --> 08:45:55,031
From zero through nine, then you keep\n
9047
08:45:55,031 --> 08:45:58,002
the idea being each of these,\neven though it's weirdly
9048
08:45:58,002 --> 08:46:02,141
a letter of the English alphabet,\n
9049
08:46:02,141 --> 08:46:07,601
It's not one zero for 10, or 1 1\n
9050
08:46:07,601 --> 08:46:10,961
these digits, so to speak, are\nindeed still just single symbols
9051
08:46:10,961 --> 08:46:14,572
and that's a characteristic of just\n
9052
08:46:14,572 --> 08:46:20,112
So how do we get from 00 and FF to\n
9053
08:46:20,112 --> 08:46:22,121
Well, this hexadecimal system, a.k.a.
9054
08:46:22,121 --> 08:46:25,546
Base 16, just does the math\nfrom week zero and really
9055
08:46:25,546 --> 08:46:27,171
grade school, a little bit differently.
9056
08:46:27,171 --> 08:46:30,341
For instance, if you have a\nnumber that's got two digits
9057
08:46:30,342 --> 08:46:34,281
or hexadecimal digits as of today, the\n
9058
08:46:34,281 --> 08:46:37,871
Instead of powers of two or powers of\n
9059
08:46:37,871 --> 08:46:40,631
respectively, it's powers of 16.
9060
08:46:40,632 --> 08:46:43,362
So if we just do the math\nout, that's the ones column
9061
08:46:43,362 --> 08:46:46,092
this is the 16s column, and so forth.
9062
08:46:46,092 --> 08:46:49,101
Things get actually pretty big\npretty quickly in this system.
9063
08:46:49,101 --> 08:46:52,106
But now let's just consider how we\n
9064
08:46:52,106 --> 08:46:54,731
If you've got two hexadecimal\ndigits for which these hashes are
9065
08:46:54,731 --> 08:46:57,792
just placeholders, zero, zero\nis going to mathematically
9066
08:46:57,792 --> 08:47:00,292
equal the decimal number you\nand I know, of course, as zero.
9067
08:47:02,081 --> 08:47:06,401
16 times zero plus one times zero is\n
9068
08:47:06,401 --> 08:47:07,881
And we can count up from here.
9069
08:47:07,882 --> 08:47:10,391
This, in hexadecimal,\nwould be how a computer
9070
08:47:10,391 --> 08:47:12,191
represents the number we know as one.
9071
08:47:12,191 --> 08:47:14,182
It would be zero one in this case.
9072
08:47:14,182 --> 08:47:19,542
This would be two, three, four,\nfive, six, seven, eight, nine--
9073
08:47:19,542 --> 08:47:21,502
in decimal, we're about to go to 10.
9074
08:47:21,502 --> 08:47:24,572
But in hexadecimal, to be\nclear, what comes next?
9075
08:47:24,572 --> 08:47:33,382
So, apparently A, so 0A, 0B, which\n
9076
08:47:33,382 --> 08:47:36,472
So using hexadecimal is\njust an interesting way
9077
08:47:36,472 --> 08:47:40,312
of using single symbols\nnow, zero through F
9078
08:47:40,312 --> 08:47:43,262
to count from zero through 15.
9079
08:47:43,261 --> 08:47:46,011
And we'll see why it's 15 in a\n
9080
08:47:46,011 --> 08:47:50,181
anyone want to conjecture how\nin hexadecimal, a.k.a. hex
9081
08:47:50,182 --> 08:47:53,092
do we now count up one position higher?
9082
08:47:53,092 --> 08:47:56,792
What comes after 0F in hexadecimal?
9083
08:47:56,792 --> 08:47:59,062
So, one zero-- it's the\nsame kind of thing--
9084
08:47:59,062 --> 08:48:01,227
once you're at the highest\ndigit possible, F--
9085
08:48:01,226 --> 08:48:03,351
or in our decimal world\nthat would have been nine--
9086
08:48:03,351 --> 08:48:06,471
you add one more, nine wraps\naround to zero, or in this case
9087
08:48:08,182 --> 08:48:11,152
You carry the one and voila--\nnow we're representing
9088
08:48:11,151 --> 08:48:12,871
the number you and I know as 16.
9089
08:48:12,871 --> 08:48:14,811
And we could keep going\nforever, literally.
9090
08:48:14,812 --> 08:48:18,547
This could be 17, 18,\n19, 20, and decimal--
9091
08:48:18,546 --> 08:48:20,421
but let's just wave our\nhands at it and count
9092
08:48:20,421 --> 08:48:23,181
as high as we can-- dot,\ndot, dot-- the highest
9093
08:48:23,182 --> 08:48:26,542
we could count in hexadecimal\nwith two digits, just logically
9094
08:48:26,542 --> 08:48:28,342
would be what, in hexadecimal?
9095
08:48:31,312 --> 08:48:34,892
So yes, that's the biggest digit\n
9096
08:48:34,891 --> 08:48:38,524
So how high can you count in hexadecimal\n
9097
08:48:38,524 --> 08:48:39,981
Well, it's the same math as always.
9098
08:48:41,932 --> 08:48:48,301
15, so that's 16 times 15 plus\none times F, or one times 15--
9099
08:48:48,301 --> 08:48:52,702
that gives us 240 plus 15 in decimal,\n
9100
08:48:54,781 --> 08:48:57,871
So this hexadecimal system-- you may\n
9101
08:48:57,871 --> 08:49:00,621
and if you haven't we'll get to\n
9102
08:49:00,621 --> 08:49:03,351
or we just saw in the\ncontext of Photoshop-- just
9103
08:49:03,351 --> 08:49:09,502
has this shorthand notation of counting\n
9104
08:49:09,502 --> 08:49:13,132
Now it's marginal, but that's like\n
9105
08:49:13,132 --> 08:49:16,852
you need in order to count as high\n
9106
08:49:18,682 --> 08:49:22,492
In hexadecimal you can count\nas high using just two
9107
08:49:22,491 --> 08:49:25,849
and that difference is going to get\n
9108
08:49:25,849 --> 08:49:28,641
Let me stipulate for now, you're\n
9109
08:49:28,641 --> 08:49:31,792
in terms of just how many symbols\n
9110
08:49:31,792 --> 08:49:35,242
bigger and bigger numbers than that.
9111
08:49:35,241 --> 08:49:38,661
All right, let me pause here just to\n
9112
08:49:38,662 --> 08:49:42,081
on what we've called hexadecimal, which\n
9113
08:49:42,081 --> 08:49:48,769
as well as A through F.\nAny questions or confusion?
9114
08:49:48,769 --> 08:49:51,351
And if it feels like we're\nlingering a bit much on arithmetic
9115
08:49:51,351 --> 08:49:54,691
we're not really going to see other\n
9116
08:49:54,691 --> 08:49:58,822
These are the go-to three in a\nprogrammer's world, typically.
9117
08:50:01,600 --> 08:50:03,893
AUDIENCE: Does the hexadecimal\nsymbol take more storage
9118
08:50:06,612 --> 08:50:07,862
DAVID J. MALAN: Good question.
9119
08:50:07,862 --> 08:50:11,972
Does hexadecimal require more storage\n
9120
08:50:11,972 --> 08:50:16,202
Theoretically no, because this is\n
9121
08:50:16,202 --> 08:50:19,082
and we'll see in a concrete\nexample in a moment.
9122
08:50:19,081 --> 08:50:22,471
But inside of the computer, at the end\n
9123
08:50:22,472 --> 08:50:25,588
And using hexadecimal is not\nusing more or fewer bits
9124
08:50:25,588 --> 08:50:27,421
think of this as how\nyou might write it down
9125
08:50:27,421 --> 08:50:30,331
on a piece of paper, just how\nmany digits you're going to write
9126
08:50:30,331 --> 08:50:33,301
or on a computer screen, how many\n
9127
08:50:33,301 --> 08:50:36,572
but it doesn't change how the\n
9128
08:50:36,572 --> 08:50:39,691
because all they're representing at\n
9129
08:50:40,981 --> 08:50:45,211
If this-- a moment ago\nFF I claimed was 255--
9130
08:50:45,211 --> 08:50:47,252
let's just rewind to week\nzero and if we wanted
9131
08:50:47,252 --> 08:50:51,752
to count to 255 in binary, that's\n
9132
08:50:52,772 --> 08:50:54,604
And there's only a few\nof these numbers that
9133
08:50:54,604 --> 08:50:58,441
are useful to memorize, like 255 is as\n
9134
08:50:58,441 --> 08:51:02,342
if you start at zero, because two to the\n
9135
08:51:04,831 --> 08:51:09,031
So in binary, recall if you have\n
9136
08:51:09,031 --> 08:51:11,351
and I won't do out the\nmath pedantically here
9137
08:51:11,351 --> 08:51:13,726
but if I do do this plus\nthis plus this, dot, dot
9138
08:51:13,726 --> 08:51:16,752
dot-- that's also going to give me 255.
9139
08:51:16,752 --> 08:51:19,801
So this is what's interesting\nhere about hexadecimal.
9140
08:51:19,801 --> 08:51:24,211
It turns out that an upside of\nstoring values in hexadecimal
9141
08:51:24,211 --> 08:51:27,932
is that we're going to\nsee the first F represents
9142
08:51:27,932 --> 08:51:31,262
the left half of all these bits,\nand the second F in this case
9143
08:51:31,261 --> 08:51:33,791
represents the rightmost\nfour of these bits.
9144
08:51:33,792 --> 08:51:36,422
So it turns out hexadecimal\nis very useful when you
9145
08:51:36,421 --> 08:51:39,391
want to treat data in units of four.
9146
08:51:39,391 --> 08:51:42,542
It's not quite eight, but units\nof four, and that's not bad.
9147
08:51:42,542 --> 08:51:45,632
Which is why-- if you use two\ndigits like I have thus far
9148
08:51:45,632 --> 08:51:48,422
00 or FF or anything in between--
9149
08:51:48,421 --> 08:51:53,281
that's actually a convenient way of\n
9150
08:51:53,281 --> 08:51:57,451
One hex digit for the first four\n
9151
08:51:57,452 --> 08:52:00,152
And again, there's nothing new\nintellectually here per se
9152
08:52:00,151 --> 08:52:03,931
it's just a different way of\n
9153
08:52:05,011 --> 08:52:06,851
So in what context do we see this?
9154
08:52:06,851 --> 08:52:08,191
Well, we talked about\nmemory last week, and we're
9155
08:52:08,191 --> 08:52:09,774
going to talk more about it this week.
9156
08:52:09,775 --> 08:52:12,301
If this is my computer's\nRAM-- random access memory--
9157
08:52:12,301 --> 08:52:16,471
you can again think of each byte as\n
9158
08:52:18,031 --> 08:52:22,351
This might be zero, this might\nbe 2 billion, and so in the past
9159
08:52:22,351 --> 08:52:25,141
I've described these as just\nthis, using decimal numbers.
9160
08:52:25,141 --> 08:52:29,491
Here's byte zero, one, two, three,\n
9161
08:52:29,491 --> 08:52:30,941
would be here, and so forth.
9162
08:52:30,941 --> 08:52:35,432
But it turns out in the world of memory,\n
9163
08:52:35,432 --> 08:52:40,051
tend to count memory\nbytes using hexadecimal.
9164
08:52:40,051 --> 08:52:42,241
Partly just by convention,\nbut also partly
9165
08:52:42,241 --> 08:52:44,941
because it's a little more\nsuccinct and again, each digit
9166
08:52:44,941 --> 08:52:48,002
represents four bits, typically.
9167
08:52:48,002 --> 08:52:49,757
So what comes after F here?
9168
08:52:49,757 --> 08:52:51,632
Well, if I think about\nthe computer's memory
9169
08:52:51,632 --> 08:52:56,672
I normally might do\nafter F, which is 15, 16.
9170
08:52:56,671 --> 08:53:01,291
But instead, one zero, one\none, one two, one three-- this
9171
08:53:01,292 --> 08:53:05,912
is not 10, 11, 12, 13, because I claim\n
9172
08:53:05,912 --> 08:53:07,981
As per the previous\nslide, we already started
9173
08:53:07,981 --> 08:53:10,801
going into A's through\nF's, so you immediately
9174
08:53:10,801 --> 08:53:13,471
see here a possible problem.
9175
08:53:13,472 --> 08:53:16,442
Why is this now worrisome,\nif all of a sudden you're
9176
08:53:16,441 --> 08:53:22,151
seeing seemingly familiar\nnumbers like 10, 11, 12, 13?
9177
08:53:22,151 --> 08:53:24,288
We didn't really stumble\nacross this problem
9178
08:53:24,289 --> 08:53:25,871
when it was all zeros and ones before.
9179
08:53:26,974 --> 08:53:28,516
AUDIENCE: Try to do math [INAUDIBLE].
9180
08:53:30,645 --> 08:53:33,312
DAVID J. MALAN: Yeah, so if you're\nwriting some code in C that's
9181
08:53:33,312 --> 08:53:35,170
doing some math, you\nmight accidentally--
9182
08:53:35,169 --> 08:53:37,961
or the computer might accidentally\n
9183
08:53:37,961 --> 08:53:40,522
if they look in some context the same.
9184
08:53:40,522 --> 08:53:42,612
Any number on the board\nthat doesn't have a letter
9185
08:53:42,612 --> 08:53:46,402
is ambiguously hexadecimal\nor decimal at this point
9186
08:53:46,401 --> 08:53:48,111
and so how might we resolve this?
9187
08:53:48,112 --> 08:53:51,072
Well, it turns out that what\ncomputers typically do is this.
9188
08:53:51,072 --> 08:53:55,842
By convention, any time you\nsee 0x and then a number
9189
08:53:55,842 --> 08:53:58,272
that's a human convention of saying--
9190
08:53:58,272 --> 08:54:01,731
signaling to the reader that this\n
9191
08:54:01,731 --> 08:54:05,801
So if it's 0x10, that\nis not the number 10
9192
08:54:05,801 --> 08:54:10,971
that is the hexadecimal number one\n
9193
08:54:13,991 --> 08:54:16,511
And again, these are not the\nkinds of things to memorize
9194
08:54:16,511 --> 08:54:19,921
it's really just the system for\n
9195
08:54:19,921 --> 08:54:22,421
So henceforth today, we're going\nto start seeing hexadecimal
9196
08:54:23,831 --> 08:54:26,861
When you write code, you might even\n
9197
08:54:26,862 --> 08:54:29,362
but again, it's just a different\nway of representing numbers
9198
08:54:29,362 --> 08:54:32,621
and humans have different\nconventions for different contexts.
9199
08:54:32,621 --> 08:54:36,131
All right, so with that said, any\n
9200
08:54:36,132 --> 08:54:41,682
But here on out, we'll start\nusing it in some actual code.
9201
08:54:45,441 --> 08:54:49,182
So, let's go ahead and consider\nmaybe a familiar example.
9202
08:54:49,182 --> 08:54:52,932
Something where involving code,\n
9203
08:54:52,932 --> 08:54:54,750
to a value like 50, in this case.
9204
08:54:54,750 --> 08:54:57,042
And then let's start to tinker\naround with what's going
9205
08:54:57,042 --> 08:54:58,752
on inside of the computer's memory.
9206
08:54:58,752 --> 08:55:01,551
In a moment I'm going to load\nup VS Code on my computer
9207
08:55:01,551 --> 08:55:04,871
and I'm going to go ahead and whip\n
9208
08:55:04,871 --> 08:55:08,592
a value like the number\n50 to a variable called n
9209
08:55:08,592 --> 08:55:14,397
but today, keep in mind that\nthat variable n and that value 50
9210
08:55:14,397 --> 08:55:16,764
is going to be stored somewhere\nin my computer's memory
9211
08:55:16,764 --> 08:55:19,932
and it turns out today we'll introduce\n
9212
08:55:19,932 --> 08:55:22,371
see where things are being stored.
9213
08:55:22,371 --> 08:55:24,072
So let me click over to VS Code here.
9214
08:55:24,072 --> 08:55:27,042
I'm going to create a\nprogram called address.c just
9215
08:55:27,042 --> 08:55:29,532
to explore computer's\naddresses today, and I'm
9216
08:55:29,531 --> 08:55:34,061
going to do an include stdio.h,\nint main(void), as usual.
9217
08:55:34,062 --> 08:55:35,802
No command line arguments for now.
9218
08:55:35,801 --> 08:55:38,403
I'm going to declare that\nvariable n equals 50
9219
08:55:38,403 --> 08:55:40,611
and then I'm just going to\ngo ahead and print it out.
9220
08:55:40,612 --> 08:55:46,092
So nothing very interesting but I'll\n
9221
08:55:47,682 --> 08:55:50,672
Nothing here should be very\ninteresting to compile or run
9222
08:55:50,671 --> 08:55:53,171
but I'll do it just to make\nsure I didn't make any mistakes.
9223
08:55:53,171 --> 08:55:58,662
Looks like as expected, it simply\n
9224
08:55:58,662 --> 08:56:02,141
But let's consider then, what this\n
9225
08:56:02,141 --> 08:56:04,882
when it's actually run on your machine.
9226
08:56:04,882 --> 08:56:06,762
So here we have that grid of memory.
9227
08:56:06,761 --> 08:56:10,811
That variable n is an int,\nand if you think back
9228
08:56:10,812 --> 08:56:14,412
how many bytes typically\ndo we use for an int?
9229
08:56:15,491 --> 08:56:18,050
Four, so four bytes, or 32 bits.
9230
08:56:18,050 --> 08:56:21,851
So if each of these squares represents\n
9231
08:56:21,851 --> 08:56:25,173
in my memory, or RAM, is\nusing four of these squares.
9232
08:56:25,173 --> 08:56:27,881
Maybe it ends up over here just\n
9233
08:56:27,882 --> 08:56:29,092
used elsewhere, for instance.
9234
08:56:29,092 --> 08:56:30,842
Though I don't really\nknow, and frankly, I
9235
08:56:30,842 --> 08:56:33,634
don't really care where it ends\n
9236
08:56:33,633 --> 08:56:37,300
So the variable-- the value 50 is\n
9237
08:56:37,300 --> 08:56:40,941
Even though I've written it as\ndecimal, just like in my code--
9238
08:56:40,941 --> 08:56:45,544
let me again remind that this is 32\n
9239
08:56:45,544 --> 08:56:48,711
it's just going to be very tedious if\n
9240
08:56:48,711 --> 08:56:51,711
so I'll use the more comfortable\nhuman decimal system.
9241
08:56:51,711 --> 08:56:54,502
So that's what's going on\ninside of the computer's memory.
9242
08:56:54,502 --> 08:56:58,932
So what if I actually wanted to\n
9243
08:56:58,932 --> 08:57:01,452
or maybe just knowing its location?
9244
08:57:01,452 --> 08:57:05,262
Well, this variable n\nindeed has a name, n--
9245
08:57:05,261 --> 08:57:09,123
that's a label of sorts for it--\n
9246
08:57:09,123 --> 08:57:11,831
technically at a specific address,\n
9247
08:57:11,831 --> 08:57:14,861
0x123, and it's 123\nbecause I really don't
9248
08:57:14,862 --> 08:57:17,781
care what it is, I just want an\n
9249
08:57:17,781 --> 08:57:24,311
So way over here off screen might be\n
9250
08:57:24,312 --> 08:57:28,222
It's in hexadecimal\nnotation just by convention.
9251
08:57:28,222 --> 08:57:32,052
So how can I actually see where\nmy variables are ending up
9252
08:57:32,051 --> 08:57:33,702
in memory if I'm curious to do so?
9253
08:57:33,702 --> 08:57:37,182
Well, let me go back to my\ncode here and let me actually
9254
08:57:37,182 --> 08:57:39,441
change this just a little bit.
9255
08:57:39,441 --> 08:57:44,741
Let me go ahead and introduce,\nfor instance, another symbol
9256
08:57:44,741 --> 08:57:48,941
here and another topic\naltogether, namely pointers.
9257
08:57:48,941 --> 08:57:54,471
So a pointer is a variable that\n
9258
08:57:54,472 --> 08:57:57,731
the location of some value\nor more specifically
9259
08:57:57,731 --> 08:58:01,042
the specific byte in which\nthat value is stored.
9260
08:58:01,042 --> 08:58:04,301
So again, if you think of your memory\n
9261
08:58:04,301 --> 08:58:07,061
zero at top left, 2 billion\nor whatever at bottom right
9262
08:58:07,062 --> 08:58:08,562
depending on how much RAM you have--
9263
08:58:08,562 --> 08:58:10,842
each of those things has\na location, or an address.
9264
08:58:10,842 --> 08:58:14,932
A pointer is just a variable\nstoring one such address.
9265
08:58:14,932 --> 08:58:20,112
So it turns out that in the world of\n
9266
08:58:20,112 --> 08:58:24,472
we can use if we want to see what\n
9267
08:58:24,472 --> 08:58:27,402
and those two operators,\nas of today, are these.
9268
08:58:27,401 --> 08:58:31,191
You can use the ampersand\noperator in C in a couple of ways.
9269
08:58:31,191 --> 08:58:34,121
We already saw it very briefly\nto do ampersand ampersand--
9270
08:58:34,121 --> 08:58:37,631
it's kind of and two\nBoolean expressions together
9271
08:58:37,632 --> 08:58:39,172
in the context of a conditional.
9272
08:58:40,182 --> 08:58:43,992
A single ampersand is\nthe address of operator.
9273
08:58:43,991 --> 08:58:48,011
So literally, in your code, if you've\n
9274
08:58:48,011 --> 08:58:53,261
and you write &n, C is going to figure\n
9275
08:58:53,261 --> 08:58:55,731
variable n in the computer's memory.
9276
08:58:55,731 --> 08:59:01,362
And it's going to give you a number,\n
9277
08:59:01,362 --> 08:59:05,141
If you want to store that\naddress in a variable
9278
08:59:05,141 --> 08:59:11,202
even though yes, it's a number like\n
9279
08:59:11,202 --> 08:59:17,082
that you want to store not an int\n
9280
08:59:17,081 --> 08:59:20,711
And the syntax for doing that--\nsomewhat nonobviously-- is
9281
08:59:20,711 --> 08:59:24,432
to use an asterisk here,\na star operator, and you
9282
08:59:24,432 --> 08:59:26,231
say this when creating the variable.
9283
08:59:26,231 --> 08:59:30,731
If you want p to be a pointer, that\n
9284
08:59:32,412 --> 08:59:36,551
And the star just tells the computer,\n
9285
08:59:36,551 --> 08:59:40,002
this is the address of\nsomething that yes, is an int
9286
08:59:40,002 --> 08:59:41,761
but we're just being more precise.
9287
08:59:41,761 --> 08:59:44,662
So on the right hand side you\nhave the address of operator.
9288
08:59:44,662 --> 08:59:47,641
As always with the equal sign,\nyou copy from right to left.
9289
08:59:47,641 --> 08:59:51,592
Because &n is by definition the address\n
9290
08:59:51,592 --> 08:59:57,141
in a pointer, and the way to declare a\n
9291
08:59:57,141 --> 09:00:01,191
whose address you're storing, and then\n
9292
09:00:01,191 --> 09:00:04,701
indeed a pointer and not\njust a regular old int.
9293
09:00:04,702 --> 09:00:06,172
So let's see this in practice.
9294
09:00:06,171 --> 09:00:09,231
Let me go back to my own\nsource code here and let
9295
09:00:09,231 --> 09:00:11,241
me make just a couple of tweaks.
9296
09:00:11,241 --> 09:00:13,581
I'm going to leave n\nalone here but I'm going
9297
09:00:13,581 --> 09:00:18,121
to go ahead and initially just do this.
9298
09:00:18,121 --> 09:00:22,702
Let me say int star\np equals ampersand n
9299
09:00:22,702 --> 09:00:27,322
and then down here, I'm going to\n
9300
09:00:28,761 --> 09:00:33,531
And then even though yes, it's just\n
9301
09:00:33,531 --> 09:00:37,671
for integers, there's actually a special\n
9302
09:00:37,671 --> 09:00:40,881
pointers or addresses, and that's %p.
9303
09:00:40,882 --> 09:00:44,182
So now let's go ahead and\nrecompile this, make address--
9304
09:00:44,182 --> 09:00:49,231
so far so good-- ./address,\nEnter, and a little weirdly
9305
09:00:49,231 --> 09:00:53,871
but perhaps understandably now,\n
9306
09:00:53,871 --> 09:00:57,741
at which the variable n happened to\n
9307
09:00:59,241 --> 09:01:01,791
This computer has a lot\nmore memory so technically
9308
09:01:01,792 --> 09:01:07,852
it was stored at 0x7FFCB4578E5C.
9309
09:01:07,851 --> 09:01:10,011
Now that has no special\nsignificance to me.
9310
09:01:10,011 --> 09:01:12,241
It could have ended up\nsomewhere else altogether
9311
09:01:12,241 --> 09:01:15,741
but this is just where, in my\n
9312
09:01:15,741 --> 09:01:18,261
server to which I'm connected\nusing VS Code here--
9313
09:01:18,261 --> 09:01:20,859
that just happens to\nbe where n ended up.
9314
09:01:20,859 --> 09:01:23,691
And strictly speaking, I don't even\n
9315
09:01:23,691 --> 09:01:26,541
I could get rid of p\nand I could just say
9316
09:01:26,542 --> 09:01:30,262
print not just n, but the address\n
9317
09:01:30,261 --> 09:01:32,721
You don't need to temporarily\nstore it in a variable.
9318
09:01:32,722 --> 09:01:35,702
Let me just do make\naddress again, ./address
9319
09:01:35,702 --> 09:01:38,282
and now I see this address here.
9320
09:01:38,281 --> 09:01:41,826
And notice if I keep running the\n
9321
09:01:41,827 --> 09:01:44,452
There's other stuff presumably\ngoing on inside of the computer.
9322
09:01:44,452 --> 09:01:47,862
Maybe it's actually randomizing it so\n
9323
09:01:47,862 --> 09:01:50,362
That can actually be a security\nfeature underneath the hood
9324
09:01:50,362 --> 09:01:55,882
but this happens to be at that moment\n
9325
09:01:55,882 --> 09:01:58,852
quite like our picture a moment ago.
9326
09:01:58,851 --> 09:02:02,002
All right, so let me pause\nhere to see if there's now
9327
09:02:02,002 --> 09:02:03,531
any questions on what we just did.
9328
09:02:05,531 --> 09:02:07,752
AUDIENCE: Is there any\nway to control where
9329
09:02:07,752 --> 09:02:10,912
you are storing something in memory?
9330
09:02:10,912 --> 09:02:14,106
Does it even matter if\nit works, or does it just
9331
09:02:14,106 --> 09:02:16,632
matter that you could go in\nand locate where something is?
9332
09:02:16,632 --> 09:02:18,174
DAVID J. MALAN: Really good question.
9333
09:02:18,173 --> 09:02:20,741
Is there any way to control\nwhere something is in memory?
9334
09:02:20,741 --> 09:02:23,698
Short answer is yes, and this is\n
9335
09:02:23,699 --> 09:02:26,531
and we're going to do this today\n
9336
09:02:26,531 --> 09:02:31,601
because with this power of going to or\n
9337
09:02:31,601 --> 09:02:33,701
I could just arbitrarily\nright now write code
9338
09:02:33,702 --> 09:02:37,972
that stores a value at byte 2 billion,\n
9339
09:02:37,972 --> 09:02:42,132
But that also means potentially,\nI could start creepily looking
9340
09:02:42,132 --> 09:02:46,192
around at all of the computer's memory,\n
9341
09:02:46,191 --> 09:02:48,731
Maybe other programs, maybe\nother parts of programs
9342
09:02:48,731 --> 09:02:50,981
and indeed, this is a\npotential security threat
9343
09:02:50,981 --> 09:02:53,345
if suddenly you're able\nto just look anywhere
9344
09:02:53,345 --> 09:02:54,762
you want in the computer's memory.
9345
09:02:54,761 --> 09:02:59,381
Now, I'm overselling it a little bit\n
9346
09:02:59,382 --> 09:03:01,932
there are some defenses\nin place in compilers
9347
09:03:01,932 --> 09:03:05,301
and in our operating systems that\n
9348
09:03:05,301 --> 09:03:07,752
But this is still a very\nfrequent source of problems
9349
09:03:07,752 --> 09:03:10,152
and later today we'll\ntalk briefly about things
9350
09:03:10,151 --> 09:03:13,011
called stack overflow,\nwhich is not just a website
9351
09:03:13,011 --> 09:03:15,191
it is a problem that you can encounter.
9352
09:03:15,191 --> 09:03:17,711
Heap overflow, and more\ngenerally buffer overflows--
9353
09:03:17,711 --> 09:03:21,162
there's just so many things that can\n
9354
09:03:21,162 --> 09:03:24,761
and if any of you have encountered\na segmentation fault yet?
9355
09:03:24,761 --> 09:03:26,681
I think we saw a few\nhands for that already.
9356
09:03:26,682 --> 09:03:29,262
You touched memory\nthat you shouldn't have
9357
09:03:29,261 --> 09:03:33,971
and odds are you did it most recently\n
9358
09:03:33,972 --> 09:03:37,362
Going to the left, or negative in an\n
9359
09:03:38,202 --> 09:03:42,412
And we'll explain today why it\nis you were able to do that.
9360
09:03:42,412 --> 09:03:44,891
Other questions on\nthese primitives so far?
9361
09:03:46,984 --> 09:03:50,109
AUDIENCE: [INAUDIBLE] pointer star p,\n
9362
09:03:51,391 --> 09:03:52,641
DAVID J. MALAN: Good question.
9363
09:03:53,932 --> 09:03:56,422
Let me rewind in time to the\nprevious version of this code
9364
09:03:56,421 --> 09:03:58,701
where I actually had\na variable called p.
9365
09:03:58,702 --> 09:04:02,512
Just like with variable\ndeclarations in the past
9366
09:04:02,511 --> 09:04:07,981
once you've declared a variable to\n
9367
09:04:07,981 --> 09:04:11,121
star, a.k.a. a pointer,\nyou don't thereafter
9368
09:04:11,121 --> 09:04:14,031
keep using the word\nint or now, the star.
9369
09:04:14,031 --> 09:04:15,831
Once you've declared it, that's it.
9370
09:04:15,831 --> 09:04:17,281
You only refer to it by name.
9371
09:04:17,281 --> 09:04:21,471
And so it's very\ndeliberate what I did here
9372
09:04:21,472 --> 09:04:24,022
saying that the type here is int star--
9373
09:04:24,022 --> 09:04:26,031
that is a pointer to an int--
9374
09:04:26,031 --> 09:04:28,971
but here I just said the name\nof the variable, as always.
9375
09:04:28,972 --> 09:04:31,672
I didn't repeat int, and\nI also didn't repeat star.
9376
09:04:31,671 --> 09:04:34,551
But at the risk of bending\none's minds a little bit there
9377
09:04:34,551 --> 09:04:40,801
is unfortunately one other use for the\n
9378
09:04:40,801 --> 09:04:44,542
If you want to print out not\nthe address of something
9379
09:04:44,542 --> 09:04:49,622
but what is at a specific\naddress, you can actually do this.
9380
09:04:49,621 --> 09:04:54,981
If I want to print out the integer\n
9381
09:04:54,981 --> 09:04:59,421
I can actually use the star here, which\n
9382
09:04:59,421 --> 09:05:02,521
said but it has a different\nfunction here-- a different purpose.
9383
09:05:02,522 --> 09:05:04,922
So let me go ahead and do\nthis in two different ways.
9384
09:05:04,921 --> 09:05:06,726
I'm going to leave this\nline of code as is
9385
09:05:06,726 --> 09:05:08,601
but I'm going to add\nanother line of code now
9386
09:05:08,601 --> 09:05:12,561
that prints out what apparently\nwill be an integer, in a moment.
9387
09:05:12,562 --> 09:05:16,485
So %i backslash n, and I could\n
9388
09:05:16,485 --> 09:05:18,652
So there's really nothing\nspecial happening now, I'm
9389
09:05:18,651 --> 09:05:20,661
just adding a sort of\nmindless printing of n.
9390
09:05:20,662 --> 09:05:23,402
So make address, ./address--
9391
09:05:23,401 --> 09:05:26,961
there's the current address of\nn and there's the value of n.
9392
09:05:26,961 --> 09:05:29,932
But what's kind of\ncool about C here, too
9393
09:05:29,932 --> 09:05:34,222
is if you know that a value is\nat a specific address like p
9394
09:05:34,222 --> 09:05:37,952
there's one other use for this\nstar operator, the asterisk.
9395
09:05:37,952 --> 09:05:41,582
You can use it as the\nso-called dereference operator
9396
09:05:41,581 --> 09:05:44,431
which means go to that address.
9397
09:05:44,432 --> 09:05:50,062
And so here what we actually have\nis an example of a pointer p
9398
09:05:50,062 --> 09:05:54,992
which is an address like\n0x123 or 0x7FF and so forth.
9399
09:05:54,991 --> 09:05:58,551
But if you say star p now, you're\nnot redeclaring the variable
9400
09:05:58,551 --> 09:05:59,991
because I didn't mention int--
9401
09:05:59,991 --> 09:06:02,751
you're going to that address in p.
9402
09:06:02,752 --> 09:06:04,432
So let me recompile this now.
9403
09:06:04,432 --> 09:06:10,552
Make address, ./address,\nand just to be clear--
9404
09:06:12,082 --> 09:06:15,592
I'm first going to see the\npointer itself, 0x something.
9405
09:06:15,591 --> 09:06:18,456
What's the second line of output\nI should presumably see now?
9406
09:06:22,951 --> 09:06:27,271
So I'm hearing 50, and that's true\n
9407
09:06:27,271 --> 09:06:33,511
of n and print it in line seven, but\n
9408
09:06:33,512 --> 09:06:36,692
that's indeed going to just\nshow you the number n--
9409
09:06:39,482 --> 09:06:42,389
All right, any questions now on\n
9410
09:06:42,389 --> 09:06:44,222
I think this is confusing--\nthe fact that we
9411
09:06:44,222 --> 09:06:46,412
use the star for\nmultiplication, the fact
9412
09:06:46,411 --> 09:06:48,721
that we use the star\nto declare a pointer
9413
09:06:48,722 --> 09:06:51,961
but then we use a star in a third\nway to dereference the pointer
9414
09:06:53,012 --> 09:06:56,612
It's just too confusing, honestly,\n
9415
09:07:07,862 --> 09:07:09,112
DAVID J. MALAN: Good question.
9416
09:07:09,112 --> 09:07:12,682
Do you-- when you are using\nthe ampersand operator
9417
09:07:12,682 --> 09:07:14,631
to get the address of\nsomething, the onus
9418
09:07:14,631 --> 09:07:18,771
is on you at the moment to know\n
9419
09:07:22,042 --> 09:07:25,402
I wrote this code so I\nknow in line six that I'm
9420
09:07:25,402 --> 09:07:28,491
trying to get the address\nof what is an integer.
9421
09:07:28,491 --> 09:07:30,631
AUDIENCE: What about line eight?
9422
09:07:30,631 --> 09:07:34,351
DAVID J. MALAN: In line\neight you don't have
9423
09:07:34,351 --> 09:07:36,182
to worry about that-- good question.
9424
09:07:36,182 --> 09:07:40,211
Notice in line eight, I didn't tell\n
9425
09:07:40,211 --> 09:07:44,911
what kind of address I'm going\n
9426
09:07:44,911 --> 09:07:47,941
I told the compiler\nthat p, now and forever
9427
09:07:47,942 --> 09:07:50,402
is going to be the address of an int.
9428
09:07:50,402 --> 09:07:55,322
That's enough information in advance so\n
9429
09:07:55,322 --> 09:07:59,311
still knows on line eight\nthat p is a pointer to an int
9430
09:07:59,311 --> 09:08:02,731
and that way it will print out\nall four bytes at that address
9431
09:08:02,732 --> 09:08:06,649
not just part of it, and not\nmore than those four bytes.
9432
09:08:09,161 --> 09:08:10,661
AUDIENCE: Do pointers have pointers?
9433
09:08:10,661 --> 09:08:11,961
DAVID J. MALAN: Do\npointers have pointers?
9434
09:08:12,461 --> 09:08:16,091
We won't do this today by\nhaving pointers to pointers
9435
09:08:16,091 --> 09:08:19,781
but yes, you can use star\nstar, and then things get--
9436
09:08:21,671 --> 09:08:23,862
We won't do that today and\nwe won't do that often.
9437
09:08:23,862 --> 09:08:26,412
In fact Python, another language,\nis just a couple of weeks
9438
09:08:31,692 --> 09:08:33,552
That was-- more verbal\nfeedback like that
9439
09:08:33,552 --> 09:08:36,232
is helpful as we forge into\nthe more complicated stuff.
9440
09:08:38,269 --> 09:08:40,145
AUDIENCE: What's the\npoint of [INAUDIBLE]??
9441
09:08:43,432 --> 09:08:46,521
DAVID J. MALAN: What's the\npoint of printing the address?
9442
09:08:46,521 --> 09:08:49,811
AUDIENCE: Like, using the\naddress to [INAUDIBLE]..
9443
09:08:50,741 --> 09:08:51,881
What's the point of doing this?
9444
09:08:51,881 --> 09:08:54,131
If you don't mind, let me--\nlet's get there in a moment.
9445
09:08:54,131 --> 09:08:56,831
This is not the common use case,\njust printing out the address--
9446
09:08:58,182 --> 09:09:00,762
At the moment we care only\nfor the sake of discussion.
9447
09:09:00,762 --> 09:09:02,814
We're soon going to start\nusing these addresses.
9448
09:09:02,813 --> 09:09:05,021
So hang in there just a\nlittle bit for that one, too
9449
09:09:05,021 --> 09:09:08,981
but it will solve some\nproblems for us before long.
9450
09:09:08,982 --> 09:09:12,672
So let's actually just now depict what\n
9451
09:09:15,052 --> 09:09:19,332
So if I toggle back here, let\nme redraw my computer's memory
9452
09:09:19,332 --> 09:09:22,781
now let me plop into the memory n,\n
9453
09:09:23,832 --> 09:09:25,992
Where is p in my computer's memory?
9454
09:09:25,991 --> 09:09:29,051
Specifically, I don't know and\n
9455
09:09:29,052 --> 09:09:31,102
run the program so for\nthe sake of discussion
9456
09:09:31,101 --> 09:09:36,072
let's just propose that if 50 ended\n
9457
09:09:36,072 --> 09:09:38,832
p ends up over here, at address--
9458
09:09:38,832 --> 09:09:42,022
whoops-- at whatever\naddress this is here.
9459
09:09:42,021 --> 09:09:44,471
But notice a couple of curiosities now.
9460
09:09:44,472 --> 09:09:47,982
If p is a pointer, it's\nthe address of something.
9461
09:09:47,982 --> 09:09:53,322
So the value in p should be an address,\n
9462
09:09:53,322 --> 09:09:57,432
0x123, and technically there's not\n
9463
09:09:57,432 --> 09:09:59,832
there's not even a 123\nthere per se-- there's
9464
09:09:59,832 --> 09:10:03,372
a pattern of bits that\nrepresents the address 0x123.
9465
09:10:03,372 --> 09:10:07,042
But again, that's weak zero--\n
9466
09:10:07,042 --> 09:10:13,122
So if this is p, and this I claimed\n
9467
09:10:13,122 --> 09:10:15,592
Can someone conjecture here?
9468
09:10:15,591 --> 09:10:20,421
Because it turns out whether n\nis an int or a char or a bool
9469
09:10:20,421 --> 09:10:23,061
which are different\ntypes-- heck, even a long--
9470
09:10:23,061 --> 09:10:27,231
it turns out that p is always going\n
9471
09:10:36,868 --> 09:10:40,812
AUDIENCE: Perhaps it\nallocates eight bytes
9472
09:10:40,811 --> 09:10:44,319
but it doesn't know the type\nof the data [INAUDIBLE]..
9473
09:10:45,362 --> 09:10:47,552
Maybe it's allocating eight bytes\n
9474
09:10:47,552 --> 09:10:50,072
Turns out that's OK because\nan address is an address.
9475
09:10:50,072 --> 09:10:53,641
It's really up to the programmer to\n
9476
09:10:55,741 --> 09:11:00,803
AUDIENCE: Maybe the first four for\n
9477
09:11:00,803 --> 09:11:06,393
is some null that [INAUDIBLE]\nwhere the pointer ends.
9478
09:11:06,394 --> 09:11:07,601
DAVID J. MALAN: OK, possibly.
9479
09:11:07,601 --> 09:11:10,572
It could be that pointers have\n
9480
09:11:10,572 --> 09:11:13,451
or something curious like that,\n
9481
09:11:13,451 --> 09:11:15,112
Turns out that's not the case.
9482
09:11:15,112 --> 09:11:18,641
It turns out that pointers\nnowadays typically are, but not
9483
09:11:18,641 --> 09:11:21,281
always are eight bytes, a.k.a.
9484
09:11:21,281 --> 09:11:24,461
64 bits, because you and\nI-- our Macs, our PCs
9485
09:11:24,461 --> 09:11:28,271
heck-- even our phones have a lot\n
9486
09:11:28,271 --> 09:11:30,161
Back in the day, a\npointer might have only
9487
09:11:30,161 --> 09:11:34,061
been 32 bits, or even only\neight bits way back in the day.
9488
09:11:34,061 --> 09:11:36,911
It's considered 32 bits, because\n
9489
09:11:36,911 --> 09:11:40,451
How high can you count,\nroughly, if you've got 32 bits?
9490
09:11:40,451 --> 09:11:43,261
What's the number we keep rattling off?
9491
09:11:43,262 --> 09:11:48,422
32 bits is roughly 2 to\nthe 32, so it's 4 billion
9492
09:11:48,421 --> 09:11:52,631
and I keep saying it's 2 billion if you\n
9493
09:11:52,631 --> 09:11:55,891
there's a reason I keep saying\n2 billion bytes, two gigabytes
9494
09:11:55,891 --> 09:11:58,951
because for a very long time that\n
9495
09:12:00,482 --> 09:12:02,852
Because the pointers that\nthe computers were using
9496
09:12:02,851 --> 09:12:04,891
were only, for instance, 32 bits.
9497
09:12:04,891 --> 09:12:07,951
And with 32 bits, depending on whether\n
9498
09:12:07,951 --> 09:12:10,981
you can count as high as 2 billion,\nroughly, or maybe 4 billion
9499
09:12:10,982 --> 09:12:13,322
but you know what-- your\nMac, your PC, your phone
9500
09:12:13,322 --> 09:12:17,802
could not have had five gigabytes of\n
9501
09:12:17,802 --> 09:12:20,552
You certainly couldn't have had\n
9502
09:12:20,552 --> 09:12:22,532
which might be 8 gigabytes of memory--
9503
09:12:24,572 --> 09:12:28,862
Because with 4 bytes, or 32\nbits, you literally, physically
9504
09:12:28,862 --> 09:12:32,972
can't count that high, which means if I\n
9505
09:12:32,972 --> 09:12:36,662
would run out of numbers to describe\n
9506
09:12:37,991 --> 09:12:41,131
So pointers nowadays are\n64 bits, or eight bytes.
9507
09:12:41,881 --> 09:12:43,798
I can't even pronounce\nhow big that number is
9508
09:12:43,798 --> 09:12:46,411
but it's plenty for the\nnext many years, and so
9509
09:12:46,411 --> 09:12:48,241
we've drawn it that\nway on the board here.
9510
09:12:48,241 --> 09:12:49,862
Now let's just abstract this away.
9511
09:12:49,862 --> 09:12:51,569
Let's get rid of all\nthe other bytes that
9512
09:12:51,569 --> 09:12:54,271
are storing something or\nnothing else, and let's now
9513
09:12:54,271 --> 09:12:57,601
start to abstract away this\ncomplexity because the reality is
9514
09:12:59,491 --> 09:13:01,801
what is this useful for, or\nwhat do we-- do we actually
9515
09:13:04,322 --> 09:13:06,421
We're doing this so that\nyou see there's no magic.
9516
09:13:06,421 --> 09:13:09,311
We're just moving things around\nand poking around in memory.
9517
09:13:09,311 --> 09:13:12,151
But what a person would typically\ndo when talking about pointers
9518
09:13:12,152 --> 09:13:14,762
would literally be to\njust point at something.
9519
09:13:14,762 --> 09:13:17,312
I really don't care\nwhat address n is at
9520
09:13:17,311 --> 09:13:20,491
so it suffices when general, when\n
9521
09:13:20,491 --> 09:13:22,381
having a discussion\nwith another programmer
9522
09:13:22,381 --> 09:13:26,701
you just draw an arrow from the\n
9523
09:13:26,701 --> 09:13:31,830
because neither you nor I probably care\n
9524
09:13:31,830 --> 09:13:35,173
There's your pointer-- it's literally\n
9525
09:13:35,173 --> 09:13:37,381
So it turns out that these\npointers, these addresses
9526
09:13:37,381 --> 09:13:41,191
are not that dissimilar to what\nwe've done for hundreds of years
9527
09:13:41,192 --> 09:13:43,742
in the form of a postal system.
9528
09:13:43,741 --> 09:13:45,481
For instance, here is a post office--
9529
09:13:45,482 --> 09:13:48,092
here, no-- here is a\nmailbox, and suppose
9530
09:13:48,091 --> 09:13:50,791
that this is a mailbox labeled p.
9531
09:13:50,792 --> 09:13:53,552
It's a pointer, and suppose\nthere's another mailbox
9532
09:13:53,552 --> 09:13:57,402
way over there, which is just\n
9533
09:13:57,402 --> 09:13:59,192
What are we really talking about?
9534
09:13:59,192 --> 09:14:03,242
Well, you store in a computer's\n
9535
09:14:03,241 --> 09:14:07,201
or the word "hi" inside of your\n
9536
09:14:07,201 --> 09:14:11,281
But today we can also use\nthose same memory locations
9537
09:14:11,281 --> 09:14:12,911
to store the address of things.
9538
09:14:12,911 --> 09:14:16,711
For instance, if I\nopen this up here and I
9539
09:14:16,711 --> 09:14:20,432
see OK, the value inside of this\n
9540
09:14:21,722 --> 09:14:26,222
0x123-- that's like a\npointer, a breadcrumb leading
9541
09:14:26,222 --> 09:14:28,022
from one location in memory to another.
9542
09:14:28,021 --> 09:14:30,521
And in fact, would someone who's\nseated roughly over there--
9543
09:14:30,521 --> 09:14:33,121
do you mind getting the mail over there?
9544
09:14:33,122 --> 09:14:35,942
Any volunteers over in this section?
9545
09:14:35,942 --> 09:14:38,292
Just need you to get to\nthe mailbox before I do.
9546
09:14:40,832 --> 09:14:46,287
Whoever is gesturing most\nwildly, come on down.
9547
09:14:58,561 --> 09:15:01,441
OK, come on up to the edge of the\n
9548
09:15:01,442 --> 09:15:05,162
if this is p, that is\napparently n, but to make clear
9549
09:15:05,161 --> 09:15:07,981
what we're talking about when\nwe're storing 0x whatever values--
9550
09:15:07,982 --> 09:15:11,132
like 0x123, that's\nessentially equivalent to my
9551
09:15:11,131 --> 09:15:13,862
maybe pulling out something\nlike this and just
9552
09:15:13,862 --> 09:15:16,412
abstractly pointing\nto your mailbox there
9553
09:15:16,411 --> 09:15:20,671
or if you prefer,\npointing to the mailbox--
9554
09:15:28,021 --> 09:15:30,182
This is akin to me\npointing at your mailbox
9555
09:15:30,182 --> 09:15:32,224
and if you want to go\nahead and open your mailbox
9556
09:15:32,224 --> 09:15:38,562
and reveal to the crowd what's\ninside your mailbox labeled n.
9557
09:15:43,961 --> 09:15:46,582
We have a little CS50 stress\nball for your trouble.
9558
09:15:47,913 --> 09:15:50,621
So that's just to put a visual on\n
9559
09:15:50,622 --> 09:15:53,532
because it can get very abstract,\n
9560
09:15:53,531 --> 09:15:56,752
talking about addresses and memory and\n
9561
09:15:56,752 --> 09:15:59,669
But if you think about just walking\n
9562
09:15:59,669 --> 09:16:02,622
complex that's got a lot of\nmailboxes, those mailboxes
9563
09:16:02,622 --> 09:16:05,592
essentially are a big\nchunk of memory and each
9564
09:16:05,591 --> 09:16:07,451
of those mailboxes has an address--
9565
09:16:07,451 --> 09:16:10,182
this is apartment one, two,\nthree-- apartment 2 billion.
9566
09:16:10,182 --> 09:16:13,451
And inside of those\nmailboxes can go anything
9567
09:16:13,451 --> 09:16:15,621
that can be represented as information.
9568
09:16:15,622 --> 09:16:18,702
It could be a number\nlike n, or 50, or if you
9569
09:16:18,701 --> 09:16:21,101
prefer it could be a\nnumber that represents
9570
09:16:21,101 --> 09:16:22,991
the address of another mailbox.
9571
09:16:22,991 --> 09:16:26,171
And this is akin, really, if\nyou've ever had an apartment or you
9572
09:16:26,171 --> 09:16:28,991
and your parents have moved,\nto having a forwarding address.
9573
09:16:28,991 --> 09:16:31,362
It's like having the\nPost Office in the US
9574
09:16:31,362 --> 09:16:34,842
put some kind of piece of paper\nin your old mailbox saying
9575
09:16:34,841 --> 09:16:37,271
actually forward it\nto that other mailbox.
9576
09:16:37,271 --> 09:16:39,641
That really is all a pointer is doing.
9577
09:16:39,641 --> 09:16:41,351
At the end of the day,\nit's just a number
9578
09:16:41,351 --> 09:16:43,691
but it's a number being\nused in a different way
9579
09:16:43,692 --> 09:16:45,822
and it's the syntax\nthat we've introduced
9580
09:16:45,822 --> 09:16:49,631
not just int but int star,\nthat tells the computer how
9581
09:16:49,631 --> 09:16:54,101
to treat that number in\nthis slightly different way.
9582
09:16:54,101 --> 09:16:57,201
Are there any questions then, on this?
9583
09:16:59,322 --> 09:17:01,739
AUDIENCE: If you had a variable,\nlike int c, [INAUDIBLE]..
9584
09:17:06,072 --> 09:17:08,052
DAVID J. MALAN: If I did int c and--
9585
09:17:12,372 --> 09:17:14,502
Equal to n, so let me\nactually type it out.
9586
09:17:14,502 --> 09:17:16,631
If I give myself another\nline of code, tell me
9587
09:17:16,631 --> 09:17:22,612
one last time what to type.\nint is equal to n, like this?
9588
09:17:22,612 --> 09:17:27,311
So this is OK, and I can't draw it\n
9589
09:17:27,311 --> 09:17:31,542
but this would be like creating another\n
9590
09:17:31,542 --> 09:17:35,592
down here, that stores\nan identical copy of 50
9591
09:17:35,591 --> 09:17:38,741
because the assignment operator\n
9592
09:17:39,561 --> 09:17:43,031
So that would just add one\nmore rectangle of size four
9593
09:17:45,752 --> 09:17:47,732
If I'm answering your\nquestion as intended.
9594
09:17:47,732 --> 09:17:52,592
OK, so that is week one style use of\n
9595
09:17:52,591 --> 09:17:55,411
I could, though, start copying\npointers but again, we'll
9596
09:17:55,411 --> 09:17:57,241
come back to some of that complexity.
9597
09:17:58,781 --> 09:18:00,281
AUDIENCE: That was a great question.
9598
09:18:02,201 --> 09:18:05,444
does the same pointer point\nto the new replica as well?
9599
09:18:05,444 --> 09:18:06,862
DAVID J. MALAN: Ah, good question.
9600
09:18:07,766 --> 09:18:12,461
And to repeat for the camera, if I\n
9601
09:18:12,461 --> 09:18:16,631
int c equals n, and I claim without\n
9602
09:18:16,631 --> 09:18:20,551
that this gives me another rectangle,\n
9603
09:18:22,042 --> 09:18:24,402
And this is what's important\nand really characteristic
9604
09:18:24,402 --> 09:18:28,362
of C. Nothing happens\nautomatically for you.
9605
09:18:28,362 --> 09:18:31,942
p is not going to be updated\nunless you update p in some way
9606
09:18:31,942 --> 09:18:34,482
so creating a third\nvariable called c-- even
9607
09:18:34,482 --> 09:18:36,882
if you're copying its\nvalue from right to left
9608
09:18:36,881 --> 09:18:40,061
that has no effect on\nanything else in the program.
9609
09:18:41,391 --> 09:18:47,561
So what have we seen that's perhaps\n
9610
09:18:47,561 --> 09:18:51,581
Well, recall that we talked quite a\n
9611
09:18:51,582 --> 09:18:57,461
to recap in layperson's terms, what is\n
9612
09:18:57,461 --> 09:18:59,551
So say-- well, let me\ntake a specific hand here.
9613
09:19:02,286 --> 09:19:03,661
AUDIENCE: An array of characters.
9614
09:19:06,332 --> 09:19:09,122
An array of characters, and we--
9615
09:19:09,122 --> 09:19:12,242
I claimed-- or revealed last week\nthat string is not technically
9616
09:19:12,241 --> 09:19:15,511
a feature built into C. It's\nnot an official data type
9617
09:19:15,512 --> 09:19:17,762
but every programmer\nin most any language
9618
09:19:17,762 --> 09:19:21,002
refers to sequences of\ncharacters-- words, letters
9619
09:19:22,811 --> 09:19:26,131
So the vernacular exists but\nthe data type doesn't typically
9620
09:19:26,131 --> 09:19:29,471
exist per se in C. So what\nwe're about to do, if you will
9621
09:19:29,472 --> 09:19:32,311
for dramatic effect, is take\noff some training wheels today.
9622
09:19:32,311 --> 09:19:36,811
The CS50 library implemented in the\n
9623
09:19:36,811 --> 09:19:38,941
we claim has had a\nbunch of things in it.
9624
09:19:38,942 --> 09:19:42,122
Prototypes for GetString,\nprototypes for GetInt
9625
09:19:42,122 --> 09:19:44,641
and all of those other\nfunctions, but it turns out
9626
09:19:44,641 --> 09:19:48,841
it also is what defines the\nword "string" in such a way
9627
09:19:48,841 --> 09:19:51,341
that you all can use it\nthese past several weeks.
9628
09:19:51,341 --> 09:19:54,002
So let's take a look at an\nexample of a string in use.
9629
09:19:54,002 --> 09:19:56,042
Here, for instance,\nis a tiny bit of code
9630
09:19:56,042 --> 09:20:00,781
that uses the word "string,"\ncreating a variable called s
9631
09:20:00,781 --> 09:20:03,443
and then storing quote\nunquote, hi, exclamation point.
9632
09:20:03,444 --> 09:20:06,152
Let's consider what this looks\n
9633
09:20:06,152 --> 09:20:08,902
I don't care about all the other\n
9634
09:20:08,902 --> 09:20:11,912
and this per last week is\nhow "hi" might be stored.
9635
09:20:11,911 --> 09:20:14,671
h-i exclamation point and then\none more, as someone already
9636
09:20:14,671 --> 09:20:18,511
observed, that sentinel value--\nthat null character which
9637
09:20:18,512 --> 09:20:21,919
just means eight zero bits to\ndemarcate the end of that string
9638
09:20:21,919 --> 09:20:24,002
just in case there's\nsomething to the right of it
9639
09:20:24,002 --> 09:20:27,162
the computer can now distinguish\none string from another.
9640
09:20:27,161 --> 09:20:30,364
So last week we introduced\nthis new syntax.
9641
09:20:30,364 --> 09:20:32,281
Well, if strings are\njust arrays of characters
9642
09:20:32,281 --> 09:20:35,192
you can then very cleverly use\nthat square bracket notation
9643
09:20:35,192 --> 09:20:39,992
and go to location zero or one\nor two, which are like addresses
9644
09:20:39,991 --> 09:20:41,792
but they're relative to the string.
9645
09:20:41,792 --> 09:20:46,741
This could be at 0x123 or 0x456,\nbut with this bracket notation
9646
09:20:46,741 --> 09:20:49,741
zero is always the beginning\nof the string, one is the next
9647
09:20:49,741 --> 09:20:51,161
two is the next, and so forth.
9648
09:20:51,161 --> 09:20:55,921
So that was our array syntax\nfor indexing into an array.
9649
09:20:55,921 --> 09:20:58,832
But technically speaking, we\ncan go a little deeper today--
9650
09:20:58,832 --> 09:21:05,101
technically speaking, if hi is\n
9651
09:21:05,101 --> 09:21:11,072
it stands to reason that i is at\n
9652
09:21:14,072 --> 09:21:18,692
Now, I don't care about 123 per se,\n
9653
09:21:19,951 --> 09:21:23,461
Even in hex, if you just add\none when you start at 0x123
9654
09:21:23,461 --> 09:21:25,817
the next number is four,\nfive, six at the end.
9655
09:21:25,817 --> 09:21:27,692
I don't have to worry\nabout A's, B's, and C's
9656
09:21:27,692 --> 09:21:30,702
because I'm not counting\nthat high in this example.
9657
09:21:30,701 --> 09:21:34,891
So if that's the case, and\nmy computer is actually
9658
09:21:34,891 --> 09:21:42,631
laying out the word hi in memory\n
9659
09:21:42,631 --> 09:21:45,362
What exactly is s if,\nat the end of the day
9660
09:21:45,362 --> 09:21:51,391
H-I exclamation point null is storing--\n
9661
09:21:52,366 --> 09:21:54,241
Now that I've taken off\nthose training wheels
9662
09:21:54,241 --> 09:21:57,841
and showed you where H-I\nexclamation point null actually are
9663
09:21:59,582 --> 09:22:03,572
Well s, as always, is\nactually a variable.
9664
09:22:03,572 --> 09:22:05,612
Even in the code I\nproposed a moment ago
9665
09:22:05,612 --> 09:22:08,912
s is apparently a data type\nthat yes, doesn't come with C
9666
09:22:08,911 --> 09:22:11,461
but CS50's library makes it exist.
9667
09:22:11,461 --> 09:22:16,832
s is a variable of type string,\nso where is s in this picture?
9668
09:22:16,832 --> 09:22:20,792
Well, it turns out that\ns might be up here.
9669
09:22:20,792 --> 09:22:24,332
Again, I'm just drawing it anywhere\nfor the sake of discussion
9670
09:22:24,332 --> 09:22:28,502
but s is a variable\nper that line of code.
9671
09:22:28,502 --> 09:22:32,338
What s is storing,\napparently, I claim, is 0x123.
9672
09:22:32,338 --> 09:22:35,671
I actually don't really care about these\n
9673
09:22:35,671 --> 09:22:40,951
s is apparently, as of now, today,\n
9674
09:22:42,122 --> 09:22:44,672
Specifically, the first character in s.
9675
09:22:44,671 --> 09:22:46,771
And this is the last\npiece of the puzzle.
9676
09:22:46,771 --> 09:22:50,341
Last week we had this clever way\n
9677
09:22:50,341 --> 09:22:55,261
Well, it turns out that strings are\n
9678
09:22:55,262 --> 09:22:59,222
as a variable that is a\npointer, inside of which
9679
09:22:59,222 --> 09:23:02,262
is the address of the first\ncharacter in the string.
9680
09:23:02,262 --> 09:23:05,312
So if s points at the\nfirst character and you
9681
09:23:05,311 --> 09:23:07,862
can trust that backslash zero\nis at the end of the string
9682
09:23:07,862 --> 09:23:13,451
that's literally all you need to figure\n
9683
09:23:14,891 --> 09:23:16,502
Well, let's be a little more concrete.
9684
09:23:16,502 --> 09:23:20,162
In terms of this picture, if I've\n
9685
09:23:20,161 --> 09:23:25,322
it turns out all this time since\n
9686
09:23:25,322 --> 09:23:32,232
semi-secretly been an\nalias for char star.
9687
09:23:34,752 --> 09:23:36,201
So why does this make sense?
9688
09:23:36,201 --> 09:23:39,441
It's a little weird still,\nbut if in our previous example
9689
09:23:39,442 --> 09:23:43,031
we were able to store the address of\n
9690
09:23:45,192 --> 09:23:48,042
well, if as of now strings\nare just the address
9691
09:23:48,042 --> 09:23:53,472
of the first character in a string, then\n
9692
09:23:53,472 --> 09:23:57,222
because that means s is the\naddress of a character, the very
9693
09:23:57,222 --> 09:23:58,822
first character in the string.
9694
09:23:58,822 --> 09:24:02,802
Now, the string might have three letters\n
9695
09:24:02,802 --> 09:24:04,932
if it's a long paragraph,\nbut that's fine
9696
09:24:04,932 --> 09:24:06,849
because you can trust\nthat there's going to be
9697
09:24:06,849 --> 09:24:08,542
that null character at the very end.
9698
09:24:08,542 --> 09:24:12,281
So this is a general purpose\nway of representing strings
9699
09:24:12,281 --> 09:24:15,402
using this new mechanism in C.
9700
09:24:15,402 --> 09:24:18,582
So in fact, let me go ahead\nhere and introduce maybe
9701
09:24:18,582 --> 09:24:20,421
a couple of manipulations of this.
9702
09:24:20,421 --> 09:24:24,191
Let me go back to my code here, and\n
9703
09:24:24,192 --> 09:24:27,742
and let's instead now\ndo, for instance, this.
9704
09:24:27,741 --> 09:24:32,743
Let me add in the CS50 library,\nso we'll include CS50.H for now.
9705
09:24:32,743 --> 09:24:34,451
I'm going to go ahead\nand inside of main
9706
09:24:34,451 --> 09:24:37,331
give myself a string s\nequals hi exclamation point.
9707
09:24:37,332 --> 09:24:38,982
I don't type the backslash zero.
9708
09:24:38,982 --> 09:24:43,589
C does that for me automatically by\n
9709
09:24:43,588 --> 09:24:45,171
Now let me just go ahead and print it.
9710
09:24:45,171 --> 09:24:48,341
So this again is week 1 style stuff\n
9711
09:24:49,972 --> 09:24:55,122
So let me do make address, Enter,\n
9712
09:24:56,752 --> 09:25:00,701
But let's start to peel back\nsome of these layers here.
9713
09:25:00,701 --> 09:25:04,721
Let me first of all, get rid of\nthe CS50 library for a moment
9714
09:25:04,722 --> 09:25:09,012
and let me change string to char star.
9715
09:25:09,012 --> 09:25:11,262
And it's a little bit weird\nbut yes, the convention
9716
09:25:11,262 --> 09:25:15,260
is to say char, a space, then the\n
9717
09:25:16,302 --> 09:25:19,052
Strictly speaking though, you might\n
9718
09:25:19,052 --> 09:25:22,032
do it like this or like\nthis, but the canonical way
9719
09:25:22,031 --> 09:25:23,811
is typically to do it like that.
9720
09:25:23,811 --> 09:25:26,671
So now no more CS50 library, no\n
9721
09:25:26,671 --> 09:25:29,182
I'm just treating strings\nfor what they really are.
9722
09:25:29,182 --> 09:25:32,381
Let me go ahead and do\nmake address, Enter--
9723
09:25:32,381 --> 09:25:34,542
so far so good-- ./address--
9724
09:25:36,012 --> 09:25:40,211
So %s is a thing that comes with printf\n
9725
09:25:40,211 --> 09:25:44,262
terminology but strictly speaking\n
9726
09:25:44,262 --> 09:25:48,582
It's always been char star,\nso what this means now is I
9727
09:25:48,582 --> 09:25:52,122
can start to have some fun\nwith these basic ideas
9728
09:25:52,122 --> 09:25:55,252
even though this is not purposeful\n
9729
09:25:55,252 --> 09:25:59,262
But if s is this-- let me go back\n
9730
09:25:59,262 --> 09:26:01,752
Let's put those training wheels\nback on for just a moment
9731
09:26:01,752 --> 09:26:04,582
so that I can do one\nmanipulation at a time.
9732
09:26:04,582 --> 09:26:07,492
Here's my string s, as before.
9733
09:26:07,491 --> 09:26:10,542
Well, let me go ahead and\ndeclare a char called c
9734
09:26:10,542 --> 09:26:15,582
and let me store the first character\n
9735
09:26:15,582 --> 09:26:18,252
s bracket zero, and\nthat should give me h.
9736
09:26:18,252 --> 09:26:21,311
And then just for kicks, let\nme go ahead and do char star--
9737
09:26:21,311 --> 09:26:28,421
whoops-- let me go ahead and do\nchar star p equals ampersand c
9738
09:26:28,421 --> 09:26:30,851
and see what this\nactually prints for me.
9739
09:26:30,851 --> 09:26:34,222
Let me go ahead and\nprint out what p is here.
9740
09:26:34,222 --> 09:26:35,452
So we're just playing around.
9741
09:26:35,451 --> 09:26:39,041
So make address-- so\nfar so good-- ./address.
9742
09:26:39,042 --> 09:26:41,381
All right, so what have I just done?
9743
09:26:41,381 --> 09:26:46,511
I've just created a char c and\nstored in it the letter H, which
9744
09:26:46,512 --> 09:26:50,891
is the same thing as s bracket I, then\n
9745
09:26:50,891 --> 09:26:53,752
and that's apparently 0x7FF whatever.
9746
09:26:55,002 --> 09:26:57,201
But I technically\ndidn't have to do that.
9747
09:26:57,201 --> 09:26:59,002
Let me go ahead and do two things now.
9748
09:26:59,002 --> 09:27:07,362
Instead of just printing p, let me go\n
9749
09:27:07,362 --> 09:27:09,822
Let me go ahead and do\nmake address, Enter--
9750
09:27:09,822 --> 09:27:12,972
so far so good-- ./address and--
9751
09:27:12,972 --> 09:27:15,732
damn it, what did I do wrong.
9752
09:27:15,732 --> 09:27:17,562
Oh shoot, I didn't want to do that.
9753
09:27:17,561 --> 09:27:21,141
Oh, I really made a mess of this.
9754
09:27:23,921 --> 09:27:27,191
That was supposed to be impressive\nbut it was the opposite.
9755
09:27:30,682 --> 09:27:34,542
So if I intended to do this,\nwhy are lines nine and 10
9756
09:27:36,822 --> 09:27:40,002
Didn't really intend to go here,\nbut let me try to save this.
9757
09:27:40,002 --> 09:27:47,351
Why are we seeing different addresses,\n
9758
09:27:55,482 --> 09:27:57,932
AUDIENCE: [INAUDIBLE]\nis the character c is
9759
09:27:57,932 --> 09:28:02,832
its own sort of location\nof the [INAUDIBLE]
9760
09:28:02,832 --> 09:28:04,874
and it's taking off just\nthe values [INAUDIBLE]..
9761
09:28:05,874 --> 09:28:08,045
So if I really wanted to\nweasel my way out of this
9762
09:28:08,044 --> 09:28:10,711
this is a great answer to the\nprevious question which was about
9763
09:28:10,711 --> 09:28:15,451
what if I introduce another variable,\n
9764
09:28:15,451 --> 09:28:18,151
and not in this case an\nint, but an actual char.
9765
09:28:18,152 --> 09:28:23,641
Here, I've made c be a copy of the\n
9766
09:28:24,741 --> 09:28:26,491
So if I were to draw\nit on the screen that
9767
09:28:26,491 --> 09:28:30,631
would give me a different\nrectangle in which this copy of h
9768
09:28:32,042 --> 09:28:33,991
So I didn't intend to\ndo this, but what you're
9769
09:28:33,991 --> 09:28:35,978
seeing is yes, the address of s--
9770
09:28:35,978 --> 09:28:38,311
and apparently that's at a\npretty low address by default
9771
09:28:38,311 --> 09:28:40,322
here-- then you're\nseeing the address of c.
9772
09:28:40,322 --> 09:28:43,201
But even though each\nof them is h, I claim
9773
09:28:43,201 --> 09:28:45,163
one is at a different address in memory.
9774
09:28:45,163 --> 09:28:46,621
And this has always been happening.
9775
09:28:46,622 --> 09:28:49,352
Any time you created one variable\n
9776
09:28:49,351 --> 09:28:51,269
or here, or here, or\nsomewhere else in memory.
9777
09:28:51,269 --> 09:28:54,271
Now for the first time all we're\n
9778
09:28:54,271 --> 09:28:57,731
the computer's memory to\nsee what is actually there.
9779
09:28:57,732 --> 09:29:01,382
So let me actually back\nthis up a little bit
9780
09:29:01,381 --> 09:29:04,752
and do what I intended to do here,\n
9781
09:29:04,752 --> 09:29:08,912
So if string s equals quote\nunquote, hi, let's go ahead
9782
09:29:08,911 --> 09:29:18,411
and give myself a pointer, called\n
9783
09:29:18,411 --> 09:29:22,251
All right, so now let me go ahead and\n
9784
09:29:24,394 --> 09:29:26,311
So we're just going to\ndo one thing at a time.
9785
09:29:26,311 --> 09:29:29,121
So make address, Enter, ./address.
9786
09:29:29,122 --> 09:29:34,222
There, at the moment, is the\n
9787
09:29:34,222 --> 09:29:36,141
What I meant to do now, was this.
9788
09:29:36,141 --> 09:29:39,082
If I want to print out\ntwo things this time
9789
09:29:39,082 --> 09:29:44,752
let me print out not only what p is,\n
9790
09:29:44,752 --> 09:29:48,771
Because if I claim that everyone from\n
9791
09:29:48,771 --> 09:29:51,741
s bracket zero just representing\nthe first character in s
9792
09:29:51,741 --> 09:29:54,981
by definition of strings\nbeing arrays of characters.
9793
09:29:54,982 --> 09:30:01,232
Then s, as of today, is itself\nthe address of a character
9794
09:30:02,122 --> 09:30:06,082
So if I now do make\naddress, and do ./address
9795
09:30:06,082 --> 09:30:08,842
this time I see the same exact things.
9796
09:30:13,588 --> 09:30:16,171
This is really the lamest sort\nof thing to be applauding over
9797
09:30:16,171 --> 09:30:21,932
but what we're demonstrating here is\n
9798
09:30:21,932 --> 09:30:23,622
of the first character in c.
9799
09:30:23,622 --> 09:30:26,292
So if we borrow some of our\nmental model from last week--
9800
09:30:26,292 --> 09:30:31,171
well, if s bracket zero is the first\n
9801
09:30:31,171 --> 09:30:33,711
that expression should be the same as s.
9802
09:30:33,711 --> 09:30:36,211
Now this isn't to say that we\nwould jump through these hoops
9803
09:30:36,211 --> 09:30:40,411
all the time with this much syntax,\n
9804
09:30:40,411 --> 09:30:46,531
that s is in fact, as I claimed a moment\n
9805
09:30:46,531 --> 09:30:50,012
Not even multiple characters, it's\n
9806
09:30:50,012 --> 09:30:53,942
but the key thing is it's the address\n
9807
09:30:53,942 --> 09:30:57,182
and per last week we\ntrust that C is going
9808
09:30:57,182 --> 09:31:00,241
to look for that null\ncharacter at the very end just
9809
09:31:00,241 --> 09:31:04,081
to make sure it knows where\nthe string actually ends.
9810
09:31:04,082 --> 09:31:07,677
All right, a question came up over here.
9811
09:31:21,942 --> 09:31:25,542
To summarize, on line\neight, when I am using %p--
9812
09:31:25,542 --> 09:31:28,542
that just means print a pointer\nvalue, so 0x something--
9813
09:31:30,942 --> 09:31:36,641
Previously, when we used %s, printf knew\n
9814
09:31:36,641 --> 09:31:40,841
of s, but h, i, exclamation point, and\n
9815
09:31:41,982 --> 09:31:47,202
p is different. %p tells the\ncomputer to go to that address--
9816
09:31:47,201 --> 09:31:52,072
sorry, tells the computer to\nprint that address on the screen.
9817
09:31:52,072 --> 09:31:55,122
So this is where %s all\nthis time has been powerful.
9818
09:31:55,122 --> 09:31:59,322
The reason printf worked\nin week 1 and 2 and 3
9819
09:31:59,322 --> 09:32:02,622
was because printf was designed\nby some human years ago
9820
09:32:02,622 --> 09:32:05,652
to go to the address that's\nbeing passed in-- for instance
9821
09:32:05,652 --> 09:32:07,991
s-- and print out\ncharacter after character
9822
09:32:07,991 --> 09:32:11,652
after character until it sees the\nnull character backslash zero
9823
09:32:13,252 --> 09:32:16,841
So that's-- you're getting a lot\n
9824
09:32:16,841 --> 09:32:19,271
Today we're using\nsomething much simpler, %p
9825
09:32:19,271 --> 09:32:22,572
which just literally prints what s is.
9826
09:32:22,572 --> 09:32:24,311
And the reason we\ndon't do this in week 1
9827
09:32:24,311 --> 09:32:26,381
is just because this\nis like way too much
9828
09:32:26,381 --> 09:32:28,381
to be interesting when\nall you want to print out
9829
09:32:28,381 --> 09:32:29,901
is hi or hello, world, or the like.
9830
09:32:29,902 --> 09:32:31,872
But now what we're\nreally doing is revealing
9831
09:32:31,872 --> 09:32:34,302
what's been going on this whole time.
9832
09:32:34,302 --> 09:32:36,039
And let me make one other example here.
9833
09:32:36,038 --> 09:32:37,871
Let me go ahead and get\nrid of this variable
9834
09:32:37,872 --> 09:32:41,262
here and let me just print out a\n
9835
09:32:41,262 --> 09:32:45,492
I'm going to print out not just s\n
9836
09:32:46,542 --> 09:32:48,432
the address of every character in s.
9837
09:32:48,432 --> 09:32:52,713
So let's get the first letter\nin s and get its address
9838
09:32:52,713 --> 09:32:54,671
and I'm going to do copy\npaste for time's sake
9839
09:32:54,671 --> 09:32:57,881
but not something I would do frequently.
9840
09:32:57,881 --> 09:33:01,394
So let me print out the address of the\n
9841
09:33:01,394 --> 09:33:03,311
the third, and actually\neven the fourth, which
9842
09:33:03,311 --> 09:33:06,682
is the backslash zero, by doing this.
9843
09:33:06,682 --> 09:33:11,292
So when I compiled this program--\nmake address, ./address--
9844
09:33:11,292 --> 09:33:14,802
I should see two\nidentical values and then
9845
09:33:14,802 --> 09:33:17,292
additional values that\nare one byte away.
9846
09:33:17,292 --> 09:33:22,932
In my diagram a moment ago, my addresses\n
9847
09:33:22,932 --> 09:33:29,201
Now it starts at, by chance,\n0x402004, which is s.
9848
09:33:29,201 --> 09:33:32,741
0x402004 is the same thing\nas s because I'm just
9849
09:33:32,741 --> 09:33:35,351
saying go to the first character\nand then get its address.
9850
09:33:35,351 --> 09:33:36,851
Those are one in the same now.
9851
09:33:36,851 --> 09:33:42,762
And then after that\nis 0x402005, 006, 007
9852
09:33:42,762 --> 09:33:44,542
because that is just like the diagram.
9853
09:33:44,542 --> 09:33:48,342
Go to the i, to the exclamation\n
9854
09:33:48,341 --> 09:33:51,252
So all I'm doing now is using my\nnewfound understanding of what
9855
09:33:51,252 --> 09:33:54,612
ampersand does and what the star\n
9856
09:33:54,612 --> 09:33:57,510
I'm poking around in\nthe computer's memory.
9857
09:33:57,510 --> 09:33:59,052
Just to demonstrate there's no magic.
9858
09:33:59,052 --> 09:34:02,022
It's all there very deliberately\nbecause I or printf or someone
9859
09:34:11,254 --> 09:34:12,921
DAVID J. MALAN: Really good observation.
9860
09:34:12,921 --> 09:34:16,432
So it's indeed the case\nthat hi, unlike 50
9861
09:34:16,432 --> 09:34:21,652
is ending up at a very low address,\n
9862
09:34:21,652 --> 09:34:24,622
That's actually because,\nlong story short, strings
9863
09:34:24,622 --> 09:34:27,592
are often stored in a different\npart of the computer's memory--
9864
09:34:27,591 --> 09:34:29,691
more on that later\ntoday-- for efficiency.
9865
09:34:29,692 --> 09:34:32,902
There's actually only going to be one\n
9866
09:34:32,902 --> 09:34:36,182
point, and the computer is going to\n
9867
09:34:36,182 --> 09:34:39,112
but other values like\nints and floats and the
9868
09:34:39,112 --> 09:34:41,752
like-- they end up lower\nin memory by convention.
9869
09:34:41,752 --> 09:34:45,002
But a good observation, because\nthat is consistent here.
9870
09:34:45,002 --> 09:34:48,472
All right, so a couple final details\n
9871
09:34:48,472 --> 09:34:54,052
Let me go ahead and claim that\nwe implemented char star--
9872
09:34:54,052 --> 09:34:56,752
or rather, string as a\nchar star as follows.
9873
09:34:56,752 --> 09:34:59,091
As of last week we\nwere writing this code.
9874
09:34:59,091 --> 09:35:03,322
As of this week, we can now start\n
9875
09:35:03,322 --> 09:35:06,902
specifically, we invented\nin the CS50 library.
9876
09:35:06,902 --> 09:35:10,252
But it turns out you've seen a way\n
9877
09:35:11,991 --> 09:35:16,222
We played around last time with data\n
9878
09:35:16,222 --> 09:35:20,002
and briefly the typedef keyword,\nwhich defines a type for you.
9879
09:35:20,002 --> 09:35:22,012
And if I highlight\nwhat's interesting here
9880
09:35:22,012 --> 09:35:25,702
the way we invented a\nperson data type last time
9881
09:35:25,701 --> 09:35:28,761
was to define a person as having\ntwo variables inside of it--
9882
09:35:28,762 --> 09:35:33,959
a structure that encapsulates a\nname and encapsulates a number.
9883
09:35:33,959 --> 09:35:37,042
Now even though the syntax is a little\n
9884
09:35:37,042 --> 09:35:43,131
thing, notice that this could be a\n
9885
09:35:43,131 --> 09:35:47,421
If I want to create a type called\n
9886
09:35:47,421 --> 09:35:51,591
then I use typedef to make\nit defined to be char star.
9887
09:35:51,591 --> 09:35:55,311
So this is literally all\nthat has ever been in CS50.h
9888
09:35:55,311 --> 09:35:58,131
in addition to those prototypes\nof functions we've talked about.
9889
09:35:58,131 --> 09:36:01,191
typedef char star string\nis a one-line code
9890
09:36:01,192 --> 09:36:05,919
that brings the word string\nas a data type into existence
9891
09:36:05,919 --> 09:36:07,502
and that's all that's ever been there.
9892
09:36:07,502 --> 09:36:10,641
But the star, the char star,\nis just too much in week 1.
9893
09:36:10,641 --> 09:36:14,031
We wait until this point\nto peel back that layer.
9894
09:36:14,031 --> 09:36:16,521
are any questions, then,\non what a string is?
9895
09:36:16,521 --> 09:36:19,101
What star or the ampersand are doing?
9896
09:36:26,432 --> 09:36:30,031
If that is-- is that why when you\n
9897
09:36:30,031 --> 09:36:34,031
did, or almost did, problems arise.
9898
09:36:34,031 --> 09:36:36,332
And in fact yes, last\nweek we use str compare--
9899
09:36:36,332 --> 09:36:40,711
STRCMP-- for a very deliberate\n
9900
09:36:40,711 --> 09:36:45,301
accidentally would have compared two\n
9901
09:36:50,574 --> 09:36:53,531
All right, well, before we give\n
9902
09:36:53,531 --> 09:36:54,762
we have lots of pieces of paper.
9903
09:36:54,762 --> 09:36:57,552
If anyone wants to come on up and\n
9904
09:36:57,552 --> 09:36:59,562
if you want to make your own\neight by eight grid of something
9905
09:36:59,561 --> 09:37:02,621
to share with the class if you're\n
9906
09:37:02,622 --> 09:37:05,352
Otherwise, let's take 10 minutes\nand will return after 10.
9907
09:37:05,351 --> 09:37:10,271
All right, so let's come\nback to this question of how
9908
09:37:10,271 --> 09:37:13,241
we can start to use these pointers\n
9909
09:37:14,332 --> 09:37:16,572
The goal ultimately\nnext week is going to be
9910
09:37:16,572 --> 09:37:20,292
to use these addresses to really\n
9911
09:37:20,292 --> 09:37:23,622
structures than just persons,\nlike last week, or candidates
9912
09:37:23,622 --> 09:37:25,422
in the context of an\nelectoral algorithm
9913
09:37:25,421 --> 09:37:28,991
if you will, and actually really use\n
9914
09:37:28,991 --> 09:37:32,051
to represent not just\nimages but maybe videos
9915
09:37:32,052 --> 09:37:34,552
and other two-dimensional\nstructures as well.
9916
09:37:34,552 --> 09:37:36,942
But for now, let's come back\nto this address example
9917
09:37:36,942 --> 09:37:41,922
whittle it down to just a hi initially,\n
9918
09:37:42,822 --> 09:37:45,762
So let me re-add the\nCS50 library just so we
9919
09:37:45,762 --> 09:37:49,391
use our synonym for a moment,\nthat is the word string
9920
09:37:49,391 --> 09:37:51,521
and I'll redefine s as a string.
9921
09:37:51,521 --> 09:37:54,191
And what I didn't mention before\nis that these double quotes
9922
09:37:54,192 --> 09:37:57,042
that you've been using for some\n
9923
09:37:57,042 --> 09:38:00,281
The double quotes are\na clue to the compiler
9924
09:38:00,281 --> 09:38:04,671
that what is between them is in\nfact a string as we now know it
9925
09:38:04,671 --> 09:38:07,932
which means the compiler will\ndo all the work of figuring out
9926
09:38:07,932 --> 09:38:10,692
where to put the h, the\ni, the exclamation point
9927
09:38:10,692 --> 09:38:13,722
and even adding for you\nautomatically a backslash zero.
9928
09:38:13,722 --> 09:38:15,942
And what the compiler\nwill do for you, too
9929
09:38:15,942 --> 09:38:18,822
is figure out what address\nall four of those chars
9930
09:38:18,822 --> 09:38:22,692
ended up at and store it\nfor you in the variable s.
9931
09:38:22,692 --> 09:38:26,891
So that's why it just happens with\n
9932
09:38:26,891 --> 09:38:31,271
or even stars explicitly, but the star\n
9933
09:38:31,271 --> 09:38:33,761
string is just synonymous\nnow with char star.
9934
09:38:33,762 --> 09:38:37,732
It's not really as readable,\nbut it is now the same idea.
9935
09:38:37,732 --> 09:38:40,272
So I'll leave string in place\njust to do something week
9936
09:38:40,271 --> 09:38:43,941
1 style here for a moment, and let's go\n
9937
09:38:43,942 --> 09:38:49,391
So I'm going to use %c this time, and\n
9938
09:38:49,391 --> 09:38:54,521
and then I'm going to print out\ns bracket one and s bracket two
9939
09:38:54,521 --> 09:38:58,451
literally doing week three\nstyle from last week--
9940
09:38:58,451 --> 09:39:03,281
a printing of every character\nin s as though it were an array.
9941
09:39:03,281 --> 09:39:06,582
So ./address should give\nme h-i exclamation point.
9942
09:39:06,582 --> 09:39:09,822
And if I really want to get\ncurious, technically speaking
9943
09:39:09,822 --> 09:39:14,052
I could print out one more location,\n
9944
09:39:14,052 --> 09:39:19,572
make address ./address and there is,\n
9945
09:39:19,572 --> 09:39:25,002
I'm not seeing zero because I didn't\n
9946
09:39:25,002 --> 09:39:28,692
it's literally eight zero bits\n
9947
09:39:28,692 --> 09:39:30,322
if you will, in printf speak.
9948
09:39:30,322 --> 09:39:32,711
And so what I'm seeing here\nis like a blank symbol.
9949
09:39:32,711 --> 09:39:34,902
That just means there is\nsomething else there--
9950
09:39:34,902 --> 09:39:39,162
it's apparently all eight\nzero bits, but they are there
9951
09:39:39,161 --> 09:39:41,931
even though we're not seeing\nthem literally right now.
9952
09:39:41,932 --> 09:39:44,572
Well, let's go ahead and\npeel back one of these layers
9953
09:39:44,572 --> 09:39:48,491
and let me go ahead and get rid of\n
9954
09:39:48,491 --> 09:39:51,911
therefore, the word string because\n
9955
09:39:53,262 --> 09:39:56,141
I'm going to now do\nmake address, ./address
9956
09:39:56,141 --> 09:39:57,612
and it's the same exact thing.
9957
09:39:57,612 --> 09:40:00,982
And now, let's just focus on the hi\n
9958
09:40:00,982 --> 09:40:05,772
So I'm going to recompile one last time\n
9959
09:40:05,771 --> 09:40:10,361
Well, it turns out that the\narray notation we used last week
9960
09:40:10,362 --> 09:40:12,972
was technically some of\nthis syntactic sugar.
9961
09:40:12,972 --> 09:40:16,182
Sort of a neat way to use\nsyntax in a useful way
9962
09:40:16,182 --> 09:40:21,792
but we can see more explicitly today\n
9963
09:40:23,421 --> 09:40:25,161
Let me go ahead and do this.
9964
09:40:25,161 --> 09:40:30,401
Let me adventurously say I\nwant to print out not s bracket
9965
09:40:30,402 --> 09:40:36,192
zero, but I want to print out\n
9966
09:40:36,192 --> 09:40:38,442
So to be clear, what is s now?
9967
09:40:38,442 --> 09:40:39,792
It's the address of a string.
9968
09:40:41,292 --> 09:40:44,802
s is the address of the\nfirst char in a string
9969
09:40:44,802 --> 09:40:47,802
and again, that's sufficient for\n
9970
09:40:47,802 --> 09:40:50,722
the computer will see that there's\n
9971
09:40:50,722 --> 09:40:56,601
So s is specifically the address\n
9972
09:40:56,601 --> 09:40:59,652
So that means, using my\nnew syntax, if I want
9973
09:40:59,652 --> 09:41:02,944
to print out that first\ncharacter I can print out star
9974
09:41:02,944 --> 09:41:06,834
s, because recall that star is the\n
9975
09:41:06,834 --> 09:41:09,042
repeat the word char, you\ndon't repeat the word int--
9976
09:41:10,661 --> 09:41:13,181
That means go to that address.
9977
09:41:13,182 --> 09:41:18,012
Similarly, if I, in my newfound\nknowledge of how strings work
9978
09:41:18,012 --> 09:41:21,641
know that the h comes first,\nthen the i right after it
9979
09:41:21,641 --> 09:41:25,512
then the exclamation point, then\n
9980
09:41:25,512 --> 09:41:29,292
one byte apart, I could\nstart to do some arithmetic.
9981
09:41:29,292 --> 09:41:34,932
I could go to s plus 1 byte and\nprint out the second character
9982
09:41:34,932 --> 09:41:38,682
and I could print out\nwhatever is at s plus 2--
9983
09:41:38,682 --> 09:41:41,951
in fact, doing what's generally\nknown as pointer arithmetic.
9984
09:41:41,951 --> 09:41:44,951
Literally treating pointers\nas the numbers they are--
9985
09:41:44,951 --> 09:41:48,191
hexadecimal or decimal, doesn't really\n
9986
09:41:48,192 --> 09:41:51,022
And go ahead and add\none byte or two bytes
9987
09:41:51,021 --> 09:41:53,511
to them to start at the\nbeginning of a string
9988
09:41:53,512 --> 09:41:56,192
and just poke around from left to right.
9989
09:41:56,192 --> 09:42:00,262
So this now is equivalent to what we\n
9990
09:42:00,262 --> 09:42:05,031
notation, but now I'm re implementing\n
9991
09:42:05,031 --> 09:42:09,182
plumbing, understanding ampersand\n
9992
09:42:09,182 --> 09:42:11,961
so if I remake this\nprogram and do ./address
9993
09:42:11,961 --> 09:42:14,489
I should still see\nh-i exclamation point.
9994
09:42:14,489 --> 09:42:16,822
But what I'm really doing is\njust kind of demonstrating
9995
09:42:16,822 --> 09:42:20,211
hopefully, my understanding\nof what really
9996
09:42:20,211 --> 09:42:22,072
is going on in the computer's memory.
9997
09:42:22,072 --> 09:42:24,591
Now, programmers who are\nmaybe trying to show off
9998
09:42:24,591 --> 09:42:25,971
might actually write this syntax.
9999
09:42:25,972 --> 09:42:28,597
I think the more common syntax\nwould be what we did last week--
10000
09:42:28,597 --> 09:42:30,332
s bracket zero, s bracket one.
10001
09:42:30,832 --> 09:42:32,707
It's just a little more\nreadable and we don't
10002
09:42:32,707 --> 09:42:36,891
need to brag about or care about\nthis underlying representation.
10003
09:42:36,891 --> 09:42:39,771
The square brackets last week\nwe're an abstraction, if you will
10004
09:42:39,771 --> 09:42:42,081
on top of what is lower level math.
10005
09:42:42,082 --> 09:42:44,722
But that's all that's going\non underneath the hood.
10006
09:42:44,722 --> 09:42:48,171
We're poking around from\nbyte to byte to byte.
10007
09:42:48,171 --> 09:42:53,582
All right, let me pause here, see if\n
10008
09:42:56,292 --> 09:42:59,012
Let's do one more then, just\nto demonstrate that this is not
10009
09:43:00,531 --> 09:43:02,521
Let me go ahead and\nget rid of all of this
10010
09:43:02,521 --> 09:43:06,901
and let me give myself an array\nof numbers like I did last week.
10011
09:43:06,902 --> 09:43:09,182
So if I'm going to\ndeclare all the numbers
10012
09:43:09,182 --> 09:43:11,881
at once using this funky\ncurly brace notation
10013
09:43:11,881 --> 09:43:15,331
I can do like 4, 6, 8, 2, 7, 5, 0.
10014
09:43:15,332 --> 09:43:19,412
So seven different numbers inside\n
10015
09:43:20,432 --> 09:43:22,491
I don't, strictly speaking,\nneed to say seven.
10016
09:43:22,491 --> 09:43:24,241
The compiler is smart\nenough to figure out
10017
09:43:24,241 --> 09:43:26,612
how many numbers I put\nwith commas between them
10018
09:43:26,612 --> 09:43:31,112
and that just gives me an array\ncontaining 4, 6, 8, 2, 7, 5, 0.
10019
09:43:31,112 --> 09:43:34,561
So it turns out I can print each of\n
10020
09:43:34,561 --> 09:43:40,381
I can do a printf of %i backslash n,\n
10021
09:43:40,381 --> 09:43:44,401
and let me just do some quick copy/paste\n
10022
09:43:44,402 --> 09:43:49,241
Theoretically, that should\nprint out 4, 6, 8, and so forth.
10023
09:43:49,241 --> 09:43:52,381
But I can do the same sort\nof manipulation understanding
10024
09:43:52,381 --> 09:43:55,292
what pointers now are,\nusing pointer arithmetic.
10025
09:43:55,292 --> 09:43:59,101
So let me actually unwind this\nand just go back to one printf
10026
09:43:59,101 --> 09:44:02,551
and instead of printing numbers bracket\n
10027
09:44:02,552 --> 09:44:06,722
let me just go and print out\nwhatever is at that address--
10028
09:44:08,792 --> 09:44:11,222
Let me then print out\nthe second digit, which
10029
09:44:11,222 --> 09:44:16,412
is going to be whatever is at numbers\n
10030
09:44:16,411 --> 09:44:20,381
and do whatever is at numbers plus 2,\n
10031
09:44:20,381 --> 09:44:22,621
let me do it four more\ntimes and do what's
10032
09:44:22,622 --> 09:44:27,242
at location three, four, five, and six.
10033
09:44:27,241 --> 09:44:30,991
And that's seven total numbers\n
10034
09:44:30,991 --> 09:44:32,561
So let me just quickly run this.
10035
09:44:35,012 --> 09:44:37,742
There are those seven\ndigits being printed.
10036
09:44:37,741 --> 09:44:41,761
But there's something\nsubtle but also useful here.
10037
09:44:45,752 --> 09:44:47,891
Because I made an array of integers.
10038
09:44:47,891 --> 09:44:52,542
But think back-- how big is a\ntypical integer, have we claimed?
10039
09:44:52,542 --> 09:44:58,182
Four bytes, or 32 bits, so it's\nworth noting that I don't really
10040
09:44:58,182 --> 09:45:00,201
need to worry about that detail.
10041
09:45:00,201 --> 09:45:05,479
Notice that I did not do plus 4,\n
10042
09:45:05,480 --> 09:45:07,272
I, the programmer,\nstrictly speaking, don't
10043
09:45:07,271 --> 09:45:09,551
need to worry about how\nbig the data type is.
10044
09:45:09,552 --> 09:45:11,652
This is the power of pointer arithmetic.
10045
09:45:11,652 --> 09:45:17,292
The compiler is smart enough to know\n
10046
09:45:17,292 --> 09:45:21,802
that is the same as saying\ngo one more piece of data--
10047
09:45:22,841 --> 09:45:24,611
so if it's an int, move four.
10048
09:45:24,612 --> 09:45:26,232
If it's a second int, move eight.
10049
09:45:26,232 --> 09:45:27,962
If it's a third int, move 12.
10050
09:45:27,961 --> 09:45:31,182
Pointer arithmetic handles that\nannoying arithmetic for you
10051
09:45:31,182 --> 09:45:33,822
so you can just think of this\nas a number after a number
10052
09:45:33,822 --> 09:45:37,182
after a number that are back to\n
10053
09:45:38,531 --> 09:45:42,561
Which is only to say plus 1, plus 2,\n
10054
09:45:43,061 --> 09:45:48,481
Because the compiler knows what\n
10055
09:45:48,482 --> 09:45:51,872
Now, there's one other\ndetail I should reveal here
10056
09:45:51,872 --> 09:45:54,032
that I've taken for granted.
10057
09:45:54,031 --> 09:45:57,002
In the past I was using double\nquotes to represent strings
10058
09:45:57,002 --> 09:45:59,732
and I claim that the compiler's\nsmart enough to realize that oh
10059
09:45:59,732 --> 09:46:04,272
if I have double quote hi, that means\n
10060
09:46:04,271 --> 09:46:05,791
and then the backslash zero.
10061
09:46:08,161 --> 09:46:13,921
It turns out that you can actually treat\n
10062
09:46:13,921 --> 09:46:16,141
is itself a pointer,\nand this is actually
10063
09:46:16,141 --> 09:46:18,512
going to be something\nuseful in upcoming problems
10064
09:46:18,512 --> 09:46:22,082
when we want to pass arrays\naround in the computer's memory.
10065
09:46:22,082 --> 09:46:25,824
Notice that strictly speaking on line\n
10066
09:46:25,824 --> 09:46:27,781
There's no star, there's\nno ampersand-- there's
10067
09:46:27,781 --> 09:46:31,021
nothing new there, and yet\ninstantly on line seven
10068
09:46:31,021 --> 09:46:35,851
I'm pretending that it is the\naddress, and this is actually OK.
10069
09:46:35,851 --> 09:46:39,752
It turns out that an array\nreally can be treated
10070
09:46:39,752 --> 09:46:43,241
as the address of the first\nelement in that array.
10071
09:46:43,241 --> 09:46:47,439
The difference is that there's no\n
10072
09:46:47,440 --> 09:46:49,232
This is just part of\nthe phone number here
10073
09:46:49,232 --> 09:46:52,052
the ending in zero-- that's not\nlike a special backslash zero.
10074
09:46:52,052 --> 09:46:55,082
So this is something we're going to\n
10075
09:46:55,082 --> 09:46:58,802
There's this interrelationship\nbetween addresses and arrays
10076
09:46:58,802 --> 09:47:03,482
that just generally allows you to\n
10077
09:47:03,482 --> 09:47:05,882
but the math is taken care of for you.
10078
09:47:05,881 --> 09:47:10,322
Are any questions then on this before\n
10079
09:47:19,144 --> 09:47:20,311
DAVID J. MALAN: Potentially.
10080
09:47:20,311 --> 09:47:24,271
If you go beyond the end of an array,\n
10081
09:47:24,271 --> 09:47:27,541
The problem is that that symptom\nis sometimes nondeterministic
10082
09:47:27,542 --> 09:47:30,542
which means that sometimes it\nwill happen, sometimes it won't.
10083
09:47:30,542 --> 09:47:34,502
It often depends on how far off the\n
10084
09:47:34,502 --> 09:47:36,991
You'll often not induce\nthe segmentation fault
10085
09:47:36,991 --> 09:47:39,781
if you just poke a little too\nfar, but if you go way too far
10086
09:47:41,192 --> 09:47:44,522
But we'll give you a tool today\n
10087
09:47:44,521 --> 09:47:46,541
exactly that kind of situation.
10088
09:47:46,542 --> 09:47:49,452
So let's go ahead now and do\n
10089
09:47:49,451 --> 09:47:51,961
but that actually comes back\nto that spoiler from earlier.
10090
09:47:51,961 --> 09:47:56,832
Let me go ahead and create a program\n
10091
09:47:56,832 --> 09:48:00,002
I'm going to go ahead and\nallow myself the CS50 library
10092
09:48:00,002 --> 09:48:03,482
not so much for string but so that\n
10093
09:48:03,482 --> 09:48:07,801
which is way easier than the way we'll\n
10094
09:48:07,800 --> 09:48:10,832
Let me give myself stdio.h,\ndo an int main(void)
10095
09:48:10,832 --> 09:48:13,742
not worrying about command line\n
10096
09:48:13,741 --> 09:48:18,061
and get an int i using get int, and\n
10097
09:48:18,061 --> 09:48:23,822
then let me give myself an int j, ask\n
10098
09:48:23,822 --> 09:48:27,991
and then let me go ahead and kind of\n
10099
09:48:27,991 --> 09:48:31,411
if i equals equals j,\nthen let's go ahead
10100
09:48:31,411 --> 09:48:36,481
and print out something like "same,"\n
10101
09:48:36,482 --> 09:48:40,152
and print out "different" if\nthey are not, in fact, the same.
10102
09:48:40,152 --> 09:48:44,312
So that would seem to be a program that\n
10103
09:48:44,311 --> 09:48:46,621
All right, so let's go\nahead and run make compare--
10104
09:48:48,811 --> 09:48:52,351
OK, i will be 50, j will be 50--
10105
09:48:54,582 --> 09:48:57,599
i will be 50, j will be 42.
10106
09:48:58,391 --> 09:49:02,701
So so far, so good in this\nfirst version of comparison.
10107
09:49:02,701 --> 09:49:05,771
But as you might see\nwhere I'm going with this
10108
09:49:05,771 --> 09:49:09,511
let's move away from integers and let's\n
10109
09:49:10,661 --> 09:49:13,261
So I could do string s over here--
10110
09:49:15,841 --> 09:49:22,711
Then I could do string t over\nhere, and GetString over here
10111
09:49:22,711 --> 09:49:25,442
asking the user for t this time, here.
10112
09:49:25,442 --> 09:49:26,972
And then I can compare the two.
10113
09:49:28,819 --> 09:49:30,152
and this is a common convention.
10114
09:49:30,152 --> 09:49:33,182
If you've used s for string already you\n
10115
09:49:33,182 --> 09:49:34,802
for simple demonstrations like this.
10116
09:49:34,802 --> 09:49:37,927
I'm going to compare the two, just like\n
10117
09:49:37,927 --> 09:49:41,882
Make compare-- so far\nso good-- ./address--
10118
09:49:44,582 --> 09:49:47,792
Let me go ahead and\ntype in something like
10119
09:49:47,792 --> 09:49:52,762
hi, exclamation point and bye,\n
10120
09:49:52,762 --> 09:49:54,662
should definitely be different.
10121
09:49:54,661 --> 09:50:00,481
Let me run it again with hi, exclamation\n
10122
09:50:00,482 --> 09:50:02,432
Different-- maybe I messed up.
10123
09:50:02,432 --> 09:50:05,542
Let's maybe do it lowercase,\nmaybe that'll fix.
10124
09:50:05,542 --> 09:50:07,862
But no, those two are different.
10125
09:50:07,862 --> 09:50:11,842
So to come back to what I described\nas a spoiler earlier, what's
10126
09:50:11,841 --> 09:50:16,019
the fundamental issue here, to be clear?
10127
09:50:16,019 --> 09:50:18,061
Why is it saying different\neven though I'm pretty
10128
09:50:18,061 --> 09:50:19,478
sure I typed the same thing twice.
10129
09:50:21,542 --> 09:50:24,961
Yeah, this is where it's now\nuseful to know that string has been
10130
09:50:24,961 --> 09:50:28,423
an abstraction-- a training wheel, if\n
10131
09:50:28,423 --> 09:50:30,631
still use GetString because\nthat's convenient still--
10132
09:50:30,631 --> 09:50:33,421
but if I change string\nto be char star, it's
10133
09:50:33,421 --> 09:50:39,661
a little more explicit as to what s and\n
10134
09:50:39,661 --> 09:50:42,121
that is the address of\na char. t is a pointer
10135
09:50:42,122 --> 09:50:44,282
to a char, that is\nthe address of a char.
10136
09:50:44,281 --> 09:50:47,432
Specifically, the first character\nin s and the first character
10137
09:50:49,211 --> 09:50:51,436
So if I'm comparing\nthese two it should stand
10138
09:50:51,436 --> 09:50:53,311
to reason that they're\ngoing to be different.
10139
09:50:53,811 --> 09:50:57,421
Because s might end up here in memory\n
10140
09:50:57,421 --> 09:51:00,542
Each time I call GetString, it is\n
10141
09:51:00,542 --> 09:51:02,531
to know that, wait a minute--\nyou typed the same thing.
10142
09:51:02,531 --> 09:51:04,051
I'm just going to hand\nyou back the same address.
10143
09:51:04,052 --> 09:51:06,872
That doesn't happen because we\n
10144
09:51:06,872 --> 09:51:10,502
Each time I call GetString,\nit returns, apparently
10145
09:51:10,502 --> 09:51:13,262
a different copy of the\nstring that was typed in.
10146
09:51:13,262 --> 09:51:15,572
A hi over here and a hi over here.
10147
09:51:15,572 --> 09:51:18,152
They might look the same to\nthe human but to the computer
10148
09:51:18,152 --> 09:51:22,052
they are different chunks of memory,\n
10149
09:51:22,052 --> 09:51:25,542
And here, too, we can reveal\nwhat is GetString returning?
10150
09:51:25,542 --> 09:51:29,522
Well, up until today it was\nreturning a string, so to speak.
10151
09:51:31,021 --> 09:51:33,361
Technically, what\nGetString has always been
10152
09:51:33,362 --> 09:51:38,732
doing is returning the address\nof the first char in a string
10153
09:51:38,732 --> 09:51:42,542
and trusting that we put a backslash\n
10154
09:51:42,542 --> 09:51:46,772
typed in, and that's enough now\nfor printf, for strlen, for you
10155
09:51:46,771 --> 09:51:49,322
to know where a string begins and ends.
10156
09:51:49,322 --> 09:51:53,072
So GetString has actually\nalways returned a pointer.
10157
09:51:53,072 --> 09:51:56,461
It has not returned a quote\nunquote string per se
10158
09:51:56,461 --> 09:51:59,762
but there are functions that can\nsolve this comparison for us.
10159
09:51:59,762 --> 09:52:02,862
Recall that I could do\nsomething like this.
10160
09:52:02,862 --> 09:52:05,792
I could actually go\nin here and I could--
10161
09:52:07,002 --> 09:52:14,341
So if I include str compare here and\n
10162
09:52:14,341 --> 09:52:18,061
let's see now what happens\nwhen I make compare.
10163
09:52:18,061 --> 09:52:21,572
Implicitly declaring library\n
10164
09:52:22,682 --> 09:52:26,162
So you might have seen this error before\n
10165
09:52:26,161 --> 09:52:30,641
but there's some evidence of\nstars or pointers going on here.
10166
09:52:30,641 --> 09:52:33,131
It looks like I didn't include\nthe string.h header file
10167
09:52:34,322 --> 09:52:38,911
Include string.h which, despite its\n
10168
09:52:38,911 --> 09:52:41,791
called string, it just has\nstring-related functions in it
10169
09:52:46,591 --> 09:52:50,371
Now let's type in hi, exclamation\n
10170
09:52:50,372 --> 09:52:54,002
These are now-- oh, I used it wrong.
10171
09:52:55,724 --> 09:52:58,141
That was supposed to be\nimpressive, but it's the opposite.
10172
09:53:07,618 --> 09:53:09,951
DAVID J. MALAN: Yeah, it\nreturns three different values.
10173
09:53:09,951 --> 09:53:13,731
Zero if they're the same, positive\n1 becomes before the other
10174
09:53:13,732 --> 09:53:15,422
negative if the opposite is true.
10175
09:53:15,421 --> 09:53:18,621
I just forgot that, so like\nI did last week correctly
10176
09:53:18,622 --> 09:53:22,102
if I want to compare them for\nequality per the manual page
10177
09:53:22,101 --> 09:53:24,781
I should be checking for\nzero as the return value.
10178
09:53:24,781 --> 09:53:27,951
Now make compare, ./compare, Enter.
10179
09:53:27,951 --> 09:53:30,621
Let's try it one last time-- hi and hi.
10180
09:53:30,622 --> 09:53:32,182
OK now, they're in fact the same.
10181
09:53:37,232 --> 09:53:40,112
And indeed, not that it's\nreturning same all the time.
10182
09:53:40,112 --> 09:53:42,332
If I type in hi and\nthen bye, it's indeed
10183
09:53:42,332 --> 09:53:44,622
noticing that difference as well.
10184
09:53:44,622 --> 09:53:48,612
Well, let me go ahead and\ndo one other thing here.
10185
09:53:50,862 --> 09:53:54,362
Let me go ahead now and just reveal\n
10186
09:53:54,362 --> 09:53:57,692
Let's get rid of the string comparison\n
10187
09:53:57,692 --> 09:54:01,472
The simple way to print this out would\n
10188
09:54:02,521 --> 09:54:05,701
taking an address and start\nthere, print every character up
10189
09:54:05,701 --> 09:54:09,101
until the backslash n, so let's\njust hand it s and do that.
10190
09:54:09,101 --> 09:54:12,271
And then let's do one more, %s,t.
10191
09:54:12,271 --> 09:54:17,111
This is, again, sort of a\nmix of week 1 and this week
10192
09:54:17,112 --> 09:54:18,932
because I got rid of the word string.
10193
09:54:18,932 --> 09:54:24,072
I'm using char star, but I'm still\n
10194
09:54:24,072 --> 09:54:27,692
Let me go ahead and run compare\nnow, and if I type hi and hi
10195
09:54:27,692 --> 09:54:29,652
I should see the same thing twice.
10196
09:54:29,652 --> 09:54:33,272
So they look the same, but here\nnow we have the syntax today
10197
09:54:33,271 --> 09:54:35,651
to print out the actual\naddresses of these things.
10198
09:54:35,652 --> 09:54:40,082
So let me just change the s to a p,\n
10199
09:54:40,082 --> 09:54:44,012
and print it, it means just\nprint the address as a pointer.
10200
09:54:44,012 --> 09:54:48,781
So make compare, ./compare, and now\n
10201
09:54:48,781 --> 09:54:53,192
and I should see, indeed, two\nslightly different addresses given
10202
09:54:54,002 --> 09:54:56,311
One's got a B at the end,\none's got an F at the end
10203
09:54:56,311 --> 09:54:58,841
and they are indeed a few bytes apart.
10204
09:54:58,841 --> 09:55:02,066
So this is just confirming what\n
10205
09:55:02,067 --> 09:55:04,442
So what does this mean, perhaps\nin the computer's memory?
10206
09:55:05,942 --> 09:55:09,872
I've zoomed out so I have a little\n
10207
09:55:09,872 --> 09:55:16,262
Here might be s in memory when I do\n
10208
09:55:16,262 --> 09:55:19,742
I get a variable that's of size\n
10209
09:55:19,741 --> 09:55:23,311
claimed earlier that on modern systems,\n
10210
09:55:23,311 --> 09:55:25,621
nowadays so they can count even higher.
10211
09:55:25,622 --> 09:55:28,607
And inside of the computer's\nmemory, also, might be hi.
10212
09:55:28,607 --> 09:55:31,232
And I don't know where it ends\nup so for the sake of discussion
10213
09:55:32,161 --> 09:55:35,121
That's what was free\nwhen I ran the program.
10214
09:55:35,122 --> 09:55:36,961
h-i exclamation point, backslash zero.
10215
09:55:36,961 --> 09:55:42,122
Maybe it ended up, for the sake of\n
10216
09:55:42,122 --> 09:55:47,162
So to be clear, what is s\nstoring once the assignment
10217
09:55:47,161 --> 09:55:50,072
operator copies from right to left?
10218
09:55:50,072 --> 09:55:54,692
What is s storing if I\nadvance one more slide?
10219
09:55:56,811 --> 09:56:00,621
0x123, the presumption\nbeing that if a string is
10220
09:56:00,622 --> 09:56:04,597
defined by the address of its first\n
10221
09:56:04,597 --> 09:56:09,052
is 0x123, then that's indeed\nwhat should be in the variable s.
10222
09:56:09,052 --> 09:56:12,112
And so technically, that's what's\n
10223
09:56:13,612 --> 09:56:16,762
GetString indeed returns\na string, so to speak
10224
09:56:16,762 --> 09:56:20,601
but more properly it returns\nthe address of a char.
10225
09:56:20,601 --> 09:56:24,082
What's been then copied from right to\n
10226
09:56:24,082 --> 09:56:26,961
all these weeks is indeed that address.
10227
09:56:26,961 --> 09:56:31,461
Now technically, we don't really need\n
10228
09:56:31,461 --> 09:56:34,311
It suffices to just think about\nthem referentially, but let's
10229
09:56:34,311 --> 09:56:38,151
first consider where t might be.\n
10230
09:56:38,152 --> 09:56:39,802
created on my second line of code.
10231
09:56:39,802 --> 09:56:41,422
Maybe it ends up there,\nmaybe somewhere else.
10232
09:56:41,421 --> 09:56:43,713
For the sake of discussion\nI'll draw it left and right.
10233
09:56:43,713 --> 09:56:47,131
Where did the second word\nend up that I typed in?
10234
09:56:47,131 --> 09:56:53,031
Well, suppose the second copy of\nhi ended up at 0x456457458459.
10235
09:56:54,322 --> 09:56:55,911
I'll pluck this one off myself.
10236
09:56:57,982 --> 09:57:01,432
And so this is now a pictorial\nrepresentation of why
10237
09:57:01,432 --> 09:57:03,112
and let's abstract away everything else.
10238
09:57:03,112 --> 09:57:08,421
When I compared s against t using\n
10239
09:57:08,421 --> 09:57:09,951
they're obviously not the same.
10240
09:57:09,951 --> 09:57:12,112
One is over here, one is over here.
10241
09:57:12,112 --> 09:57:16,641
And per a moment ago, one is\n0x123, the other is 0x456.
10242
09:57:16,641 --> 09:57:19,851
Yes, technically they're pointing\nat something that's the same
10243
09:57:19,851 --> 09:57:23,332
but that just reveals\nhow str compare works.
10244
09:57:23,332 --> 09:57:26,002
str compare is apparently\na function that
10245
09:57:26,002 --> 09:57:29,241
takes in the address of\na string as its argument
10246
09:57:29,241 --> 09:57:31,761
and the address of another\nstring as its argument
10247
09:57:31,762 --> 09:57:36,682
it goes to the first character in\n
10248
09:57:36,682 --> 09:57:38,872
and probably has a for\nloop or a while loop
10249
09:57:38,872 --> 09:57:41,782
and just goes from left to\nright, comparing, looking
10250
09:57:41,781 --> 09:57:45,502
for the same chars left and right, and\n
10251
09:57:47,482 --> 09:57:51,842
If it does notice a difference it\n
10252
09:57:51,841 --> 09:57:55,681
And that's very similar, recall, to how\n
10253
09:57:56,182 --> 09:57:59,092
I used a for loop, I was\nlooking for a backslash zero.
10254
09:57:59,091 --> 09:58:04,881
str compare is probably a little similar\n
10255
09:58:04,881 --> 09:58:08,362
but comparing, this\ntime not just counting.
10256
09:58:08,362 --> 09:58:11,092
Are any questions then,\non string comparison
10257
09:58:11,091 --> 09:58:14,181
and why it is that we use str\ncompare and not equals equals?
10258
09:58:15,374 --> 09:58:17,610
AUDIENCE: Do pointers have addresses?
10259
09:58:17,610 --> 09:58:19,402
DAVID J. MALAN: Do\npointers have addresses?
10260
09:58:19,902 --> 09:58:24,652
So we won't do that today, but I could\n
10261
09:58:26,182 --> 09:58:29,781
That would give me the\nequivalent of a char star star
10262
09:58:29,781 --> 09:58:31,966
that itself could be\nstored elsewhere in memory.
10263
09:58:32,841 --> 09:58:35,031
We don't do that recursively forever.
10264
09:58:35,031 --> 09:58:37,972
There's star and there's star\nstar, but yes, that is a thing
10265
09:58:37,972 --> 09:58:41,272
and it's very often useful in the\n
10266
09:58:41,271 --> 09:58:44,541
which we haven't really talked about,\n
10267
09:58:47,582 --> 09:58:50,632
All right, so what might we now\ndo to take things up a notch?
10268
09:58:50,631 --> 09:58:53,151
Well let's go ahead and implement\na different program here
10269
09:58:53,152 --> 09:58:56,702
that maybe tries copying some\nvalues, just to demonstrate this.
10270
09:58:56,701 --> 09:59:00,441
Let me open up a file\ncalled, how about copy.c
10271
09:59:00,442 --> 09:59:02,872
and I'm going to start\noff with a few includes.
10272
09:59:02,872 --> 09:59:06,652
So let's include the CS50 library just\n
10273
09:59:06,652 --> 09:59:11,302
Let's include-- how about stdio\nas always, let's preemptively
10274
09:59:11,302 --> 09:59:14,072
include string.h and maybe\none other in a moment.
10275
09:59:14,072 --> 09:59:17,072
Let's do int main(void) as before.
10276
09:59:17,072 --> 09:59:20,601
And then in here, let's get a\nstring from the user and just
10277
09:59:23,031 --> 09:59:26,722
And heck, we can actually just\ncall this char star if we want
10278
09:59:26,722 --> 09:59:28,834
or string, since we're\nusing the RS50 library.
10279
09:59:28,834 --> 09:59:30,002
But we'll come back to that.
10280
09:59:30,002 --> 09:59:33,591
Let's now make a copy\nof s and do s equals t
10281
09:59:33,591 --> 09:59:38,252
using a single assignment operator and\n
10282
09:59:38,252 --> 09:59:43,192
Let's go into the first character\nof t, which is t bracket zero
10283
09:59:43,192 --> 09:59:45,592
and then let's uppercase\nit using that function
10284
09:59:45,591 --> 09:59:50,931
that we've used in the past of\n
10285
09:59:50,932 --> 09:59:52,592
And actually, I should go back up here.
10286
09:59:52,591 --> 09:59:56,828
If I'm using toupper or if you use\n
10287
09:59:56,828 --> 09:59:59,661
I might not remember this offhand,\n
10288
10:00:01,521 --> 10:00:04,651
There was a bunch of helpful\nfunctions in that library as well.
10289
10:00:04,652 --> 10:00:09,457
Now at the very last line of the program\n
10290
10:00:09,457 --> 10:00:16,882
are by simply printing out %s for each\n
10291
10:00:16,881 --> 10:00:20,042
of course, and let's\nsee what happens here.
10292
10:00:21,832 --> 10:00:23,242
oh my God, so many mistakes.
10293
10:00:26,661 --> 10:00:30,211
String t equals s, sorry, so\nI'm creating two variables
10294
10:00:30,211 --> 10:00:33,141
s and t respectively,\nand I'm copying s into t.
10295
10:00:34,822 --> 10:00:40,012
There we go. ./copy, and let's\nnow type in, for instance
10296
10:00:40,012 --> 10:00:43,882
how about hi exclamation point\nin all lowercase this time
10297
10:00:47,451 --> 10:00:51,561
I don't think that's what I\nintended, so to speak, here.
10298
10:00:51,561 --> 10:00:55,381
Because notice that I got s from\nthe user, so that checks out.
10299
10:00:55,381 --> 10:00:59,063
I then copied t into\ns, which looks correct.
10300
10:00:59,063 --> 10:01:00,771
That's what we always\nuse assignment for.
10301
10:01:00,771 --> 10:01:04,551
Then I uppercase the first\nletter in t, but not s--
10302
10:01:05,692 --> 10:01:09,412
then I printed s and t and then\nnoticed, apparently, both s
10303
10:01:13,281 --> 10:01:15,881
So if you're starting to get a\nlittle comfortable with what's
10304
10:01:15,881 --> 10:01:19,781
going on underneath the hood,\n
10305
10:01:19,781 --> 10:01:23,584
Why did both get capitalized?
10306
10:01:23,584 --> 10:01:24,792
Why did both get capitalized?
10307
10:01:25,482 --> 10:01:27,962
AUDIENCE: Could it be they're\nreferencing the same address?
10308
10:01:27,961 --> 10:01:29,372
DAVID J. MALAN: Yeah, they're\nrepresenting the same address.
10309
10:01:31,232 --> 10:01:34,622
If you create another variable called\n
10310
10:01:34,622 --> 10:01:37,232
you are literally assigning\nit the value in s
10311
10:01:37,232 --> 10:01:40,122
which is 0x123 or something like that.
10312
10:01:40,122 --> 10:01:43,742
And so at that point in the\nstory both s and t presumably
10313
10:01:43,741 --> 10:01:47,311
have a value of 0x123,\nwhich means they technically
10314
10:01:47,311 --> 10:01:51,421
point to the same h-i\nexclamation point in memory.
10315
10:01:51,421 --> 10:01:56,252
Nowhere did I tell the computer to give\n
10316
10:01:56,252 --> 10:01:59,491
per se, I literally said just copy s.
10317
10:01:59,491 --> 10:02:03,752
So here's where an understanding of what\n
10318
10:02:03,752 --> 10:02:06,122
I'm only copying the pointers.
10319
10:02:06,122 --> 10:02:07,961
So what actually went on in memory?
10320
10:02:07,961 --> 10:02:09,601
Let's take a look here at this grid.
10321
10:02:09,601 --> 10:02:12,451
If I created s initially,\nmaybe it ends up here.
10322
10:02:12,451 --> 10:02:15,961
And I created hi in lowercase,\nand it ended up down here.
10323
10:02:15,961 --> 10:02:22,112
Then the address was, again, like\n
10324
10:02:22,112 --> 10:02:24,811
If then I create a\nsecond variable called t
10325
10:02:24,811 --> 10:02:29,042
and I call it a string, a.k.a. char\n
10326
10:02:29,042 --> 10:02:34,622
But when I copy s into t by\ndoing t equals s semicolon
10327
10:02:34,622 --> 10:02:40,227
that literally just copies s into\n
10328
10:02:40,226 --> 10:02:43,351
So if we now abstract away all these\n
10329
10:02:43,351 --> 10:02:47,731
with arrows, what we've drawn in\nthe computer's memory is this.
10330
10:02:47,732 --> 10:02:52,232
Two different pointers but storing\nthe same address, which means
10331
10:02:52,232 --> 10:02:55,122
the breadcrumbs lead to the same place.
10332
10:02:55,122 --> 10:02:58,202
And so if you follow the t breadcrumb\n
10333
10:02:58,201 --> 10:03:02,191
it is functionally the\nsame as copying the--
10334
10:03:02,192 --> 10:03:07,832
changing the first letter\nin the version s as well.
10335
10:03:07,832 --> 10:03:12,671
So what's the solution, then,\nto this kind of problem?
10336
10:03:12,671 --> 10:03:14,741
Even if you have no idea\nhow to do it in code
10337
10:03:14,741 --> 10:03:17,307
what's the gist of what I\nreally intended, which is
10338
10:03:17,307 --> 10:03:21,461
I want a genuine copy of s, called t.
10339
10:03:21,461 --> 10:03:25,574
I want a new h-i exclamation\npoint backslash zero.
10340
10:03:25,574 --> 10:03:27,281
What do I need to do\nto make that happen?
10341
10:03:28,249 --> 10:03:30,992
AUDIENCE: I think there's\na function called str copy.
10342
10:03:30,991 --> 10:03:34,322
DAVID J. MALAN: So there is a\nfunction called str copy, strcpy
10343
10:03:34,322 --> 10:03:36,872
which is a possible\nanswer to this question.
10344
10:03:36,872 --> 10:03:41,042
The catch with stir copy is that you\n
10345
10:03:41,042 --> 10:03:43,592
what the source string is--\nthe one you want to copy--
10346
10:03:43,591 --> 10:03:46,322
you also need to pass in the\naddress of a chunk of memory
10347
10:03:46,322 --> 10:03:50,911
into which you can copy the string, and\n
10348
10:03:50,911 --> 10:03:53,311
and we need one more building\nblock today, if you will.
10349
10:03:53,311 --> 10:03:57,722
We haven't yet seen a way to\ncreate new chunks of memory
10350
10:03:57,722 --> 10:04:00,641
and then let some other\nfunction copy into them.
10351
10:04:00,641 --> 10:04:04,021
And for this, we're going to introduce\n
10352
10:04:04,932 --> 10:04:07,652
And this is the last and most\npowerful feature perhaps, today
10353
10:04:07,652 --> 10:04:11,612
whereby we're going to introduce two\n
10354
10:04:11,612 --> 10:04:14,851
malloc means memory allocate,\nwhich literally does just that.
10355
10:04:14,851 --> 10:04:18,002
It's a function that takes a number\n
10356
10:04:18,002 --> 10:04:21,394
do you want the operating system to\n
10357
10:04:21,394 --> 10:04:23,311
It's going to find it\nand it's going to return
10358
10:04:23,311 --> 10:04:26,915
to you the address of the first byte of\n
10359
10:04:26,915 --> 10:04:29,582
and then you can do anything you\nwant with that chunk of memory.
10360
10:04:29,582 --> 10:04:31,112
free is going to do the opposite.
10361
10:04:31,112 --> 10:04:33,932
When you're done using a chunk of\n
10362
10:04:33,932 --> 10:04:37,561
you can say free it, and that means you\n
10363
10:04:37,561 --> 10:04:40,781
and then the operating system can\n
10364
10:04:40,781 --> 10:04:44,222
So this is actually evidence of\na common problem in programming.
10365
10:04:44,222 --> 10:04:48,671
If your Mac your PC has ever been in\n
10366
10:04:48,671 --> 10:04:53,281
really slow, or it's slowing to a\n
10367
10:04:53,281 --> 10:04:56,281
one of the possible\nexplanations could be
10368
10:04:56,281 --> 10:04:59,161
that the program you're\nrunning by Apple or Microsoft
10369
10:04:59,161 --> 10:05:02,401
or whoever, maybe they're using\nmalloc or some equivalent
10370
10:05:02,402 --> 10:05:03,707
asking the operating system--
10371
10:05:03,707 --> 10:05:05,582
Mac OS or Windows-- for,\ngive me more memory.
10372
10:05:06,362 --> 10:05:07,741
The user is creating more images.
10373
10:05:07,741 --> 10:05:09,182
The user is typing a longer essay.
10374
10:05:09,182 --> 10:05:10,802
Give me more memory, more memory.
10375
10:05:10,802 --> 10:05:15,362
If the program has a bug and never\n
10376
10:05:15,362 --> 10:05:18,061
your computer might end up using\nall of the available memory
10377
10:05:18,061 --> 10:05:21,932
and honestly, humans are not very good\n
10378
10:05:21,932 --> 10:05:24,811
Very often programs, computers\njust freeze at that point
10379
10:05:24,811 --> 10:05:28,951
or get really, really slow because\n
10380
10:05:28,951 --> 10:05:31,112
when there's not enough memory left.
10381
10:05:31,112 --> 10:05:33,722
So one of the reasons for a\ncomputer really slowing down
10382
10:05:33,722 --> 10:05:37,995
might be calling for malloc a lot, or\n
10383
10:05:37,995 --> 10:05:40,412
Which is to say, you should\nalways use these two functions
10384
10:05:40,411 --> 10:05:43,991
in concert and free memory\nonce you are done with it.
10385
10:05:43,991 --> 10:05:48,121
So let me go ahead and do this in\n
10386
10:05:48,122 --> 10:05:50,162
Let me go ahead and do this.
10387
10:05:50,161 --> 10:05:53,851
Before I copy s into t using\nsomething like str copy
10388
10:05:53,851 --> 10:05:56,487
I first need to get a bunch\nof memory from the computer.
10389
10:05:56,487 --> 10:05:59,612
So to do that, let's make this super\n
10390
10:05:59,612 --> 10:06:03,182
so I'm going to change my strings\n
10391
10:06:03,182 --> 10:06:05,641
and what I technically\nam going to store in t
10392
10:06:05,641 --> 10:06:09,692
is the address of an\navailable chunk of memory.
10393
10:06:09,692 --> 10:06:13,891
To do that, I can ask the computer\nto allocate memory for me
10394
10:06:15,302 --> 10:06:18,542
If I want to create a copy\nof h-i exclamation point
10395
10:06:22,991 --> 10:06:27,252
Because I need the h, the i, the\n
10396
10:06:28,362 --> 10:06:30,522
It's up to me to understand\nthat and ask for it.
10397
10:06:30,521 --> 10:06:32,051
It's not going to happen magically.
10398
10:06:32,052 --> 10:06:35,962
Nothing does in C. So I could\njust naively type four there
10399
10:06:35,961 --> 10:06:38,862
and that would be correct\nif I type in h-i exclamation
10400
10:06:38,862 --> 10:06:42,792
point or any other three letter word\n
10401
10:06:42,792 --> 10:06:46,122
I should probably do\nsomething like strlen of s
10402
10:06:46,122 --> 10:06:49,692
plus 1 for the additional\nnull character.
10403
10:06:49,692 --> 10:06:52,182
Recall that string length\ndoes it in the English sense--
10404
10:06:52,182 --> 10:06:56,351
it returns the length of the string\n
10405
10:06:56,351 --> 10:06:58,601
the fact that I'm going\nto need that backslash n.
10406
10:06:58,601 --> 10:07:00,972
Now let me do this old\nschool style first.
10407
10:07:00,972 --> 10:07:05,711
Let me go ahead and manually\ncopy the string s into t first.
10408
10:07:05,711 --> 10:07:13,572
So for int i equals 0, i is less than\n
10409
10:07:13,572 --> 10:07:18,521
Then inside my for loop, I'm going\n
10410
10:07:18,521 --> 10:07:22,572
i, but actually I want\nthe null character too
10411
10:07:22,572 --> 10:07:25,362
so I want to do the length\nof the string plus 1 more
10412
10:07:25,362 --> 10:07:28,031
and heck, I think I learned\nan optimization last time.
10413
10:07:28,031 --> 10:07:30,491
If I'm doing this again\nand again, I could really
10414
10:07:30,491 --> 10:07:36,222
do n equals strlen of s plus 1\nand then do i is less than n
10415
10:07:36,222 --> 10:07:38,722
just as a nice design optimization.
10416
10:07:38,722 --> 10:07:41,891
I think this for loop will\nactually handle the process, then
10417
10:07:41,891 --> 10:07:48,701
of copying every character from s into\n
10418
10:07:48,701 --> 10:07:52,031
Or I could get rid of all of that\n
10419
10:07:52,031 --> 10:07:56,201
is to use str copy, which takes as\n
10420
10:07:56,201 --> 10:07:58,661
and its second argument the source.
10421
10:07:58,661 --> 10:08:03,641
So copy from right to left in this case,\n
10422
10:08:03,641 --> 10:08:06,591
automatically for me as well.
10423
10:08:08,781 --> 10:08:10,762
I can now capitalize safely.
10424
10:08:10,762 --> 10:08:14,802
The first character in t, which\n
10425
10:08:14,802 --> 10:08:18,802
than s, and then I can print them both\n
10426
10:08:19,811 --> 10:08:22,691
So make copy-- all right,\nwhat did I do wrong?
10427
10:08:22,692 --> 10:08:25,781
Implicitly declaring library\nfunction malloc dot, dot, dot.
10428
10:08:25,781 --> 10:08:28,421
So we've seen this kind of error before.
10429
10:08:28,421 --> 10:08:31,511
What is-- even if you don't\nknow quite how to solve it
10430
10:08:31,512 --> 10:08:33,042
what's the essence of the solution?
10431
10:08:33,042 --> 10:08:36,072
What do I need to do to fix this\n
10432
10:08:36,072 --> 10:08:38,631
declaring a library function?
10433
10:08:41,572 --> 10:08:42,921
I need to include the library.
10434
10:08:42,921 --> 10:08:46,911
And I could look this up in the manual,\n
10435
10:08:47,722 --> 10:08:49,822
There's another library\nwe'll occasionally
10436
10:08:49,822 --> 10:08:51,921
need now called standard lib--
10437
10:08:51,921 --> 10:08:56,031
standard library-- that contains\nmalloc and free prototypes
10438
10:08:57,381 --> 10:09:00,421
All right, let me just clear this\n
10439
10:09:00,421 --> 10:09:06,322
Now I'm good. ./copy, Enter, All right.\n
10440
10:09:06,322 --> 10:09:10,131
t and s now come back as intended.
10441
10:09:10,131 --> 10:09:15,322
s is untouched, it would seem,\nbut t is now capitalized.
10442
10:09:15,322 --> 10:09:18,711
Are any questions, then, on\nwhat we just did in code?
10443
10:09:20,533 --> 10:09:23,942
AUDIENCE: You said that\nmalloc and free go together.
10444
10:09:28,411 --> 10:09:30,453
There's a few improvements\nI want to make, so let
10445
10:09:30,453 --> 10:09:32,011
me actually do those right now.
10446
10:09:32,012 --> 10:09:35,042
Technically, I should practice what\n
10447
10:09:35,042 --> 10:09:37,459
when I'm done with t, free t.
10448
10:09:37,459 --> 10:09:39,542
Fortunately, I don't have\nto worry about how big t
10449
10:09:39,542 --> 10:09:43,052
was-- the computer remembers how many\n
10450
10:09:43,052 --> 10:09:44,732
all of them, not just the first.
10451
10:09:46,442 --> 10:09:49,112
I don't need to do free\ns, and I shouldn't
10452
10:09:49,112 --> 10:09:52,052
because that is handled\nautomatically by the CS50 library.
10453
10:09:52,052 --> 10:09:54,452
s, recall, came from\nGetString, and we actually
10454
10:09:54,451 --> 10:09:56,829
have some fancy code in\nplace that makes sure
10455
10:09:56,830 --> 10:09:58,622
that at the end of your\nprogram's execution
10456
10:09:58,622 --> 10:10:01,682
we free any memory that we\nallocated so we don't actually
10457
10:10:01,682 --> 10:10:03,616
waste memory like I described earlier.
10458
10:10:03,616 --> 10:10:05,491
But there's actually a\ncouple of other things
10459
10:10:05,491 --> 10:10:07,991
if I really want to be\npedantic I should put in here.
10460
10:10:07,991 --> 10:10:11,432
It turns out that\nsometimes malloc can fail
10461
10:10:11,432 --> 10:10:14,169
and sometimes malloc doesn't\nhave enough memory available
10462
10:10:14,169 --> 10:10:15,961
because maybe your\ncomputer's doing so much
10463
10:10:15,961 --> 10:10:18,061
stuff there's just no\nmore RAM available.
10464
10:10:18,061 --> 10:10:20,341
So technically, I should\ndo something like this--
10465
10:10:20,341 --> 10:10:24,901
if t equals equals null,\nwith two L's today
10466
10:10:24,902 --> 10:10:28,112
then I should just return 1 or something\n
10467
10:10:28,112 --> 10:10:29,987
I should probably print\nan error message too
10468
10:10:29,987 --> 10:10:31,662
but for now I'm going to keep it simple.
10469
10:10:31,661 --> 10:10:33,886
I should also probably check this.
10470
10:10:33,887 --> 10:10:36,211
This is a little risky of me.
10471
10:10:36,211 --> 10:10:40,872
If I'm doing t bracket zero, this is\n
10472
10:10:40,872 --> 10:10:43,592
But what if the human just\nhit Enter at the prompt
10473
10:10:43,591 --> 10:10:46,752
and didn't even type h, let\nalone h-i exclamation point?
10474
10:10:46,752 --> 10:10:48,991
What if there is no t bracket zero?
10475
10:10:48,991 --> 10:10:54,542
So technically, what I should probably\n
10476
10:10:54,542 --> 10:11:00,482
is at least greater than zero,\n
10477
10:11:01,802 --> 10:11:04,092
And then at the very\nend if all goes well
10478
10:11:04,091 --> 10:11:08,201
I can return zero, thereby signifying\n
10479
10:11:08,201 --> 10:11:12,072
So yes, these two functions, malloc\n
10480
10:11:12,072 --> 10:11:17,012
And so if you call malloc you\nshould call free eventually.
10481
10:11:17,012 --> 10:11:22,617
But you did not call malloc for s,\n
10482
10:11:23,491 --> 10:11:24,658
AUDIENCE: Here's a question.
10483
10:11:24,658 --> 10:11:26,940
Why do we do malloc plus 1?
10484
10:11:26,940 --> 10:11:28,732
DAVID J. MALAN: Why\ndid I do malloc plus 1?
10485
10:11:28,732 --> 10:11:31,642
So malloc-- sorry, malloc\nof string length of s
10486
10:11:31,641 --> 10:11:35,264
plus 1-- the string length is the\n
10487
10:11:35,264 --> 10:11:36,472
would perceive it in English.
10488
10:11:36,472 --> 10:11:39,472
So h-i exclamation\npoint-- strlen gives me 3
10489
10:11:39,472 --> 10:11:43,162
but I know now as of last week and\n
10490
10:11:43,161 --> 10:11:45,111
and a string always has an extra byte.
10491
10:11:45,112 --> 10:11:47,662
The onus is on me to\nunderstand and apply
10492
10:11:47,661 --> 10:11:52,371
that lesson learned so that I actually\n
10493
10:11:53,991 --> 10:11:59,661
And here's just an annoying thing when\n
10494
10:11:59,661 --> 10:12:03,711
week, it turns out that\nN-U-L-L is the same idea.
10495
10:12:03,711 --> 10:12:06,891
It's also zero, but it's zero\nin the context of pointer.
10496
10:12:06,891 --> 10:12:11,122
So long story short, you never\n
10497
10:12:11,122 --> 10:12:12,412
and we saw it on the screen.
10498
10:12:12,411 --> 10:12:17,991
You will start writing N-U-L-L when you\n
10499
10:12:19,042 --> 10:12:20,452
And what I mean by that is this.
10500
10:12:20,451 --> 10:12:23,331
If malloc fails and there's just\nnot enough memory left inside
10501
10:12:23,332 --> 10:12:26,632
of the computer for you, it's\ngot to return a special value
10502
10:12:26,631 --> 10:12:30,561
and that special value is\nN-U-L-L in all capital letters.
10503
10:12:30,561 --> 10:12:32,182
That signifies something went wrong.
10504
10:12:32,182 --> 10:12:37,131
Do not trust that I'm giving\nyou a useful return value.
10505
10:12:37,131 --> 10:12:40,752
Other questions on\nthese copies thus far?
10506
10:12:46,841 --> 10:12:48,091
DAVID J. MALAN: Good question.
10507
10:12:48,091 --> 10:12:49,981
Will str copy not work without malloc?
10508
10:12:49,982 --> 10:12:53,252
You kind of need both in\nthis case because str copy
10509
10:12:53,252 --> 10:12:56,641
by definition-- if I pull up its\n
10510
10:12:56,641 --> 10:12:58,622
to put the copied characters.
10511
10:12:58,622 --> 10:13:01,682
It's not sufficient just to\nsay char star t semicolon.
10512
10:13:01,682 --> 10:13:03,122
That only gives you a pointer.
10513
10:13:03,122 --> 10:13:06,062
But I need another\nchunk of memory that's
10514
10:13:06,061 --> 10:13:10,171
just as big as h-i exclamation\npoint backslash zero
10515
10:13:10,171 --> 10:13:12,631
so malloc gives me a\nwhole bunch of memory
10516
10:13:12,631 --> 10:13:16,921
and then str copy fills it with h-i\n
10517
10:13:16,921 --> 10:13:19,381
So again, that's why we're\ngoing down to this lower level
10518
10:13:19,381 --> 10:13:21,423
because once you understand\nwhat needs to be done
10519
10:13:21,423 --> 10:13:23,292
you now have the functions to do it.
10520
10:13:23,292 --> 10:13:25,332
So let's actually consider\nwhat we just solved.
10521
10:13:25,332 --> 10:13:29,192
So in this next version of the program\n
10522
10:13:29,192 --> 10:13:32,702
t was initialized for the\nreturn value of malloc
10523
10:13:32,701 --> 10:13:34,741
and maybe the memory that\nI got back was here--
10524
10:13:38,341 --> 10:13:40,651
I've left it blank\ninitially because nothing
10525
10:13:40,652 --> 10:13:42,362
is put there automatically by malloc.
10526
10:13:42,362 --> 10:13:46,472
I just get a chunk of memory that\n
10527
10:13:46,472 --> 10:13:51,391
I then assign t to that return value,\n
10528
10:13:51,391 --> 10:13:53,222
Notice there's no backslash zero.
10529
10:13:53,222 --> 10:13:56,101
This is not yet a string\nit's just a chunk of memory--
10530
10:13:56,101 --> 10:13:58,231
four bytes-- an array of four bytes.
10531
10:13:58,232 --> 10:14:01,802
What str copy eventually did\nfor me was it copied the h over
10532
10:14:01,802 --> 10:14:06,032
the i over, the exclamation point\nover, and the backslash zero.
10533
10:14:06,031 --> 10:14:09,902
And if I didn't want to use str copy or\n
10534
10:14:09,902 --> 10:14:14,062
would have done exactly the same thing.
10535
10:14:14,061 --> 10:14:19,178
Are any questions, then,\non these examples here.
10536
10:14:28,491 --> 10:14:29,741
DAVID J. MALAN: Good question.
10537
10:14:29,741 --> 10:14:34,091
After malloc, if I had then\nstill done just t equals s
10538
10:14:34,091 --> 10:14:37,211
it actually would have recreated\nthe same original problem
10539
10:14:37,211 --> 10:14:40,932
by just copying 0x123 from s into t.
10540
10:14:40,932 --> 10:14:44,112
So then I would have been left with\n
10541
10:14:44,112 --> 10:14:48,072
steps ago, I would have-- and\nI can't quite do it live--
10542
10:14:48,072 --> 10:14:50,381
this arrow, if I did\nwhat you just described
10543
10:14:50,381 --> 10:14:54,358
would now be pointing over here and so\n
10544
10:14:54,358 --> 10:14:56,441
the problem, I would have\njust additionally wasted
10545
10:14:56,442 --> 10:14:59,502
four bytes temporarily that\nI'm not actually using.
10546
10:15:06,222 --> 10:15:08,180
do you always use malloc\nand str copy together?
10547
10:15:08,955 --> 10:15:10,872
These are both solving\ntwo different problems.
10548
10:15:10,872 --> 10:15:15,132
malloc's giving me enough memory to\n
10549
10:15:15,131 --> 10:15:18,941
However, you could actually use an\n
10550
10:15:18,942 --> 10:15:22,272
and you could use str copy on that, and\n
10551
10:15:22,271 --> 10:15:24,432
But thus far, it's a\nreasonable mental model
10552
10:15:24,432 --> 10:15:26,652
to have that if you\nwant to copy strings
10553
10:15:26,652 --> 10:15:30,281
you use malloc and then str\ncopy, or your own homegrown loop.
10554
10:15:42,531 --> 10:15:44,730
DAVID J. MALAN: Say that once more.
10555
10:15:55,531 --> 10:15:58,801
str copy, per its documentation,\nwill copy the whole string
10556
10:15:58,802 --> 10:16:01,022
plus the null character at the end.
10557
10:16:01,021 --> 10:16:03,481
It just assumes there will be one there.
10558
10:16:03,482 --> 10:16:07,652
It's therefore up to you to pass str\n
10559
10:16:08,641 --> 10:16:10,832
If I only ask malloc\nfor three bytes, that
10560
10:16:10,832 --> 10:16:12,902
could have potentially\ncreated a memory problem
10561
10:16:12,902 --> 10:16:16,262
whereby str copy would just still\nblindly copy one, two, three
10562
10:16:16,262 --> 10:16:19,802
four bytes, but technically it should\n
10563
10:16:19,802 --> 10:16:22,652
You do not yet have access to the\n
10564
10:16:22,652 --> 10:16:24,902
because you never asked malloc for it.
10565
10:16:26,822 --> 10:16:29,822
AUDIENCE: So the number inside\n
10566
10:16:30,182 --> 10:16:32,057
The number inside malloc--\nit's one argument.
10567
10:16:32,057 --> 10:16:35,084
It's the number of bytes you want back.
10568
10:16:35,084 --> 10:16:38,402
AUDIENCE: Does that mean you\nhave to remember [INAUDIBLE]??
10569
10:16:41,158 --> 10:16:43,491
DAVID J. MALAN: Yes, the onus\nis on you, the programmer
10570
10:16:43,491 --> 10:16:45,658
to remember or frankly, use\na function to figure out
10571
10:16:45,658 --> 10:16:47,182
how many bytes you actually need.
10572
10:16:47,182 --> 10:16:50,031
That's why I did not ultimately\ntype in four manually
10573
10:16:51,802 --> 10:16:55,192
So the plus 1 is necessary if you\n
10574
10:16:55,192 --> 10:16:57,832
but using strlen means\nthat I can actually
10575
10:16:57,832 --> 10:17:01,012
play around with any types of\ninputs and it will dynamically
10576
10:17:02,902 --> 10:17:05,182
So suffice it to say,\nthere's so many ways
10577
10:17:05,182 --> 10:17:07,292
already where you can\nstart to break programs.
10578
10:17:07,292 --> 10:17:10,747
Let's give you at least one tool for\n
10579
10:17:10,747 --> 10:17:12,622
And indeed, in upcoming\nproblem sets you will
10580
10:17:12,622 --> 10:17:14,722
use this to find bugs in your own code.
10581
10:17:14,722 --> 10:17:18,351
Not just using printf, not just using\n
10582
10:17:19,561 --> 10:17:22,731
So let me go ahead and deliberately\n
10583
10:17:22,732 --> 10:17:24,872
that has some memory-related errors.
10584
10:17:24,872 --> 10:17:30,262
Let me include stdio.h at the top and\n
10585
10:17:30,262 --> 10:17:31,912
so I have access to malloc now.
10586
10:17:31,911 --> 10:17:36,531
Let me do int main(void) and then\n
10587
10:17:36,531 --> 10:17:39,711
I want to allocate\nmaybe how about three--
10588
10:17:41,572 --> 10:17:43,552
Just for the sake of discussion.
10589
10:17:43,552 --> 10:17:48,082
So I'm going to go ahead and do malloc\n
10590
10:17:48,082 --> 10:17:51,368
I want three integers and\nan integer is four bytes
10591
10:17:51,368 --> 10:17:52,701
so technically I could do this--
10592
10:17:52,701 --> 10:17:57,211
3 times 4, or I could do 12 but again,\n
10593
10:17:57,211 --> 10:17:59,701
and if I run this program on\na slightly different computer
10594
10:17:59,701 --> 10:18:01,221
int might be a different size.
10595
10:18:01,222 --> 10:18:05,682
so the better way to do this would be\n
10596
10:18:05,682 --> 10:18:08,932
And this is just an operator you can use\n
10597
10:18:08,932 --> 10:18:10,972
on this computer, how big is an int?
10598
10:18:10,972 --> 10:18:13,652
How big is a float, or something else?
10599
10:18:13,652 --> 10:18:15,772
So that's going to give me that many--
10600
10:18:15,771 --> 10:18:18,171
that much memory for three ints.
10601
10:18:18,171 --> 10:18:20,182
What do I want to assign this to?
10602
10:18:20,182 --> 10:18:22,372
Well, malloc returns an address.
10603
10:18:22,372 --> 10:18:27,652
Pointers are addresses, so I'm going\n
10604
10:18:31,101 --> 10:18:33,682
This is a little less obvious,\nbut again go back to basics.
10605
10:18:33,682 --> 10:18:38,451
The right hand side here gives me a\n
10606
10:18:38,451 --> 10:18:42,021
malloc returns the address of\nthe first byte of that chunk.
10607
10:18:42,021 --> 10:18:44,151
How do I store the address of anything?
10608
10:18:45,052 --> 10:18:48,922
The syntax for today\nis type of data, star
10609
10:18:48,921 --> 10:18:53,991
where the type of data in question\n
10610
10:18:53,991 --> 10:18:57,891
Again, it's kind of purposeless, only\n
10611
10:18:57,891 --> 10:19:03,262
here, but this is equivalent now to\n
10612
10:19:03,262 --> 10:19:06,711
in total, presumably, so I\ncan technically now do this.
10613
10:19:06,711 --> 10:19:10,851
I can go into maybe the first\n
10614
10:19:12,271 --> 10:19:20,061
Second location, the number 73, and the\n
10615
10:19:20,061 --> 10:19:22,911
Now I've deliberately\nmade two mistakes here
10616
10:19:22,911 --> 10:19:26,061
because I'm trying to trip\nover my newfound understanding
10617
10:19:26,061 --> 10:19:28,641
or my greenness with\nunderstanding pointers.
10618
10:19:28,641 --> 10:19:32,002
One, I didn't remember that I\n
10619
10:19:33,112 --> 10:19:36,502
malloc essentially returns an array,\n
10620
10:19:36,502 --> 10:19:38,902
An array of three ints,\nor more technically
10621
10:19:38,902 --> 10:19:42,741
the address of a chunk of memory\nthat could fit three ints.
10622
10:19:42,741 --> 10:19:46,042
So I can use my square bracket\n
10623
10:19:46,042 --> 10:19:48,991
and use pointer arithmetic, but\n
10624
10:19:48,991 --> 10:19:50,841
But I have made two mistakes.
10625
10:19:50,841 --> 10:19:54,441
I did not start indexing\nat zero, so line seven
10626
10:19:54,442 --> 10:19:56,302
should have been x bracket zero.
10627
10:19:56,302 --> 10:19:59,174
Line eight should have been x\nbracket 1, and then line nine
10628
10:19:59,173 --> 10:20:00,381
should have been x bracket 2.
10629
10:20:01,591 --> 10:20:04,521
The second mistake that\nI've made as a side effect
10630
10:20:04,521 --> 10:20:07,581
is I'm also touching\nmemory that I shouldn't.
10631
10:20:07,582 --> 10:20:12,531
x bracket 3 would mean go to the\n
10632
10:20:13,341 --> 10:20:15,861
I only asked for enough\nmemory for three ints
10633
10:20:15,862 --> 10:20:19,101
not four, so this is what's\ncalled a buffer overflow.
10634
10:20:19,101 --> 10:20:22,191
I am accidentally, but\ndeliberately at the moment
10635
10:20:22,192 --> 10:20:26,312
going beyond the boundaries of\nthis array, this chunk of memory.
10636
10:20:26,311 --> 10:20:28,671
So bad things happen,\nbut not necessarily
10637
10:20:28,671 --> 10:20:30,002
by just running your program.
10638
10:20:30,002 --> 10:20:31,552
Let me go ahead and just try this.
10639
10:20:31,552 --> 10:20:37,372
Make memory, and you'll see here\nthat it compiles OK. ./memory
10640
10:20:37,372 --> 10:20:39,500
and it actually does\nnot segmentation fault
10641
10:20:39,500 --> 10:20:41,542
which comes back to that\npoint of nondeterminism.
10642
10:20:41,542 --> 10:20:43,912
Sometimes it does, sometimes it\ndoesn't-- it depends on how bad
10643
10:20:45,052 --> 10:20:48,219
But there's a program that can\nspot these kinds of mistakes
10644
10:20:48,218 --> 10:20:51,051
and I'm going to go ahead and expand\n
10645
10:20:51,052 --> 10:20:56,512
and I'm going to run not just ./memory,\n
10646
10:20:56,512 --> 10:20:59,362
This is a command that comes\nwith a lot of computer systems
10647
10:20:59,362 --> 10:21:02,432
that's designed to find\nmemory-related bugs in code.
10648
10:21:02,432 --> 10:21:04,372
So it's a new tool in\nyour toolkit today
10649
10:21:04,372 --> 10:21:06,472
and you'll use it with\nthe coming problem sets.
10650
10:21:07,671 --> 10:21:09,951
It's output, honestly, it's hideous.
10651
10:21:09,951 --> 10:21:13,341
But there's a few things\nthat will start to jump out
10652
10:21:13,341 --> 10:21:15,741
and will help you with\ntools and the problems
10653
10:21:15,741 --> 10:21:17,311
sets to see these kinds of things.
10654
10:21:21,832 --> 10:21:25,822
That's on memory.c line\nnine, per my highlights.
10655
10:21:25,822 --> 10:21:27,711
So let me go look at line nine.
10656
10:21:27,711 --> 10:21:31,372
In what sense is this an\ninvalid write of size four?
10657
10:21:31,372 --> 10:21:33,952
Well, I'm touching memory\nthat I shouldn't, and I'm
10658
10:21:33,951 --> 10:21:35,421
touching it as though it's an int.
10659
10:21:35,421 --> 10:21:37,911
And an int is four bytes-- size four.
10660
10:21:37,911 --> 10:21:41,191
So again, this takes some practice to\n
10661
10:21:41,192 --> 10:21:44,132
but this is now a clue\nfor me, the programmer
10662
10:21:44,131 --> 10:21:47,591
that not only did I screw up, but\nI screwed up related to memory
10663
10:21:47,591 --> 10:21:50,109
and so this is just a hint, if you will.
10664
10:21:50,110 --> 10:21:52,652
It's not going to necessarily\ntell you exactly how to fix it
10665
10:21:52,652 --> 10:21:56,491
you have to wrestle with\nthe semantics, but invalid
10666
10:21:56,491 --> 10:21:58,322
write of size four-- oh, OK.
10667
10:21:58,322 --> 10:22:02,682
So I should not have indexed\npast the boundary here.
10668
10:22:02,682 --> 10:22:05,381
All right, so I\nshouldn't have done that.
10669
10:22:05,381 --> 10:22:11,125
So let me go ahead then and change this\n
10670
10:22:11,125 --> 10:22:13,292
All right, so let me go\nahead and recompile my code.
10671
10:22:13,292 --> 10:22:19,622
Make memory, ./memory, still doesn't\n
10672
10:22:20,252 --> 10:22:26,461
Let me go ahead and run Valgrind\n
10673
10:22:26,461 --> 10:22:28,682
And now there's fewer scary--
10674
10:22:28,682 --> 10:22:32,201
less scary output now, but\nthere's still something in there.
10675
10:22:32,201 --> 10:22:35,728
Notice this-- 12 bytes in one blocks--
10676
10:22:35,728 --> 10:22:37,561
no regard for grammar\nthere-- are definitely
10677
10:22:37,561 --> 10:22:39,331
lost in lost record one of one.
10678
10:22:39,332 --> 10:22:42,972
Super cryptic, but this is hinting\nat a so-called memory leak.
10679
10:22:42,972 --> 10:22:46,802
The blocks of memory are lost in\n
10680
10:22:46,802 --> 10:22:48,242
I asked for them but I never--
10681
10:22:51,368 --> 10:22:53,701
And this is the arcane way\nof saying, you've screwed up.
10682
10:22:54,911 --> 10:22:57,181
So this is an easy fix, fortunately.
10683
10:22:57,182 --> 10:23:01,572
Once I'm done with this memory I\n
10684
10:23:01,572 --> 10:23:03,991
So now let me go ahead\nand rerun make memory
10685
10:23:03,991 --> 10:23:07,801
it's still runs fine so all the while\n
10686
10:23:08,942 --> 10:23:10,622
But let me run Valgrind one more time.
10687
10:23:14,701 --> 10:23:16,891
All heap blocks were\nfreed, whatever that means.
10688
10:23:18,732 --> 10:23:21,842
And even though it's still a little\n
10689
10:23:21,841 --> 10:23:25,345
and in fact, it's pretty explicit--\n
10690
10:23:27,002 --> 10:23:30,192
So even though this is one of\nthe most arcane tools we'll use
10691
10:23:30,192 --> 10:23:32,702
it's also one of the most\npowerful because it can see things
10692
10:23:32,701 --> 10:23:36,031
that you, the human, might not, and\n
10693
10:23:36,031 --> 10:23:38,101
It does a much closer\nreading of your code
10694
10:23:38,101 --> 10:23:43,862
while it's running to figure\nout exactly what is going on.
10695
10:23:43,862 --> 10:23:46,141
Any questions, then, on this tool?
10696
10:23:46,141 --> 10:23:50,042
And we'll guide you after today\nwith actually using this, too.
10697
10:23:50,042 --> 10:23:52,561
Just helps you find\nmemory-related mistakes
10698
10:23:52,561 --> 10:23:55,381
that you might now be capable of making.
10699
10:23:55,381 --> 10:23:57,542
All right, let's do one\nother memory-related thing.
10700
10:23:57,542 --> 10:23:59,531
Let me shrink my terminal window here.
10701
10:23:59,531 --> 10:24:03,271
Let me create one other\nfile here called garbage.c.
10702
10:24:03,271 --> 10:24:06,781
It turns out there's a term of ours\n
10703
10:24:06,781 --> 10:24:08,292
that we can reveal as follows.
10704
10:24:08,292 --> 10:24:11,281
Let me include stdio.h,\nand let me include--
10705
10:24:11,281 --> 10:24:14,822
how about stdlib.h, and\nthen let me give myself int
10706
10:24:14,822 --> 10:24:17,921
main(void), and then in this\nrelatively short program
10707
10:24:17,921 --> 10:24:20,822
let me give myself three\nints using last week's
10708
10:24:20,822 --> 10:24:24,781
notation, just int scores bracket\n
10709
10:24:24,781 --> 10:24:28,801
Then let me go ahead and do for\n
10710
10:24:28,802 --> 10:24:34,052
i plus plus, then let me go ahead\nand print out, %i backslash n
10711
10:24:38,851 --> 10:24:44,141
This code, pretty sure is going\n
10712
10:24:46,531 --> 10:24:51,061
I've forgotten a step even though the\n
10713
10:24:53,792 --> 10:24:56,281
Yeah, I didn't provide the\nscores, so I didn't actually
10714
10:24:56,281 --> 10:25:00,211
initialize the array called scores\n
10715
10:25:00,211 --> 10:25:03,752
What's curious about this, though,\n
10716
10:25:04,442 --> 10:25:08,402
Let me go ahead and playfully\nmake garbage, Enter
10717
10:25:08,402 --> 10:25:10,982
and it's an apt description\nbecause what I'm about to see
10718
10:25:10,982 --> 10:25:13,592
are so-called garbage values.
10719
10:25:13,591 --> 10:25:18,421
When you, the programmer, do not\n
10720
10:25:18,421 --> 10:25:21,239
values, sometimes, who knows\nwhat's going to be there.
10721
10:25:21,239 --> 10:25:23,072
The computer's been\ndoing some other things
10722
10:25:23,072 --> 10:25:26,521
there's a bit of work that happens even\n
10723
10:25:26,521 --> 10:25:29,761
so there might be remnants\nof past ints, chars, strings
10724
10:25:29,762 --> 10:25:32,402
floats-- anything else in\nthere and what you're seeing
10725
10:25:32,402 --> 10:25:38,022
is those garbage values, which is\n
10726
10:25:38,021 --> 10:25:40,961
as I just did, to initialize\nthe value of some variable.
10727
10:25:40,961 --> 10:25:42,961
And this is actually\npretty dangerous, and there
10728
10:25:42,961 --> 10:25:46,442
have been many examples of\nsoftware being compromised
10729
10:25:46,442 --> 10:25:49,622
because of one of these issues\n
10730
10:25:49,622 --> 10:25:53,972
and all of a sudden users, maybe people\n
10731
10:25:53,972 --> 10:25:57,842
applications, could suddenly see the\n
10732
10:25:58,951 --> 10:26:01,411
Maybe someone's password that\nhad been previously typed in
10733
10:26:01,411 --> 10:26:03,391
or some other value like\na credit card number
10734
10:26:03,391 --> 10:26:04,951
that had been previously typed in.
10735
10:26:04,951 --> 10:26:06,932
There are different\ndefense mechanisms in place
10736
10:26:06,932 --> 10:26:10,472
to generally make this not\nso likely, but it's certainly
10737
10:26:10,472 --> 10:26:13,531
very possible, at least\nin this kind of context
10738
10:26:13,531 --> 10:26:17,461
to see values that you\nprobably shouldn't because they
10739
10:26:17,461 --> 10:26:20,981
might be remnants from\nsomething else that used them.
10740
10:26:20,982 --> 10:26:25,062
So this is to say again, you have this\n
10741
10:26:25,061 --> 10:26:28,381
but also now you have this great\nhacking ability to poke around
10742
10:26:28,381 --> 10:26:31,801
the contents of memory, and this is\n
10743
10:26:31,802 --> 10:26:35,792
trying to find ways to exploit systems.
10744
10:26:40,432 --> 10:26:42,472
All right, let's go ahead and\ntake a quick five minute break
10745
10:26:42,472 --> 10:26:44,872
and when we come back, we'll\nbuild on these final topics.
10746
10:26:47,031 --> 10:26:50,841
First, just a little programmer\n
10747
10:26:50,841 --> 10:26:53,211
will make a little bit of sense to you.
10748
10:26:53,211 --> 10:26:57,682
And what we'll also do next to take a\n
10749
10:26:57,682 --> 10:27:00,862
animates with claymation, if you\n
10750
10:27:00,862 --> 10:27:03,862
exactly what happens now if you have\n
10751
10:27:03,862 --> 10:27:07,364
values are and how they get there, and\n
10752
10:27:07,364 --> 10:27:09,531
It's one thing just to print\nthem out as I just did
10753
10:27:09,531 --> 10:27:13,792
it's another if you actually mistake\n
10754
10:27:13,792 --> 10:27:17,241
because garbage values are just zeros\n
10755
10:27:17,241 --> 10:27:20,121
But if you use that new\ndereference operator, the star
10756
10:27:20,122 --> 10:27:24,472
and try to go to a garbage value\nthinking incorrectly that it's
10757
10:27:24,472 --> 10:27:26,872
a valid pointer, bad things can happen.
10758
10:27:26,872 --> 10:27:31,792
Computers can crash or more familiarly,\n
10759
10:27:31,792 --> 10:27:34,762
So allow me to introduce, if we\n
10760
10:27:34,762 --> 10:27:36,472
our friend Binky from Stanford.
10761
10:27:40,311 --> 10:27:41,901
SPEAKER 1: Hey Binky, wake up.
10762
10:27:48,544 --> 10:27:50,461
SPEAKER 1: Well, to get\nstarted, I guess we're
10763
10:27:50,461 --> 10:27:52,082
going to need a couple of pointers.
10764
10:27:52,082 --> 10:27:56,359
BINKY: OK, this code allocates two\n
10765
10:27:56,942 --> 10:28:00,549
Well, I see the two pointers, but they\n
10766
10:28:01,381 --> 10:28:03,511
Initially, pointers\ndon't point to anything.
10767
10:28:03,512 --> 10:28:06,542
The things they point to are called\n
10768
10:28:07,535 --> 10:28:08,702
SPEAKER 1: Oh, right, right.
10769
10:28:11,381 --> 10:28:13,711
So how do you allocate a pointee?
10770
10:28:13,711 --> 10:28:17,281
BINKY: OK, well this code\nallocates a new integer pointee
10771
10:28:17,281 --> 10:28:20,354
and this part sets x to point to it.
10772
10:28:20,355 --> 10:28:21,772
SPEAKER 1: Hey, that looks better.
10773
10:28:23,381 --> 10:28:26,771
BINKY: OK, I'll dereference the\npointer x to store the number
10774
10:28:28,902 --> 10:28:32,562
For this trick, I'll need my\nmagic wand of dereferencing.
10775
10:28:32,561 --> 10:28:35,951
SPEAKER 1: Your magic\nwand of dereferencing?
10776
10:28:37,802 --> 10:28:39,512
BINKY: This is what the code looks like.
10777
10:28:39,512 --> 10:28:42,307
I'll just set up the number and--
10778
10:28:44,531 --> 10:28:49,451
So doing a dereference on x follows\n
10779
10:28:49,451 --> 10:28:51,491
in this case to store 42 in there.
10780
10:28:51,491 --> 10:28:56,112
Hey, try using it to store the number\n
10781
10:28:57,252 --> 10:29:01,631
I'll just go over here to y\nand get the number 13 set up
10782
10:29:01,631 --> 10:29:06,161
and then take the wand of\ndereferencing and just--
10783
10:29:07,241 --> 10:29:09,461
SPEAKER 1: Oh hey, that didn't work.
10784
10:29:09,461 --> 10:29:13,182
Say, Binky, I don't think\ndereferencing y is a good idea
10785
10:29:13,182 --> 10:29:16,377
because setting up the\npointee is a separate step
10786
10:29:16,377 --> 10:29:18,912
and I don't think we ever did it.
10787
10:29:19,961 --> 10:29:22,391
SPEAKER 1: Yeah, we\nallocated the pointer y
10788
10:29:22,391 --> 10:29:25,631
but we never set it\nto point to a pointee.
10789
10:29:26,800 --> 10:29:28,842
SPEAKER 1: Hey, you're\nlooking good there, Binky.
10790
10:29:28,841 --> 10:29:31,721
Can you fix it so that y points\nto the same pointee as x?
10791
10:29:31,722 --> 10:29:35,082
BINKY: Sure, I'll use my magic\nwand of pointer assignment.
10792
10:29:35,082 --> 10:29:37,332
SPEAKER 1: Is that going to\nbe a problem, like before?
10793
10:29:37,332 --> 10:29:39,222
BINKY: No, this doesn't\ntouch the pointees
10794
10:29:39,222 --> 10:29:42,851
it just changes one pointer to\n
10795
10:29:43,872 --> 10:29:46,542
Now y points to the same place as x.
10796
10:29:48,432 --> 10:29:51,491
It has a pointee so you can try\nthe wand of dereferencing again
10797
10:29:56,434 --> 10:29:57,641
SPEAKER 1: Hey, look at that.
10798
10:29:57,641 --> 10:29:59,472
Now dereferencing works on y.
10799
10:29:59,472 --> 10:30:03,522
And because the pointers are sharing\n
10800
10:30:05,232 --> 10:30:07,272
So are we going to switch places now?
10801
10:30:07,271 --> 10:30:09,191
SPEAKER 1: Oh look, we're out of time.
10802
10:30:10,311 --> 10:30:12,531
That's from our friend\nNick Parlante at Stanford.
10803
10:30:12,531 --> 10:30:14,871
So let's consider what\nNick did here as Binky.
10804
10:30:14,872 --> 10:30:16,942
So here is all the code together.
10805
10:30:16,942 --> 10:30:20,619
These first couple of lines were not\n
10806
10:30:20,618 --> 10:30:21,951
they move the stars to the left.
10807
10:30:22,701 --> 10:30:25,612
Again, more conventional\nmight be this syntax here.
10808
10:30:26,822 --> 10:30:30,141
It's OK to create\nvariables, even pointers
10809
10:30:30,141 --> 10:30:33,771
and not assign them a value initially\n
10810
10:30:33,771 --> 10:30:36,291
So we eventually do\nhere, with this line.
10811
10:30:36,292 --> 10:30:39,351
We assign to x the return\nvalue of malloc, which
10812
10:30:39,351 --> 10:30:41,182
is presumably the address of something.
10813
10:30:41,182 --> 10:30:44,432
To be fair, we should really\nbe checking for null as well
10814
10:30:44,432 --> 10:30:46,351
but that's not the biggest problem here.
10815
10:30:46,351 --> 10:30:48,841
The biggest problem is\nnot even this next line
10816
10:30:48,841 --> 10:30:54,591
which means go to the memory location\n
10817
10:30:54,591 --> 10:30:56,811
That's fine, because\nagain, malloc returns
10818
10:30:56,811 --> 10:30:59,061
the address of some chunk of memory.
10819
10:30:59,061 --> 10:31:01,161
This chunk of memory is\nbig enough for an int.
10820
10:31:01,161 --> 10:31:04,072
x is therefore going to store\nthe address of that chunk that's
10821
10:31:05,031 --> 10:31:08,902
Star x recalls the dereference\n
10822
10:31:10,701 --> 10:31:13,822
It's like going to the mailbox\nand putting the number 42 in it
10823
10:31:13,822 --> 10:31:16,732
instead of taking the number\n50 out, like we did before.
10824
10:31:18,411 --> 10:31:21,651
This is where Binky lost\nhis head, so to speak.
10825
10:31:24,042 --> 10:31:26,042
AUDIENCE: We haven't yet\nallocated space for it.
10826
10:31:26,591 --> 10:31:28,502
We haven't yet allocated space for y.
10827
10:31:28,502 --> 10:31:31,412
There's no mention of malloc,\nthere's no assignment of y
10828
10:31:32,951 --> 10:31:35,801
So this would be, go\nto the address in y
10829
10:31:35,802 --> 10:31:39,192
but if there is no known address in\n
10830
10:31:39,192 --> 10:31:42,122
which means go to some random address\n
10831
10:31:42,932 --> 10:31:47,582
that might cause what we've seen in the\n
10832
10:31:47,582 --> 10:31:49,472
Now this, fortunately,\nis the kind of thing
10833
10:31:49,472 --> 10:31:53,402
that if you don't quite have the eye\n
10834
10:31:53,402 --> 10:31:55,272
could help you find as well.
10835
10:31:55,271 --> 10:31:59,041
But it's just another example of\n
10836
10:31:59,042 --> 10:32:02,472
of having control now\nover memory at this level.
10837
10:32:02,972 --> 10:32:04,805
Well, let's go ahead\nand do one other thing.
10838
10:32:04,805 --> 10:32:07,947
Considering from last week\nthat this notion of swapping
10839
10:32:07,947 --> 10:32:09,572
was actually a really common operation.
10840
10:32:09,572 --> 10:32:12,572
We had all of our volunteers come\n
10841
10:32:12,572 --> 10:32:14,942
during bubble sorts and\neven selection sort
10842
10:32:14,942 --> 10:32:17,042
and we just took for\ngranted that the two
10843
10:32:17,042 --> 10:32:18,974
humans would swap themselves just fine.
10844
10:32:18,974 --> 10:32:21,182
But there needs to be code\nto do that if you actually
10845
10:32:21,182 --> 10:32:24,999
implement bubble sort, selection sort,\n
10846
10:32:24,999 --> 10:32:26,582
So let's consider some code like this.
10847
10:32:26,582 --> 10:32:28,652
We'll keep it simple\nlike last week, and where
10848
10:32:28,652 --> 10:32:35,700
we wanted to swap some values like\n
10849
10:32:35,699 --> 10:32:38,491
Void because I'm not going to return\n
10850
10:32:39,391 --> 10:32:44,701
So here, for instance,\nmight be some code for this.
10851
10:32:44,701 --> 10:32:45,909
But why is it so complicated?
10852
10:32:45,910 --> 10:32:47,494
Here, let's actually take a step back.
10853
10:32:48,661 --> 10:32:50,281
I think we have time\nfor one more volunteer.
10854
10:32:50,281 --> 10:32:51,739
Could we get someone to come on up?
10855
10:32:51,739 --> 10:32:54,031
You have to be comfy\non camera and you're
10856
10:32:54,031 --> 10:32:57,061
being asked to help with your-- oh,\n
10857
10:32:57,061 --> 10:33:01,002
So whoever has their\nfriend doing this here--
10858
10:33:01,982 --> 10:33:03,872
Now they're pointing it over here.
10859
10:33:03,872 --> 10:33:05,612
Now, literally an arm is being twisted.
10860
10:33:25,002 --> 10:33:27,078
Who were you trying to volunteer?
10861
10:33:29,332 --> 10:33:33,652
So here we have for Marina two\n
10862
10:33:33,652 --> 10:33:35,182
just so that they're super obvious.
10863
10:33:35,182 --> 10:33:37,586
And suppose that the problem\nat hand, like last week
10864
10:33:37,586 --> 10:33:40,461
it's just to swap two values, as\n
10865
10:33:40,461 --> 10:33:42,472
two people and we want to swap them.
10866
10:33:42,472 --> 10:33:45,862
But let's consider these glasses\n
10867
10:33:45,862 --> 10:33:47,572
in an array, and you know what?
10868
10:33:47,572 --> 10:33:50,042
I'd really like you to swap the values.
10869
10:33:50,042 --> 10:33:53,601
So orange has to go in there,\nand purple has to go in there.
10870
10:33:54,555 --> 10:33:56,722
And we'll see if we can\nthen translate that to code.
10871
10:34:02,472 --> 10:34:04,932
So presumably, you're\nstruggling mentally
10872
10:34:04,932 --> 10:34:08,141
with how you would do this without\n
10873
10:34:08,682 --> 10:34:11,552
Let me go ahead and we do have a\n
10874
10:34:11,552 --> 10:34:14,052
So if I hand you this, how would\nyou now solve this problem?
10875
10:34:16,542 --> 10:34:18,292
AUDIENCE: I would go\nlike that, but it's--
10876
10:34:18,292 --> 10:34:18,942
DAVID J. MALAN: No, that's--
10877
10:34:20,232 --> 10:34:23,342
Go do it-- go with your instincts.
10878
10:34:26,042 --> 10:34:28,171
Go to whatever your instincts are.
10879
10:34:34,561 --> 10:34:37,188
Yeah, so a little-- so\nstrictly speaking, probably
10880
10:34:37,188 --> 10:34:39,271
shouldn't have moved the\nglasses just because that
10881
10:34:39,271 --> 10:34:41,291
would be like moving\nthe array locations
10882
10:34:41,292 --> 10:34:43,972
so let's actually do it one\nmore time but the glasses now
10883
10:34:43,972 --> 10:34:45,722
have to go back where\nthey originally are.
10884
10:34:45,722 --> 10:34:50,412
So how would you swap these now,\nusing this temporary variable?
10885
10:34:51,836 --> 10:34:54,461
Otherwise we'd be completely\nuprooting the array, for instance
10886
10:34:54,461 --> 10:34:56,442
by just physically moving it around.
10887
10:34:56,442 --> 10:34:58,932
So you moved the orange into\nthis temporary variable
10888
10:34:58,932 --> 10:35:01,271
then you copied the purple\ninto where the orange was
10889
10:35:01,271 --> 10:35:03,641
and now, presumably, excellent.
10890
10:35:03,641 --> 10:35:06,461
The orange is going to end\nup where the purple once was
10891
10:35:06,461 --> 10:35:08,981
and this temporary variable,\nit stored up some extra memory.
10892
10:35:08,982 --> 10:35:11,802
It was necessary at the time,\nbut not necessary, ultimately.
10893
10:35:11,802 --> 10:35:17,492
But a round of applause if we could,\n
10894
10:35:17,491 --> 10:35:21,671
So the fact that it\ninstantly occurred to Mariana
10895
10:35:21,671 --> 10:35:25,072
that you need some temporary variable\n
10896
10:35:25,072 --> 10:35:28,311
and in fact this code here,\nthat we might glimpse now
10897
10:35:28,311 --> 10:35:30,398
is reminiscent of\nexactly that algorithm
10898
10:35:30,398 --> 10:35:33,231
where A and B, at the end of the\n
10899
10:35:33,232 --> 10:35:35,242
Just like the second\ntime, the two glasses
10900
10:35:35,241 --> 10:35:37,641
have to kind of stay put, even\n
10901
10:35:37,641 --> 10:35:39,391
but they're going back\nto where they were
10902
10:35:39,391 --> 10:35:41,391
is kind of like having\ntwo values, A and B
10903
10:35:41,391 --> 10:35:44,451
and you just have a temporary\nvariable into which you copy A
10904
10:35:44,451 --> 10:35:47,691
then you change A with\nB, then you go and change
10905
10:35:47,692 --> 10:35:50,632
B with whatever the\noriginal value of A was
10906
10:35:50,631 --> 10:35:55,281
because you temporarily stored it\n
10907
10:35:55,281 --> 10:35:59,521
Unfortunately, this code doesn't\nnecessarily work as intended.
10908
10:35:59,521 --> 10:36:02,752
So let me go over to my\nVS Code here and open up
10909
10:36:02,752 --> 10:36:06,021
a program called swap.c,\nand in swap.c, let
10910
10:36:06,021 --> 10:36:11,002
me whip up something really quickly\n
10911
10:36:12,921 --> 10:36:18,112
Inside of main let me do something\nlike x gets 1 and y gets 2.
10912
10:36:18,112 --> 10:36:23,241
Let me just print out as a\nvisual confirmation that x is %i
10913
10:36:23,241 --> 10:36:28,252
y is %i backslash n, plugging\nin x and y, respectively.
10914
10:36:28,252 --> 10:36:31,432
Then let me call a swap function\n
10915
10:36:31,432 --> 10:36:38,122
Swap x and y And then let me print out\n
10916
10:36:38,122 --> 10:36:41,692
just to print out again what they are,\n
10917
10:36:41,692 --> 10:36:44,855
2 first, then 2, 1 the second time.
10918
10:36:44,855 --> 10:36:46,522
Now how is swap going to be implemented?
10919
10:36:46,521 --> 10:36:49,951
Let me implement it exactly\nas on the screen a moment ago.
10920
10:36:52,372 --> 10:36:54,862
or let's call it int A\nfor consistency, int B.
10921
10:36:54,862 --> 10:36:57,022
But I could always call\nthose anything I want.
10922
10:36:57,021 --> 10:37:01,252
Int tmp gets A, A gets B, B gets tmp.
10923
10:37:01,252 --> 10:37:04,341
So exactly as I proposed\na moment ago, and exactly
10924
10:37:04,341 --> 10:37:08,121
as Mariana really implemented\nit using these glasses of water.
10925
10:37:08,122 --> 10:37:11,932
I need to now include my prototype,\n
10926
10:37:11,932 --> 10:37:15,622
And I'll just copy/paste that up here,\n
10927
10:37:15,622 --> 10:37:18,832
So make swap-- so far, so good-- swap--
10928
10:37:18,832 --> 10:37:23,692
x is now 1, y is 2, x is 1, y is 2.
10929
10:37:23,692 --> 10:37:29,452
So there seems to be a bit of a\nbug here, but why might this be?
10930
10:37:29,451 --> 10:37:33,291
This code does not in fact work, even\n
10931
10:37:35,086 --> 10:37:41,600
AUDIENCE: Because A and B have different\n
10932
10:37:41,599 --> 10:37:43,391
DAVID J. MALAN: Good,\nand let me summarize.
10933
10:37:43,391 --> 10:37:46,722
A and B do indeed have\ndifferent addresses of x and y
10934
10:37:46,722 --> 10:37:50,322
and in fact what happens when you\n
10935
10:37:50,322 --> 10:37:54,582
calling swap, passing in x and\ny, you are calling a function
10936
10:37:56,211 --> 10:37:57,972
And this is a term of\nart that just means
10937
10:37:57,972 --> 10:38:02,682
you are passing in copies of x and\n
10938
10:38:02,682 --> 10:38:06,912
A and B in the context of this\n
10939
10:38:06,911 --> 10:38:10,811
Now technically, these\nnames are local only.
10940
10:38:10,811 --> 10:38:13,572
I could have called this x,\nI could have called this y
10941
10:38:13,572 --> 10:38:17,891
I could have changed this to x,\n
10942
10:38:17,891 --> 10:38:19,391
The problem would still remain.
10943
10:38:19,391 --> 10:38:23,322
Just because you use the same names\n
10944
10:38:23,322 --> 10:38:24,911
that doesn't mean they're the same.
10945
10:38:24,911 --> 10:38:26,481
They just look the same to you.
10946
10:38:26,482 --> 10:38:31,182
But indeed, swap is going to get copies\n
10947
10:38:33,822 --> 10:38:36,161
x and y will be copies of the original.
10948
10:38:36,161 --> 10:38:38,501
So for clarity, let me\nrevert this back to A and B
10949
10:38:38,502 --> 10:38:42,311
just to make super clear that they're\n
10950
10:38:42,311 --> 10:38:44,261
but there's indeed a problem there.
10951
10:38:44,262 --> 10:38:46,402
This function actually works fine.
10952
10:38:47,722 --> 10:38:52,281
Let me go ahead and print out\ninside of this. printf A is %i
10953
10:38:52,281 --> 10:38:56,351
B is %i backslash n, and\nthen I'll print A and B.
10954
10:38:56,351 --> 10:38:59,561
And let me do that same thing at the\n
10955
10:39:02,112 --> 10:39:06,101
Make swap, ./swap,\nand this is promising.
10956
10:39:06,101 --> 10:39:12,731
Initially, x is 1, y is 2, A\nis 1, B is 2, A is 2, B is 1
10957
10:39:12,732 --> 10:39:14,959
but then nope-- x is 1, y is 2.
10958
10:39:14,959 --> 10:39:17,292
So if anything, I've confirmed\nthat the logic is right--
10959
10:39:17,292 --> 10:39:20,412
Mariana's logic is right, but\nthere's something about C.
10960
10:39:20,411 --> 10:39:24,281
There's something about using one\n
10961
10:39:26,031 --> 10:39:30,381
The fact that I'm passing in copies of\n
10962
10:39:30,381 --> 10:39:31,752
So what in fact is going on?
10963
10:39:31,752 --> 10:39:34,572
Well again, inside of your computer's\n
10964
10:39:34,572 --> 10:39:36,447
and we've been talking\nabout them abstractly
10965
10:39:36,447 --> 10:39:38,502
it's just this grid of memory locations.
10966
10:39:38,502 --> 10:39:41,703
It turns out that your\ncomputer uses this memory
10967
10:39:41,703 --> 10:39:42,911
in a pretty conventional way.
10968
10:39:42,911 --> 10:39:46,991
It's not just random, where it just\n
10969
10:39:46,991 --> 10:39:50,951
it actually uses different parts of\n
10970
10:39:50,951 --> 10:39:54,341
And you have control over a lot of\n
10971
10:39:55,184 --> 10:39:56,891
And let's go ahead\nand zoom out from this
10972
10:39:56,891 --> 10:40:00,942
and consider that within your computer's\n
10973
10:40:00,942 --> 10:40:04,362
do is actually store initially,\nall of the zeros and ones
10974
10:40:04,362 --> 10:40:08,362
that you compiled in the top of\n
10975
10:40:08,362 --> 10:40:11,592
So when you compile a program and\n
10976
10:40:11,591 --> 10:40:15,011
or on a Mac or PC you double\nclick on it, the computer first--
10977
10:40:15,012 --> 10:40:20,141
the operating system first-- loads all\n
10978
10:40:20,141 --> 10:40:24,731
Machine code, into just one big chunk\n
10979
10:40:24,732 --> 10:40:28,662
Below that it stores global\nvariables-- any variables
10980
10:40:28,661 --> 10:40:32,543
you have created in your program\n
10981
10:40:33,252 --> 10:40:35,052
Generally, the top of your file.
10982
10:40:35,052 --> 10:40:36,995
Globals tend to go at the top there.
10983
10:40:36,995 --> 10:40:39,912
Then there's this chunk of memory\n
10984
10:40:39,911 --> 10:40:42,311
and we saw that word\nbriefly in Valgin's output
10985
10:40:42,311 --> 10:40:45,941
and then there's this other\nchunk of memory called the stack.
10986
10:40:45,942 --> 10:40:51,072
And it turns out that up until this\n
10987
10:40:51,072 --> 10:40:56,322
Any time you use local variables in\n
10988
10:40:56,322 --> 10:41:00,042
Any time you use malloc, that\nmemory ends up on the heap.
10989
10:41:00,042 --> 10:41:02,112
Now as the arrow suggests,\nthis actually looks
10990
10:41:02,112 --> 10:41:05,194
like a problem waiting to happen because\n
10991
10:41:05,194 --> 10:41:07,031
heap, and more and more\nand more stack, it's
10992
10:41:07,031 --> 10:41:09,762
like two things barreling down the\n
10993
10:41:10,252 --> 10:41:11,502
And that's actually a problem.
10994
10:41:11,502 --> 10:41:14,841
If you've ever heard the phrase\n
10995
10:41:14,841 --> 10:41:16,631
this is the origin of its name.
10996
10:41:16,631 --> 10:41:18,881
When you start to use\nmore and more and more
10997
10:41:18,881 --> 10:41:21,161
memory by calling lots\nand lots of functions
10998
10:41:21,161 --> 10:41:23,621
or using lots and lots\nof local variables
10999
10:41:23,622 --> 10:41:25,872
you use a lot of this stack memory.
11000
10:41:25,872 --> 10:41:29,322
Or if you use malloc a lot and keep\n
11001
10:41:29,322 --> 10:41:33,042
and never really, or rarely calling\n
11002
10:41:33,042 --> 10:41:36,881
and eventually these two things might\n
11003
10:41:37,932 --> 10:41:40,552
The program will crash or\nsomething bad will happen.
11004
10:41:40,552 --> 10:41:43,332
So the onus is on you\njust to don't do that.
11005
10:41:43,332 --> 10:41:45,582
But this is the design,\ngenerally, of what's
11006
10:41:45,582 --> 10:41:47,472
going on inside of\nyour computer's memory.
11007
10:41:47,472 --> 10:41:51,072
Now within that memory, though,\nthere are certain conventions
11008
10:41:51,072 --> 10:41:52,932
focusing on here, the stack.
11009
10:41:52,932 --> 10:41:55,391
And in fact, let me go\nover here with a marker
11010
10:41:55,391 --> 10:41:58,881
and say that this represents the\n
11011
10:41:58,881 --> 10:42:03,161
And so here we have a whole bunch of\n
11012
10:42:03,161 --> 10:42:05,451
represents a byte of memory\nand this, for instance
11013
10:42:05,451 --> 10:42:08,141
might represent four bytes\naltogether-- good enough for an int
11014
10:42:09,472 --> 10:42:13,811
So in my original code that I wrote\n
11015
10:42:13,811 --> 10:42:16,211
what is in fact going on\ninside the swap function?
11016
10:42:16,211 --> 10:42:20,262
We can visualize it like this-- when\n
11017
10:42:20,262 --> 10:42:23,862
matter, main is the first function\n
11018
10:42:23,862 --> 10:42:27,372
and so I'm just going to label\n
11019
10:42:27,372 --> 10:42:31,742
And what were the two variables I\n
11020
10:42:33,561 --> 10:42:35,761
And each of those was an\nint, so that's four bytes
11021
10:42:35,762 --> 10:42:38,482
so it's deliberate\nthat I reserved four--
11022
10:42:38,482 --> 10:42:41,312
a chunk of wood here that's four bytes.
11023
10:42:41,311 --> 10:42:45,261
So let me just call this x, and I'm just\n
11024
10:42:45,771 --> 10:42:49,791
And then I had my other variable y, and\n
11025
10:42:49,792 --> 10:42:54,002
What happens when main calls swap\n
11026
10:42:54,002 --> 10:43:00,292
Well, it has two variables of its\n
11027
10:43:00,292 --> 10:43:04,702
and B is initially 2, but it\nhas a third variable, tmp
11028
10:43:04,701 --> 10:43:07,731
which is a local variable in\naddition to the arguments A and B
11029
10:43:07,732 --> 10:43:12,292
that are passed in, so I'm going\n
11030
10:43:12,292 --> 10:43:13,516
And what is the value of tmp?
11031
10:43:13,516 --> 10:43:15,141
Well, we have to look back at the code.
11032
10:43:15,141 --> 10:43:19,792
tmp initially gets the value of A.\n
11033
10:43:21,502 --> 10:43:23,961
That's step one in my\nthree line program.
11034
10:43:23,961 --> 10:43:27,981
OK, A equals B. So that is assigned\n
11035
10:43:27,982 --> 10:43:31,612
into the A So B is 2, A is\nthis, so let me go ahead
11036
10:43:31,612 --> 10:43:33,722
and erase this and just overwrite that.
11037
10:43:33,722 --> 10:43:37,252
So at this moment in the story\nyou have two copies of two
11038
10:43:37,252 --> 10:43:40,072
so that's OK though, because\nthe third line of code
11039
10:43:40,072 --> 10:43:43,101
says tmp gets copied\ninto B. So what's tmp--
11040
10:43:43,101 --> 10:43:48,531
1, gets copied into B, so let\nme overwrite this 2 with a 1
11041
10:43:50,182 --> 10:43:53,302
Now unfortunately, the code ends.
11042
10:43:53,302 --> 10:43:56,872
swap doesn't actually do anything\n
11043
10:43:56,872 --> 10:43:58,882
is that I could have had a return value.
11044
10:43:58,881 --> 10:44:01,101
I could go in there\nand change void to int
11045
10:44:01,101 --> 10:44:02,871
but which one am I going to return?
11046
10:44:04,582 --> 10:44:06,992
The whole goal is to\nswap two values, and it
11047
10:44:06,991 --> 10:44:08,991
seems kind of lame if you\ncan't write a function
11048
10:44:08,991 --> 10:44:12,021
to do something as common per\nlast week sorting algorithms
11049
10:44:14,902 --> 10:44:18,112
Well, even though when this\nprogram starts running
11050
10:44:18,112 --> 10:44:21,351
main is using this chunk of memory\n
11051
10:44:21,351 --> 10:44:24,021
and the stack is just like\na cafeteria stack of trays--
11052
10:44:25,561 --> 10:44:27,651
Here's main's memory on the stack.
11053
10:44:27,652 --> 10:44:29,932
Here's the swap function's\nmemory on the stack.
11054
10:44:29,932 --> 10:44:32,601
It's using three ints instead of two--
11055
10:44:34,311 --> 10:44:37,822
What happens when the function\n
11056
10:44:37,822 --> 10:44:41,061
The sort of recollection that\nthis is swap's memory goes away
11057
10:44:41,061 --> 10:44:42,651
and garbage values are left.
11058
10:44:42,652 --> 10:44:46,891
So, adorably, we get rid\nof these values here
11059
10:44:46,891 --> 10:44:51,351
and there's still data there--\n
11060
10:44:51,351 --> 10:44:54,951
are still there in the computer's\n
11061
10:44:54,951 --> 10:44:56,701
because the function has now returned.
11062
10:44:56,701 --> 10:44:59,781
So they're still in there and this\n
11063
10:44:59,781 --> 10:45:03,141
of why there's other stuff in memory\n
11064
10:45:03,982 --> 10:45:06,432
Sometimes you did put\nit there, but now once
11065
10:45:06,432 --> 10:45:10,072
swap returns you only should be\ntouching memory inside of main.
11066
10:45:10,072 --> 10:45:14,362
But we've never actually\ncopied one value into main.
11067
10:45:14,362 --> 10:45:18,022
We haven't returned anything and we\n
11068
10:45:19,652 --> 10:45:23,662
Well, what if we instead passed\ninto swap not copies of x and y
11069
10:45:23,661 --> 10:45:28,041
calling them A and B. What if they\n
11070
10:45:28,042 --> 10:45:31,222
sort of a treasure map that\nwill lead swap to the actual x
11071
10:45:32,601 --> 10:45:36,411
Today we have that\ncapability using pointers.
11072
10:45:36,411 --> 10:45:40,281
So suppose that we\nuse this code instead.
11073
10:45:40,281 --> 10:45:43,192
There's a lot of stars going on\nhere, which is a bit annoying
11074
10:45:43,192 --> 10:45:45,862
but let's consider what it\nis we're trying to achieve.
11075
10:45:45,862 --> 10:45:50,752
What if we pass in not x and y, but\n
11076
10:45:50,752 --> 10:45:52,862
respectively--\nbreadcrumbs, if you will--
11077
10:45:52,862 --> 10:45:55,881
that will lead swap to\nthe original values.
11078
10:45:55,881 --> 10:45:59,691
Then what we do is we still\ngive ourselves a tmp variable
11079
10:46:00,711 --> 10:46:03,051
It's still a glass, so\nwe still call it an int
11080
10:46:03,052 --> 10:46:05,432
but what do we want to put\ninto that temporary variable?
11081
10:46:05,432 --> 10:46:08,014
We don't want to put A into it,\nbecause that's an address now.
11082
10:46:08,014 --> 10:46:10,731
We want to go to that\naddress per the star
11083
10:46:10,732 --> 10:46:12,502
and put whatever's at that address.
11084
10:46:13,741 --> 10:46:17,481
Well, we want to then copy\ninto whatever's at location A
11085
10:46:17,482 --> 10:46:20,272
we want to copy over to\nlocation A's contents
11086
10:46:20,271 --> 10:46:24,471
whatever is at location B's\ncontents and then lastly, we
11087
10:46:24,472 --> 10:46:27,622
want to copy tmp into\nwhatever's at location B.
11088
10:46:27,622 --> 10:46:31,510
So again, we're very deliberately\nintroducing all of these stars
11089
10:46:31,510 --> 10:46:33,802
because we don't want to\nchange any of these addresses
11090
10:46:33,802 --> 10:46:37,222
we want to go to these addresses\nper the reference operator
11091
10:46:37,222 --> 10:46:41,582
and put values there,\nor get values from.
11092
10:46:41,582 --> 10:46:43,052
So what does this actually mean?
11093
10:46:43,052 --> 10:46:47,362
Well, if I kind of rewind in this story\n
11094
10:46:47,362 --> 10:46:53,031
although I'm going to delete its\n
11095
10:46:53,031 --> 10:46:56,481
and I still have A, but\nwhat's going to be different
11096
10:46:56,482 --> 10:47:00,412
this time is how I use A and B.\nSo let me finish erasing those.
11097
10:47:00,411 --> 10:47:02,541
That's A on the left,\nthis is B on the right.
11098
10:47:02,542 --> 10:47:05,061
At this point in the\nstory, we're rerunning swap
11099
10:47:05,061 --> 10:47:08,511
with this new and improved version,\nand let's see what happens.
11100
10:47:08,512 --> 10:47:12,232
Well, x is presumably at some address.
11101
10:47:12,232 --> 10:47:15,712
Maybe it's like 0x123, as always.
11102
10:47:15,711 --> 10:47:18,832
What then does A get\nwhen I'm using this code?
11103
10:47:28,012 --> 10:47:33,641
Well, I'm going to put 0x456,\nand the what am I going to do?
11104
10:47:33,641 --> 10:47:35,832
Based on these three\nlines of code, I'm going
11105
10:47:35,832 --> 10:47:40,031
to store in tmp whatever is at the\n
11106
10:47:40,031 --> 10:47:43,061
That's this thing here, so\nI'm going to put 1 in tmp.
11107
10:47:43,061 --> 10:47:45,612
Line two-- I'm going to go to B--
11108
10:47:45,612 --> 10:47:48,491
all right, B is 456, so\nI'm going to B and I'm
11109
10:47:48,491 --> 10:47:53,292
going to store 2 at whatever is\nat location A, and at location A
11110
10:47:53,292 --> 10:47:56,572
is 123, so that's this,\nso what am I going to do?
11111
10:47:56,572 --> 10:47:59,262
I'm going to change this 1 to a 2.
11112
10:47:59,262 --> 10:48:01,992
Last line of code-- get the\nvalue of tmp, which is 1
11113
10:48:01,991 --> 10:48:07,091
and then put it at whatever the\n
11114
10:48:07,091 --> 10:48:11,651
and change it to be the value\nof tmp, tmp, which puts 1 here.
11115
10:48:12,881 --> 10:48:14,441
There's still no return value.
11116
10:48:14,442 --> 10:48:17,742
swap returns, which means\nthese three temporary variables
11117
10:48:19,451 --> 10:48:21,831
They can be reused by\nsubsequent function calls
11118
10:48:21,832 --> 10:48:26,452
but now, I've actually\nswapped the values of x and y.
11119
10:48:26,451 --> 10:48:30,401
Which is to say what came as naturally\n
11120
10:48:30,402 --> 10:48:33,882
is not quite as simply\ndone in C because again
11121
10:48:33,881 --> 10:48:36,221
functions are isolated from each other.
11122
10:48:36,222 --> 10:48:39,502
You can pass in values but you\nget copies of those values.
11123
10:48:39,502 --> 10:48:44,052
If you want one function to affect the\n
11124
10:48:44,052 --> 10:48:47,382
you have to 1, understand\nwhat's going on but 2
11125
10:48:47,381 --> 10:48:50,331
pass things in as by a pointer here.
11126
10:48:50,332 --> 10:48:53,921
So if I go back to my code here,\n
11127
10:48:53,921 --> 10:48:56,021
Let me get rid of these extra printf's.
11128
10:48:56,021 --> 10:48:58,752
Let me go in and add all these stars.
11129
10:48:58,752 --> 10:49:02,771
So I'm dereferencing these\nactual addresses here and here
11130
10:49:02,771 --> 10:49:05,182
and I've got to make one more change.
11131
10:49:05,182 --> 10:49:11,741
How do I now call swap if swap is\n
11132
10:49:11,741 --> 10:49:14,801
That is, the address of an int\nand the address of another int.
11133
10:49:14,802 --> 10:49:17,292
What do I change on line 11 here?
11134
10:49:25,591 --> 10:49:28,411
DAVID J. MALAN: Sorry,\nthe address of operator.
11135
10:49:28,411 --> 10:49:33,091
So up here on line 11, we do\nampersand x and ampersand y.
11136
10:49:33,091 --> 10:49:36,361
So that yes, we're technically\npassing in a copy of a value
11137
10:49:36,362 --> 10:49:39,241
but this time the copy we're passing\n
11138
10:49:39,241 --> 10:49:42,631
and as soon as we have an address, just\n
11139
10:49:42,631 --> 10:49:45,932
the foamy finger-- I can point at\n
11140
10:49:45,932 --> 10:49:49,921
and actually get a value from the\n
11141
10:49:52,182 --> 10:49:56,912
So let's cross our fingers\nnow and do make swap, Enter.
11142
10:49:56,911 --> 10:49:58,081
Oh my God, so many mistakes.
11143
10:49:58,082 --> 10:50:00,242
Oh, I didn't remember\nto change my prototype
11144
10:50:00,241 --> 10:50:03,781
so let me go way up here and\nadd two more stars because I
11145
10:50:05,161 --> 10:50:10,322
Make swap, ./swap, and viola--\nnow I have actually swapped.
11146
10:50:17,021 --> 10:50:19,851
All right, so what more can we do here?
11147
10:50:19,851 --> 10:50:24,822
Well, let me consider\nthat all this time we've
11148
10:50:24,822 --> 10:50:29,052
been deliberately using\nGetString and GetInt and GetFloat
11149
10:50:29,052 --> 10:50:30,472
and so forth, but for a reason.
11150
10:50:30,472 --> 10:50:33,430
These aren't just training wheels\n
11151
10:50:33,430 --> 10:50:36,432
they're actually in place\nto make your code safer.
11152
10:50:36,432 --> 10:50:40,872
And to illustrate this, let me go\n
11153
10:50:40,872 --> 10:50:45,222
How about a file called scanf.c.
11154
10:50:45,222 --> 10:50:48,252
It turns out that the old\nschool way-- the way in C
11155
10:50:48,252 --> 10:50:52,512
really, of getting user input,\nis via functions like scanf
11156
10:50:52,512 --> 10:50:56,112
and let me go ahead and include\nstdio.h, int main(void)
11157
10:50:56,112 --> 10:50:59,802
and without using the CS50 library at\n
11158
10:51:00,972 --> 10:51:03,522
Let me give myself an int called x.
11159
10:51:03,521 --> 10:51:07,436
Let me just print out what the value of\n
11160
10:51:07,436 --> 10:51:10,722
or rather, ask the user for\nthe value by asking them for x.
11161
10:51:10,722 --> 10:51:14,141
And I'm going to use a function\n
11162
10:51:14,141 --> 10:51:20,711
in an integer using %i, and I'm going\n
11163
10:51:22,667 --> 10:51:25,542
And then I'm going to go ahead and,\n
11164
10:51:25,542 --> 10:51:29,592
I'm going to print out with %i\n
11165
10:51:29,591 --> 10:51:32,681
All right, so line eight\nis week 1 style code.
11166
10:51:32,682 --> 10:51:36,351
Line five and six is week 1 style code.
11167
10:51:36,351 --> 10:51:41,771
So the curiosity today is this new line.\n
11168
10:51:43,332 --> 10:51:46,031
I'm using the same syntax\nthat I use for printf
11169
10:51:46,031 --> 10:51:49,451
which is kind of a little clue-- a\n
11170
10:51:49,451 --> 10:51:52,391
want to scan in, that is, read\nfrom the human's keyboard--
11171
10:51:52,391 --> 10:51:55,932
and I'm telling it where to put\nwhatever the human typed in.
11172
10:51:55,932 --> 10:51:59,682
I can't just say x, because we run into\n
11173
10:51:59,682 --> 10:52:02,171
I have to give a little\nbreadcrumb to the variable
11174
10:52:02,171 --> 10:52:05,472
where I want scanf to\nput the human's integer.
11175
10:52:05,472 --> 10:52:08,902
And so this just tells the\ncomputer to get an int.
11176
10:52:08,902 --> 10:52:11,141
This is what you would have\nhad to type, essentially
11177
10:52:11,141 --> 10:52:14,052
in week 1 just to get\nan int from the user
11178
10:52:14,052 --> 10:52:16,902
and there's a whole bunch of\nthings that can go wrong still
11179
10:52:16,902 --> 10:52:20,292
but that's the cryptic syntax we\n
11180
10:52:20,292 --> 10:52:22,241
Let me go ahead and make scanf here--
11181
10:52:25,302 --> 10:52:27,252
Put the semicolon in the wrong place.
11182
10:52:30,641 --> 10:52:32,036
Non void doesn't return a value.
11183
10:52:42,332 --> 10:52:45,312
I'm going to type in a number like\n
11184
10:52:45,311 --> 10:52:49,542
So that is the traditional way of\n
11185
10:52:49,542 --> 10:52:53,012
The problem, though, is when you\n
11186
10:52:54,482 --> 10:52:56,650
Let me delete all of\nthis and give myself
11187
10:52:56,650 --> 10:52:59,192
a string s, although wait a\nminute-- we don't call it strings
11188
10:52:59,192 --> 10:53:02,252
anymore-- char star to store a string.
11189
10:53:02,252 --> 10:53:06,091
Then let me go ahead and just prompt the\n
11190
10:53:06,091 --> 10:53:10,891
Then let me go ahead and use scanf, ask\n
11191
10:53:10,891 --> 10:53:13,572
and store it at that address.
11192
10:53:13,572 --> 10:53:16,112
Then let me go ahead and print\nout whatever the human typed
11193
10:53:16,112 --> 10:53:19,002
in just by using the same notation.
11194
10:53:19,002 --> 10:53:24,152
So here, line five is the same thing\n
11195
10:53:24,152 --> 10:53:26,552
that layer today so it's char star s.
11196
10:53:26,552 --> 10:53:31,352
This is just week one this is\njust week one, line seven is new.
11197
10:53:31,351 --> 10:53:37,171
scanf will also read from the human's\n
11198
10:53:37,171 --> 10:53:39,002
But that's OK, because s is an address.
11199
10:53:39,002 --> 10:53:41,912
It's correct not to do the ampersand.
11200
10:53:42,811 --> 10:53:47,432
A string is and has always\nbeen a char star, a.k.a string.
11201
10:53:47,432 --> 10:53:49,451
The problem, though, arises as follows--
11202
10:53:51,771 --> 10:53:53,271
oh my God, what did I do wrong--
11203
10:53:53,271 --> 10:53:55,791
I can't-- OK, we have certain\ndefenses in place with make.
11204
10:53:55,792 --> 10:54:02,241
Let me do clang of scanf.c, an\noutput of program called scanf.
11205
10:54:02,241 --> 10:54:05,199
All right, so I'm overriding\nsome of our pedagogical defenses
11206
10:54:05,199 --> 10:54:06,531
that we have in place with make.
11207
10:54:06,531 --> 10:54:11,121
Let me now run scanf of this version,\n
11208
10:54:15,701 --> 10:54:18,521
So it didn't even store something\n
11209
10:54:18,521 --> 10:54:22,182
This time it's in lowercase,\nbut that is somewhat related.
11210
10:54:22,182 --> 10:54:26,921
What did I fundamentally\ndo wrong though, here?
11211
10:54:26,921 --> 10:54:29,051
Why is this getting\nmore and more dangerous?
11212
10:54:29,052 --> 10:54:30,832
And let me illustrate\nthe point even more.
11213
10:54:30,832 --> 10:54:34,101
What if I type in not just something\n
11214
10:54:34,101 --> 10:54:39,941
What if I do like, hellooooo and\n
11215
10:54:45,451 --> 10:54:48,631
Right, a really long,\nunexpectedly long string.
11216
10:54:48,631 --> 10:54:50,491
This is the nondeterminism kicking in.
11217
10:54:51,781 --> 10:54:53,614
I was trying to trigger\na segmentation fault
11218
10:54:53,614 --> 10:54:56,851
but it wouldn't, but\nthe point still remains.
11219
10:54:56,851 --> 10:55:01,542
It's still not working, but what's\n
11220
10:55:01,542 --> 10:55:03,211
and it's not storing my actual input?
11221
10:55:04,091 --> 10:55:06,026
AUDIENCE: Do you have to make a space?
11222
10:55:06,027 --> 10:55:07,902
DAVID J. MALAN: We have\nto make space for it.
11223
10:55:07,902 --> 10:55:11,141
So what we're missing here is\nmalloc, or something like that.
11224
10:55:11,141 --> 10:55:14,101
So I could do that, I could\ndo something like this.
11225
10:55:14,101 --> 10:55:16,801
Well, let the human type in\nat least a three letter word
11226
10:55:16,802 --> 10:55:20,942
so I could do malloc of 3\nplus 1 for the null character.
11227
10:55:20,942 --> 10:55:25,322
So let me give them four characters,\n
11228
10:55:26,281 --> 10:55:28,442
Nope, sorry. clang, I have to--
11229
10:55:29,582 --> 10:55:36,171
Oh, include stdlib.h-- there we go.
11230
10:55:36,171 --> 10:55:39,197
That gives me malloc, now I'm\n
11231
10:55:39,197 --> 10:55:42,322
now I'm going to rerun it, and now I'm\n
11232
10:55:43,701 --> 10:55:47,421
And let me get a little aggressive now\n
11233
10:55:47,421 --> 10:55:49,461
Still works, but I'm getting lucky.
11234
10:55:53,031 --> 10:55:55,355
Damn it, that still works, too.
11235
10:55:56,451 --> 10:55:58,650
But it actually-- not quite.
11236
10:55:58,650 --> 10:56:00,771
There's some weirdness\ngoing on there already.
11237
10:56:00,771 --> 10:56:02,371
It turns out I can also do this.
11238
10:56:02,372 --> 10:56:05,751
I could actually just say\nchar star four and give myself
11239
10:56:05,750 --> 10:56:07,041
an array of four characters.
11240
10:56:07,042 --> 10:56:08,461
Let me try this one more time.
11241
10:56:08,461 --> 10:56:12,021
So let me rerun clang ./scanf.
11242
10:56:12,021 --> 10:56:16,820
Hellooooooo, clearly exceeding\nthe four characters--
11243
10:56:22,182 --> 10:56:24,703
So the point here, though, is\nif we hadn't given you GetInt
11244
10:56:24,703 --> 10:56:27,161
you would have had to use the\nscanf thing-- not a huge deal
11245
10:56:28,432 --> 10:56:31,682
But if we hadn't given you GetString you\n
11246
10:56:31,682 --> 10:56:34,842
knowing about malloc already or\n
11247
10:56:34,841 --> 10:56:36,911
and even now there's a danger.
11248
10:56:36,911 --> 10:56:41,112
If the human types in five letters,\n
11249
10:56:41,112 --> 10:56:44,862
like with the Hello input, will\n
11250
10:56:44,862 --> 10:56:46,842
So GetString also has\nthis functionality built
11251
10:56:46,841 --> 10:56:49,150
in where we have a\nfancy loop inside such
11252
10:56:49,150 --> 10:56:53,682
that we allocate using malloc as\n
11253
10:56:53,682 --> 10:56:55,631
and we use malloc\nessentially every keystroke.
11254
10:56:55,631 --> 10:57:00,461
The moment you type in h-e-l-l-o, we're\n
11255
10:57:00,461 --> 10:57:04,932
allocating more and more memory so that\n
11256
10:57:04,932 --> 10:57:07,661
GetString even though\nit's this easy to crack--
11257
10:57:07,661 --> 10:57:10,811
this easy to crash your code\nusing scanf if you again
11258
10:57:10,811 --> 10:57:13,481
did it without the help of a library.
11259
10:57:13,482 --> 10:57:15,539
So where are we all going with this?
11260
10:57:15,538 --> 10:57:17,621
Well, let me show you a\nfew final examples that'll
11261
10:57:17,622 --> 10:57:19,961
pave the way for what\nwill be problem set four.
11262
10:57:19,961 --> 10:57:23,122
Let me go ahead and open\nup from today's code--
11263
10:57:23,122 --> 10:57:25,241
which is available on\nthe course's website--
11264
10:57:25,241 --> 10:57:32,202
for instance, a program like\nthis, called phonebook.c
11265
10:57:32,201 --> 10:57:34,900
and I'm just going to give\nyou a quick tour of it
11266
10:57:34,900 --> 10:57:37,862
that you'll see more details on in\n
11267
10:57:37,862 --> 10:57:40,570
We're going to introduce a few\n
11268
10:57:40,571 --> 10:57:43,812
You're going to see a function called\n
11269
10:57:43,811 --> 10:57:47,203
and it takes two arguments-- the\n
11270
10:57:47,203 --> 10:57:50,411
that you might manipulate in Excel or\n
11271
10:57:50,411 --> 10:57:55,211
separated values, and then something\n
11272
10:57:55,211 --> 10:57:58,150
W for write, depending on whether\nyou want to add to the file
11273
10:57:58,150 --> 10:58:00,682
just open it up, or change it.
11274
10:58:00,682 --> 10:58:03,192
We're going to introduce\nyou to a file pointer.
11275
10:58:03,192 --> 10:58:05,031
You'll see that capital file--
11276
10:58:05,031 --> 10:58:07,631
which is a little bit\nunconventional-- capital file is
11277
10:58:07,631 --> 10:58:10,481
a pointer to an actual file\non the computer's hard drive
11278
10:58:10,482 --> 10:58:13,001
so that you can actually access\nsomething like a CSV file
11279
10:58:14,351 --> 10:58:16,661
And we're going to see\ndown below that you're also
11280
10:58:16,661 --> 10:58:20,411
going to have the ability to write\n
11281
10:58:20,411 --> 10:58:24,341
You'll see functions like\nprintf printf for file printf.
11282
10:58:24,341 --> 10:58:29,471
Or fwrite-- file write-- which now that\n
11283
10:58:29,472 --> 10:58:33,311
you'll have the ability to\nactually not only read files--
11284
10:58:33,311 --> 10:58:36,830
text files, images, other\nthings-- but also write them out.
11285
10:58:36,830 --> 10:58:42,281
In fact for instance, just as a teaser\n
11286
10:58:42,281 --> 10:58:44,682
we focus on this week where\nwe give you a forensic image
11287
10:58:44,682 --> 10:58:47,351
and your goal is to\nrecover as many photographs
11288
10:58:47,351 --> 10:58:51,012
from this forensic image of a\n
11289
10:58:51,012 --> 10:58:54,432
And the way you're going to do\nthat is by knowing in advance
11290
10:58:54,432 --> 10:58:58,932
that every JPEG in the world starts\n
11291
10:58:58,932 --> 10:59:01,161
in hexadecimal, but these three numbers.
11292
10:59:01,161 --> 10:59:03,881
And so in fact, just as\na teaser, let me open up
11293
10:59:03,881 --> 10:59:07,061
an example you'll see on the\ncourse's website for today.
11294
10:59:07,061 --> 10:59:09,796
If I scroll through here,\nyou'll see a program
11295
10:59:09,796 --> 10:59:11,421
that does a little something like this.
11296
10:59:13,572 --> 10:59:15,762
if we could hit the button--
11297
10:59:16,402 --> 10:59:21,582
So here we have the notion of a byte\n
11298
10:59:21,582 --> 10:59:24,461
We'll see a data type called byte,\nwhich is a common convention.
11299
10:59:25,701 --> 10:59:28,034
And you're going to learn\nabout a function called fread
11300
10:59:28,035 --> 10:59:31,932
which reads from a file some number\n
11301
10:59:31,932 --> 10:59:33,701
We might then use code like this.
11302
10:59:33,701 --> 10:59:37,362
If bytes bracket zero\nequals equals 0xFF and bytes
11303
10:59:37,362 --> 10:59:43,122
bracket 1 equals 0xD8 and bytes bracket\n
11304
10:59:43,122 --> 10:59:47,842
bytes I just claimed represent a\n
11305
10:59:47,841 --> 10:59:51,171
Let me go ahead and run\nthis program as follows.
11306
10:59:51,171 --> 10:59:55,281
Let me copy jpeg.c into my\ndirectory from today's distribution.
11307
10:59:55,281 --> 11:00:03,432
Let me do make jpeg, and let me run\n
11308
11:00:03,432 --> 11:00:07,201
called lecture.jpeg, and I\nclaim yes, it's possibly a JPEG.
11309
11:00:08,201 --> 11:00:11,841
Let me open it up for us, called\n
11310
11:00:11,841 --> 11:00:15,941
is that same photo with which we began\n
11311
11:00:15,942 --> 11:00:18,072
But what we're also\ngoing to do this week
11312
11:00:18,072 --> 11:00:22,991
is start to implement our own sort\n
11313
11:00:22,991 --> 11:00:26,261
we might take images and actually\n
11314
11:00:26,262 --> 11:00:28,279
creates different versions thereof.
11315
11:00:28,279 --> 11:00:30,072
For instance, using a\ndifferent file format
11316
11:00:30,072 --> 11:00:33,862
called BMP, which essentially lays out\n
11317
11:00:35,262 --> 11:00:36,822
You're going to see a struct--
11318
11:00:36,822 --> 11:00:38,862
a data struct in C that's\nway more complicated
11319
11:00:38,862 --> 11:00:40,991
than the candidate\nstructure from the past
11320
11:00:40,991 --> 11:00:43,226
or the person structure\nfrom the past, that
11321
11:00:43,226 --> 11:00:45,851
looks like this, which is just\na whole bunch more values in it
11322
11:00:45,851 --> 11:00:47,769
but we'll walk you through\nthese in the p-set.
11323
11:00:47,769 --> 11:00:49,781
And we might take a\nphotograph like this and ask
11324
11:00:49,781 --> 11:00:52,241
you to run a few different\nfilters on it a la Instagram
11325
11:00:52,241 --> 11:00:55,871
like a black and white filter,\nor grayscale, a sepia filter
11326
11:00:55,872 --> 11:00:59,891
to give it some old school feel, or\n
11327
11:00:59,891 --> 11:01:02,481
or blur it, even in this way.
11328
11:01:02,482 --> 11:01:05,472
And just to end on a note\nhere, I have a version
11329
11:01:05,472 --> 11:01:08,982
of this code ready to go that doesn't\n
11330
11:01:08,982 --> 11:01:11,712
it just implements one filter initially.
11331
11:01:11,711 --> 11:01:14,411
Let me go ahead and just ready\nthis on my computer here.
11332
11:01:14,411 --> 11:01:16,466
I'm going to go into my\nown version of filter
11333
11:01:16,466 --> 11:01:18,341
and you'll see a few\nfiles that will give you
11334
11:01:18,341 --> 11:01:21,981
a tour of this coming week\nin bitmap.h, for instance
11335
11:01:21,982 --> 11:01:26,872
is a version of this structure that\n
11336
11:01:26,872 --> 11:01:34,722
And let me show you this file here,\n
11337
11:01:34,722 --> 11:01:38,412
called filter that I've already\nimplemented in advance today.
11338
11:01:38,411 --> 11:01:41,471
But the ones we give you for the piece\n
11339
11:01:41,472 --> 11:01:43,847
this function called filter\ntakes the height of an image
11340
11:01:43,847 --> 11:01:46,942
the width of an image, and\na two dimensional array.
11341
11:01:46,942 --> 11:01:49,932
So rows and columns\nof pixels, and then I
11342
11:01:49,932 --> 11:01:53,771
have a loop like this that iterates over\n
11343
11:01:55,402 --> 11:01:57,372
And then notice what\nI'm going to do here.
11344
11:01:57,372 --> 11:02:00,552
I'm going to change the blue\nvalue to be zero in this case
11345
11:02:00,552 --> 11:02:02,962
and the green value to\nbe zero in this case.
11346
11:02:03,701 --> 11:02:07,451
Well, the image I have\nhere in mind is this one
11347
11:02:07,451 --> 11:02:10,241
whereby we have this\nhidden image that simply
11348
11:02:10,241 --> 11:02:13,511
has old school style-- a\nsecret message embedded in it.
11349
11:02:13,512 --> 11:02:16,722
And if you don't happen to have in\n
11350
11:02:16,722 --> 11:02:18,942
glasses that essentially\nmake everything red--
11351
11:02:18,942 --> 11:02:21,817
getting rid of the green in the\n
11352
11:02:21,817 --> 11:02:24,192
you can actually-- I'm actually\nprobably the only one who
11353
11:02:24,192 --> 11:02:26,472
can read this right\nnow-- see what message
11354
11:02:26,472 --> 11:02:28,752
is hidden behind all of this red noise.
11355
11:02:28,752 --> 11:02:34,482
But if using my code written here in\n
11356
11:02:34,482 --> 11:02:37,182
in the picture and I get rid of\nall the green in the picture
11357
11:02:37,182 --> 11:02:39,792
essentially implementing\nthe idea of this filter--
11358
11:02:39,792 --> 11:02:42,612
this red filter where you only see red--
11359
11:02:42,612 --> 11:02:45,862
well, let's go ahead and\ncompile this program.
11360
11:02:45,862 --> 11:02:50,832
Make filter, run ./filter\non this hidden message.bmp.
11361
11:02:50,832 --> 11:02:53,891
I'm going to save it in a\nnew file called message.bmp
11362
11:02:53,891 --> 11:02:56,832
and with one final flourish\nwe're going to open up
11363
11:02:56,832 --> 11:03:00,732
message.bmp, which is the result\nof having put on these glasses
11364
11:03:00,732 --> 11:03:03,882
and hopefully now you\ntoo will see what I see.
11365
11:03:12,891 --> 11:03:14,292
All right, that's it for CS50!
11366
11:04:39,311 --> 11:04:42,222
And this is already week 5,\nwhich means this is actually
11367
11:04:44,722 --> 11:04:48,552
In fact, in just a few days'\ntime, what has looked like this
11368
11:04:48,552 --> 11:04:50,972
and much more cryptic\nthan this perhaps, is
11369
11:04:50,972 --> 11:04:53,472
going to be distilled into\nsomething much simpler next week.
11370
11:04:53,472 --> 11:04:55,632
When we transition to a\nlanguage called Python.
11371
11:04:55,631 --> 11:04:59,951
And with Python, we'll still have our\n
11372
11:05:00,654 --> 11:05:03,822
But a lot of the low-level plumbing\n
11373
11:05:03,822 --> 11:05:06,502
struggling with, frustrated by,\nover the past couple of weeks
11374
11:05:06,502 --> 11:05:08,802
especially, now that\nwe've introduced pointers.
11375
11:05:08,802 --> 11:05:11,682
And it feels like you probably\nhave to do everything yourself.
11376
11:05:11,682 --> 11:05:14,542
In Python, and in a lot\nof higher level languages
11377
11:05:14,542 --> 11:05:16,932
so to speak-- more modern,\nmore recent languages
11378
11:05:16,932 --> 11:05:20,021
you'll be able to do so much more\n
11379
11:05:20,021 --> 11:05:23,021
And indeed, we're going to start\n
11380
11:05:24,461 --> 11:05:27,641
Frameworks, which is collections of\n
11381
11:05:27,641 --> 11:05:31,091
And on top of all that, will you be\n
11382
11:05:31,091 --> 11:05:34,691
impressive projects, that actually solve\n
11383
11:05:34,692 --> 11:05:37,582
Particularly, by way of\nyour own final project.
11384
11:05:37,582 --> 11:05:41,082
So last week though, in week 4,\n
11385
11:05:41,082 --> 11:05:43,692
And we've been treating this\nmemory inside of your computer
11386
11:05:45,042 --> 11:05:48,252
At the end of the day, it's just\n
11387
11:05:48,252 --> 11:05:51,381
And it's really up to you\nwhat you do with those bytes.
11388
11:05:51,381 --> 11:05:54,881
And how you interconnect them, how\n
11389
11:05:54,881 --> 11:05:56,959
And arrays, were like\none of the simplest ways.
11390
11:05:56,959 --> 11:05:58,752
We started playing\naround with that memory.
11391
11:05:58,752 --> 11:06:00,641
Just contiguous chunks of memory.
11392
11:06:01,781 --> 11:06:04,512
But let's consider, for a\nmoment, some of the problems that
11393
11:06:04,512 --> 11:06:06,101
pretty quickly arise with arrays.
11394
11:06:06,101 --> 11:06:09,671
And then, today focus on what more\n
11395
11:06:09,671 --> 11:06:14,591
Using your computer's memory as\na much more versatile canvas
11396
11:06:14,591 --> 11:06:16,861
to create even\ntwo-dimensional structures.
11397
11:06:16,862 --> 11:06:18,612
To represent information,\nand, ultimately
11398
11:06:18,612 --> 11:06:20,692
to solve more interesting problems.
11399
11:06:20,692 --> 11:06:22,272
So here's an array of size 3.
11400
11:06:22,271 --> 11:06:24,072
Maybe, the size of 3 integers.
11401
11:06:24,072 --> 11:06:26,319
And suppose that this\nis inside of a program.
11402
11:06:26,319 --> 11:06:29,112
And at this point in the story,\n
11403
11:06:30,521 --> 11:06:34,558
And suppose, whatever the context,\n
11404
11:06:36,432 --> 11:06:39,449
Well, instinctively, where\nshould the number 4 go?
11405
11:06:39,449 --> 11:06:41,531
If this is your computer's\nmemory and we currently
11406
11:06:41,531 --> 11:06:43,241
have this array 1, 2, 3, from what.
11407
11:06:44,591 --> 11:06:47,822
Where should the number 4\njust, perhaps, naively go.
11408
11:06:52,061 --> 11:06:53,502
So you could replace number 1.
11409
11:06:53,502 --> 11:06:55,377
I don't really like\nthat, though, because I'd
11410
11:06:55,377 --> 11:06:56,771
like to keep number 1 around.
11411
11:06:58,061 --> 11:06:59,811
But I'm losing, of course, information.
11412
11:06:59,811 --> 11:07:02,271
So what else could I do if\nI want to add the number 4.
11413
11:07:02,771 --> 11:07:04,146
AUDIENCE: On the right side of 3.
11414
11:07:04,813 --> 11:07:06,953
So, I mean, it feels like\nif there's some ordering
11415
11:07:06,953 --> 11:07:09,161
to these, which seems kind\nof a reasonable inference
11416
11:07:09,161 --> 11:07:11,261
that it probably belongs\nsomewhere over here.
11417
11:07:11,262 --> 11:07:14,742
But recall last week, as we started\n
11418
11:07:14,741 --> 11:07:16,612
there's other stuff\npotentially going on.
11419
11:07:16,612 --> 11:07:20,232
And if fill that in, ideally, we'd\n
11420
11:07:20,232 --> 11:07:22,062
If we're maintaining this kind of order.
11421
11:07:22,061 --> 11:07:24,461
But recall in the context\nof your computer's memory
11422
11:07:24,461 --> 11:07:25,902
there might be other stuff there.
11423
11:07:25,902 --> 11:07:28,414
Some of these garbage\nvalues that might be usable
11424
11:07:28,413 --> 11:07:30,371
but we don't really know\nor care what they are.
11425
11:07:30,372 --> 11:07:31,961
As represented by Oscar here.
11426
11:07:31,961 --> 11:07:34,991
But there might actually\nbe useful data in use.
11427
11:07:34,991 --> 11:07:38,381
Like, if your program has not\njust a few integers in this array
11428
11:07:38,381 --> 11:07:40,511
but also a string that\nsays like, "Hello, world.
11429
11:07:40,512 --> 11:07:46,572
It could be that your computer has\n
11430
11:07:48,192 --> 11:07:50,442
Well, maybe, you created the\narray in one line of code
11431
11:07:52,091 --> 11:07:54,491
Maybe the next line of\ncode used GET-STRING.
11432
11:07:54,491 --> 11:07:57,711
Or maybe just hard coded a string\n
11433
11:07:57,711 --> 11:08:00,459
And so you painted yourself\ninto a corner, so to speak.
11434
11:08:00,459 --> 11:08:03,042
Now I think you might claim,\nwell, let's just overwrite the H.
11435
11:08:03,042 --> 11:08:04,991
But that's problematic\nfor the same reasons.
11436
11:08:06,711 --> 11:08:09,612
So where else could the 4 go?
11437
11:08:09,612 --> 11:08:12,851
Or how do we solve this problem\nif we want to add a number
11438
11:08:12,851 --> 11:08:14,561
and there's clearly memory available.
11439
11:08:14,561 --> 11:08:17,951
Because those garbage values are junk\n
11440
11:08:17,951 --> 11:08:20,081
So we could certainly reuse those.
11441
11:08:20,082 --> 11:08:23,722
Where could the 4, and\nperhaps this whole array, go?
11442
11:08:24,222 --> 11:08:26,052
So I'm hearing we could\nmove it somewhere.
11443
11:08:26,052 --> 11:08:27,885
Maybe, replace some of\nthose garbage values.
11444
11:08:27,885 --> 11:08:29,902
And honestly, we have a lot of options.
11445
11:08:29,902 --> 11:08:32,141
We could use any of these\ngarbage values up here.
11446
11:08:32,141 --> 11:08:34,881
We could use any of these down\nhere, or even further down.
11447
11:08:34,881 --> 11:08:38,441
The point is there is plenty\nof memory available as
11448
11:08:38,442 --> 11:08:41,891
indicated by these Oscars, where\nwe could put 4, maybe even, 5
11449
11:08:43,271 --> 11:08:46,451
The catch is that we\nchose poorly early on.
11450
11:08:47,531 --> 11:08:51,167
And 1, 2, 3 ended up back-to-back with\n
11451
11:08:52,250 --> 11:08:55,060
Let's go ahead and assume that\n
11452
11:08:55,061 --> 11:08:58,226
And we'll plop the new\narray in this location here.
11453
11:08:58,226 --> 11:09:00,101
So I'm going to go ahead\nand copy the 1 over.
11454
11:09:01,902 --> 11:09:04,634
And then, ultimately, once\nI'm ready to fill the 4
11455
11:09:04,633 --> 11:09:07,091
I can throw away, essentially,\nthe old array at this point.
11456
11:09:07,091 --> 11:09:09,101
Because I have it now\nentirely in duplicate.
11457
11:09:09,101 --> 11:09:11,241
And I can populate it with the number 4.
11458
11:09:12,612 --> 11:09:15,582
That is a correct potential\nsolution to this problem.
11459
11:09:16,665 --> 11:09:19,624
And this is something we're going to\n
11460
11:09:19,624 --> 11:09:22,302
What's the downside of having\nsolved this problem in this way?
11461
11:09:23,896 --> 11:09:25,271
I'm adding a lot of running time.
11462
11:09:25,271 --> 11:09:28,061
It took me a lot of effort to\ncopy those additional numbers.
11463
11:09:28,061 --> 11:09:29,502
Now, granted, it's a small array.
11464
11:09:30,502 --> 11:09:32,377
It's going to be over\nin the blink of an eye.
11465
11:09:32,377 --> 11:09:35,061
But if we start talking\nabout interesting data sets
11466
11:09:35,061 --> 11:09:37,671
web application data sets,\nmobile app data sets.
11467
11:09:37,671 --> 11:09:41,152
Where you have not just a few, but\n
11468
11:09:41,152 --> 11:09:43,112
a few million pieces of data.
11469
11:09:43,112 --> 11:09:46,252
This is probably a suboptimal\nsolution to just, oh
11470
11:09:46,252 --> 11:09:48,234
move all your data from\none place to another.
11471
11:09:48,233 --> 11:09:49,941
Because who's to say\nthat we're not going
11472
11:09:49,942 --> 11:09:51,531
to paint ourselves into a new corner.
11473
11:09:51,531 --> 11:09:54,741
And it would feel like you're wasting\n
11474
11:09:54,741 --> 11:09:58,591
And, ultimately, just costing\nyourself a huge amount of time.
11475
11:09:58,591 --> 11:10:01,611
In fact, if we put this now into\n
11476
11:10:01,612 --> 11:10:06,531
from a few weeks back, what might\nthe running time now of Search
11477
11:10:08,752 --> 11:10:10,912
A throwback a couple of weeks ago.
11478
11:10:10,911 --> 11:10:14,061
If you're using an array, to\nrecap, what was the running time
11479
11:10:14,061 --> 11:10:17,072
of a Search algorithm in Big O notation?
11480
11:10:17,072 --> 11:10:19,252
So, maybe, in the worst case.
11481
11:10:19,252 --> 11:10:23,031
If you've got n numbers, 3 in this\n
11482
11:10:28,582 --> 11:10:30,202
And what's your intuition for that?
11483
11:10:36,792 --> 11:10:39,584
So if we go through each element,\n
11484
11:10:39,584 --> 11:10:42,972
then Search is going to take\nthis a Big O running time.
11485
11:10:42,972 --> 11:10:46,002
If, though, we're talking about\nthese numbers, specifically.
11486
11:10:46,002 --> 11:10:48,972
And now I'll explicitly stipulate\nthat, yeah, they're sorted.
11487
11:10:50,141 --> 11:10:54,432
What would the Big O notation be\n
11488
11:10:54,432 --> 11:10:56,921
be it of size 3, or 4,\nor n, more generally.
11489
11:10:57,972 --> 11:10:59,772
SPEAKER 1: Big O of, not n, but rather?
11490
11:11:01,182 --> 11:11:05,190
Because we could use per week zero\n
11491
11:11:05,190 --> 11:11:06,732
we'd have to deal with some rounding.
11492
11:11:06,732 --> 11:11:08,922
Because there's not a perfect\nnumber of elements at the moment.
11493
11:11:08,921 --> 11:11:10,332
But you could use binary search.
11494
11:11:11,652 --> 11:11:13,391
And then go left or\nright, left or right
11495
11:11:13,391 --> 11:11:15,141
until you find the\nelement you care about.
11496
11:11:15,141 --> 11:11:19,302
So Search remains in Big O\nof log n when using arrays.
11497
11:11:19,302 --> 11:11:21,132
But what about insertion, now?
11498
11:11:21,131 --> 11:11:23,171
If we start to think\nabout other operations.
11499
11:11:23,171 --> 11:11:26,862
Like, adding a number to this array,\n
11500
11:11:26,862 --> 11:11:29,531
app, or Google finding\nanother page on the internet.
11501
11:11:29,531 --> 11:11:31,991
So insertion happens all the time.
11502
11:11:31,991 --> 11:11:34,811
What's the running time of Insert?
11503
11:11:34,811 --> 11:11:38,112
When it comes to inserting into\nan existing array of size n.
11504
11:11:38,112 --> 11:11:40,781
How many steps might that take?
11505
11:11:43,201 --> 11:11:46,061
Because in the worst case,\nwhere you're out of space
11506
11:11:46,061 --> 11:11:48,629
you have to allocate, it\nwould seem, a new array.
11507
11:11:48,629 --> 11:11:50,921
Maybe, taking over some of\nthe previous garbage values.
11508
11:11:50,921 --> 11:11:52,661
But the catch is, even\nthough you're only
11509
11:11:52,661 --> 11:11:55,031
inserting one new number,\nlike the number 4
11510
11:11:55,031 --> 11:11:58,551
you have to copy over all the darn\n
11511
11:11:58,552 --> 11:12:01,542
So if your original array of\nsize n, the copying of that
11512
11:12:01,542 --> 11:12:03,412
is going to take Big O of n plus 1.
11513
11:12:03,411 --> 11:12:06,411
But we can throw away the plus 1\n
11514
11:12:06,411 --> 11:12:09,341
So Insert now becomes Big O of n.
11515
11:12:09,341 --> 11:12:11,201
And that might not be ideal.
11516
11:12:11,201 --> 11:12:13,991
Because if you're in the habit\nof inserting things frequently
11517
11:12:13,991 --> 11:12:16,362
that could start to add\nup, and add up, and add up.
11518
11:12:16,362 --> 11:12:19,302
And this is why computer programs,\nand websites, and mobile apps
11519
11:12:20,472 --> 11:12:23,482
If you're not being mindful\nof these trade offs.
11520
11:12:23,482 --> 11:12:27,492
So what about, just for good\nmeasure, Omega notation.
11521
11:12:28,752 --> 11:12:31,241
Well just to recap\nhere, we could get lucky
11522
11:12:31,241 --> 11:12:33,533
and Search could just take one step.
11523
11:12:33,533 --> 11:12:35,741
Because you might just get\nlucky, and boom the number
11524
11:12:35,741 --> 11:12:38,292
you're looking for is right there in\n
11525
11:12:38,292 --> 11:12:40,152
Or even linear search, for that matter.
11526
11:12:41,201 --> 11:12:45,191
If there's enough room, and we didn't\n
11527
11:12:45,192 --> 11:12:46,728
1, 2, and 3, to a new location.
11528
11:12:47,561 --> 11:12:49,722
And we could have, as\nsomeone suggested, just
11529
11:12:49,722 --> 11:12:51,520
put the number 4 right there at the end.
11530
11:12:51,519 --> 11:12:53,561
And if we don't get lucky,\nit might take n steps.
11531
11:12:53,561 --> 11:12:57,441
If we do get lucky, it might just take\n
11532
11:12:57,442 --> 11:12:59,152
In fact, let me go ahead and do this.
11533
11:12:59,152 --> 11:13:00,802
How about we do something like this?
11534
11:13:00,802 --> 11:13:02,502
Let me switch over to some code here.
11535
11:13:02,502 --> 11:13:05,591
Let me start to make a\nprogram called List.C.
11536
11:13:05,591 --> 11:13:08,270
And in List.C, let's\nstart with the old way.
11537
11:13:08,271 --> 11:13:11,512
So we follow the breadcrumbs we've\n
11538
11:13:11,512 --> 11:13:14,952
So in this List.C, I'm going\nto include standardio.h.
11539
11:13:16,932 --> 11:13:20,262
Then inside of my code here, I'm\n
11540
11:13:20,262 --> 11:13:22,072
the first version of memory.
11541
11:13:22,072 --> 11:13:26,811
So int list 3 is now implemented\nat the moment, in an array.
11542
11:13:26,811 --> 11:13:29,169
So we're rewinding for\nnow to week 2 style code.
11543
11:13:29,169 --> 11:13:31,002
And then, let me just\ninitialize this thing.
11544
11:13:31,002 --> 11:13:32,682
At the first location will be 1.
11545
11:13:32,682 --> 11:13:34,722
At the next location will be 2.
11546
11:13:34,722 --> 11:13:37,391
And at the last location will be 3.
11547
11:13:37,391 --> 11:13:39,722
So the array is zero indexed always.
11548
11:13:39,722 --> 11:13:41,472
I, for just the sake\nof discussion though
11549
11:13:41,472 --> 11:13:44,902
am putting in the numbers 1, 2,\n3, like a normal person might.
11550
11:13:45,402 --> 11:13:46,819
So now let's just print these out.
11551
11:13:50,322 --> 11:13:53,232
Let's go ahead now and\nprint out using printf.
11552
11:13:56,141 --> 11:13:59,771
So very simple program, inspired\nby what we did in week 2.
11553
11:13:59,771 --> 11:14:03,682
Just to create and then print\nout the contents of an array.
11554
11:14:05,862 --> 11:14:09,942
So far, so good. ./list\nAnd voila, we see 1, 2, 3.
11555
11:14:09,942 --> 11:14:14,952
Now let's start to practice some of what\n
11556
11:14:14,951 --> 11:14:19,541
So let me go in now and get\nrid of the array version.
11557
11:14:19,542 --> 11:14:22,391
And let me zoom out a little bit\n
11558
11:14:22,391 --> 11:14:25,932
And now let's begin to\ncreate a list of size 3.
11559
11:14:25,932 --> 11:14:29,112
So if I'm going to do\nthis now, dynamically
11560
11:14:29,112 --> 11:14:33,262
so that I'm allocating these\nthings again and again
11561
11:14:33,262 --> 11:14:34,912
let me go ahead and do this.
11562
11:14:34,911 --> 11:14:41,951
Let me give myself a list that's of type\n
11563
11:14:41,951 --> 11:14:48,971
of 3 times the size of an int, so what\n
11564
11:14:48,972 --> 11:14:51,972
enough memory for that very first\npicture we drew on the board.
11565
11:14:51,972 --> 11:14:54,641
Which was the array\ncontaining 1, 2, and 3.
11566
11:14:54,641 --> 11:14:57,472
But laying the foundation\nto be able to resize it
11567
11:14:57,472 --> 11:14:59,061
which was ultimately the goal.
11568
11:14:59,061 --> 11:15:01,131
So my syntax is a little different here.
11569
11:15:01,131 --> 11:15:04,572
I'm going to use malloc and get memory\n
11570
11:15:05,482 --> 11:15:09,372
Instead of using the stack by just\n
11571
11:15:12,161 --> 11:15:16,572
That is to say this line of code from\n
11572
11:15:16,572 --> 11:15:20,112
identical to this line of\ncode in the second version.
11573
11:15:20,112 --> 11:15:22,211
But the first line of\ncode puts the memory
11574
11:15:22,211 --> 11:15:24,372
on the stack, automatically, for me.
11575
11:15:24,372 --> 11:15:27,282
The second line of code,\nthat I've left here now
11576
11:15:27,281 --> 11:15:30,762
is creating an array of size 3,\nbut it's putting it on the heap.
11577
11:15:30,762 --> 11:15:34,382
And that's important because it was only\n
11578
11:15:35,311 --> 11:15:38,341
That you can actually ask for more\n
11579
11:15:38,341 --> 11:15:42,241
When you just use the\nfirst notation int list 3
11580
11:15:42,241 --> 11:15:45,631
you have permanently given\nyourself an array of size 3.
11581
11:15:45,631 --> 11:15:48,612
You cannot add to that in code.
11582
11:15:48,612 --> 11:15:50,491
So let me go ahead and do this.
11583
11:15:50,491 --> 11:15:53,625
If list==null, something went wrong.
11584
11:15:53,625 --> 11:15:54,792
The computers out of memory.
11585
11:15:54,792 --> 11:15:56,985
So let's just return 1 and\nquit out of this program.
11586
11:15:56,985 --> 11:15:58,152
There's nothing to see here.
11587
11:15:58,152 --> 11:16:00,002
So just a good error check there.
11588
11:16:00,002 --> 11:16:02,252
Now let me go ahead and\ninitialize this list.
11589
11:16:02,252 --> 11:16:04,201
So list [0] will be 1 again.
11590
11:16:07,921 --> 11:16:10,292
So that's the same kind\nof syntax as before.
11591
11:16:10,292 --> 11:16:13,412
And notice this equivalence.
11592
11:16:13,411 --> 11:16:18,211
Recall that there's this relationship\n
11593
11:16:18,211 --> 11:16:21,031
And arrays are really just doing\npointer arithmetic for you
11594
11:16:21,031 --> 11:16:22,741
where the square bracket notation is.
11595
11:16:22,741 --> 11:16:27,511
So if I've asked myself here, in line\n
11596
11:16:27,512 --> 11:16:32,732
it is perfectly OK to treat it now like\n
11597
11:16:32,732 --> 11:16:35,222
Because the computer will\ndo the arithmetic for me
11598
11:16:35,222 --> 11:16:37,921
and find the first location,\nthe second, and the third.
11599
11:16:37,921 --> 11:16:42,031
If you really want to be\ncool and hacker-like, well
11600
11:16:42,031 --> 11:16:48,781
you could say list=1,\nlist+1=2, list+2=3.
11601
11:16:51,362 --> 11:16:53,701
That's the same thing\nusing very explicit
11602
11:16:53,701 --> 11:16:56,311
pointer arithmetic, which we\nlooked at briefly last week.
11603
11:16:56,311 --> 11:16:58,651
But this is atrocious to\nlook at for most people.
11604
11:16:58,652 --> 11:17:00,342
It's just not very user friendly.
11605
11:17:00,341 --> 11:17:03,271
It's longer to type, so\nmost people, even when
11606
11:17:03,271 --> 11:17:06,151
allocating memory dynamically\nas I did a second ago
11607
11:17:06,152 --> 11:17:10,112
would just use the more\nfamiliar notation of an array.
11608
11:17:11,792 --> 11:17:16,322
Now suppose time passes\nand I realize, oh shoot
11609
11:17:16,322 --> 11:17:21,302
I really wanted this array to\nbe of size 4 instead of size 3.
11610
11:17:21,302 --> 11:17:23,844
Now, obviously, I could just\nrewind and like fix the program.
11611
11:17:23,843 --> 11:17:25,801
But suppose that this is\na much larger program.
11612
11:17:25,802 --> 11:17:28,172
And I've realized, at\nthis point, that I need
11613
11:17:28,171 --> 11:17:31,561
to be able to dynamically add more\n
11614
11:17:32,222 --> 11:17:33,762
Well let me go ahead and do this.
11615
11:17:33,762 --> 11:17:36,152
Let me just say, all\nright, list should actually
11616
11:17:36,152 --> 11:17:42,182
be the result of asking for 4\nchunks of memory from malloc.
11617
11:17:42,182 --> 11:17:46,217
And then, I could do something\nlike this, list [3]=4.
11618
11:17:49,171 --> 11:17:52,182
Now this is buggy, potentially,\nin a couple of ways.
11619
11:17:52,182 --> 11:17:59,012
But let me ask first, what's really\n
11620
11:17:59,012 --> 11:18:03,332
The goal at hand is to start with\n
11621
11:18:03,332 --> 11:18:05,141
And I want to add a number 4 to it.
11622
11:18:05,141 --> 11:18:10,862
So at the moment, in line 17, I've asked\n
11623
11:18:12,421 --> 11:18:14,612
And then I'm adding the number 4 to it.
11624
11:18:14,612 --> 11:18:18,092
But I have skipped a few\nsteps and broken this somehow.
11625
11:18:19,375 --> 11:18:21,504
AUDIENCE: You don't know\nexactly [INAUDIBLE]..
11626
11:18:22,171 --> 11:18:24,542
I don't necessarily know where\n
11627
11:18:24,542 --> 11:18:26,042
It's probably not\ngoing to be immediately
11628
11:18:26,042 --> 11:18:27,391
adjacent to the previous chunk.
11629
11:18:27,391 --> 11:18:30,222
And so, yes, even though I'm\nputting the number for there
11630
11:18:30,222 --> 11:18:34,182
I haven't copied the 1, the 2, or\n
11631
11:18:35,881 --> 11:18:40,112
well, that's actually, indeed,\n
11632
11:18:40,112 --> 11:18:43,561
I am orphaning the\noriginal chunk of memory.
11633
11:18:43,561 --> 11:18:46,741
If you think of the picture that\n
11634
11:18:46,741 --> 11:18:52,981
up here on line 5 that allocates\n
11635
11:18:55,752 --> 11:18:59,131
But as soon as I do this, I'm\nclobbering the value of list.
11636
11:18:59,131 --> 11:19:01,441
And saying no, don't point\nat this chunk of memory.
11637
11:19:01,442 --> 11:19:05,382
Point at this chunk of memory, at\n
11638
11:19:05,381 --> 11:19:07,711
where the original chunk of memory is.
11639
11:19:07,711 --> 11:19:12,301
So the right way to do something like\n
11640
11:19:12,302 --> 11:19:14,880
Let me go ahead and give\nmyself a temporary variable.
11641
11:19:14,879 --> 11:19:16,171
And I'll literally call it TMP.
11642
11:19:16,171 --> 11:19:18,301
T-M-P, like I did last week.
11643
11:19:18,302 --> 11:19:21,602
So that I can now ask the computer for\n
11644
11:19:22,771 --> 11:19:25,711
I'm going to again say\nif TMP equals null
11645
11:19:25,711 --> 11:19:27,851
I'm going to say bad\nthings happened here.
11646
11:19:29,042 --> 11:19:31,322
And you know what,\njust to be tidy, let me
11647
11:19:31,322 --> 11:19:34,023
free the original list before I quit.
11648
11:19:34,023 --> 11:19:35,731
Because remember from\nlast week, any time
11649
11:19:35,732 --> 11:19:38,132
you use malloc you\neventually have to use free.
11650
11:19:38,131 --> 11:19:41,521
But this chunk of code here\nis just a safety check.
11651
11:19:41,521 --> 11:19:43,921
If there's no more memory,\nthere's nothing to see here.
11652
11:19:43,921 --> 11:19:46,981
I'm just going to clean\nup my state and quit.
11653
11:19:46,982 --> 11:19:50,322
But now, if I have asked\nfor this chunk of memory
11654
11:19:50,322 --> 11:19:55,682
now I can do this 4 int i gets 0.
11655
11:19:58,082 --> 11:20:00,002
What if I do something like this?
11656
11:20:04,021 --> 11:20:08,461
That would seem to have the effect\n
11657
11:20:09,281 --> 11:20:12,991
And then, I think I need to\ndo one last thing TMP [3]
11658
11:20:12,991 --> 11:20:14,941
gets the number 4, for instance.
11659
11:20:14,942 --> 11:20:18,961
Again, I'm hard coding the numbers\nfor the sake of discussion.
11660
11:20:18,961 --> 11:20:23,942
After I've done this,\nwhat could I now do?
11661
11:20:23,942 --> 11:20:28,472
I could now set list equals to TMP.
11662
11:20:28,472 --> 11:20:31,529
And now, I have updated\nmy linked list properly.
11663
11:20:31,529 --> 11:20:32,822
So let me go ahead and do this.
11664
11:20:36,961 --> 11:20:42,301
Let me go ahead and print each of these\n
11665
11:20:42,302 --> 11:20:45,372
And then, I'm going to return 0 just\n
11666
11:20:45,372 --> 11:20:49,472
Now so to recap, we\ninitialize the original array
11667
11:20:49,472 --> 11:20:52,622
of size 3 and plug-in\nthe values 1, 2, 3.
11668
11:20:53,442 --> 11:20:55,692
And then, I realize, wait a\nminute, I need more space.
11669
11:20:55,692 --> 11:20:58,067
And so I asked the computer\nfor a second chunk of memory.
11670
11:20:59,281 --> 11:21:01,949
Just as a safety check, I make\nsure that TMP doesn't equal null.
11671
11:21:01,949 --> 11:21:03,489
Because if it does I'm out of memory.
11672
11:21:03,489 --> 11:21:05,072
So I should just quit altogether.
11673
11:21:05,072 --> 11:21:07,591
But once I'm sure that\nit's not null, I'm
11674
11:21:07,591 --> 11:21:12,931
going to copy all the values from\n
11675
11:21:12,932 --> 11:21:16,391
And then, I'm going to add my new\n
11676
11:21:16,391 --> 11:21:19,891
And then, now that I'm done playing\n
11677
11:21:19,891 --> 11:21:23,341
I'm going to remember\nin my list variable what
11678
11:21:23,341 --> 11:21:25,381
the addresses of this\nnew chunk of memory.
11679
11:21:25,381 --> 11:21:28,051
And then, I'm going to print\nall of those values out.
11680
11:21:28,052 --> 11:21:31,832
So at least, aesthetically, when I\n
11681
11:21:31,832 --> 11:21:34,141
except for my missing semicolon.
11682
11:21:38,101 --> 11:21:40,771
Implicitly declaring a\nlibrary function malloc.
11683
11:21:40,771 --> 11:21:45,230
What's my mistake any time\nyou see that kind of error?
11684
11:21:46,862 --> 11:21:52,182
So up here, I forgot to do include\n
11685
11:21:52,182 --> 11:21:53,972
Let me go ahead and,\nagain, do make list.
11686
11:21:56,432 --> 11:21:59,311
And I should see 1, 2, 3, 4.
11687
11:21:59,311 --> 11:22:03,122
But they're still a bug here.
11688
11:22:03,122 --> 11:22:05,792
Does anyone see the\nthe-- bug or question?
11689
11:22:05,792 --> 11:22:07,582
AUDIENCE: You forgot to free them.
11690
11:22:07,582 --> 11:22:08,272
SPEAKER 1: I'm sorry, say again.
11691
11:22:08,271 --> 11:22:09,951
AUDIENCE: You forgot to free them.
11692
11:22:09,951 --> 11:22:12,051
SPEAKER 1: I forgot to\nfree the original list.
11693
11:22:12,052 --> 11:22:15,652
And we could see this, even if not\n
11694
11:22:15,652 --> 11:22:18,329
If I do something like\nValgrind of dot/list
11695
11:22:18,328 --> 11:22:19,911
remember our tool from this past week.
11696
11:22:19,911 --> 11:22:22,791
Let me increase the size of my\nterminal window, temporarily.
11697
11:22:22,792 --> 11:22:25,022
The output is crazy cryptic at first.
11698
11:22:25,021 --> 11:22:30,261
But, notice that I have definitely\n
11699
11:22:30,262 --> 11:22:32,632
And indeed, it's even\npointing at the line number
11700
11:22:32,631 --> 11:22:34,411
in which some of those bytes were lost.
11701
11:22:34,411 --> 11:22:36,411
So let me go ahead and back to my code.
11702
11:22:36,411 --> 11:22:41,091
And indeed, I think what I need to do\n
11703
11:22:41,091 --> 11:22:44,631
pointing it at this new chunk\nof memory instead of the old
11704
11:22:44,631 --> 11:22:47,391
I think I now need to\nfirst, proactively
11705
11:22:47,391 --> 11:22:49,942
say free the old list of memory.
11706
11:22:51,961 --> 11:22:56,731
So if I now do Make List and do dot\n
11707
11:22:56,732 --> 11:22:59,932
And, if I cross my fingers\nand run Valgrind again
11708
11:22:59,932 --> 11:23:03,921
after increasing my window\nsize, hopefully here.
11709
11:23:06,561 --> 11:23:09,502
It seems like less memory is lost.
11710
11:23:09,502 --> 11:23:11,932
What have I now forgotten to do?
11711
11:23:11,932 --> 11:23:13,912
AUDIENCE: You forgot to free the end.
11712
11:23:13,911 --> 11:23:16,221
SPEAKER 1: I forgot to free\nit at the very end, too.
11713
11:23:16,222 --> 11:23:19,042
Because I still have a chunk of\nmemory that I got from malloc.
11714
11:23:19,042 --> 11:23:21,682
So let me go to the very\nbottom of the program now.
11715
11:23:21,682 --> 11:23:26,811
And after I'm done senselessly\njust printing this thing out
11716
11:23:29,932 --> 11:23:33,262
And now let me do Make List, dot/list.
11717
11:23:35,152 --> 11:23:39,682
Now let's do Valgrind\nof dot/list, Enter.
11718
11:23:39,682 --> 11:23:43,012
And now, hopefully, all\nheap blocks were freed.
11719
11:23:44,500 --> 11:23:47,542
So this is perhaps the best output\n
11720
11:23:47,542 --> 11:23:50,432
I used the heap, but I freed\nall the memory as well.
11721
11:23:50,432 --> 11:23:52,112
So there were 2 fixes needed there.
11722
11:23:52,612 --> 11:23:56,391
Any questions then on this array-based\n
11723
11:23:56,391 --> 11:23:59,012
is statically allocating\nan array, so to speak.
11724
11:23:59,012 --> 11:24:00,711
By just hard coding the number 3.
11725
11:24:00,711 --> 11:24:04,671
The second version now is\ndynamically allocating the array
11726
11:24:04,671 --> 11:24:06,862
using not the stack but the heap.
11727
11:24:06,862 --> 11:24:10,281
But, it too, suffers from the\nslowness we described earlier
11728
11:24:10,281 --> 11:24:12,771
of having to copy all those\nvalues from one to the other.
11729
11:24:14,665 --> 11:24:17,340
AUDIENCE: Why do you not\nhave to free the TMP?
11730
11:24:18,381 --> 11:24:20,301
Why did I not have to free the TMP?
11731
11:24:20,302 --> 11:24:22,612
I essentially did eventually.
11732
11:24:22,612 --> 11:24:27,842
Because TMP was pointing\nat the chunk of 4 integers.
11733
11:24:27,841 --> 11:24:33,291
But on line 33 here,\nI assigned list to be
11734
11:24:33,292 --> 11:24:36,061
identical to what TMP was pointing at.
11735
11:24:36,061 --> 11:24:40,654
And so, when I finally freed the list,\n
11736
11:24:40,654 --> 11:24:43,822
In fact, if I wanted to, I could say\n
11737
11:24:43,822 --> 11:24:45,561
But conceptually, it's wrong.
11738
11:24:45,561 --> 11:24:49,612
Because at this point in the story, I\n
11739
11:24:50,722 --> 11:24:52,822
But they were the same at\nthat point in the story.
11740
11:24:53,322 --> 11:24:55,360
AUDIENCE: Is [? the line ?] part of it?
11741
11:24:56,402 --> 11:24:58,832
And long story short,\neverything we're doing thus far
11742
11:24:58,832 --> 11:25:00,302
is still in the world of arrays.
11743
11:25:00,302 --> 11:25:02,192
The only distinction\nwe're making is that
11744
11:25:02,192 --> 11:25:08,702
in version 1, when I said int list\n
11745
11:25:08,701 --> 11:25:12,631
So-called statically allocated\non the stack, as per last week.
11746
11:25:12,631 --> 11:25:16,381
This version now is still dealing with\n
11747
11:25:16,381 --> 11:25:18,461
and using dynamic memory allocation.
11748
11:25:18,461 --> 11:25:20,979
So that I can still use an\narray per the first pictures
11749
11:25:22,021 --> 11:25:24,551
But I can at least grow\nthe array if I want.
11750
11:25:24,552 --> 11:25:28,472
So we haven't even now solved this, even\n
11751
11:25:30,061 --> 11:25:34,411
AUDIENCE: How are you able to free\n
11752
11:25:34,411 --> 11:25:37,201
SPEAKER 1: How am I able to free list?
11753
11:25:37,201 --> 11:25:41,791
I freed the original address of list.
11754
11:25:41,792 --> 11:25:44,702
I, then, changed what list is storing.
11755
11:25:44,701 --> 11:25:47,551
I'm moving its arrow to\na new chunk of memory.
11756
11:25:47,552 --> 11:25:51,032
And that is perfectly reasonable\nfor me to now manipulate
11757
11:25:51,031 --> 11:25:54,661
because now list is pointing\nat the same value of TMP.
11758
11:25:54,661 --> 11:26:00,091
And TMP is what was given the return\n
11759
11:26:00,091 --> 11:26:02,261
So that chunk of memory is valid.
11760
11:26:02,262 --> 11:26:05,702
So these are just squares\non the board, right.
11761
11:26:05,701 --> 11:26:07,451
There's just pointers inside of them.
11762
11:26:07,451 --> 11:26:09,368
So what I'm technically\nsaying is, and I'm not
11763
11:26:09,368 --> 11:26:11,521
pointing I'm not freeing\nlist per se, I am
11764
11:26:11,521 --> 11:26:16,141
freeing the chunk of memory that begins\n
11765
11:26:16,141 --> 11:26:21,542
Therefore, if a few lines later, I\n
11766
11:26:21,542 --> 11:26:25,561
Totally reasonable to then touch that\n
11767
11:26:25,561 --> 11:26:27,871
Because you're not freeing\nthe variable per se
11768
11:26:27,872 --> 11:26:30,272
you're freeing the\naddress in the variable.
11769
11:26:31,622 --> 11:26:37,232
So let me back up here and\nnow make one final edit.
11770
11:26:37,232 --> 11:26:41,672
So let's finish this with\none final improvement here.
11771
11:26:41,671 --> 11:26:44,641
Because it turns out,\nthere's a somewhat better way
11772
11:26:44,641 --> 11:26:48,091
to actually resize an array\nas we've been doing here.
11773
11:26:48,091 --> 11:26:52,509
And there's another function in stdlib\n
11774
11:26:52,510 --> 11:26:55,052
And I'm just going to go in and\nmake a little bit of a change
11775
11:26:55,052 --> 11:26:58,060
here so that I can do the following.
11776
11:26:58,059 --> 11:26:59,851
Let me go ahead and\nfirst comment this now
11777
11:26:59,851 --> 11:27:02,801
just so we can keep track of what's\n
11778
11:27:02,802 --> 11:27:09,452
So dynamically allocate\nan array of size 3.
11779
11:27:09,451 --> 11:27:14,131
Assign 3 numbers to that array.
11780
11:27:15,811 --> 11:27:21,121
Allocate new array of size 4.
11781
11:27:21,122 --> 11:27:26,942
Copy numbers from old\narray into new array.
11782
11:27:26,942 --> 11:27:31,652
And add fourth number to new array.
11783
11:27:36,332 --> 11:27:41,942
Remember, if you will, new array\nusing my same list variable.
11784
11:27:49,741 --> 11:27:53,011
And we'll post this code online after\n
11785
11:27:53,012 --> 11:27:56,702
So it turns out that we can reduce\n
11786
11:27:56,701 --> 11:27:59,461
Not so much with the printing\nhere, but with this copying.
11787
11:27:59,461 --> 11:28:01,741
Turns out c does have a\nfunction called realloc
11788
11:28:01,741 --> 11:28:07,061
that can actually handle the resizing\n
11789
11:28:07,061 --> 11:28:09,182
I'm going to scroll up\nto where I previously
11790
11:28:09,182 --> 11:28:12,302
allocated a new array of size 4.
11791
11:28:12,302 --> 11:28:19,502
And I'm instead going to say this,\n
11792
11:28:19,502 --> 11:28:21,959
Now, previously this wasn't\nnecessarily possible.
11793
11:28:21,959 --> 11:28:23,792
Because recall that we\nhad painted ourselves
11794
11:28:23,792 --> 11:28:25,625
into a corner with the\nexample on the screen
11795
11:28:25,625 --> 11:28:28,472
where "Hello, world" happened to\n
11796
11:28:29,891 --> 11:28:32,822
Let me use realloc, for re-allocate.
11797
11:28:32,822 --> 11:28:36,122
And pass in not just the size\nof memory we want this time
11798
11:28:36,122 --> 11:28:39,812
but also the address\nthat we want to resize.
11799
11:28:39,811 --> 11:28:43,421
Which, again, is this array called list.
11800
11:28:43,921 --> 11:28:46,811
The code thereafter is\npretty much the same.
11801
11:28:46,811 --> 11:28:50,682
But what I don't need to do is this.
11802
11:28:50,682 --> 11:28:54,002
So realloc is a pretty handy\n
11803
11:28:54,002 --> 11:28:57,152
If at the very beginning of class,\n
11804
11:28:57,152 --> 11:29:00,491
And someone's instinct was to just plop\n
11805
11:29:00,491 --> 11:29:03,241
If there's available memory,\nrealloc will just do that.
11806
11:29:03,241 --> 11:29:07,682
And boom, it will just grow the array\n
11807
11:29:07,682 --> 11:29:11,641
If, though, it realizes, sorry, there's\n
11808
11:29:11,641 --> 11:29:14,521
or something else there,\nrealloc will handle
11809
11:29:14,521 --> 11:29:18,211
the trouble of moving that whole\narray from 1 chunk of memory
11810
11:29:18,211 --> 11:29:20,491
originally, to a new chunk of memory.
11811
11:29:20,491 --> 11:29:26,881
And then realloc will return to you,\n
11812
11:29:26,881 --> 11:29:31,031
And it will handle the process\nof freeing the old chunk for you.
11813
11:29:31,031 --> 11:29:33,281
So you do not need to do this yourself.
11814
11:29:33,281 --> 11:29:36,612
So in fact, let me go ahead\nand get rid of this as well.
11815
11:29:36,612 --> 11:29:41,582
So realloc just condenses, a lot of what\n
11816
11:29:41,582 --> 11:29:45,592
Whereby, realloc handles it for you.
11817
11:29:46,091 --> 11:29:49,151
So that's the final improvement\non this array-based approach.
11818
11:29:49,152 --> 11:29:51,932
So what now, knowing\nwhat your memory is
11819
11:29:51,932 --> 11:29:54,881
what can we now do with it that\nsolves that kind of problem?
11820
11:29:54,881 --> 11:29:56,801
Because the world is\ngoing to get really slow.
11821
11:29:56,802 --> 11:29:59,802
And our apps, and our phones, and our\n
11822
11:29:59,802 --> 11:30:04,032
if we're just constantly wasting\n
11823
11:30:04,031 --> 11:30:05,891
What could we perhaps do instead?
11824
11:30:05,891 --> 11:30:07,961
Well there's one new\npiece of syntax today
11825
11:30:07,961 --> 11:30:11,322
that builds on these 3 pieces\nof syntax from the past.
11826
11:30:11,322 --> 11:30:13,182
Recall, that we've\nlooked at struct, which
11827
11:30:13,182 --> 11:30:16,302
is a keyword in C, that just lets\nyou invent your own structure.
11828
11:30:16,302 --> 11:30:19,542
Your own variable, if you will,\nin conjunction with typedef.
11829
11:30:19,542 --> 11:30:23,682
Which lets you say a person has a name\n
11830
11:30:23,682 --> 11:30:26,141
Or a candidate has a name\nand some number of votes.
11831
11:30:26,141 --> 11:30:30,521
You can encapsulate multiple pieces of\n
11832
11:30:30,521 --> 11:30:34,641
What did we use the Dot Notation\nfor now, a couple of times?
11833
11:30:34,641 --> 11:30:37,949
What does the Dot operator do in C?
11834
11:30:37,949 --> 11:30:39,241
AUDIENCE: Access the structure.
11835
11:30:39,631 --> 11:30:41,682
To access the field\ninside of a structure.
11836
11:30:41,682 --> 11:30:43,807
So if you've got a person\nwith a name and a number
11837
11:30:43,807 --> 11:30:46,832
you could say something like\nperson.name or person.number
11838
11:30:46,832 --> 11:30:48,992
if person is the name\nof one such variable.
11839
11:30:48,991 --> 11:30:51,331
Star, of course, we've\nseen now in a few ways.
11840
11:30:51,332 --> 11:30:55,022
Like way back in week 1, we\nsaw it as like, multiplication.
11841
11:30:55,021 --> 11:30:58,231
Last week, we began to see it\nin the context of pointers
11842
11:30:58,232 --> 11:31:00,452
whereby, you use it\nto declare a pointer.
11843
11:31:00,451 --> 11:31:03,041
Like, int* p, or something like that.
11844
11:31:03,042 --> 11:31:05,522
But we also saw it in\none other context, which
11845
11:31:05,521 --> 11:31:08,861
was like the opposite, which\nwas the dereference operator.
11846
11:31:08,862 --> 11:31:10,754
Which says if this is\nan address, that is
11847
11:31:10,754 --> 11:31:13,711
if this is a variable like a pointer,\n
11848
11:31:13,711 --> 11:31:17,461
then with no int or no char,\nno data type in front of it.
11849
11:31:17,461 --> 11:31:19,351
That means go to that address.
11850
11:31:19,351 --> 11:31:22,781
And it dereferences the pointer\nand goes to that location.
11851
11:31:22,781 --> 11:31:25,201
So it turns out that using\nthese 3 building blocks
11852
11:31:25,201 --> 11:31:28,241
you can actually start to now use\n
11853
11:31:28,741 --> 11:31:31,201
And even next week, when\nwe transition to Python
11854
11:31:31,201 --> 11:31:33,841
and you start to get a\nlot of features for free.
11855
11:31:33,841 --> 11:31:36,031
Like a single line of\ncode will just do so much
11856
11:31:36,031 --> 11:31:40,652
more in Python than it does in C. It\n
11857
11:31:40,652 --> 11:31:42,542
And just so you've seen it already.
11858
11:31:42,542 --> 11:31:47,252
It turns out that it's so\ncommon in C to use this operator
11859
11:31:47,252 --> 11:31:51,271
to go inside of a structure and\n
11860
11:31:51,271 --> 11:31:53,731
that there's shorthand\nnotation for it, a.k.a.
11861
11:31:54,932 --> 11:31:56,576
That literally looks like an arrow.
11862
11:31:56,576 --> 11:31:58,951
So recall last week, I was in\nthe habit of pointing, even
11863
11:32:00,152 --> 11:32:04,502
This arrow notation, a\nhyphen and an angled bracket
11864
11:32:04,502 --> 11:32:11,432
denotes going to an address and\nlooking at a field inside of it.
11865
11:32:11,432 --> 11:32:13,722
But we'll see this in\npractice in just a bit.
11866
11:32:13,722 --> 11:32:16,592
So what might be the\nsolution, now, to this problem
11867
11:32:16,591 --> 11:32:20,101
we saw a moment ago whereby, we had\n
11868
11:32:20,101 --> 11:32:23,381
And our memory, a few moments\nago, looked like this.
11869
11:32:23,381 --> 11:32:28,201
We could just copy the whole existing\n
11870
11:32:29,491 --> 11:32:33,331
What would another, perhaps\nbetter solution longer term
11871
11:32:33,332 --> 11:32:38,627
be, that doesn't require\nconstantly moving stuff around?
11872
11:32:38,627 --> 11:32:40,502
Maybe hang in there for\nyour instincts if you
11873
11:32:40,502 --> 11:32:44,682
know the buzz phrase we're looking for\n
11874
11:32:44,682 --> 11:32:47,281
But if we want to avoid\nmoving the 1, 2, and the 3
11875
11:32:47,281 --> 11:32:49,981
but we still want to be able\nto add endless amounts of data.
11876
11:32:51,961 --> 11:32:54,872
So maybe create some kind\nof list using pointers that
11877
11:32:54,872 --> 11:32:56,852
just point at a new location, right.
11878
11:32:56,851 --> 11:32:59,972
In an ideal world, even\nthough this piece of memory
11879
11:32:59,972 --> 11:33:02,912
is being used by this h in\nthe string "Hello, world"
11880
11:33:02,911 --> 11:33:05,461
maybe we could somehow use\na pointer from last week.
11881
11:33:05,461 --> 11:33:09,811
Like an arrow, that says after the\n
11882
11:33:11,521 --> 11:33:15,791
And you just stitch together\nthese integers in memory
11883
11:33:15,792 --> 11:33:17,822
so that each one leads to the next.
11884
11:33:17,822 --> 11:33:21,182
It's not necessarily the case\nthat it's literally back-to-back.
11885
11:33:21,182 --> 11:33:23,432
That would have the\ndownside, it would seem
11886
11:33:23,432 --> 11:33:24,991
of costing us a little bit of space.
11887
11:33:24,991 --> 11:33:27,601
Like a pointer, which recall,\ntakes up some amount of space.
11888
11:33:27,601 --> 11:33:29,881
Typically 8 bytes or 64 bits.
11889
11:33:29,881 --> 11:33:33,481
But I don't have to copy potentially\na huge amount of data just
11890
11:33:34,921 --> 11:33:36,760
And so these things do have a name.
11891
11:33:36,760 --> 11:33:38,552
And indeed, these things\nare what generally
11892
11:33:38,552 --> 11:33:42,302
would be called a linked list.
11893
11:33:42,302 --> 11:33:44,822
A linked list captures\nexactly that intuition
11894
11:33:44,822 --> 11:33:46,542
of linking together things in memory.
11895
11:33:46,542 --> 11:33:48,012
So let's take a look at an example.
11896
11:33:48,012 --> 11:33:49,804
Here's a computer's\nmemory in the abstract.
11897
11:33:49,803 --> 11:33:52,621
Suppose that I'm trying\nto create an array.
11898
11:33:52,622 --> 11:33:55,682
Let's generalize it as\na list, now, of numbers.
11899
11:33:55,682 --> 11:33:57,362
An array has a very specific meaning.
11900
11:33:57,362 --> 11:34:00,092
It's memory that's contiguous,\nback, to back, to back.
11901
11:34:00,091 --> 11:34:03,721
At the end of the day, I as the\n
11902
11:34:05,822 --> 11:34:09,781
I don't really care how it's stored.
11903
11:34:09,781 --> 11:34:12,091
I don't care how it's stored\nwhen I'm writing the code
11904
11:34:12,091 --> 11:34:13,924
I just wanted to work\nat the end of the day.
11905
11:34:13,925 --> 11:34:16,052
So suppose that I first\ninsert my number 1.
11906
11:34:16,052 --> 11:34:19,592
And, who knows, it ends up,\nup there at location, 0X123
11907
11:34:21,302 --> 11:34:23,552
Maybe there's something already here.
11908
11:34:23,552 --> 11:34:25,592
And heck, maybe there's\nsomething already here
11909
11:34:25,591 --> 11:34:28,576
but there's plenty of other options\nfor where this thing can go.
11910
11:34:28,576 --> 11:34:30,451
And suppose that, for\nthe sake of discussion
11911
11:34:30,451 --> 11:34:32,284
the first available\nspot for the next number
11912
11:34:32,285 --> 11:34:38,094
happens to be over here at location\n
11913
11:34:38,093 --> 11:34:40,051
So that's where I'm going\nto plop the number 2.
11914
11:34:40,052 --> 11:34:41,552
And where might the number 3 end up?
11915
11:34:41,552 --> 11:34:44,342
Oh I don't know, maybe\ndown over there at 0X789.
11916
11:34:44,341 --> 11:34:48,511
The point being, I don't know\nwhat is, or really care about
11917
11:34:48,512 --> 11:34:50,672
everything else that's\nin the computer's memory.
11918
11:34:50,671 --> 11:34:54,722
I just care that there are at\nleast 3 locations available where
11919
11:34:54,722 --> 11:34:57,781
I can put my 1, my 2, and my 3.
11920
11:34:57,781 --> 11:35:01,502
But the catch is, now that\nwe're not using an array
11921
11:35:01,502 --> 11:35:05,851
we can't just naively assume that\n
11922
11:35:06,991 --> 11:35:10,441
Add 2 to an index, and boom\nyou're at the next, next number.
11923
11:35:10,442 --> 11:35:14,852
Now you have to leave these little\n
11924
11:35:14,851 --> 11:35:17,161
to lead from one to the other.
11925
11:35:17,161 --> 11:35:19,351
And sometimes, it might be\nclose, a few bytes away.
11926
11:35:19,351 --> 11:35:23,292
Maybe, it's a whole gigabyte away\n
11927
11:35:25,021 --> 11:35:30,252
Like where do these pointers\ngo, as you proposed?
11928
11:35:30,752 --> 11:35:32,822
All I have access to here are bytes.
11929
11:35:32,822 --> 11:35:34,891
I've already stored the\n1, the 2, and the 3.
11930
11:35:37,961 --> 11:35:40,851
So let me, you put the pointers\nright next to these numbers.
11931
11:35:40,851 --> 11:35:44,891
So let me at least plan ahead, so that\n
11932
11:35:44,891 --> 11:35:47,951
recall from last week, for some\nmemory, I don't just ask it now
11933
11:35:47,951 --> 11:35:49,856
for space for just the number.
11934
11:35:49,857 --> 11:35:51,732
Let me start getting\ninto the habit of asking
11935
11:35:51,732 --> 11:35:56,832
malloc for enough space for the number\n
11936
11:35:56,832 --> 11:35:59,542
So it's a little more aggressive\nof me to ask for more memory.
11937
11:36:00,822 --> 11:36:02,622
And here is an example of a trade off.
11938
11:36:02,622 --> 11:36:06,402
Almost any time in CS, when you start\n
11939
11:36:06,402 --> 11:36:10,662
Or if you try to conserve space,\nyou might have to lose time.
11940
11:36:10,661 --> 11:36:12,161
It's being that trade off there.
11941
11:36:14,391 --> 11:36:15,942
Well let me abstract this away.
11942
11:36:15,942 --> 11:36:19,057
And either next to or below, I'm\n
11943
11:36:19,057 --> 11:36:20,182
for the sake of discussion.
11944
11:36:20,182 --> 11:36:22,152
So the arrows are a bit prettier.
11945
11:36:22,152 --> 11:36:25,062
I've asked malloc for\nnow twice as much space
11946
11:36:25,061 --> 11:36:27,072
it would seem, than I previously needed.
11947
11:36:27,072 --> 11:36:31,016
But I'm going to use this second chunk\n
11948
11:36:31,016 --> 11:36:33,641
And I'm going to use this chunk\nof memory to refer to the next
11949
11:36:33,641 --> 11:36:35,451
essentially, stitching\nthis thing together.
11950
11:36:35,451 --> 11:36:37,511
So what should go in this first box?
11951
11:36:37,512 --> 11:36:41,082
Well, I claim the number, 0X456.
11952
11:36:41,082 --> 11:36:43,781
And it's written in hex because\nit represents a memory address.
11953
11:36:43,781 --> 11:36:47,801
But this is the equivalent of drawing\n
11954
11:36:47,802 --> 11:36:51,552
As a little check here, what\nshould go in this second box
11955
11:36:51,552 --> 11:36:55,422
if the goal is to stitch these\ntogether in order 1, 2, 3?
11956
11:36:55,421 --> 11:36:57,593
Feel free to just shout this out.
11957
11:36:59,052 --> 11:37:00,472
SPEAKER 1: OK, that worked well.
11958
11:37:01,396 --> 11:37:04,271
And you can't do that with the hands\n
11959
11:37:04,271 --> 11:37:08,511
So 0X789 should go here because that's\n
11960
11:37:08,512 --> 11:37:11,772
And then, we don't really have\nterribly many possibilities here.
11961
11:37:11,771 --> 11:37:14,441
This has to have a value, right.
11962
11:37:14,442 --> 11:37:19,312
Because at the end of the day, it's\n
11963
11:37:19,311 --> 11:37:22,651
So what value should go here,\nif this is the end of this list?
11964
11:37:23,652 --> 11:37:25,752
SPEAKER 1: So it could be 0X123.
11965
11:37:25,752 --> 11:37:29,531
The implication being that\nit would be a cyclical list.
11966
11:37:29,531 --> 11:37:32,051
Which is OK, but\npotentially problematic.
11967
11:37:32,052 --> 11:37:36,102
If any of you have accidentally\n
11968
11:37:36,101 --> 11:37:39,161
because you had an infinite loop,\n
11969
11:37:39,161 --> 11:37:43,811
to give yourself the accidental\nprobability of an infinite loop.
11970
11:37:43,811 --> 11:37:46,398
What might be simpler than\nthat and ward that off?
11971
11:37:48,612 --> 11:37:50,322
SPEAKER 1: So just the null character.
11972
11:37:50,322 --> 11:37:53,021
Not N-U-L, confusingly, which\nis at the end of strings.
11973
11:37:53,021 --> 11:37:56,031
But N-U-L-L, as we\nintroduced it last week.
11974
11:37:58,061 --> 11:38:00,881
So this is just a special value\nthat programmers decades ago
11975
11:38:00,881 --> 11:38:04,991
decided that if you store the address\n
11976
11:38:04,991 --> 11:38:07,902
There's never going to be\nanything useful at 0x0.
11977
11:38:07,902 --> 11:38:11,082
Therefore, it's a sentinel\nvalue, just a special value
11978
11:38:12,281 --> 11:38:14,351
There's nowhere further to go.
11979
11:38:14,351 --> 11:38:17,951
It's OK to come back to your\n
11980
11:38:17,951 --> 11:38:19,871
But we'd better be\nsmart enough to, maybe
11981
11:38:19,872 --> 11:38:23,862
remember where did the list start\nso that you can detect cycles.
11982
11:38:23,862 --> 11:38:26,421
If you start looping around\nin this structure, otherwise.
11983
11:38:26,921 --> 11:38:29,121
But these addresses, who really\ncares at the end of the day
11984
11:38:30,402 --> 11:38:32,302
It really just now looks like this.
11985
11:38:32,302 --> 11:38:35,260
And indeed, this is how most anyone\n
11986
11:38:35,260 --> 11:38:36,552
if having a discussion at work.
11987
11:38:36,552 --> 11:38:38,344
Talking about what data\nstructure we should
11988
11:38:38,343 --> 11:38:40,271
use to solve some problem\nin the real world.
11989
11:38:40,271 --> 11:38:42,521
We don't care generally\nabout the addresses.
11990
11:38:42,521 --> 11:38:45,111
We care that in code we can access them.
11991
11:38:45,112 --> 11:38:48,072
But in terms of the concept\nalone this would be, perhaps
11992
11:38:48,072 --> 11:38:49,720
the right way to think about this.
11993
11:38:49,720 --> 11:38:51,678
All right, let me pause\nhere and see if there's
11994
11:38:51,678 --> 11:38:55,901
any questions on this idea of creating\n
11995
11:38:55,902 --> 11:39:00,022
not just the numbers like 1,\n2, 3, but twice as much data.
11996
11:39:00,021 --> 11:39:02,591
So that you have little\nbreadcrumbs in the form of pointers
11997
11:39:02,591 --> 11:39:05,991
that can lead you from one to the next.
11998
11:39:05,991 --> 11:39:08,155
Any questions on these linked lists?
11999
11:39:14,913 --> 11:39:19,506
AUDIENCE: So does this takes\ntime more memory than an array?
12000
11:39:19,506 --> 11:39:21,631
SPEAKER 1: This does take\nmore memory than an array
12001
11:39:21,631 --> 11:39:24,180
because I now need space\nfor these pointers.
12002
11:39:24,180 --> 11:39:28,151
And to be clear, I technically\ndidn't really draw this to scale.
12003
11:39:28,152 --> 11:39:31,082
Thus far, in the class, we've\ngenerally thought about integers
12004
11:39:31,082 --> 11:39:33,992
like, 1, 2 and 3, as\nbeing 4 bytes, or 32 bits.
12005
11:39:33,991 --> 11:39:37,021
I made the claim last week that\non modern computer's pointers
12006
11:39:37,021 --> 11:39:40,051
tend to be 8 bytes or 64 bits.
12007
11:39:40,052 --> 11:39:42,762
So, technically, this box should\nactually be a little bigger.
12008
11:39:42,762 --> 11:39:44,461
It was just going to look a\nlittle stupid in the picture.
12009
11:39:45,811 --> 11:39:48,811
But, indeed, you're using\nmore space as a result.
12010
11:39:50,269 --> 11:39:51,601
SPEAKER 1: Oh, how does-- sorry.
12011
11:39:51,601 --> 11:39:55,451
How does the computer identify\nuseful data from used data?
12012
11:39:55,451 --> 11:39:58,261
So, for instance, garbage\nvalues or non-garbage values.
12013
11:39:58,262 --> 11:40:00,902
For now, think of that\nas the job of malloc.
12014
11:40:00,902 --> 11:40:04,292
So when you ask malloc for memory,\nas we started to last week
12015
11:40:04,292 --> 11:40:07,472
malloc keeps track of the\naddresses of the memory
12016
11:40:07,472 --> 11:40:10,442
it has handed to as valid values.
12017
11:40:10,442 --> 11:40:12,932
The other type of memory you\nuse, not just from the heap.
12018
11:40:12,932 --> 11:40:15,872
Because recall we briefly\ndiscussed that malloc uses space
12019
11:40:15,872 --> 11:40:18,872
from the heap, which was drawn at the\n
12020
11:40:18,872 --> 11:40:22,702
There's also stack memory, which is\n
12021
11:40:22,701 --> 11:40:25,201
And where all of the memory\nused by individual functions go.
12022
11:40:25,201 --> 11:40:27,534
And that was drawn in the\npicture is working its way up.
12023
11:40:27,535 --> 11:40:30,302
That's just an artist's\nrendition of direction.
12024
11:40:30,302 --> 11:40:33,662
The compiler, essentially,\nwill also help
12025
11:40:33,661 --> 11:40:37,349
keep track of which values are\nvalid or not inside of the stack.
12026
11:40:37,349 --> 11:40:39,391
Or really the underlying\ncode that you've written
12027
11:40:39,391 --> 11:40:40,724
will keep track of that for you.
12028
11:40:40,724 --> 11:40:43,691
So it's managed for you at that point.
12029
11:40:44,792 --> 11:40:46,522
Sorry it took me a bit to catch on.
12030
11:40:46,521 --> 11:40:48,691
So let's now translate\nthis to actual code.
12031
11:40:48,692 --> 11:40:52,262
How could we implement this idea\n
12032
11:40:52,262 --> 11:40:53,641
And that's a term of our NCS.
12033
11:40:53,641 --> 11:40:57,692
Whenever you have some data structure\n
12034
11:40:57,692 --> 11:41:00,429
N-O-D-E, is the generic term for that.
12035
11:41:00,428 --> 11:41:02,261
So each of these might\nbe said to be a node.
12036
11:41:03,311 --> 11:41:06,103
Well a couple of weeks ago, we saw\n
12037
11:41:06,103 --> 11:41:07,741
like a student or a candidate.
12038
11:41:07,741 --> 11:41:12,421
And a student, or rather a person,\n
12039
11:41:12,421 --> 11:41:14,161
And we used a few pieces of syntax here.
12040
11:41:14,161 --> 11:41:17,371
One, we use the struct keyword,\nwhich gives us a data structure.
12041
11:41:17,372 --> 11:41:21,902
We use typedef, which defines the\nname person to be our new data
12042
11:41:21,902 --> 11:41:24,332
type representing that whole structure.
12043
11:41:24,332 --> 11:41:26,432
So we probably have the\nright ingredients here
12044
11:41:26,432 --> 11:41:28,982
to build up this thing called a node.
12045
11:41:28,982 --> 11:41:32,102
And just to be clear, what should\n
12046
11:41:32,917 --> 11:41:35,042
It's not going to be a name\nor a number, obviously.
12047
11:41:35,042 --> 11:41:39,732
But what should a node have in\nterms of those fields, perhaps?
12048
11:41:41,107 --> 11:41:44,082
SPEAKER 1: So a number like a\nnumber and a pointer in some form.
12049
11:41:44,082 --> 11:41:46,332
So let's translate this to actual code.
12050
11:41:46,332 --> 11:41:51,092
So let's rename person to node\nto capture this notion here.
12051
11:41:52,347 --> 11:41:54,222
If it's just going to\nbe an int, that's fine.
12052
11:41:54,222 --> 11:41:56,461
We can just say int number,\nor int n, or whatever
12053
11:41:56,461 --> 11:41:58,862
you want to call that particular field.
12054
11:41:58,862 --> 11:42:00,554
The next one is a little non-obvious.
12055
11:42:00,553 --> 11:42:02,761
And this is where things\nget a little weird at first
12056
11:42:02,762 --> 11:42:05,312
but, in retrospect, it\nshould all fit together.
12057
11:42:05,311 --> 11:42:11,112
Let me propose that, ideally, we\n
12058
11:42:11,112 --> 11:42:13,412
And I could call the word\nnext anything I want.
12059
11:42:13,411 --> 11:42:17,591
Next just means what comes after\n
12060
11:42:17,591 --> 11:42:19,981
So a lot of CS people would\njust use next to represent
12061
11:42:22,741 --> 11:42:25,921
C and C compilers are\npretty naive, recall.
12062
11:42:25,921 --> 11:42:29,141
They only look at code top\nto bottom, left to right.
12063
11:42:29,141 --> 11:42:31,322
And any time they encounter\na word they have never
12064
11:42:31,322 --> 11:42:32,995
seen before, bad things happen.
12065
11:42:32,995 --> 11:42:34,412
Like, you can't compile your code.
12066
11:42:34,411 --> 11:42:36,401
You get some cryptic\nerror message or the like.
12067
11:42:36,402 --> 11:42:39,391
And that seems to be\nabout to happen here.
12068
11:42:39,391 --> 11:42:42,451
Because if the compiler is reading\nthis code from top to bottom
12069
11:42:42,451 --> 11:42:44,822
it's going to say, oh,\ninside of this struct
12070
11:42:44,822 --> 11:42:46,622
should be a variable called next.
12071
11:42:49,682 --> 11:42:52,951
Because it literally does\nnot find out until 2 lines
12072
11:42:52,951 --> 11:42:55,201
later, after that semicolon.
12073
11:42:55,201 --> 11:42:57,811
So the way to avoid this, which\nwe haven't quite seen before
12074
11:42:57,811 --> 11:43:02,701
is that you can temporarily name this\n
12075
11:43:02,701 --> 11:43:08,041
And then, down here inside of the\n
12076
11:43:08,042 --> 11:43:09,692
And then, you leave the rest alone.
12077
11:43:09,692 --> 11:43:14,102
This is a workaround this is\npossible because now you're
12078
11:43:14,101 --> 11:43:17,222
teaching the compiler, from\nthe first line, that here comes
12079
11:43:17,222 --> 11:43:19,442
a data structure called struct node.
12080
11:43:19,442 --> 11:43:22,902
Down here, you're shortening the name\n
12081
11:43:23,402 --> 11:43:26,485
It's just a little more convenient\n
12082
11:43:26,485 --> 11:43:30,241
But you do have to write struct\n
12083
11:43:30,241 --> 11:43:33,211
But that's OK because it's\nalready come into existence
12084
11:43:33,211 --> 11:43:35,373
now, as of that first line of code.
12085
11:43:35,374 --> 11:43:37,082
So that's the only\nfundamental difference
12086
11:43:37,082 --> 11:43:40,382
between what we did last week\nwith a person or a candidate.
12087
11:43:40,381 --> 11:43:45,371
We just now have to use this\nstruct workaround, syntactically.
12088
11:43:46,652 --> 11:43:50,491
AUDIENCE: So [INAUDIBLE] have like\n
12089
11:43:51,451 --> 11:43:56,551
SPEAKER 1: Why is the next variable\n
12090
11:43:58,631 --> 11:44:01,351
So think about the picture\nwe are trying to draw.
12091
11:44:01,351 --> 11:44:05,222
Technically, yes, each of these\narrows I deliberately drew
12092
11:44:07,982 --> 11:44:10,802
They need to point at the\nwhole data structure in memory.
12093
11:44:10,802 --> 11:44:13,082
Because the computer,\nultimately, and the compiler
12094
11:44:13,082 --> 11:44:16,952
in turn, needs to know that this\n
12095
11:44:18,521 --> 11:44:21,851
Inside of a node is a number\nand also another pointer.
12096
11:44:21,851 --> 11:44:24,252
So when you draw these\narrows, it would be
12097
11:44:24,252 --> 11:44:26,862
incorrect to point at just the number.
12098
11:44:26,862 --> 11:44:29,239
Because that throws\naway information that
12099
11:44:29,239 --> 11:44:31,572
would leave the compiler\nwondering, OK, I'm at a number.
12100
11:44:31,572 --> 11:44:32,682
Where the heck is the pointer?
12101
11:44:32,682 --> 11:44:34,932
You have to tell it that\nit's pointing at a whole node
12102
11:44:34,932 --> 11:44:38,338
so it knows a few bytes away\nis that corresponding pointer.
12103
11:44:40,665 --> 11:44:42,112
AUDIENCE: How do you [INAUDIBLE].
12104
11:44:42,112 --> 11:44:43,444
SPEAKER 1: Really good question.
12105
11:44:43,444 --> 11:44:46,731
It would seem that just as\ncopying the array earlier
12106
11:44:46,732 --> 11:44:49,942
required twice as much memory,\n
12107
11:44:49,942 --> 11:44:52,612
So, technically, twice as much\nplus 1 for the new number.
12108
11:44:52,612 --> 11:44:56,002
Here, too, it looks like we're\nusing twice as much memory, also.
12109
11:44:56,002 --> 11:44:58,881
And to my comment earlier, it's\n
12110
11:44:58,881 --> 11:45:02,752
because these pointers are 8 bytes, and\n
12111
11:45:04,762 --> 11:45:08,391
In the context of the array, you\n
12112
11:45:08,391 --> 11:45:10,231
So, yes, you needed\ntwice as much memory.
12113
11:45:10,232 --> 11:45:13,082
But then you were quickly\nfreeing the original array.
12114
11:45:13,082 --> 11:45:16,372
So you weren't consuming long-term,\n
12115
11:45:16,372 --> 11:45:19,772
The difference here, too, is\nthat, as we'll see in a moment
12116
11:45:19,771 --> 11:45:23,151
it turns out it's going to be\n
12117
11:45:23,152 --> 11:45:25,101
to insert new numbers in here.
12118
11:45:25,101 --> 11:45:28,101
Because I'm not going to have\nto do a huge amount of copying.
12119
11:45:28,101 --> 11:45:31,281
And even though I might still have\n
12120
11:45:31,281 --> 11:45:33,561
is going to take some\namount of time, I'm
12121
11:45:33,561 --> 11:45:36,951
not going to have to be asking for\n
12122
11:45:36,951 --> 11:45:40,671
And certain operations in the computer,\n
12123
11:45:40,671 --> 11:45:42,481
back memory, tends to be slower.
12124
11:45:42,482 --> 11:45:44,340
So we get to avoid\nthat situation as well.
12125
11:45:44,339 --> 11:45:46,131
There's going to be\nsome downsides, though.
12126
11:45:47,182 --> 11:45:51,241
But we'll see in a bit just what some\n
12127
11:45:51,741 --> 11:45:56,222
So from here, if we go back to the\n
12128
11:45:56,222 --> 11:45:59,302
let's start to now build up a\nlinked list with some actual code.
12129
11:45:59,302 --> 11:46:03,682
How do you go about, in C,\nrepresenting a linked list in code?
12130
11:46:03,682 --> 11:46:06,262
Well, at the moment, it would\nactually be as simple as this.
12131
11:46:06,262 --> 11:46:09,412
You declare a variable,\ncalled list, for instance.
12132
11:46:09,411 --> 11:46:12,451
That itself stores\nthe address of a node.
12133
11:46:14,701 --> 11:46:17,362
So if you want to store\na linked list in memory
12134
11:46:17,362 --> 11:46:19,879
you just create a variable\ncalled list, or whatever else.
12135
11:46:19,879 --> 11:46:21,711
And you just say that\nthis variable is going
12136
11:46:21,711 --> 11:46:25,911
to be pointing at the first node in a\n
12137
11:46:25,911 --> 11:46:29,751
Because malloc is ultimately going\n
12138
11:46:29,752 --> 11:46:33,752
get at any one particular\nnode in memory.
12139
11:46:34,252 --> 11:46:36,171
So let's actually do\nthis in pictorial form.
12140
11:46:36,171 --> 11:46:39,171
When you write a line of\ncode, like I just did here--
12141
11:46:39,171 --> 11:46:43,161
and I do not initialize it to\n
12142
11:46:44,211 --> 11:46:48,201
It does exist in memory as a box,\n
12143
11:46:48,201 --> 11:46:50,911
But I've deliberately\ndrawn Oscar inside of it.
12144
11:46:54,112 --> 11:46:55,444
SPEAKER 1: It's a garbage value.
12145
11:46:55,444 --> 11:46:59,881
I have been allocated the\nvariable in memory, called list.
12146
11:46:59,881 --> 11:47:03,951
Which is going to give me 64 bits\n
12147
11:47:04,951 --> 11:47:07,701
But if I myself have not\nused the assignment operator
12148
11:47:07,701 --> 11:47:11,311
it's not going to get magically\n
12149
11:47:11,811 --> 11:47:13,951
It's not going to even give me a node.
12150
11:47:13,951 --> 11:47:18,631
This is literally just going to be an\n
12151
11:47:18,631 --> 11:47:20,241
So what would be a solution here?
12152
11:47:20,241 --> 11:47:23,241
Suppose that I'm beginning\nto create my linked list
12153
11:47:23,241 --> 11:47:24,771
but I don't have any nodes yet.
12154
11:47:24,771 --> 11:47:28,783
What would be a sensible thing to\n
12155
11:47:31,319 --> 11:47:32,612
SPEAKER 1: So just null, right.
12156
11:47:32,612 --> 11:47:34,342
When in doubt with\npointers, generally it's
12157
11:47:34,341 --> 11:47:36,091
a good thing to\ninitialize things to null
12158
11:47:36,091 --> 11:47:37,641
so at least it's not a garbage value.
12159
11:47:39,900 --> 11:47:41,692
But it's a special\nvalue you can then check
12160
11:47:41,692 --> 11:47:43,622
for with a conditional, or the like.
12161
11:47:43,622 --> 11:47:47,602
So this might be a better\nway to create a linked list
12162
11:47:47,601 --> 11:47:51,601
even before you've inserted any\nnumbers into the thing itself.
12163
11:47:52,101 --> 11:47:55,316
So after that, how can we go about\n
12164
11:47:55,317 --> 11:47:56,692
So now the story looks like this.
12165
11:47:56,692 --> 11:47:59,632
Oscar is gone because inside\nof this box is all zero bits.
12166
11:47:59,631 --> 11:48:03,531
Just because it's nice and clean, and\n
12167
11:48:03,531 --> 11:48:08,072
Well, if I want to add the number 1\n
12168
11:48:08,072 --> 11:48:10,072
Well, perhaps I could\nstart with code like this.
12169
11:48:10,072 --> 11:48:11,781
Borrowing inspiration from last week.
12170
11:48:11,781 --> 11:48:16,402
Let's ask malloc for enough\nspace for the size of a node.
12171
11:48:16,402 --> 11:48:20,542
And this gets to your question earlier,\n
12172
11:48:20,542 --> 11:48:23,842
I don't just need space for an int and\n
12173
11:48:24,921 --> 11:48:27,631
And I gave that thing a name, node.
12174
11:48:27,631 --> 11:48:30,411
So size of node figures out\nand does the arithmetic for me.
12175
11:48:30,411 --> 11:48:32,871
And gives me back the\nright number of bytes.
12176
11:48:32,872 --> 11:48:36,412
This, then, stores the address\nof that chunk of memory
12177
11:48:36,411 --> 11:48:38,361
in what I'll temporarily called n.
12178
11:48:38,362 --> 11:48:40,641
Just to represent a generic new node.
12179
11:48:42,351 --> 11:48:45,561
Because just like last week when I\n
12180
11:48:45,561 --> 11:48:47,841
and I stored it in an int* pointer.
12181
11:48:47,841 --> 11:48:50,241
This week, if I'm asking\nfor memory for a node
12182
11:48:50,241 --> 11:48:52,822
I'm storing it in a node* pointer.
12183
11:48:52,822 --> 11:48:56,002
So technically, nothing new\nthere except for this new term
12184
11:48:56,002 --> 11:48:58,502
of art in data structure called node.
12185
11:48:59,002 --> 11:49:00,351
So what does that do for me?
12186
11:49:00,351 --> 11:49:03,141
It essentially draws a\npicture like this in memory.
12187
11:49:03,141 --> 11:49:07,171
I still have my list variable from\n
12188
11:49:07,671 --> 11:49:09,129
And that's why I've drawn it blank.
12189
11:49:09,129 --> 11:49:11,541
I also now have a\ntemporary variable called
12190
11:49:11,542 --> 11:49:15,052
n, which I initialize to\nthe return value of malloc.
12191
11:49:15,052 --> 11:49:17,132
Which gave me one of\nthese nodes in memory.
12192
11:49:17,131 --> 11:49:19,612
But I've drawn it having\ngarbage values, too
12193
11:49:19,612 --> 11:49:21,332
because I don't know what int is there.
12194
11:49:21,332 --> 11:49:22,790
I don't know what pointer is there.
12195
11:49:22,790 --> 11:49:27,082
It's garbage values because malloc does\n
12196
11:49:27,082 --> 11:49:28,732
There is another function for that.
12197
11:49:28,732 --> 11:49:31,582
But malloc alone just says,\nsure, use this chunk of memory.
12198
11:49:31,582 --> 11:49:33,391
Deal with whatever is there.
12199
11:49:33,391 --> 11:49:36,381
So how can I go about\ninitializing this to known values?
12200
11:49:36,381 --> 11:49:40,921
Well, suppose I want to insert the\n
12201
11:49:40,921 --> 11:49:44,693
A list of size 1, I could\ndo something like this.
12202
11:49:44,694 --> 11:49:47,402
And this is where you have to\n
12203
11:49:47,402 --> 11:49:51,542
My conditional here is asking the\n
12204
11:49:51,542 --> 11:49:54,692
So that is, if malloc\ngave me valid memory
12205
11:49:54,692 --> 11:49:58,172
and I don't have to quit altogether\n
12206
11:49:58,171 --> 11:50:02,072
If n does not equal null, but\nis equal to valid address
12207
11:50:02,072 --> 11:50:03,552
I'm going to go ahead and do this.
12208
11:50:03,552 --> 11:50:06,302
And this is cryptic looking syntax now.
12209
11:50:06,302 --> 11:50:09,632
But does someone want to take a stab\n
12210
11:50:13,862 --> 11:50:18,002
How might you explain what that\ninner line of code is doing? *n.
12211
11:50:27,283 --> 11:50:29,641
The place that n is pointing\nto, set it equal to 1.
12212
11:50:29,641 --> 11:50:33,542
Or using the vernacular of going\nthere, go to the address in n
12213
11:50:33,542 --> 11:50:35,961
and set it's number field to 1.
12214
11:50:35,961 --> 11:50:37,961
However you want to think\nabout it, that's fine.
12215
11:50:37,961 --> 11:50:40,411
But the * again is the\ndereference operator here.
12216
11:50:40,411 --> 11:50:42,211
And we're doing the\nparentheses, which we
12217
11:50:42,211 --> 11:50:45,722
haven't needed to do before because we\n
12218
11:50:45,722 --> 11:50:47,491
structures together until today.
12219
11:50:47,491 --> 11:50:49,862
This just means go there first.
12220
11:50:49,862 --> 11:50:52,201
And then once you're\nthere, go access number.
12221
11:50:52,201 --> 11:50:54,311
You don't want to do one\nthing before the other.
12222
11:50:54,311 --> 11:50:56,371
So this is just enforcing\norder of operations.
12223
11:50:56,372 --> 11:50:58,782
The parentheses just like\nin grade school math.
12224
11:50:59,281 --> 11:51:00,692
So this line of code is cryptic.
12225
11:51:01,463 --> 11:51:03,421
It's not something most\npeople easily remember.
12226
11:51:03,421 --> 11:51:07,231
Thankfully, there's that syntactic\n
12227
11:51:08,338 --> 11:51:10,171
And this, even though\nit's new to you today
12228
11:51:10,171 --> 11:51:12,301
should eventually feel\na little more familiar.
12229
11:51:12,302 --> 11:51:15,692
Because this now is shorthand\nnotation for saying, start at n.
12230
11:51:15,692 --> 11:51:17,891
Go there as by following the arrow.
12231
11:51:17,891 --> 11:51:20,012
And when you get there,\nchange the number field.
12232
11:51:22,201 --> 11:51:24,721
So most people would not\nwrite code like this.
12233
11:51:25,512 --> 11:51:26,912
It's a couple extra keystrokes.
12234
11:51:26,911 --> 11:51:30,781
This just looks more like the artist's\n
12235
11:51:30,781 --> 11:51:35,012
And how most CS people would think about\n
12236
11:51:37,775 --> 11:51:42,132
The picture now, after setting number to\n
12237
11:51:42,131 --> 11:51:43,921
So there's still one step missing.
12238
11:51:43,921 --> 11:51:46,201
And that's, of course, to\ninitialize, it would seem
12239
11:51:46,201 --> 11:51:50,561
the pointer in this new node\nto something known like null.
12240
11:51:50,561 --> 11:51:52,216
So I bet we could do this like this.
12241
11:51:52,216 --> 11:51:54,091
With a different line\nof code, I'm just going
12242
11:51:54,091 --> 11:52:00,361
to say if n does not equal null,\n
12243
11:52:00,362 --> 11:52:04,022
Or more pedantically, go\nto n, follow the arrow
12244
11:52:04,021 --> 11:52:07,921
and then update the next field\n
12245
11:52:07,921 --> 11:52:10,171
And again, this is just\ndoing some nice bookkeeping.
12246
11:52:10,171 --> 11:52:13,351
Technically speaking,\nwe might not need to set
12247
11:52:13,351 --> 11:52:16,391
this to null if we're going to keep\n
12248
11:52:16,391 --> 11:52:19,591
But I'm doing it step-by-step so\n
12249
11:52:19,591 --> 11:52:23,281
And there's no bugs in\nmy code at this point.
12250
11:52:24,752 --> 11:52:27,211
There's one last thing I'm\ngoing to have to do here.
12251
11:52:27,211 --> 11:52:32,432
If the goal, ultimately, was to insert\n
12252
11:52:32,432 --> 11:52:36,342
what's the last step I\nshould, perhaps, do here?
12253
11:52:38,031 --> 11:52:40,741
AUDIENCE: Set the pointer value to null.
12254
11:52:41,491 --> 11:52:45,451
I now need to update the actual\n
12255
11:52:45,451 --> 11:52:48,511
list, to point at this brand new node.
12256
11:52:48,512 --> 11:52:52,798
That is now perfectly initialized as\n
12257
11:52:52,798 --> 11:52:54,881
Yeah, technically, this\nis already pointing there.
12258
11:52:54,881 --> 11:52:57,572
But I describe this deliberately\nearlier as being temporary.
12259
11:52:57,572 --> 11:53:02,101
I just needed this to get it back from\n
12260
11:53:02,101 --> 11:53:04,711
This is the long term\nvariable I care about.
12261
11:53:04,711 --> 11:53:06,961
So I'm going to want to do\nsomething simple like this.
12262
11:53:09,002 --> 11:53:11,345
And this seems a little\nweird that list equals n.
12263
11:53:11,345 --> 11:53:13,262
But again, think about\nwhat's inside this box.
12264
11:53:13,262 --> 11:53:15,470
At the moment this is null\nbecause there is no linked
12265
11:53:15,470 --> 11:53:17,012
list at the beginning of our story.
12266
11:53:17,012 --> 11:53:21,391
N is the address of the beginning, and\n
12267
11:53:21,391 --> 11:53:24,781
So it stands to reason that\nif you set list equal to n
12268
11:53:24,781 --> 11:53:27,661
that has the effect of\ncopying this address up here.
12269
11:53:27,661 --> 11:53:30,764
Or really just copying the\narrow into that same location
12270
11:53:30,764 --> 11:53:32,432
so that now the picture looks like this.
12271
11:53:32,432 --> 11:53:35,822
And heck, if this was a temporary\n
12272
11:53:35,822 --> 11:53:37,351
And now, this is the picture.
12273
11:53:37,351 --> 11:53:39,512
So an annoying number\nof steps, certainly
12274
11:53:39,512 --> 11:53:42,002
to walk through verbally like this.
12275
11:53:42,002 --> 11:53:44,162
But it's just malloc to\ngive yourself a node
12276
11:53:44,161 --> 11:53:49,411
initialize the 2 fields inside of\n
12277
11:53:50,252 --> 11:53:52,391
I didn't have to copy anything.
12278
11:53:52,391 --> 11:53:55,614
I just had to insert\nsomething in this case.
12279
11:53:55,614 --> 11:53:58,322
Let me pause here to see if there's\n
12280
11:53:58,322 --> 11:54:02,271
And we'll see before long it all\n
12281
11:54:02,271 --> 11:54:06,447
AUDIENCE: So if the\nstatements [INAUDIBLE]..
12282
11:54:07,072 --> 11:54:10,491
I drew them separately just\nfor the sake of the voiceover
12283
11:54:10,491 --> 11:54:12,502
of doing each thing very methodically.
12284
11:54:12,502 --> 11:54:14,572
In real code, as we'll\ntransition to now
12285
11:54:14,572 --> 11:54:16,701
I could have and should\nhave just done it
12286
11:54:16,701 --> 11:54:20,481
all inside of one conditional after\n
12287
11:54:20,482 --> 11:54:22,792
I could set number to a value like 1.
12288
11:54:22,792 --> 11:54:25,897
And I could set the pointer\nitself to something like null.
12289
11:54:26,512 --> 11:54:30,082
Well let's translate, then,\nthis into some similar code
12290
11:54:30,082 --> 11:54:34,822
that allows us to build up a linked\n
12291
11:54:35,631 --> 11:54:37,381
But now, using this new primitive.
12292
11:54:37,381 --> 11:54:39,621
So I'm going to go\nback into VS Code here.
12293
11:54:39,622 --> 11:54:42,952
I'm going to go ahead now and delete\n
12294
11:54:44,752 --> 11:54:49,951
And now, inside of my main function,\n
12295
11:54:49,951 --> 11:54:53,661
I'm going to first give\nmyself a list of size 0.
12296
11:54:53,661 --> 11:54:56,091
And I'm going to call that node* list.
12297
11:54:56,091 --> 11:54:59,091
And I'm going to initialize that\n
12298
11:54:59,091 --> 11:55:02,241
But I'm also now going to have to\n
12299
11:55:03,451 --> 11:55:06,981
So recall that I might do something\nlike typedef, struct node.
12300
11:55:06,982 --> 11:55:09,802
Inside of this struct node, I'm\ngoing to have a number, which
12301
11:55:09,802 --> 11:55:11,492
I'll call number of type int.
12302
11:55:11,491 --> 11:55:13,641
And I'm going to have\na structure called node
12303
11:55:13,641 --> 11:55:16,951
with a * that says the next\npointer is called next.
12304
11:55:16,951 --> 11:55:20,631
And I'm going to call this whole\nthing, more succinctly, node
12305
11:55:22,311 --> 11:55:25,401
Now as an aside, for those of you\n
12306
11:55:27,082 --> 11:55:29,932
Technically, I could\ndo something like this.
12307
11:55:29,932 --> 11:55:33,442
Not use typedef and not\nuse the word node alone.
12308
11:55:33,442 --> 11:55:37,162
This syntax here would actually\ncreate for me a new data
12309
11:55:37,161 --> 11:55:40,311
type called, verbosely, struct node.
12310
11:55:40,311 --> 11:55:42,921
And I could use this throughout\nmy code saying struct node.
12311
11:55:43,942 --> 11:55:45,322
That just gets a little tedious.
12312
11:55:45,322 --> 11:55:48,197
And it would be nicer just to refer\n
12313
11:55:49,232 --> 11:55:51,712
So what typedef has\nbeen doing for us is it
12314
11:55:51,711 --> 11:55:55,252
again, lets us invent our own\nword that's even more succinct.
12315
11:55:55,252 --> 11:55:58,521
And this just has the effect\nnow of calling this whole thing
12316
11:55:58,521 --> 11:56:02,241
node without the need, subsequently, to\n
12317
11:56:04,161 --> 11:56:07,531
So now that this thing exists in\n
12318
11:56:10,252 --> 11:56:12,921
And to do this, I'm going to\ngive myself a temporary variable.
12319
11:56:12,921 --> 11:56:14,822
I'll call it n for consistency.
12320
11:56:14,822 --> 11:56:18,021
I'm going to use malloc to\ngive myself the size of a node
12321
11:56:19,561 --> 11:56:21,021
And then, I'm going to\ndo a little safety check.
12322
11:56:21,021 --> 11:56:23,951
If n equals equals null, I'm going\n
12323
11:56:23,951 --> 11:56:25,701
I'm just going to quit\nout of this program
12324
11:56:25,701 --> 11:56:28,441
because there's nothing useful\nto be done at this point.
12325
11:56:28,442 --> 11:56:31,052
But most likely my computer is\nnot going to run out of memory.
12326
11:56:31,052 --> 11:56:34,232
So I'm going to assume we can keep\n
12327
11:56:34,232 --> 11:56:38,872
If n does not equal null, and that\n
12328
11:56:40,851 --> 11:56:42,411
I'm going to build this up backwards.
12329
11:56:44,188 --> 11:56:45,771
That's OK, let's go ahead and do this.
12330
11:56:48,082 --> 11:56:52,972
And then n [arrow next] equals null.
12331
11:56:52,972 --> 11:56:59,902
And now, update list to point\nto new node, list equals n.
12332
11:56:59,902 --> 11:57:02,062
So at this point in the\nstory, we've essentially
12333
11:57:02,061 --> 11:57:06,811
constructed what was that first\npicture, which looks like this.
12334
11:57:06,811 --> 11:57:11,362
This is the corresponding code via\n
12335
11:57:11,362 --> 11:57:14,342
Suppose now, we want to add\nthe number 2 to the list.
12336
11:57:21,391 --> 11:57:23,811
Well, I don't need to\nredeclare n because I can use
12337
11:57:23,811 --> 11:57:25,591
the same temporary variables before.
12338
11:57:25,591 --> 11:57:30,791
So this time, I'm just going to say n\n
12339
11:57:30,792 --> 11:57:32,542
I'm, again, going to\nhave my safety check.
12340
11:57:32,542 --> 11:57:36,772
So if n equals equals null, then let's\n
12341
11:57:36,771 --> 11:57:41,301
But, I have to be a\nlittle more careful now.
12342
11:57:41,302 --> 11:57:43,641
Technically speaking,\nwhat do I still need
12343
11:57:43,641 --> 11:57:48,021
to do before I quit out of my\nprogram to be really proper?
12344
11:57:48,021 --> 11:57:51,361
Free the memory that did\nsucceed a little higher up.
12345
11:57:51,362 --> 11:57:56,762
So I think it suffices to free what\n
12346
11:57:57,262 --> 11:58:03,742
Now, if all was well, though, let's\n
12347
11:58:03,741 --> 11:58:09,322
And now, n [arrow next] equals null.
12348
11:58:09,322 --> 11:58:12,381
And now, let's go ahead\nand add it to the list.
12349
11:58:12,381 --> 11:58:20,391
If I go ahead and do\nlist arrow next equals n
12350
11:58:20,391 --> 11:58:24,141
I think what we've just done is\nbuild up the equivalent, now
12351
11:58:24,141 --> 11:58:27,141
of this in the computer's memory.
12352
11:58:27,141 --> 11:58:29,661
By going to the list\nfield's next field, which
12353
11:58:29,661 --> 11:58:33,561
is synonymous with the 1\nnodes, bottom-most box.
12354
11:58:33,561 --> 11:58:37,021
And store the address of what was n,\n
12355
11:58:37,021 --> 11:58:39,871
And I'm just throwing away, in the\n
12356
11:58:42,362 --> 11:58:47,569
Let me go down here and say, add\n
12357
11:58:49,822 --> 11:58:52,762
And clearly, in a real program, we\n
12358
11:58:52,762 --> 11:58:56,542
And do this dynamically or a function\n
12359
11:58:56,542 --> 11:58:59,601
But just to go through the\nsyntax here, this is fine.
12360
11:58:59,601 --> 11:59:03,182
If n equals equals null, out\nof memory for some reason.
12361
11:59:03,182 --> 11:59:09,131
Let's return 1, but we\nshould free the list itself
12362
11:59:09,131 --> 11:59:12,932
and even the second node, list [next].
12363
11:59:12,932 --> 11:59:16,211
But I've deliberately done this poorly.
12364
11:59:16,711 --> 11:59:18,722
This is a little more subtle now.
12365
11:59:18,722 --> 11:59:22,052
And let me get rid of the highlighting\n
12366
11:59:22,052 --> 11:59:26,372
If n happens to equal equal\nnull, and something really just
12367
11:59:26,372 --> 11:59:32,522
went wrong they're out of memory,\n
12368
11:59:32,521 --> 11:59:35,252
And again, it's not that I'm\nfreeing those variables per se.
12369
11:59:35,252 --> 11:59:39,101
I'm freeing the addresses\nat in those variables.
12370
11:59:39,101 --> 11:59:41,371
But there's also a\nbug with my code here.
12371
11:59:45,061 --> 11:59:49,165
This line here, 43, what is\nthat freeing specifically?
12372
11:59:49,832 --> 11:59:52,382
AUDIENCE: You're freeing list 2 times.
12373
11:59:52,381 --> 11:59:54,121
SPEAKER 1: I'm freeing, not so.
12374
11:59:54,631 --> 11:59:56,221
I'm not freeing list 2 times.
12375
11:59:56,222 --> 11:59:59,012
Technically, I'm freeing\nlist once and list next once.
12376
11:59:59,012 --> 12:00:01,082
But let me just ask the\nmore explicit question.
12377
12:00:01,082 --> 12:00:03,902
What am I freeing with\nline 43 at the moment?
12378
12:00:08,911 --> 12:00:10,921
Because if 1 is at the\nbeginning of the list
12379
12:00:10,921 --> 12:00:14,011
list contains the address\nof that number 1 node.
12380
12:00:14,012 --> 12:00:15,762
And so this frees that node.
12381
12:00:15,762 --> 12:00:18,732
This line of code, you might\nthink now intuitively, OK
12382
12:00:18,732 --> 12:00:21,092
it's probably freeing the node number 2.
12383
12:00:22,891 --> 12:00:24,601
Valgrind might help you catch this.
12384
12:00:24,601 --> 12:00:27,002
But by eyeing it, it's\nnot necessarily obvious.
12385
12:00:27,002 --> 12:00:31,472
You should never touch memory\nthat you have already freed.
12386
12:00:31,472 --> 12:00:34,412
And so, the fact that I did\nin this order, very bad.
12387
12:00:34,411 --> 12:00:37,111
Because I'm telling the\noperating system, I don't know.
12388
12:00:37,112 --> 12:00:39,631
I don't need the list address anymore.
12389
12:00:40,891 --> 12:00:43,141
And then, literally one line later,\n
12390
12:00:43,141 --> 12:00:45,211
Let me actually go to\nthat address for a moment
12391
12:00:45,211 --> 12:00:47,881
and look at the next\nfield of that first node.
12392
12:00:48,701 --> 12:00:51,191
You've already given up\ncontrol over the node.
12393
12:00:51,192 --> 12:00:54,211
So it's an easy fix in\nthis case, logically.
12394
12:00:54,211 --> 12:00:56,851
But we should be freeing\nthe second node first
12395
12:00:56,851 --> 12:01:00,542
and then the first one\nso that we're doing it
12396
12:01:00,542 --> 12:01:02,522
in, essentially, reverse order.
12397
12:01:02,521 --> 12:01:04,438
And again, Valgrind would\nhelp you catch that.
12398
12:01:04,438 --> 12:01:07,063
But that's the kind of thing one\nneeds to be careful about when
12399
12:01:08,082 --> 12:01:10,592
You cannot touch memory\nafter you freed it.
12400
12:01:12,451 --> 12:01:17,971
Let me go ahead and update\nthe number field of n to be 3.
12401
12:01:17,972 --> 12:01:20,982
The next node of n to be null.
12402
12:01:20,982 --> 12:01:22,772
And then, just like\nin the slide earlier
12403
12:01:22,771 --> 12:01:28,502
I think I can do list\nnext, next equals n.
12404
12:01:28,502 --> 12:01:32,372
And that has the effect now of\n
12405
12:01:32,372 --> 12:01:34,472
essentially, this data structure.
12406
12:01:36,302 --> 12:01:38,342
Like, in a better world, we'd\nhave a loop and some functions
12407
12:01:38,341 --> 12:01:39,901
that are automating this process.
12408
12:01:39,902 --> 12:01:44,162
But, for now, we're doing it just\n
12409
12:01:44,161 --> 12:01:48,901
So at this point, unfortunately,\n
12410
12:01:48,902 --> 12:01:53,671
It's no longer as easy as int\ni equals 0, i less than 3, i++.
12411
12:01:53,671 --> 12:02:00,902
Because you cannot just\ndo something like this.
12412
12:02:00,902 --> 12:02:06,002
Because pointer arithmetic\nno longer comes into play
12413
12:02:06,002 --> 12:02:10,232
when it's you, who are stitching\n
12414
12:02:10,232 --> 12:02:12,932
In all of our past examples\nwith arrays, you've
12415
12:02:12,932 --> 12:02:16,302
been trusting that all of the bytes in\n
12416
12:02:16,302 --> 12:02:19,015
So it's perfectly reasonable for\nthe compiler and the computer
12417
12:02:19,014 --> 12:02:21,932
to just figure out, oh, well if you\n
12418
12:02:21,932 --> 12:02:23,612
[1], it's one location over.
12419
12:02:23,612 --> 12:02:25,592
[2], it's one location over.
12420
12:02:25,591 --> 12:02:28,511
This is way less obvious now.
12421
12:02:28,512 --> 12:02:32,132
Because even though you might want to\n
12422
12:02:32,131 --> 12:02:36,752
list, or the second, or the third, you\n
12423
12:02:38,072 --> 12:02:41,521
Instead, you have to\nfollow all of those arrows.
12424
12:02:41,521 --> 12:02:44,822
So with linked lists, you can't use\n
12425
12:02:44,822 --> 12:02:47,792
because one node might be here,\nover here, over here, over here.
12426
12:02:47,792 --> 12:02:51,031
You can't just use some simple offset.
12427
12:02:51,031 --> 12:02:53,822
So I think our code is going\nto have to be a little fancier.
12428
12:02:53,822 --> 12:02:57,302
And this might look scary at\nfirst, but it's just an application
12429
12:02:57,302 --> 12:02:59,641
of some of the basic definitions here.
12430
12:02:59,641 --> 12:03:06,961
Let me do a for-loop that actually\n
12431
12:03:08,612 --> 12:03:13,262
I'm going to keep doing this, so\n
12432
12:03:13,262 --> 12:03:15,842
And on each iteration\nof this loop, I'm going
12433
12:03:15,841 --> 12:03:20,581
to update TMP to be\nwhatever TMP arrow next is.
12434
12:03:20,582 --> 12:03:23,192
And I'll remind you in a moment\nand explain in more detail.
12435
12:03:23,192 --> 12:03:27,211
But when I print something here\nwith printf, I can still use %i.
12436
12:03:27,211 --> 12:03:29,521
Because it's still a number\nat the end of the day.
12437
12:03:29,521 --> 12:03:34,121
But what I want to print out is the\n
12438
12:03:34,122 --> 12:03:36,514
So maybe the ugliest\nfor-loop we've ever seen.
12439
12:03:36,514 --> 12:03:38,972
Because it's mixing, not just\nthe idea of a for-loop, which
12440
12:03:38,972 --> 12:03:40,982
itself was a bit cryptic weeks ago.
12441
12:03:40,982 --> 12:03:43,507
But now, I'm using pointers\ninstead of integers.
12442
12:03:43,506 --> 12:03:45,631
But I'm not violating the\ndefinition of a for-loop.
12443
12:03:45,631 --> 12:03:48,421
Recall that a for-loop has 3\nmain things in parentheses.
12444
12:03:48,421 --> 12:03:50,281
What do you want to initialize first?
12445
12:03:50,281 --> 12:03:53,222
What condition do you want to\nkeep checking again and again?
12446
12:03:53,222 --> 12:03:56,921
And what update do you want to make\n
12447
12:03:56,921 --> 12:03:59,341
So with that basic\ndefinition in mind, this
12448
12:03:59,341 --> 12:04:01,831
is giving me a temporary\nvariable called TMP
12449
12:04:01,832 --> 12:04:04,002
that is initialized to\nthe beginning of the loop.
12450
12:04:04,002 --> 12:04:07,591
So it's like pointing my\nfinger at the number 1 node.
12451
12:04:07,591 --> 12:04:11,011
Then, I'm asking the question,\ndoes TMP not equal null?
12452
12:04:11,012 --> 12:04:13,652
Well, hopefully, not because\nI'm pointing at a valid node
12453
12:04:15,192 --> 12:04:17,012
So, of course, it\ndoesn't equal null yet.
12454
12:04:17,012 --> 12:04:19,512
Null won't be until we get\nto the end of the list.
12455
12:04:21,012 --> 12:04:22,742
I started this TMP variable.
12456
12:04:22,741 --> 12:04:27,752
I follow the arrow and go to\nthe number field they're in.
12457
12:04:28,832 --> 12:04:32,492
The for-loop says,\nchange TMP to be whatever
12458
12:04:32,491 --> 12:04:36,572
is at TMP, by following the arrow\nand grabbing the next field.
12459
12:04:36,572 --> 12:04:39,741
That, then, has the result of being\n
12460
12:04:39,741 --> 12:04:42,241
No, of course, it doesn't equal\nnull because the second node
12461
12:04:43,531 --> 12:04:45,402
Null is still at the very end.
12462
12:04:45,402 --> 12:04:47,192
So I print out the number 2.
12463
12:04:47,192 --> 12:04:51,152
Next step, I update TMP one more\ntime to be whatever is next.
12464
12:04:51,152 --> 12:04:53,711
That, then, does not yet equal null.
12465
12:04:53,711 --> 12:04:55,951
So I go ahead and print\nout the number 3 node.
12466
12:04:55,951 --> 12:05:01,601
Then one last time, I update TMP to\n
12467
12:05:01,601 --> 12:05:05,461
But after 1, 2, 3, that\nlast next field is null.
12468
12:05:05,461 --> 12:05:09,271
And so, I break out of\nthis for-loop altogether.
12469
12:05:09,271 --> 12:05:12,211
So if I do this in\npictorial form, all we're
12470
12:05:12,211 --> 12:05:15,781
doing, if I now use my finger\nto represent the TMP variable.
12471
12:05:15,781 --> 12:05:19,561
I initialize TMP to be whatever\nlist is, so it points here.
12472
12:05:19,561 --> 12:05:22,261
That's obviously not null\nso I print out whatever
12473
12:05:22,262 --> 12:05:26,582
is that TMP, follow the arrow\nin number, and I print that out.
12474
12:05:26,582 --> 12:05:28,772
Then I update TMP to point here.
12475
12:05:28,771 --> 12:05:30,558
Then I update TMP to point here.
12476
12:05:30,559 --> 12:05:31,891
Then I update TMP to point here.
12477
12:05:34,961 --> 12:05:39,152
So, again, admittedly much more cryptic\n
12478
12:05:40,091 --> 12:05:46,336
But it's just a different\nutilization of the for-loop syntax.
12479
12:05:46,836 --> 12:05:50,622
AUDIENCE: How does it happen that\n
12480
12:05:50,622 --> 12:05:52,500
Because it seems to me that addresses-
12481
12:05:53,542 --> 12:05:56,542
How is it that I'm actually printing\n
12482
12:05:57,921 --> 12:05:59,601
The compiler is helping me here.
12483
12:05:59,601 --> 12:06:02,211
Because I taught it, in the\nvery beginning of my program
12484
12:06:05,211 --> 12:06:08,991
The compiler knows that a node has\n
12485
12:06:10,911 --> 12:06:16,891
Because I'm iterating using a node*\n
12486
12:06:16,891 --> 12:06:19,641
the compiler knows that any\ntime I'm pointing at something
12487
12:06:19,641 --> 12:06:21,421
I'm pointing at the whole node.
12488
12:06:21,421 --> 12:06:24,502
Doesn't matter where specifically in\n
12489
12:06:24,502 --> 12:06:26,692
It's, ultimately, pointing\nat the whole node itself.
12490
12:06:26,692 --> 12:06:30,802
And the fact that I, then, use\nTMP arrow number means, OK
12491
12:06:30,802 --> 12:06:31,972
adjust your finger slightly.
12492
12:06:31,972 --> 12:06:35,991
So you're literally pointing at the\n
12493
12:06:35,991 --> 12:06:40,402
So that's sufficient information for\n
12494
12:06:41,042 --> 12:06:44,211
Other questions then\non this approach here.
12495
12:06:46,762 --> 12:06:51,322
SPEAKER 1: How would I use a for-loop\n
12496
12:06:51,322 --> 12:06:56,122
You will do something like this,\nif I may, in problem set 5.
12497
12:06:56,122 --> 12:06:59,211
We will give you some of the\nscaffolding for doing this.
12498
12:06:59,211 --> 12:07:02,182
But in this coming weeks materials\nwill we guide you to that.
12499
12:07:02,182 --> 12:07:04,775
But let me not spoil it just yet.
12500
12:07:06,192 --> 12:07:08,559
AUDIENCE: So I had a\nquestion about line 49.
12501
12:07:09,141 --> 12:07:11,159
AUDIENCE: Is line 49\npossible in line 43?
12502
12:07:12,201 --> 12:07:15,381
Is line 49 acceptable, even\nif we freed it earlier.
12503
12:07:15,381 --> 12:07:18,081
We didn't free it in line\n43, in this case, right.
12504
12:07:18,082 --> 12:07:22,281
You can only reach line 49,\nif n does not equal null.
12505
12:07:22,281 --> 12:07:24,472
And you do not return on line 45.
12506
12:07:25,341 --> 12:07:29,661
I was only doing those freeing, if I\n
12507
12:07:32,512 --> 12:07:33,887
AUDIENCE: I had a quick question.
12508
12:07:36,862 --> 12:07:40,131
SPEAKER 1: Correct You're asking\n
12509
12:07:40,131 --> 12:07:41,839
does that mean you\ndon't have to free it?
12510
12:07:41,839 --> 12:07:44,241
You never have to free pointers, per se.
12511
12:07:44,241 --> 12:07:49,042
You should only free addresses that\n
12512
12:07:49,042 --> 12:07:51,412
So I haven't finished\nthe program, to be fair.
12513
12:07:51,411 --> 12:07:53,361
But you're not freeing variables.
12514
12:07:53,362 --> 12:07:55,222
You're not freeing like, fields.
12515
12:07:55,222 --> 12:07:58,351
You are freeing specific\naddresses, whatever they may be.
12516
12:07:58,351 --> 12:08:01,252
So the last thing, and I\nwas stalling on showing this
12517
12:08:01,252 --> 12:08:02,932
because it too is a little cryptic.
12518
12:08:02,932 --> 12:08:06,052
Here is how you can free,\nnow, a whole linked list.
12519
12:08:06,052 --> 12:08:08,724
In the world of arrays,\nrecall, it was so easy.
12520
12:08:09,682 --> 12:08:11,402
You return 0 and you're done.
12521
12:08:12,622 --> 12:08:14,482
Because, again, the\ncomputer doesn't know
12522
12:08:14,482 --> 12:08:17,182
what you have stitched together\nusing all of these pointers
12523
12:08:17,182 --> 12:08:18,622
all over the computer's memory.
12524
12:08:18,622 --> 12:08:20,662
You need to follow those arrows.
12525
12:08:20,661 --> 12:08:23,401
So one way to do this\nwould be as follows.
12526
12:08:23,402 --> 12:08:28,402
While the list itself is not null,\n
12527
12:08:29,722 --> 12:08:32,454
I'm going to give myself a\ntemporary variable called TMP again.
12528
12:08:32,453 --> 12:08:34,911
And it's a different TMP because\nit's in a different scope.
12529
12:08:34,911 --> 12:08:38,691
It's inside of the while loop instead\n
12530
12:08:38,692 --> 12:08:44,122
I am going to initialize TMP to\nbe the address of the next node.
12531
12:08:44,122 --> 12:08:46,641
Just so I can get one\nstep ahead of things.
12532
12:08:47,932 --> 12:08:51,811
Because now, I can boldly\nfree the list itself
12533
12:08:51,811 --> 12:08:53,451
which does not mean the whole list.
12534
12:08:53,451 --> 12:08:56,151
Again, I'm freeing the\naddress in list, which
12535
12:08:56,152 --> 12:08:58,891
is the address of the number 1 node.
12536
12:08:59,872 --> 12:09:02,461
It's just the address\nof the number 1 node.
12537
12:09:02,461 --> 12:09:05,362
So if I first use TMP\nto point out the number
12538
12:09:05,362 --> 12:09:10,792
2 slightly in the middle of the picture,\n
12539
12:09:10,792 --> 12:09:12,772
at the moment, to free list.
12540
12:09:12,771 --> 12:09:15,351
That is the address of the first node.
12541
12:09:15,351 --> 12:09:19,641
Now I'm going to say, all right, once\n
12542
12:09:19,641 --> 12:09:24,561
I can update the list\nitself to be literally TMP.
12543
12:09:27,932 --> 12:09:33,622
If you think about this picture, TMP\n
12544
12:09:35,031 --> 12:09:38,421
So TMP, represented by my right hand\n
12545
12:09:38,421 --> 12:09:43,011
Totally safe and reasonable to\nfree now the list itself a.k.a.
12546
12:09:43,012 --> 12:09:44,632
the address of the number 1 node.
12547
12:09:44,631 --> 12:09:47,362
That has the effect of just\nthrowing away the number 1 node
12548
12:09:47,362 --> 12:09:50,152
telling the computer you can\nreuse that memory for you.
12549
12:09:50,152 --> 12:09:53,632
The last line of code I wrote\n
12550
12:09:53,631 --> 12:09:58,042
2, at which point my loop proceeded\n
12551
12:09:58,042 --> 12:10:01,072
And only once my finger is\nliterally pointing at nowhere
12552
12:10:01,072 --> 12:10:03,832
the null symbol, will the\nloop, by nature of a while
12553
12:10:03,832 --> 12:10:06,472
loop as I'll toggle back to, break out.
12554
12:10:06,472 --> 12:10:09,112
And there's nothing more to be freed.
12555
12:10:09,112 --> 12:10:12,171
So again, what you'll see,\nultimately, in problem set 5
12556
12:10:12,171 --> 12:10:16,171
more on that later, is an opportunity\n
12557
12:10:17,211 --> 12:10:20,061
But again, even though the syntax\nis admittedly pretty cryptic
12558
12:10:20,061 --> 12:10:23,781
we're still using basics like\nthese for-loops or while loops.
12559
12:10:23,781 --> 12:10:27,442
We're just starting to now\nfollow explicit addresses rather
12560
12:10:27,442 --> 12:10:31,222
than letting the computer do\nall of the arithmetic for us
12561
12:10:31,222 --> 12:10:33,116
as we previously benefited from.
12562
12:10:33,116 --> 12:10:36,241
At the very end of this thing, I'm\n
12563
12:10:36,241 --> 12:10:39,722
And I think, then, we're good to go.
12564
12:10:40,222 --> 12:10:43,442
Questions on this linked list code now?
12565
12:10:43,442 --> 12:10:46,192
And again, we'll walk through this\n
12566
12:10:46,692 --> 12:10:51,095
AUDIENCE: Can you explain the while\n
12567
12:10:51,762 --> 12:10:55,432
Can we explain this while loop\nhere for freeing the list.
12568
12:10:55,432 --> 12:10:58,061
So notice that, first, I'm just\nasking the obvious question.
12569
12:10:58,902 --> 12:11:02,872
Because if it is, there's\nno work to be done.
12570
12:11:02,872 --> 12:11:06,942
However, while the list is not\nnull, according to line 58
12571
12:11:08,021 --> 12:11:12,401
I want to create a temporary variable\n
12572
12:11:12,402 --> 12:11:15,022
that list arrow next is pointing at.
12573
12:11:17,741 --> 12:11:21,171
List arrow next is whatever\nthis thing is here.
12574
12:11:21,171 --> 12:11:23,951
So if my right hand represents\nthe temporary variable
12575
12:11:23,951 --> 12:11:27,951
I'm literally pointing at the\nsame thing as the list is itself.
12576
12:11:27,951 --> 12:11:31,121
The next line of code,\nrecall, was free the list.
12577
12:11:31,122 --> 12:11:33,882
And unlike, in our world of\narrays, like half an hour
12578
12:11:33,881 --> 12:11:36,581
ago where that just meant\nfree the whole darn list
12579
12:11:36,582 --> 12:11:41,171
you now have taken over control over the\n
12580
12:11:41,171 --> 12:11:43,031
in ways that you didn't with the array.
12581
12:11:43,031 --> 12:11:46,332
The computer knew how to free\nthe whole array because you
12582
12:11:46,332 --> 12:11:48,162
malloc the whole thing at once.
12583
12:11:48,161 --> 12:11:52,061
You are now mallocing the\nlinked list one node at a time.
12584
12:11:52,061 --> 12:11:54,911
And the operating system does\nnot keep track of for you
12585
12:11:56,292 --> 12:11:59,952
So when you free list,\nyou are literally freeing
12586
12:11:59,951 --> 12:12:03,911
the value of the list variable,\n
12587
12:12:03,911 --> 12:12:07,301
Then my last line of code, which I'll\n
12588
12:12:07,302 --> 12:12:11,982
list to now ignore the\nfree memory and point at 2.
12589
12:12:14,561 --> 12:12:17,981
So, again, it's just a\nvery pedantic way of using
12590
12:12:17,982 --> 12:12:21,942
this new syntax of star notation,\n
12591
12:12:21,942 --> 12:12:25,902
to do the equivalent of walking\ndown all of these arrows.
12592
12:12:25,902 --> 12:12:28,122
Following all of these breadcrumbs.
12593
12:12:28,122 --> 12:12:31,422
But it does take admittedly\nsome getting used to.
12594
12:12:31,421 --> 12:12:33,926
Syntax, you only have to do one week.
12595
12:12:33,927 --> 12:12:35,802
But, again, next week\nin Python will we begin
12596
12:12:35,802 --> 12:12:37,632
to abstract a lot of\nthis complexity away.
12597
12:12:37,631 --> 12:12:39,502
But none of this\ncomplexity is going away.
12598
12:12:39,502 --> 12:12:42,252
It's just that someone else, the\nauthors of Python for instance
12599
12:12:42,252 --> 12:12:44,389
will have automated this stuff for us.
12600
12:12:44,389 --> 12:12:46,182
The goal this week is\nto understand what it
12601
12:12:46,182 --> 12:12:49,461
is we're going to get for\nfree, so to speak, next week.
12602
12:12:49,961 --> 12:12:54,292
Questions on these length lists.
12603
12:12:55,932 --> 12:12:58,745
AUDIENCE: So are the while\nloops strictly necessary
12604
12:12:58,745 --> 12:13:00,209
for the freeing [INAUDIBLE].
12605
12:13:01,252 --> 12:13:03,834
Let me summarize as, could we\nhave freed this with a for-loop?
12606
12:13:04,760 --> 12:13:06,112
It just is a matter of style.
12607
12:13:06,112 --> 12:13:09,152
It's a little more elegant to do it\n
12608
12:13:09,152 --> 12:13:11,154
But other people will\nreasonably disagree.
12609
12:13:11,154 --> 12:13:13,862
Anything you can do with a while\n
12610
12:13:14,872 --> 12:13:17,211
Do while loops, recall,\nare a little different.
12611
12:13:17,211 --> 12:13:19,853
But they will always\ndo at least one thing.
12612
12:13:19,853 --> 12:13:22,311
But for-loops and while loops\nbehave the same in this case.
12613
12:13:25,482 --> 12:13:27,881
All right, well let's just\nvary things a little bit here.
12614
12:13:27,881 --> 12:13:29,963
Just to see what some of\nthe pitfalls might now be
12615
12:13:29,963 --> 12:13:31,722
without getting into the weeds of code.
12616
12:13:31,722 --> 12:13:35,711
Indeed, we'll try to save some of\n
12617
12:13:35,711 --> 12:13:40,002
But instead, let's imagine that we\n
12618
12:13:40,002 --> 12:13:43,182
I can offer, in exchange for a\nfew volunteers, some foam fingers
12619
12:13:43,182 --> 12:13:45,099
to bring to the next game, perhaps.
12620
12:13:45,099 --> 12:13:46,932
Could we get maybe just\none volunteer first?
12621
12:13:47,591 --> 12:13:50,591
You will be our linked\nlist from the get go.
12622
12:13:52,061 --> 12:13:54,322
SPEAKER 1: Pedro, come on up.
12623
12:13:54,322 --> 12:13:55,572
All right, thank you to Pedro.
12624
12:13:58,661 --> 12:14:00,661
And if you want to just\nstand roughly over here.
12625
12:14:00,661 --> 12:14:03,210
But you are a null pointer so\njust point sort of at the ground
12626
12:14:03,211 --> 12:14:04,412
as though you're pointing at 0.
12627
12:14:04,911 --> 12:14:07,508
So Pedro is our linked list\nof size 0, which pictorially
12628
12:14:07,508 --> 12:14:10,800
might look a little something like this\n
12629
12:14:10,800 --> 12:14:15,481
Now suppose that we want to go ahead\n
12630
12:14:15,482 --> 12:14:17,682
Can we get a volunteer\nto be on camera here?
12631
12:14:18,182 --> 12:14:19,349
You jumped out of your seat.
12632
12:14:21,889 --> 12:14:23,682
OK, you really want\nthe foam finger, I say.
12633
12:14:36,752 --> 12:14:39,271
So here is your number\n2 for your number field.
12634
12:14:40,502 --> 12:14:43,597
And come on, let's say that there\n
12635
12:14:44,222 --> 12:14:46,961
So Caleb got malloced,\nif you will, over here.
12636
12:14:46,961 --> 12:14:51,286
So now if we want to insert Caleb and\n
12637
12:14:51,286 --> 12:14:52,411
well what do we need to do?
12638
12:14:52,411 --> 12:14:53,822
I already initialized you to 2.
12639
12:14:53,822 --> 12:14:55,802
And pointing as you\nare to the ground means
12640
12:14:55,802 --> 12:14:58,112
you're initialized to\nnull for your next field.
12641
12:14:58,112 --> 12:14:59,881
Pedro, what you should you-- perfect.
12642
12:15:02,101 --> 12:15:03,676
So Pedro is now pointing at the list.
12643
12:15:03,677 --> 12:15:05,802
So now our list looks a\nlittle something like this.
12644
12:15:07,652 --> 12:15:10,152
So the first couple of these\nwill be pretty straightforward.
12645
12:15:10,152 --> 12:15:13,662
Let's insert one more, if anyone\n
12646
12:15:13,661 --> 12:15:15,161
Here, how about right in the middle.
12647
12:15:16,351 --> 12:15:19,159
And just in anticipation, how\nabout let's malloc someone else.
12648
12:15:19,160 --> 12:15:20,702
OK, your friends are pointing at you.
12649
12:15:20,701 --> 12:15:22,831
Do you want to come\ndown too, preemptively?
12650
12:15:22,832 --> 12:15:25,334
This is a pool of memory, if you will.
12651
12:15:30,661 --> 12:15:32,291
And hang there for just a moment.
12652
12:15:32,792 --> 12:15:34,351
So we've just malloced Hannah.
12653
12:15:34,351 --> 12:15:37,621
And Hannah, how about Hannah,\nsuppose you ended up over there
12654
12:15:37,622 --> 12:15:39,282
in just some random location.
12655
12:15:39,781 --> 12:15:43,442
So what should we now do, if the\n
12656
12:15:44,042 --> 12:15:46,020
So Pedro, do you have\nto update yourself?
12657
12:15:47,391 --> 12:15:48,781
Caleb, what do you have to do?
12658
12:15:49,281 --> 12:15:52,173
And Hannah what should you be doing?
12659
12:15:52,173 --> 12:15:55,381
I would, it's just for you for now, so\n
12660
12:15:55,881 --> 12:15:58,771
So, again demonstrating the fact\n
12661
12:15:58,771 --> 12:16:01,291
we had our nice, clean array\nback, to back, to back
12662
12:16:01,292 --> 12:16:03,862
contiguously, these guys are\ndeliberately all over the stage.
12663
12:16:10,921 --> 12:16:12,737
And pick your favorite place in memory.
12664
12:16:16,802 --> 12:16:19,030
So Jonathan's now over there.
12665
12:16:20,072 --> 12:16:21,929
So 5, we want to point\nHannah at number 5.
12666
12:16:21,928 --> 12:16:23,761
So you, of course, are\ngoing to point there.
12667
12:16:23,762 --> 12:16:25,137
And where should you be pointing?
12668
12:16:25,137 --> 12:16:26,982
Down to represent null, as well.
12669
12:16:29,035 --> 12:16:30,702
But now things get a little interesting.
12670
12:16:30,701 --> 12:16:33,481
And here, we'll use a chance\nto, without the weeds of code
12671
12:16:33,482 --> 12:16:36,572
point out how order of operations\nis really going to matter.
12672
12:16:36,572 --> 12:16:40,802
Suppose that I next want to\nallocate say, the number 1.
12673
12:16:40,802 --> 12:16:42,992
And I want to insert the\nnumber 1 into this list.
12674
12:16:43,491 --> 12:16:45,101
This is what the code would look like.
12675
12:16:45,101 --> 12:16:48,661
But if we act this out-- could\nwe get one more volunteer?
12676
12:16:48,661 --> 12:16:50,471
How about on the end\nthere in the sweater.
12677
12:17:01,457 --> 12:17:03,332
And how about, Lauren,\nwhy don't you go right
12678
12:17:03,332 --> 12:17:04,952
in here in front, if you don't mind.
12679
12:17:07,262 --> 12:17:09,332
So I've initialized\nLauren to the number 1.
12680
12:17:09,332 --> 12:17:11,942
And your pointer will be\nnull, pointing at the ground.
12681
12:17:11,942 --> 12:17:14,485
Where do you belong if we're\nmaintaining sorted order?
12682
12:17:14,485 --> 12:17:15,902
Looks like right at the beginning.
12683
12:17:18,902 --> 12:17:23,582
So Pedro has presumed\nto point now at Lauren.
12684
12:17:23,582 --> 12:17:27,812
But how do you know where to point?
12685
12:17:28,982 --> 12:17:30,882
SPEAKER 1: Pedro's undoing\nwhat he did a moment ago.
12686
12:17:31,862 --> 12:17:35,232
And that was perfect that Pedro\n
12687
12:17:35,732 --> 12:17:39,432
You literally just orphaned all of these\n
12688
12:17:39,932 --> 12:17:44,281
Because if Pedro was our only variable\n
12689
12:17:44,281 --> 12:17:47,281
this is the danger of using pointers,\n
12690
12:17:47,281 --> 12:17:48,661
and building your own data structures.
12691
12:17:48,661 --> 12:17:50,619
The moment you point\ntemporarily, if you could
12692
12:17:50,620 --> 12:17:53,972
to Lauren, I have no idea\nwhere he's pointing to.
12693
12:17:53,972 --> 12:17:58,741
I have no idea how to get back to Caleb,\n
12694
12:18:01,771 --> 12:18:03,781
I think we need Lauren\nto make a decision first.
12695
12:18:05,131 --> 12:18:06,301
SPEAKER 1: So pointing at Caleb.
12696
12:18:06,601 --> 12:18:09,184
Because you're pointing at\nliterally who Pedro is pointing at.
12697
12:18:09,184 --> 12:18:10,971
Pedro, now what are you safe to do?
12698
12:18:11,472 --> 12:18:13,211
So order of operations there matters.
12699
12:18:13,211 --> 12:18:17,311
And if we had just done this line\n
12700
12:18:17,311 --> 12:18:20,222
That was like Pedro's first\ninstinct, bad things happen.
12701
12:18:20,222 --> 12:18:22,182
And we orphaned the rest of the list.
12702
12:18:22,182 --> 12:18:25,832
But if we think through it logically and\n
12703
12:18:25,832 --> 12:18:29,322
we've now updated the list to look\n
12704
12:18:30,391 --> 12:18:32,966
We got one more foam finger\nhere for the number 3.
12705
12:18:49,216 --> 12:18:52,851
If you want to go maybe in the middle of\n
12706
12:18:52,851 --> 12:18:56,752
So here, too, the goal is\nto maintain sorted order.
12707
12:18:56,752 --> 12:19:01,881
So let's ask the audience, who or what\n
12708
12:19:01,881 --> 12:19:04,391
So we don't screw up and\norphan some of the memory.
12709
12:19:04,391 --> 12:19:07,722
And if we do orphan memory, this is\n
12710
12:19:08,591 --> 12:19:10,901
Your Mac, your PC, your\nphone can start to slow down
12711
12:19:10,902 --> 12:19:14,092
if you keep asking for memory but\n
12712
12:19:14,091 --> 12:19:15,911
So we want to get this right.
12713
12:19:20,502 --> 12:19:22,182
SPEAKER 1: 3 should point at 4.
12714
12:19:22,182 --> 12:19:25,572
So 3, do you want to point at 4.
12715
12:19:27,281 --> 12:19:32,442
And how did you know,\nMiriam, whom to point at?
12716
12:19:36,131 --> 12:19:39,701
Because if you look at where this\nlist is currently constructed
12717
12:19:39,701 --> 12:19:42,551
and you can cheat on the board\nhere, 2 is pointing to 4.
12718
12:19:42,552 --> 12:19:46,122
If you point at whoever Caleb,\nnumber 2, is pointing out
12719
12:19:46,122 --> 12:19:48,942
that, indeed, leads you\nto Hannah for number 4.
12720
12:19:48,942 --> 12:19:53,082
So now what's the next step\nto stitch this together?
12721
12:19:57,792 --> 12:20:00,385
So Caleb, I think it's now\nsafe for you to decouple.
12722
12:20:00,385 --> 12:20:02,302
Because someone is already\npointing at Hannah.
12723
12:20:02,302 --> 12:20:03,427
We haven't orphaned anyone.
12724
12:20:03,427 --> 12:20:05,322
So now, if we follow\nthe breadcrumbs, we've
12725
12:20:05,322 --> 12:20:10,351
got Pedro leading to 1,\nto 2, to 3, to 4, to 5.
12726
12:20:10,351 --> 12:20:12,851
We need the numbers back, but\nyou can keep the foam fingers.
12727
12:20:12,851 --> 12:20:15,019
Thank you to our volunteers here.
12728
12:20:17,741 --> 12:20:20,739
SPEAKER 1: You can just\nput the numbers here.
12729
12:20:21,572 --> 12:20:22,739
SPEAKER 1: Thank you to all.
12730
12:20:22,739 --> 12:20:26,682
So this is only to say that when you\n
12731
12:20:26,682 --> 12:20:29,245
and in the problem set, it's\ngoing to be very easy to lose
12732
12:20:29,245 --> 12:20:30,662
sight of the forest for the trees.
12733
12:20:30,661 --> 12:20:32,701
Because the code does get really dense.
12734
12:20:32,701 --> 12:20:37,721
But the idea is, again, really do bubble\n
12735
12:20:37,722 --> 12:20:40,781
And if you think about data\nstructures at this level.
12736
12:20:40,781 --> 12:20:42,898
If you go off in program\nafter a class like CS50
12737
12:20:42,898 --> 12:20:45,481
and your whiteboarding something\nwith a friend or a colleague
12738
12:20:45,482 --> 12:20:48,512
most people think at\nand talk at this level.
12739
12:20:48,512 --> 12:20:51,031
And they just assume that,\nyeah, if we went back and looked
12740
12:20:51,031 --> 12:20:54,371
at our textbooks or class notes, we\n
12741
12:20:54,372 --> 12:20:56,222
But the important stuff\nis the conversation.
12742
12:20:57,601 --> 12:21:02,561
Even though, via this week, will we\n
12743
12:21:02,561 --> 12:21:06,572
So when it comes to analyzing\nan algorithm like this
12744
12:21:06,572 --> 12:21:08,641
let's consider the following.
12745
12:21:08,641 --> 12:21:15,961
What might be now the running time of\n
12746
12:21:17,582 --> 12:21:19,292
We talked about arrays earlier.
12747
12:21:19,292 --> 12:21:22,292
And we had some binary search\npossibilities still, as soon
12748
12:21:23,131 --> 12:21:26,311
But as soon as we have a linked list,\n
12749
12:21:27,661 --> 12:21:29,369
And so you can't just\nassume that you can
12750
12:21:29,370 --> 12:21:32,162
jump arithmetically to the middle\n
12751
12:21:32,982 --> 12:21:36,572
You pretty much have to follow all\n
12752
12:21:36,572 --> 12:21:39,362
So how might that inform what we see?
12753
12:21:41,076 --> 12:21:43,951
Even though I keep drawing all these\n
12754
12:21:44,461 --> 12:21:46,254
And all of us humans\nin the room can easily
12755
12:21:46,254 --> 12:21:49,842
spot where the 1 is, where the 2 is,\n
12756
12:21:49,841 --> 12:21:54,091
just like with our lockers and arrays,\n
12757
12:21:54,091 --> 12:21:57,991
And the key thing with a linked\nlist is that the only address
12758
12:21:57,991 --> 12:22:01,891
we've fundamentally been remembering\n
12759
12:22:01,891 --> 12:22:05,472
He was the link to all\nof the other nodes.
12760
12:22:05,472 --> 12:22:07,472
And, in turn, each\nperson led to the next.
12761
12:22:07,472 --> 12:22:12,132
But without Pedro, we would have lost\n
12762
12:22:12,131 --> 12:22:14,432
So when you start with\na linked list, if you
12763
12:22:14,432 --> 12:22:18,211
want to find an element as via\n
12764
12:22:18,211 --> 12:22:19,682
Following all of the arrows.
12765
12:22:19,682 --> 12:22:21,692
Following all of the\npointers on the stage
12766
12:22:21,692 --> 12:22:23,822
in order to get to the node in question.
12767
12:22:23,822 --> 12:22:27,182
And only once you hit null can\nyou conclude, yep, it was there.
12768
12:22:28,982 --> 12:22:31,922
So given that if a\ncomputer, essentially
12769
12:22:31,921 --> 12:22:36,451
can only see the number 1, or the number\n
12770
12:22:36,451 --> 12:22:39,752
or the number 5, one\nat a time, how might we
12771
12:22:39,752 --> 12:22:43,171
think about the running time of search?
12772
12:22:43,171 --> 12:22:45,091
And it is indeed Big O of n.
12773
12:22:45,891 --> 12:22:48,391
Well, in the worst case, the\nnumber you might be looking for
12774
12:22:49,961 --> 12:22:53,192
And so, obviously, you're going to\n
12775
12:22:53,192 --> 12:22:55,425
And I drew these things\nwith boxes on top of them.
12776
12:22:55,425 --> 12:22:57,842
Because, again, even though\nyou and I can immediately see
12777
12:22:57,841 --> 12:23:00,091
where the 5 is for\ninstance, the computer
12778
12:23:00,091 --> 12:23:03,961
can only figure that out by starting\n
12779
12:23:03,961 --> 12:23:05,881
So there, too, is another trade off.
12780
12:23:05,881 --> 12:23:09,511
It would seem that, overnight,\nwe have lost the ability
12781
12:23:09,512 --> 12:23:14,672
to do a very powerful algorithm from\n
12782
12:23:15,302 --> 12:23:19,292
Because there's no way in this\npicture to jump mathematically
12783
12:23:19,292 --> 12:23:21,857
to the middle node, unless\nyou remember where it is.
12784
12:23:21,857 --> 12:23:23,732
And then, remember where\nevery other node is.
12785
12:23:23,732 --> 12:23:25,524
And at that point,\nyou're back to an array.
12786
12:23:25,523 --> 12:23:29,862
Linked list, by design, only\nremember the next node in the list.
12787
12:23:30,362 --> 12:23:32,851
How about something like insert?
12788
12:23:32,851 --> 12:23:35,671
In the worst case,\nperhaps, how many steps
12789
12:23:35,671 --> 12:23:38,822
might it take to insert\nsomething into a linked list?
12790
12:23:44,372 --> 12:23:45,713
Fortunately, it's not that bad.
12791
12:23:45,713 --> 12:23:46,921
It's not as bad as n squared.
12792
12:23:46,921 --> 12:23:49,201
That typically means\ndoing n things, n times.
12793
12:23:49,201 --> 12:23:53,741
And I think we can stay under\nthat, but not a bad thought.
12794
12:23:55,313 --> 12:23:56,521
SPEAKER 1: Why would it be n?
12795
12:23:56,521 --> 12:24:00,269
AUDIENCE: Because the [INAUDIBLE].
12796
12:24:00,851 --> 12:24:03,131
So to summarize, you're proposing n.
12797
12:24:03,131 --> 12:24:04,994
Because to find where\nthe thing goes, you
12798
12:24:04,995 --> 12:24:06,912
have to traverse,\npotentially, the whole list.
12799
12:24:06,911 --> 12:24:09,701
Because if I'm inserting the\nnumber 6 or the number 99
12800
12:24:09,701 --> 12:24:12,252
that numerically\nbelongs at the very end
12801
12:24:12,252 --> 12:24:15,311
I can only find its location\nby looking for all of them.
12802
12:24:15,311 --> 12:24:16,849
At this point, though, in the term.
12803
12:24:16,849 --> 12:24:18,641
And really, at this\npoint in the story, you
12804
12:24:18,641 --> 12:24:22,072
should start to question these very\n
12805
12:24:22,072 --> 12:24:25,841
Because the answer is almost\nalways going to depend, right.
12806
12:24:25,841 --> 12:24:28,461
If I've just got a link to\nlist that looks like this
12807
12:24:28,461 --> 12:24:31,722
the first question back to\nsomeone asking this question
12808
12:24:31,722 --> 12:24:34,781
would be, well does the list\nneed to be sorted, right?
12809
12:24:34,781 --> 12:24:37,173
I've drawn it as sorted\nand it might imply as much.
12810
12:24:37,173 --> 12:24:39,131
So that's a reasonable\nassumption to have made.
12811
12:24:39,131 --> 12:24:41,801
But if I don't care about\nmaintaining sorted order
12812
12:24:41,802 --> 12:24:45,672
I could actually insert into a\nlinked list in constant time.
12813
12:24:46,211 --> 12:24:49,110
I could just keep inserting into\n
12814
12:24:49,902 --> 12:24:51,792
And even though the\nlist is getting longer
12815
12:24:51,792 --> 12:24:55,752
the number of steps required to insert\n
12816
12:25:00,222 --> 12:25:02,382
If you want to keep it\nsorted though, yes, it's
12817
12:25:02,381 --> 12:25:03,792
going to be, indeed, Big O of n.
12818
12:25:03,792 --> 12:25:05,322
But again, these kinds\nof, now, assumptions
12819
12:25:05,322 --> 12:25:06,529
are going to start to matter.
12820
12:25:06,529 --> 12:25:09,222
So let's for the sake of\ndiscussion say it's Big O of n
12821
12:25:09,222 --> 12:25:11,141
if we do want to maintain sorted order.
12822
12:25:11,141 --> 12:25:14,292
But what about in the\ncase of not caring.
12823
12:25:14,292 --> 12:25:16,110
It might indeed be a Big O of 1.
12824
12:25:16,110 --> 12:25:19,152
And now these are the kinds of decisions\n
12825
12:25:19,152 --> 12:25:20,682
What about in the best case here?
12826
12:25:20,682 --> 12:25:22,722
If we're thinking about\nBig Omega notation
12827
12:25:22,722 --> 12:25:25,114
then, frankly, we could just\nget lucky in the best case.
12828
12:25:25,114 --> 12:25:27,822
And the element we're looking for\n
12829
12:25:27,822 --> 12:25:32,052
Or heck, we just blindly insert to the\n
12830
12:25:32,052 --> 12:25:33,982
that we want to keep things in.
12831
12:25:34,482 --> 12:25:39,900
So besides then, how can we\nimprove further on this design?
12832
12:25:39,900 --> 12:25:41,442
We don't need to stop at linked list.
12833
12:25:41,442 --> 12:25:43,572
Because, honestly, it's\nnot been a clear win.
12834
12:25:43,572 --> 12:25:46,421
Like, linked list allow us\nto use more of our memory
12835
12:25:46,421 --> 12:25:49,911
because we don't need massive\n
12836
12:25:50,781 --> 12:25:54,792
But they still require Big O of\nn time to find the end of it
12837
12:25:56,112 --> 12:25:59,351
We're using at least twice as\nmuch memory for the darn pointer.
12838
12:25:59,351 --> 12:26:01,601
So that seems like a sidestep.
12839
12:26:01,601 --> 12:26:03,582
It's not really a step forward.
12840
12:26:05,322 --> 12:26:09,639
Here's where we can now accelerate the\n
12841
12:26:09,639 --> 12:26:11,472
even if you haven't\nused this technique yet
12842
12:26:11,472 --> 12:26:15,612
we would seem to have an ability to\n
12843
12:26:16,601 --> 12:26:19,002
And anything you could\nimagine drawing with arrows
12844
12:26:19,002 --> 12:26:21,622
you can implement, it\nwould seem, in code.
12845
12:26:21,622 --> 12:26:24,102
So what if we leverage\na second dimension.
12846
12:26:24,101 --> 12:26:26,618
Instead of just stringing\ntogether things laterally
12847
12:26:26,618 --> 12:26:28,451
left to right, essentially,\neven though they
12848
12:26:28,451 --> 12:26:30,101
were bouncing around on the screen.
12849
12:26:30,101 --> 12:26:33,252
What if we start to leverage a\n
12850
12:26:33,252 --> 12:26:36,881
And build more interesting\nstructures in the computer's memory.
12851
12:26:36,881 --> 12:26:39,671
Well it turns out that\nin a computer's memory
12852
12:26:39,671 --> 12:26:42,612
we could create a tree,\nsimilar to a family tree.
12853
12:26:42,612 --> 12:26:46,362
If you've ever seen or draw on a family\n
12854
12:26:50,442 --> 12:26:53,652
So inverted branch of a\ntree that grows, typically
12855
12:26:53,652 --> 12:26:56,531
when it's drawn, downward instead\nof upward like a typical tree.
12856
12:26:56,531 --> 12:26:59,021
But that's something we could\ntranslate into code as well.
12857
12:26:59,021 --> 12:27:02,721
Specifically, let's do something\ncalled a binary search tree.
12858
12:27:04,601 --> 12:27:07,152
And what I mean by\nthis is the following.
12859
12:27:07,961 --> 12:27:10,841
This is an example of an\narray from like week 2
12860
12:27:10,841 --> 12:27:12,231
when we first talked about those.
12861
12:27:12,232 --> 12:27:13,932
And we had the lockers on stage.
12862
12:27:13,932 --> 12:27:19,961
And recall that what was nice\nabout an array, if 1, it's sorted.
12863
12:27:19,961 --> 12:27:23,021
And 2, all of its numbers\nare indeed contiguous
12864
12:27:23,021 --> 12:27:25,011
which is by definition an array.
12865
12:27:25,012 --> 12:27:26,752
We can just do some simple math.
12866
12:27:26,752 --> 12:27:31,461
For instance, if there are 7 elements\n
12867
12:27:31,961 --> 12:27:34,811
3 and 1/2, round down\nthrough truncation, that's 3.
12868
12:27:36,161 --> 12:27:39,414
That gives me the middle element,\narithmetically, in this thing.
12869
12:27:39,415 --> 12:27:41,832
And even though I have to be\ncareful about rounding, using
12870
12:27:41,832 --> 12:27:45,912
simple arithmetic, I can very quickly,\n
12871
12:27:45,911 --> 12:27:48,371
find for you the middle of the\nleft half, of the left half
12872
12:27:48,372 --> 12:27:49,664
of the right half, or whatever.
12873
12:27:49,663 --> 12:27:50,961
That's the power of arrays.
12874
12:27:50,961 --> 12:27:52,902
And that's what gave us binary search.
12875
12:27:52,902 --> 12:27:54,421
And how did binary search work?
12876
12:27:54,421 --> 12:27:55,671
Well, we looked at the middle.
12877
12:27:55,671 --> 12:27:57,311
And then, we went left or right.
12878
12:27:57,311 --> 12:28:02,561
And then, we went left or right again,\n
12879
12:28:02,561 --> 12:28:07,691
Wouldn't it be nice if we\nsomehow preserved the new upsides
12880
12:28:07,692 --> 12:28:10,520
today of dynamic memory\nallocation, giving ourselves
12881
12:28:10,519 --> 12:28:13,061
the ability to just add another\nelement, add another element
12882
12:28:14,232 --> 12:28:16,782
But retain the power of binary search.
12883
12:28:16,781 --> 12:28:21,582
Because log of n was much better than\n
12884
12:28:21,582 --> 12:28:24,461
Even the phone book\ndemonstrated as much weeks ago.
12885
12:28:24,461 --> 12:28:28,491
So what if I draw this same\npicture in 2 dimensions.
12886
12:28:28,491 --> 12:28:32,441
And I preserve the color scheme,\n
12887
12:28:32,442 --> 12:28:35,982
What are these things look like now?
12888
12:28:35,982 --> 12:28:38,532
Maybe, like, things we\nmight now call nodes, right.
12889
12:28:38,531 --> 12:28:42,512
A node is just a generic term\nfor like, storing some data.
12890
12:28:42,512 --> 12:28:45,682
What if the data these nodes\nare storing are numbers.
12891
12:28:47,211 --> 12:28:51,341
But what if we connected these\n
12892
12:28:51,341 --> 12:28:56,711
Whereby, every node has not one\npointer now, but as many as 2.
12893
12:28:56,711 --> 12:28:59,811
Maybe 0, like in the leaves\nat the bottom are in green.
12894
12:28:59,811 --> 12:29:02,932
But other nodes on the interior\nmight have as many as 2.
12895
12:29:02,932 --> 12:29:04,732
Like having 2 children, so to speak.
12896
12:29:04,732 --> 12:29:06,902
And indeed, the vernacular\nhere is exactly that.
12897
12:29:06,902 --> 12:29:08,812
This would be called\nthe root of the tree.
12898
12:29:08,811 --> 12:29:11,752
Or this would be a parent,\nwith respect to these children.
12899
12:29:11,752 --> 12:29:14,391
The green ones would be\ngrandchildren, respect to these.
12900
12:29:14,391 --> 12:29:19,012
The green ones would be siblings\nwith respect to each other.
12901
12:29:19,851 --> 12:29:22,144
So all the same jargon you\nmight use in the real world
12902
12:29:22,144 --> 12:29:25,402
applies in the world of data\nstructures and CS trees.
12903
12:29:25,402 --> 12:29:30,292
But this is interesting because I think\n
12904
12:29:30,292 --> 12:29:32,781
structure in the computer's memory.
12905
12:29:33,322 --> 12:29:37,521
Well, suppose that we defined\na node to be no longer just
12906
12:29:37,521 --> 12:29:39,591
this, a number in a next field.
12907
12:29:39,591 --> 12:29:42,351
What if we give ourselves\na bit more room here?
12908
12:29:42,351 --> 12:29:47,211
And give ourselves a pointer called\n
12909
12:29:47,211 --> 12:29:49,561
Both of which is a\npointer to a struct node.
12910
12:29:49,561 --> 12:29:53,511
So same idea as before, but now we\n
12911
12:29:53,512 --> 12:29:56,692
as pointing this way and\nthis way, not just this way.
12912
12:29:56,692 --> 12:29:58,762
Not just a single direction, but 2.
12913
12:29:58,762 --> 12:30:02,662
So you could imagine, in code, building\n
12914
12:30:02,661 --> 12:30:06,051
That creates, in essence,\nthis diagram here.
12915
12:30:07,732 --> 12:30:09,772
Suppose I want to find the number 3.
12916
12:30:09,771 --> 12:30:12,322
I want to search for the\nnumber 3 in this tree.
12917
12:30:12,322 --> 12:30:15,682
It would seem, just like Pedro was\n
12918
12:30:15,682 --> 12:30:18,572
in the world of trees,\nthe root, so to speak
12919
12:30:18,572 --> 12:30:20,572
is the beginning of your data structure.
12920
12:30:20,572 --> 12:30:26,211
You can retain and remember this entire\n
12921
12:30:26,752 --> 12:30:29,811
One variable can hang\non to this whole tree.
12922
12:30:29,811 --> 12:30:32,002
So how can I find the number 3?
12923
12:30:32,002 --> 12:30:36,141
Well, if I look at the root node and\n
12924
12:30:37,732 --> 12:30:40,052
Or if it's greater\nthan, I can go this way.
12925
12:30:40,052 --> 12:30:42,232
So I preserve that\nproperty of the phone book
12926
12:30:42,232 --> 12:30:44,482
or just assorted array in general.
12927
12:30:45,802 --> 12:30:48,810
If I'm looking for 3, I can\ngo to the right of the 2
12928
12:30:48,809 --> 12:30:50,601
because that number is\ngoing to be greater.
12929
12:30:50,601 --> 12:30:53,161
If I go left, it's going\nto be smaller instead.
12930
12:30:53,161 --> 12:30:55,911
And here's an example\nof actually recursion.
12931
12:30:55,911 --> 12:30:59,572
Recursion in a physical sense\nmuch like the Mario's pyramid.
12932
12:30:59,572 --> 12:31:01,732
Which was recursively to find.
12933
12:31:02,781 --> 12:31:04,731
I claim this whole thing is a tree.
12934
12:31:04,732 --> 12:31:08,272
Specifically, a binary search\ntree, which means every node
12935
12:31:08,271 --> 12:31:11,361
has 2, or maybe 1, or maybe 0 children.
12936
12:31:14,211 --> 12:31:19,641
And it's the case that every left\n
12937
12:31:19,641 --> 12:31:22,612
And every right child\nis larger than the root.
12938
12:31:22,612 --> 12:31:25,582
That definition certainly\nworks for 2, 4, and 6.
12939
12:31:25,582 --> 12:31:30,412
But it also works recursively for\n
12940
12:31:30,411 --> 12:31:32,391
Notice, if you think\nof this as the root
12941
12:31:32,391 --> 12:31:34,461
it is indeed bigger\nthan this left child.
12942
12:31:34,461 --> 12:31:36,561
And it's smaller than this right child.
12943
12:31:36,561 --> 12:31:39,081
And if you look even at\nthe leaves, so to speak.
12944
12:31:40,491 --> 12:31:44,169
This root node is bigger than\nits left child, if it existed.
12945
12:31:44,169 --> 12:31:45,502
So it's a meaningless statement.
12946
12:31:45,502 --> 12:31:47,692
And it's less than its right child.
12947
12:31:47,692 --> 12:31:50,482
Or it's not greater than, certainly,\nso that's meaningless too.
12948
12:31:50,482 --> 12:31:54,242
So we haven't violated the definition\n
12949
12:31:54,241 --> 12:31:57,711
And so, now, how many steps does\n
12950
12:31:57,711 --> 12:32:02,061
any number in a binary\nsearch tree, it would seem?
12951
12:32:04,012 --> 12:32:05,882
And the height of this\nthing is actually 3.
12952
12:32:05,881 --> 12:32:08,631
And so long story short, especially,\n
12953
12:32:08,631 --> 12:32:10,792
with your logarithms from yesteryear.
12954
12:32:10,792 --> 12:32:14,601
Log base 2 is the number of times you\n
12955
12:32:14,601 --> 12:32:16,341
and half, until you get down to 1.
12956
12:32:16,341 --> 12:32:19,309
This is like a logarithm\nin the reverse direction.
12957
12:32:19,309 --> 12:32:20,601
Here's a whole lot of elements.
12958
12:32:20,601 --> 12:32:22,972
And we're having, we're\nhaving until we get down to 1.
12959
12:32:22,972 --> 12:32:27,125
So the height of this tree, that\nis to say, is log base 2 of n.
12960
12:32:27,125 --> 12:32:30,292
Which means that even in the worst case,\n
12961
12:32:30,292 --> 12:32:32,167
it's all the way at the\nbottom in the leaves.
12962
12:32:32,811 --> 12:32:37,701
It's going to take log base 2\nof n steps, or log of n steps
12963
12:32:37,701 --> 12:32:41,311
to find, maximally, any\none of those numbers.
12964
12:32:41,311 --> 12:32:46,101
So, again, binary search is back.
12965
12:32:46,101 --> 12:32:48,116
But we've paid a price, right.
12966
12:32:48,116 --> 12:32:49,491
This isn't a linked list anymore.
12967
12:32:50,673 --> 12:32:53,631
But we've gained back binary search,\n
12968
12:32:53,631 --> 12:32:56,256
That's where the whole class\nbegan, on making that distinction.
12969
12:32:56,256 --> 12:33:01,502
But what price have we paid to retain\n
12970
12:33:04,552 --> 12:33:06,532
It's no longer sorted\nleft to right, but this
12971
12:33:06,531 --> 12:33:09,502
is a claim sorted, according to\n
12972
12:33:09,502 --> 12:33:13,491
Where, again, left child\nis smaller than root.
12973
12:33:13,491 --> 12:33:15,921
And right child is greater than root.
12974
12:33:15,921 --> 12:33:19,341
So it is sorted, but it's sorted in\n
12975
12:33:22,741 --> 12:33:24,152
AUDIENCE: [INAUDIBLE] nodes now.
12976
12:33:24,944 --> 12:33:29,312
Every node now needs not one\nnumber, but 2, 3 pieces of data.
12977
12:33:31,112 --> 12:33:32,866
So, again, there's that trade off again.
12978
12:33:32,866 --> 12:33:34,741
Where, well, if you want\nto save time, you've
12979
12:33:34,741 --> 12:33:37,561
got to give something if\nyou start giving space.
12980
12:33:37,561 --> 12:33:40,028
And you start using more\nspace, you can speed up time.
12981
12:33:40,862 --> 12:33:42,122
There's always a price paid.
12982
12:33:42,122 --> 12:33:47,882
And it's very often in space, or time,\n
12983
12:33:47,881 --> 12:33:49,511
the number of bugs you have to solve.
12984
12:33:49,512 --> 12:33:51,542
I mean, all of these\nare finite resources
12985
12:33:51,542 --> 12:33:53,315
that you have to juggle them on.
12986
12:33:53,315 --> 12:33:55,982
So if we consider now the code\nwith which we can implement this
12987
12:33:57,601 --> 12:34:00,551
And how might we actually\nuse something like this?
12988
12:34:00,552 --> 12:34:03,002
Well, let's take a look at,\nmaybe, one final program.
12989
12:34:03,002 --> 12:34:07,122
And see here, before we transition\n
12990
12:34:07,122 --> 12:34:11,552
Let me go ahead here and let me just\n
12991
12:34:11,552 --> 12:34:15,692
So let me, in a moment, copy\nover file called tree.c.
12992
12:34:15,692 --> 12:34:18,550
Which we'll have on\nthe course's websites.
12993
12:34:18,550 --> 12:34:20,342
And I'll walk you\nthrough some of the logic
12994
12:34:20,341 --> 12:34:25,271
here that I've written for tree.c.
12995
12:34:25,771 --> 12:34:27,281
So what do we have here first?
12996
12:34:27,281 --> 12:34:31,921
So here is an implementation of\n
12997
12:34:31,921 --> 12:34:36,341
And as before, I've played around and\n
12998
12:34:37,771 --> 12:34:41,611
Here is my definition of a node for a\n
12999
12:34:41,612 --> 12:34:44,491
from what I proposed on\nthe board a moment ago.
13000
12:34:44,491 --> 12:34:47,191
Here are 2 prototypes for\n2 functions, that I'll
13001
12:34:47,192 --> 12:34:49,262
show you in a moment,\nthat allow me to free
13002
12:34:49,262 --> 12:34:52,652
an entire tree, one node at a time.
13003
12:34:52,652 --> 12:34:55,382
And then, also allow me to\nprint the tree in order.
13004
12:34:55,381 --> 12:34:57,781
So even though they're\nnot sorted left to right
13005
12:34:57,781 --> 12:35:00,932
I bet if I'm clever about\nwhat child I print first
13006
12:35:00,932 --> 12:35:04,152
I can reconstruct the idea of\nprinting this tree properly.
13007
12:35:04,152 --> 12:35:06,632
So how might I implement\na binary search tree?
13008
12:35:07,921 --> 12:35:10,502
Here is how I might\nrepresent a tree of size 0.
13009
12:35:10,502 --> 12:35:13,442
It's just a null pointer called tree.
13010
12:35:13,442 --> 12:35:15,542
Here's how I might add\na number to that list.
13011
12:35:15,542 --> 12:35:19,561
So here, for instance, is me\nmalllocing space for a node.
13012
12:35:19,561 --> 12:35:21,691
Storing it in a temporary\nvariable called n.
13013
12:35:21,692 --> 12:35:23,552
Here is me just doing a safety check.
13014
12:35:23,552 --> 12:35:25,262
Make sure n does not equal null.
13015
12:35:25,262 --> 12:35:29,612
And then, here is me initializing this\n
13016
12:35:29,612 --> 12:35:32,342
Then, initializing the left\nchild of that node to be null.
13017
12:35:32,341 --> 12:35:34,991
And the right child of\nthat null node to be null.
13018
12:35:34,991 --> 12:35:40,152
And then, initializing the tree itself\n
13019
12:35:40,152 --> 12:35:43,322
So at this point in the story, there's\n
13020
12:35:43,322 --> 12:35:46,222
containing the number\n2 with no children.
13021
12:35:46,722 --> 12:35:49,112
Let's just add manually\nto this a little further.
13022
12:35:49,112 --> 12:35:52,262
Let's add another number to the\nlist, by mallocing another node.
13023
12:35:52,262 --> 12:35:55,622
I don't need to declare n as a node*\n
13024
12:35:56,262 --> 12:35:58,202
Here's a little safety check.
13025
12:35:58,201 --> 12:36:02,761
I'm going to not bother with my,\n
13026
12:36:07,285 --> 12:36:09,452
We want a free memory too,\nwhich I've not done here
13027
12:36:09,451 --> 12:36:11,131
but I'll save that for another time.
13028
12:36:11,131 --> 12:36:13,471
Here, I'm going to\ninitialize the number to 1.
13029
12:36:13,472 --> 12:36:17,582
I'm going to initialize the children\n
13030
12:36:17,582 --> 12:36:19,292
And now, I'm going to do this.
13031
12:36:19,292 --> 12:36:23,762
Initialize the tree's\nleft child to be n.
13032
12:36:23,762 --> 12:36:26,704
So what that's essentially\ndoing here is if this
13033
12:36:26,703 --> 12:36:29,911
is my root node, the single rectangle\n
13034
12:36:29,911 --> 12:36:32,011
has no children, neither left nor right.
13035
12:36:32,012 --> 12:36:33,961
Here's my new node with the number 1.
13036
12:36:33,961 --> 12:36:36,101
I want it to become the new left child.
13037
12:36:36,101 --> 12:36:39,631
So that line of code on the\nscreen there, tree left equals n
13038
12:36:39,631 --> 12:36:44,201
is like stitching these 2 together\n
13039
12:36:44,701 --> 12:36:47,581
The next lines of code,\nyou can probably guess
13040
12:36:47,582 --> 12:36:50,042
are me adding another\nnumber to the list.
13041
12:36:51,211 --> 12:36:56,682
So this is a simpler tree with\n2, 1, and, 3 respectively.
13042
12:36:56,682 --> 12:36:59,192
And this code, let me wave\nmy hands, is almost the same.
13043
12:36:59,192 --> 12:37:02,492
Except for the fact that I'm\nupdating the tree's right child
13044
12:37:02,491 --> 12:37:04,472
to be this new and third node.
13045
12:37:04,472 --> 12:37:07,862
Let's now run the code before\nlooking at those 2 functions.
13046
12:37:12,991 --> 12:37:16,411
So it sounds like the data structure\n
13047
12:37:16,411 --> 12:37:18,181
But how did I actually print this?
13048
12:37:18,182 --> 12:37:20,072
And then, eventually,\nfree the whole thing?
13049
12:37:20,072 --> 12:37:23,461
Well let's look at the\ndefinition of first print tree.
13050
12:37:23,461 --> 12:37:26,432
And this is where\nthings get interesting.
13051
12:37:26,432 --> 12:37:30,271
Print tree returns nothing\nso it's a void function.
13052
12:37:30,271 --> 12:37:36,002
But it takes a pointer to a root element\n
13053
12:37:37,171 --> 12:37:39,271
If root equals equals\nnull, there's obviously
13054
12:37:39,271 --> 12:37:40,591
nothing to print, just return.
13055
12:37:42,451 --> 12:37:44,491
But here's where things\nget a little magical.
13056
12:37:44,491 --> 12:37:47,761
Otherwise, print your left child.
13057
12:37:50,491 --> 12:37:53,911
Then, print your right child.
13058
12:37:53,911 --> 12:37:59,181
What is this an example of, even\n
13059
12:37:59,182 --> 12:38:00,802
What programming technique here?
13060
12:38:02,398 --> 12:38:05,853
So this is actually perhaps the most\n
13061
12:38:05,853 --> 12:38:08,061
It wasn't really that\ncompelling with the Mario thing
13062
12:38:08,061 --> 12:38:10,191
because we had such an easy\nimplementation with a for-loop loop
13063
12:38:11,031 --> 12:38:15,652
But here is a perfect application of\n
13064
12:38:17,391 --> 12:38:19,701
If you take any snip\nof any branch, it all
13065
12:38:19,701 --> 12:38:22,072
still looks like a tree,\njust a smaller one.
13066
12:38:22,072 --> 12:38:23,911
That lends itself to recursion.
13067
12:38:23,911 --> 12:38:28,491
So here is this leap of faith where I\n
13068
12:38:28,491 --> 12:38:31,311
tree, if you will, via\nmy child at the left.
13069
12:38:31,311 --> 12:38:34,612
Then, I'll print my own root\nnode here in the middle.
13070
12:38:34,612 --> 12:38:37,222
Then, go ahead and\nprint my right sub tree.
13071
12:38:37,222 --> 12:38:41,662
And because we have this base case that\n
13072
12:38:41,661 --> 12:38:44,448
there's nothing to do, you're\nnot going to recurse infinitely.
13073
12:38:44,449 --> 12:38:47,031
You're not going to call yourself\nagain, and again, and again
13074
12:38:48,692 --> 12:38:52,882
So it works out and prints\nthe 1, the 2, and the 3.
13075
12:38:52,881 --> 12:38:54,322
And notice what we could do, too.
13076
12:38:54,322 --> 12:38:57,741
If you wanted to print the tree in\n
13077
12:38:57,741 --> 12:39:00,531
Print your right tree\nfirst, the greater element.
13078
12:39:01,432 --> 12:39:02,811
Then, your smaller sub tree.
13079
12:39:02,811 --> 12:39:05,451
And if I do make tree\nhere and ./tree, well now
13080
12:39:05,451 --> 12:39:07,581
I've reversed the order of the list.
13081
12:39:08,671 --> 12:39:10,421
You can do it with a\nfor-loop in an array.
13082
12:39:10,421 --> 12:39:13,851
But you can also do it, even with\nthis 2-dimensional structure.
13083
12:39:13,851 --> 12:39:17,661
Let's lastly look at\nthis free tree function.
13084
12:39:17,661 --> 12:39:19,641
And this one's almost the same.
13085
12:39:19,641 --> 12:39:22,881
Order doesn't matter in quite the\n
13086
12:39:22,881 --> 12:39:24,502
Here's what I did with free tree.
13087
12:39:24,502 --> 12:39:27,459
Well, if the root of the tree is\n
13088
12:39:28,042 --> 12:39:32,582
Otherwise, go ahead and free your\n
13089
12:39:32,582 --> 12:39:35,572
Then free your right child\nand all of its descendants.
13090
12:39:37,381 --> 12:39:43,171
And again, free literally just\n
13091
12:39:43,171 --> 12:39:45,051
It doesn't free the whole darn thing.
13092
12:39:45,052 --> 12:39:47,332
It just frees literally\nwhat's at that address.
13093
12:39:47,332 --> 12:39:51,382
Why was it important that\nI did line 72 last, though?
13094
12:39:51,381 --> 12:39:53,932
Why did I free the left\nchild and the right child
13095
12:39:53,932 --> 12:39:57,455
before I freed myself, so to speak?
13096
12:39:59,163 --> 12:40:03,621
If you free yourself first, if I had\n
13097
12:40:03,622 --> 12:40:08,302
you're not allowed to touch the left\n
13098
12:40:08,302 --> 12:40:10,832
Because the memory address is\nno longer valid at that point.
13099
12:40:10,832 --> 12:40:12,772
You would get some\nmemory error, perhaps.
13100
12:40:13,792 --> 12:40:15,472
Valgrind definitely wouldn't like it.
13101
12:40:15,472 --> 12:40:17,542
Bad things would otherwise happen.
13102
12:40:17,542 --> 12:40:19,372
But here, then, is an\nexample of recursion.
13103
12:40:19,372 --> 12:40:23,842
And again, just a recursive use\nof an actual data structure.
13104
12:40:23,841 --> 12:40:26,601
And what's even cooler here\nis, relatively speaking
13105
12:40:26,601 --> 12:40:29,121
suppose we wanted to\nsearch something like this.
13106
12:40:29,122 --> 12:40:33,202
Binary search actually gets pretty\n
13107
12:40:33,891 --> 12:40:38,421
here might be the prototype for a search\n
13108
12:40:38,421 --> 12:40:43,402
You give me the root of a tree, and\n
13109
12:40:43,402 --> 12:40:47,362
and I can pretty easily now return true\n
13110
12:40:47,932 --> 12:40:49,912
Well, let's first ask a question.
13111
12:40:49,911 --> 12:40:52,876
If tree equals equals null,\nthen you just return false.
13112
12:40:52,877 --> 12:40:56,002
Because if there's no tree, there's no\n
13113
12:40:57,341 --> 12:41:04,041
Else if, the number you're looking for\n
13114
12:41:04,042 --> 12:41:06,052
which direction should we go?
13115
12:41:08,671 --> 12:41:11,781
Well, let's just return the\nanswer to this question.
13116
12:41:11,781 --> 12:41:15,921
Search the left sub tree,\nby way of my left child
13117
12:41:15,921 --> 12:41:17,451
looking for the same number.
13118
12:41:17,451 --> 12:41:19,731
And you just assume through\nthe beauty of recursion
13119
12:41:19,732 --> 12:41:22,882
that you're kicking the can\nand let yourself figure it out
13120
12:41:24,082 --> 12:41:26,542
Just that snipped left tree instead.
13121
12:41:26,542 --> 12:41:30,802
Else if, the number you're looking for\n
13122
12:41:30,802 --> 12:41:32,641
go to the right, as you might infer.
13123
12:41:32,641 --> 12:41:35,542
So I can just return the\nanswer to this question.
13124
12:41:35,542 --> 12:41:38,631
Search my right sub tree\nfor that same number.
13125
12:41:38,631 --> 12:41:40,502
And there's a fourth\nand final condition.
13126
12:41:40,502 --> 12:41:43,732
What's the fourth scenario we\nhave to consider, explicitly?
13127
12:41:45,262 --> 12:41:47,304
SPEAKER 1: If the number,\nitself, is right there.
13128
12:41:47,303 --> 12:41:50,961
So else if, the number I'm looking\n
13129
12:41:50,961 --> 12:41:53,731
then and only then,\nshould you return true.
13130
12:41:53,732 --> 12:41:55,972
And if you're thinking\nquickly here, there's
13131
12:41:55,972 --> 12:41:59,632
an optimization possible,\nbetter design opportunity.
13132
12:41:59,631 --> 12:42:01,131
Think back to even our scratch days.
13133
12:42:01,131 --> 12:42:03,252
What could we do a little better here?
13134
12:42:06,622 --> 12:42:09,164
Because if there's logically\nonly 4 things that could happen
13135
12:42:09,163 --> 12:42:12,021
you're wasting your time by asking\na fourth gratuitous question.
13136
12:42:13,341 --> 12:42:16,981
So here to, more so than the\nMario example a few weeks ago
13137
12:42:16,982 --> 12:42:19,582
there's just this elegance\narguably to recursion.
13138
12:42:21,442 --> 12:42:25,432
This is the code for binary\nsearch on a binary search tree.
13139
12:42:25,432 --> 12:42:27,502
And so, recursion tends\nto work in lockstep
13140
12:42:27,502 --> 12:42:32,182
with these kinds of data structures\n
13141
12:42:34,161 --> 12:42:39,841
Any questions, then, on binary search\n
13142
12:42:40,709 --> 12:42:42,656
AUDIENCE: About like third years.
13143
12:42:48,211 --> 12:42:54,171
So when returning a Boolean value, true\n
13144
12:42:54,171 --> 12:42:57,832
in a library called Standard\nBool, S-T-D-B-O-O-L dot H.
13145
12:42:57,832 --> 12:42:59,961
With a header file that you can use.
13146
12:42:59,961 --> 12:43:06,739
It is the case that true is, it's\n
13147
12:43:06,739 --> 12:43:08,031
But they would map indeed, yes.
13148
12:43:09,442 --> 12:43:11,872
But you should not compare\nthem explicitly to 0 and 1.
13149
12:43:11,872 --> 12:43:14,872
When you're using true and false, you\n
13150
12:43:14,872 --> 12:43:18,857
AUDIENCE: I meant if\nit's in a code return.
13151
12:43:19,732 --> 12:43:23,332
So if I am in my own code from\nearlier, an avoid function
13152
12:43:23,332 --> 12:43:25,762
it is totally fine to return.
13153
12:43:25,762 --> 12:43:28,432
You just can't return\nsomething explicitly.
13154
12:43:28,432 --> 12:43:30,201
So return just means that's it.
13155
12:43:31,762 --> 12:43:33,632
You're not actually\nhanding back a value.
13156
12:43:33,631 --> 12:43:37,252
So it's a way of short\ncircuiting the execution.
13157
12:43:37,252 --> 12:43:39,531
If you don't like that,\nand some people do frown
13158
12:43:39,531 --> 12:43:44,241
upon having code return from functions\n
13159
12:43:45,531 --> 12:43:49,222
If the root does not equal\nnull, do all of these things.
13160
12:43:49,222 --> 12:43:51,502
And then, indent all three\nof these lines underneath.
13161
12:43:52,972 --> 12:43:54,772
I happen to write it\nthe other way just so
13162
12:43:54,771 --> 12:43:58,471
that there was explicitly a base case\n
13163
12:43:58,472 --> 12:44:01,402
Whereas, now, it's\nimplicitly there for us only.
13164
12:44:03,771 --> 12:44:07,441
So let's ask the question as\nbefore about running time of this.
13165
12:44:07,442 --> 12:44:09,412
It would look like\nbinary search is back.
13166
12:44:09,411 --> 12:44:15,081
And we can now do things in logarithmic\n
13167
12:44:15,082 --> 12:44:17,421
Is this a binary search tree?
13168
12:44:19,141 --> 12:44:21,862
And again, a binary\nsearch tree is a tree
13169
12:44:21,862 --> 12:44:28,599
where the root is greater than its left\n
13170
12:44:29,391 --> 12:44:30,862
So you're nodding your head.
13171
12:44:33,502 --> 12:44:35,512
So this is a binary search tree.
13172
12:44:35,512 --> 12:44:37,872
Is this a binary search tree?
13173
12:44:40,341 --> 12:44:43,191
Or I'm hearing just my delay\nchanging the vote it would seem.
13174
12:44:43,192 --> 12:44:45,562
So this is one of those trick questions.
13175
12:44:45,561 --> 12:44:47,961
This is a binary search\ntree because I've not
13176
12:44:47,961 --> 12:44:50,872
violated the definition\nof what I gave you, right.
13177
12:44:50,872 --> 12:44:56,961
Is there any example of a left child\n
13178
12:44:56,961 --> 12:44:59,961
Or is there any example of a right\n
13179
12:44:59,961 --> 12:45:02,379
That's just the opposite way\nof describing the same thing.
13180
12:45:02,379 --> 12:45:04,552
No, this is a binary search tree.
13181
12:45:04,552 --> 12:45:07,692
Unfortunately, it also looks like,\n
13182
12:45:09,381 --> 12:45:11,451
But you could imagine\nthis happening, right.
13183
12:45:11,451 --> 12:45:14,121
Suppose that I hadn't been as\nthoughtful as I was earlier
13184
12:45:14,122 --> 12:45:17,452
by inserting 2, And then 1, and then 3.
13185
12:45:17,451 --> 12:45:19,641
Which nicely balanced everything out.
13186
12:45:19,641 --> 12:45:22,341
Suppose that instead, because\nof what the user is typing in
13187
12:45:22,341 --> 12:45:25,461
or whatever you contrive in your\n
13188
12:45:27,741 --> 12:45:30,331
Like, you've created a\nproblem for yourself.
13189
12:45:30,332 --> 12:45:33,772
Because if we follow the same logic\n
13190
12:45:33,771 --> 12:45:38,511
this is how you might implement\n
13191
12:45:38,512 --> 12:45:42,232
if you just blindly keep\nfollowing that definition.
13192
12:45:42,232 --> 12:45:44,512
I mean, this would be\nbetter designed as what?
13193
12:45:44,512 --> 12:45:46,972
If we rotated the whole thing around.
13194
12:45:48,351 --> 12:45:50,542
And those kinds of trees\nactually have names.
13195
12:45:50,542 --> 12:45:52,881
There's trees called AVL\ntrees in computer science.
13196
12:45:52,881 --> 12:45:54,531
There are red-black black\ntrees in computer science.
13197
12:45:54,531 --> 12:45:56,781
There are other types of\ntrees that, additionally
13198
12:45:56,781 --> 12:45:59,991
add some logic that tell you\nwhen you got to pivot the thing
13199
12:45:59,991 --> 12:46:03,720
and rotate it, and snip off the\n
13200
12:46:03,720 --> 12:46:05,512
But a binary search\ntree, in and of itself
13201
12:46:05,512 --> 12:46:09,152
does not guarantee that it\nwill be balanced, so to speak.
13202
12:46:09,152 --> 12:46:11,722
And so, if you consider\nthe worst case scenario
13203
12:46:11,722 --> 12:46:13,342
of even using a binary search tree.
13204
12:46:13,341 --> 12:46:15,441
If you're not smart about\nthe code you're writing
13205
12:46:15,442 --> 12:46:17,662
and you just blindly\nfollow this definition
13206
12:46:17,661 --> 12:46:21,771
you might accidentally create a\n
13207
12:46:21,771 --> 12:46:24,531
tree that essentially\nlooks like a linked list.
13208
12:46:24,531 --> 12:46:26,991
Because you're not even using\nany of the left children.
13209
12:46:26,991 --> 12:46:30,231
So unfortunately, the literal\nanswer to the question
13210
12:46:30,232 --> 12:46:32,962
here is what's the\nrunning time of search?
13211
12:46:34,881 --> 12:46:37,461
But not if you don't maintain\nthe balance of the tree.
13212
12:46:37,461 --> 12:46:42,771
Both, in certain search, could actually\n
13213
12:46:44,434 --> 12:46:46,641
If you don't somehow take\ninto account, and we're not
13214
12:46:46,641 --> 12:46:48,201
going to do the code for that here.
13215
12:46:48,201 --> 12:46:51,621
It's a higher level thing you\nmight explore down the road.
13216
12:46:51,622 --> 12:46:55,412
It can devolve into something\nthat you might not have intended.
13217
12:46:55,411 --> 12:46:57,503
And so, now that we're\ntalking about 2 dimensions
13218
12:46:57,504 --> 12:46:59,211
it's really the onus\nis on the programmer
13219
12:46:59,211 --> 12:47:01,972
to consider what kinds of\nperverse situations might happen.
13220
12:47:01,972 --> 12:47:04,342
Where the thing devolves\ninto a structure
13221
12:47:04,341 --> 12:47:07,831
that you don't actually\nwant it to devolve into.
13222
12:47:08,332 --> 12:47:09,842
We've got just a few structures to go.
13223
12:47:09,841 --> 12:47:11,421
Let's go ahead and take one\nmore 5 minute break here.
13224
12:47:11,421 --> 12:47:12,891
When we come back,\nwe'll talk at this level
13225
12:47:12,891 --> 12:47:14,512
about some final applications of this.
13226
12:47:19,341 --> 12:47:22,731
And as promised, we'll operate\nnow at this higher level.
13227
12:47:22,732 --> 12:47:26,002
Where if we take for granted that, even\n
13228
12:47:26,002 --> 12:47:28,794
to play with these techniques yet,\n
13229
12:47:30,262 --> 12:47:33,112
Both in a one dimension\nand even 2 dimensions
13230
12:47:33,112 --> 12:47:35,451
to build things like lists and trees.
13231
12:47:35,451 --> 12:47:37,461
So if we have these building blocks.
13232
12:47:37,461 --> 12:47:40,161
Things like now arrays,\nand lists, and trees
13233
12:47:40,161 --> 12:47:44,271
what if we start to amalgamate\n
13234
12:47:44,271 --> 12:47:46,381
of multiple data structures?
13235
12:47:46,381 --> 12:47:49,841
Can we start to get some of the best\n
13236
12:47:49,841 --> 12:47:51,191
something called a hash table.
13237
12:47:51,192 --> 12:47:55,022
So a hash table is a Swiss\narmy knife of data structures
13238
12:47:55,021 --> 12:47:56,791
in that it's so commonly used.
13239
12:47:56,792 --> 12:48:01,482
Because it allows you to associate\nkeys with value, so to speak.
13240
12:48:01,482 --> 12:48:06,542
So, for instance, it allows you to\n
13241
12:48:08,552 --> 12:48:11,402
Or anything where you have\nto take something as input
13242
12:48:11,402 --> 12:48:13,781
and get as output a corresponding\npiece of information.
13243
12:48:13,781 --> 12:48:16,692
A hash table is often a\ndata structure of choice.
13244
12:48:16,692 --> 12:48:17,942
And here's what it looks like.
13245
12:48:17,942 --> 12:48:20,281
It's actually looks like\nan array, at first glance.
13246
12:48:20,281 --> 12:48:23,472
But for discussion's sake, I've\ndrawn this array vertically
13247
12:48:26,141 --> 12:48:31,201
But it allows you, a hash table, to\n
13248
12:48:32,222 --> 12:48:35,612
So, for instance, there's actually\n26 locations in this array.
13249
12:48:35,612 --> 12:48:38,582
Because I want to, for\ninstance, store initially
13250
12:48:38,582 --> 12:48:41,461
names of people, for instance.
13251
12:48:41,461 --> 12:48:44,135
And wouldn't it be nice if the\nperson's name starts with A
13252
12:48:44,135 --> 12:48:45,302
I have a go to place for it.
13253
12:48:46,262 --> 12:48:48,345
And if it starts with Z,\nI put them at the bottom.
13254
12:48:48,345 --> 12:48:50,552
So that I can jump\ninstantly, arithmetically
13255
12:48:50,552 --> 12:48:52,952
using a little bit of\nAscii or Unicode fanciness
13256
12:48:52,951 --> 12:48:56,021
exactly to the location that\nthey want to they need to go.
13257
12:48:56,021 --> 12:48:58,171
So, for instance, here's\nour array 0 index.
13258
12:48:59,612 --> 12:49:01,982
If I think of this,\nthough, as A through Z
13259
12:49:01,982 --> 12:49:03,852
I'm going to think of\nthese 26 locations
13260
12:49:03,851 --> 12:49:07,112
now in the context of a hash table,\n
13261
12:49:07,112 --> 12:49:09,491
So buckets into which\nyou can put values.
13262
12:49:09,491 --> 12:49:13,862
So, for instance, suppose that we\n
13263
12:49:15,072 --> 12:49:16,741
And that name is say, Albus.
13264
12:49:16,741 --> 12:49:21,461
So Albus starting with A. Albus might\n
13265
12:49:21,961 --> 12:49:23,669
And then, we want to\ninsert another name.
13266
12:49:23,669 --> 12:49:25,112
This one happens to be Zacharias.
13267
12:49:25,112 --> 12:49:28,171
Starting with Z, so it goes all\nthe way at the end of this data
13268
12:49:28,171 --> 12:49:29,972
structure in location 25 a.k.a.
13269
12:49:30,872 --> 12:49:34,742
And then, maybe a third name like\n
13270
12:49:34,741 --> 12:49:36,792
according to that\nposition in the alphabet.
13271
12:49:36,792 --> 12:49:39,542
So this is great because\nin constant time
13272
12:49:39,542 --> 12:49:43,502
I can insert and conversely\nsearch for any of these names
13273
12:49:43,502 --> 12:49:45,182
based on the first letter of their name.
13274
12:49:45,182 --> 12:49:47,580
A, or Z, or H, in this case.
13275
12:49:47,580 --> 12:49:50,372
Let's fast forward and assume we\n
13276
12:49:50,372 --> 12:49:52,382
might look familiar,\ninto this hash table.
13277
12:49:52,381 --> 12:49:56,591
It's great because every\nname has its own location.
13278
12:49:56,591 --> 12:50:00,961
But if you're thinking of names\n
13279
12:50:00,961 --> 12:50:03,192
we eventually encounter a\nproblem with this, right.
13280
12:50:03,192 --> 12:50:06,961
When could something go wrong\nusing a hash table like this
13281
12:50:06,961 --> 12:50:09,572
if we wanted to insert even more names?
13282
12:50:09,572 --> 12:50:11,771
What's going to eventually happen?
13283
12:50:12,271 --> 12:50:14,479
There's already someone with\nthe first letter, right.
13284
12:50:14,480 --> 12:50:17,342
Like I haven't even mentioned\nHarry, for instance, or Hagrid.
13285
12:50:17,341 --> 12:50:19,231
And yet, Hermione's\nalready using that spot.
13286
12:50:19,232 --> 12:50:21,512
So that invites the\nquestion, well, what happens?
13287
12:50:21,512 --> 12:50:25,082
Maybe, if we want to insert Harry\n
13288
12:50:26,192 --> 12:50:28,805
But then if there's a location\nI, where do we put them?
13289
12:50:28,805 --> 12:50:31,472
And it just feels like the situation\ncould very quickly devolve.
13290
12:50:31,472 --> 12:50:34,412
But I've deliberately\ndrawn this data structure
13291
12:50:34,411 --> 12:50:37,471
that I claim as a hash\ntable, in 2 directions.
13292
12:50:39,601 --> 12:50:42,781
But what might this be hinting\nI'm using horizontally
13293
12:50:42,781 --> 12:50:45,781
even though I'm drawing the rectangles\n
13294
12:50:47,239 --> 12:50:48,572
Maybe another array, to be fair.
13295
12:50:48,572 --> 12:50:51,739
But, honestly, arrays are such a pain\n
13296
12:50:52,292 --> 12:50:56,082
These look like the beginnings\nof a linked list, if you will.
13297
12:50:56,082 --> 12:50:59,671
Where the name is where the number\n
13298
12:50:59,671 --> 12:51:01,682
horizontally now just\nfor discussion's sake.
13299
12:51:01,682 --> 12:51:05,281
And this seems to be a pointer\nthat isn't pointing anywhere yet.
13300
12:51:05,281 --> 12:51:10,561
But it looks like the array is 26\n
13301
12:51:11,402 --> 12:51:14,156
Some of which are pointing at\nthe first node in a linked list.
13302
12:51:14,156 --> 12:51:16,531
So that's really what a hash\ntable might be in your mind.
13303
12:51:16,531 --> 12:51:21,309
An amalgam of an array, whose\nelements are linked lists.
13304
12:51:21,309 --> 12:51:23,851
And in theory, this gives you\nthe best of both worlds, right.
13305
12:51:23,851 --> 12:51:26,911
You get random access with\nhigh probability, right.
13306
12:51:26,911 --> 12:51:30,101
You get to jump immediately to the\n
13307
12:51:30,101 --> 12:51:32,911
But, if you run into this perverse\n
13308
12:51:34,351 --> 12:51:37,832
It starts to devolve into a\nlinked list, but it's at least 26
13309
12:51:39,061 --> 12:51:42,151
Not one massive linked list,\nwhich would be Big O of n.
13310
12:51:43,961 --> 12:51:46,112
So if Harry gets inserted in Hagrid.
13311
12:51:46,112 --> 12:51:50,262
Yeah, you have to chain them\ntogether, so to speak, in this way.
13312
12:51:50,262 --> 12:51:53,127
But, at least you've not\npainted yourself into a corner.
13313
12:51:53,127 --> 12:51:56,252
And in fact, if we fast forward and\n
13314
12:51:56,252 --> 12:51:58,601
the data structure\nstarts to look like this.
13315
12:51:58,601 --> 12:52:00,941
So the chains not terribly long.
13316
12:52:00,942 --> 12:52:03,752
And some of them are actually\nof size 0 because there's just
13317
12:52:03,752 --> 12:52:06,631
some unpopular letters of the\nalphabet among these names.
13318
12:52:06,631 --> 12:52:08,581
But it seems better than\njust putting everyone
13319
12:52:08,582 --> 12:52:11,342
in one big array, or\none big linked list.
13320
12:52:11,341 --> 12:52:15,671
We're trying to balance these trade\n
13321
12:52:15,671 --> 12:52:17,891
Well, how might we represent\nsomething like this?
13322
12:52:17,891 --> 12:52:19,622
Here's how we could describe this thing.
13323
12:52:19,622 --> 12:52:22,802
A node in the context of a\nlinked list could be this.
13324
12:52:22,802 --> 12:52:26,342
I have an array called\nword of type char.
13325
12:52:26,341 --> 12:52:30,541
And it's big enough to fit the\n
13326
12:52:30,542 --> 12:52:32,372
And the plus 1 why, probably?
13327
12:52:33,241 --> 12:52:34,211
SPEAKER 1: The null character.
13328
12:52:34,211 --> 12:52:37,322
So I'm assuming that longest word\n
13329
12:52:37,951 --> 12:52:40,216
And it's something big\nlike 40, 100, whatever.
13330
12:52:40,216 --> 12:52:43,291
Whatever the longest word\nin the Harry Potter universe
13331
12:52:43,292 --> 12:52:45,921
is or the English dictionary is.
13332
12:52:45,921 --> 12:52:51,531
Longest word plus 1 should be sufficient\n
13333
12:52:51,531 --> 12:52:53,841
And then, what else does it\neach of these nodes have?
13334
12:52:53,841 --> 12:52:57,541
Well it has a pointer to another node.
13335
12:52:57,542 --> 12:52:59,872
So here's how we might\nimplement the notion of a node
13336
12:52:59,872 --> 12:53:04,192
in the context of storing\nnot integers, but names.
13337
12:53:05,841 --> 12:53:08,841
But how do we decide what\nthe hash table itself is?
13338
12:53:08,841 --> 12:53:12,621
Well, if we now have a definition of a\n
13339
12:53:12,622 --> 12:53:14,992
or even globally, called hash table.
13340
12:53:14,991 --> 12:53:20,391
That itself is an array\nof node* pointers.
13341
12:53:20,391 --> 12:53:22,792
That is an array of pointers to nodes.
13342
12:53:22,792 --> 12:53:24,772
The beginnings of linked lists.
13343
12:53:26,432 --> 12:53:28,565
I proposed, verbally, that it be 26.
13344
12:53:28,565 --> 12:53:30,982
But honestly, if you get a lot\nof collisions, so to speak.
13345
12:53:30,982 --> 12:53:33,105
A lot of H names trying\nto go to the same place.
13346
12:53:33,105 --> 12:53:35,272
Well, maybe, we need to be\nsmarter and not just look
13347
12:53:35,271 --> 12:53:36,688
at the first letter of their name.
13348
12:53:36,688 --> 12:53:38,281
But, maybe, the first and the second.
13349
12:53:38,281 --> 12:53:42,381
So it's H-A and H-E. But wait, no,\n
13350
12:53:42,381 --> 12:53:45,322
But we start to at least make\nthe problem a little less
13351
12:53:45,322 --> 12:53:48,982
impactful by tinkering with\nsomething like the number of buckets
13352
12:53:50,362 --> 12:53:55,042
But how do we decide where someone\n
13353
12:53:55,042 --> 12:53:57,381
Well, it's an old school\nproblem of input and output.
13354
12:53:57,381 --> 12:54:00,741
The input to the problem is going\nto be something like the name.
13355
12:54:00,741 --> 12:54:02,781
And the algorithm in\nthe middle, as of today
13356
12:54:02,781 --> 12:54:05,211
is going to be something\ncalled a hash function.
13357
12:54:05,211 --> 12:54:07,101
A hash function is\ngenerally something that
13358
12:54:07,101 --> 12:54:10,851
takes as input, a string, a\nnumber, whatever, and produces
13359
12:54:10,851 --> 12:54:13,341
as output a location in our context.
13360
12:54:16,972 --> 12:54:19,671
Or whatever the number\nof buckets you want is
13361
12:54:19,671 --> 12:54:23,851
it's going to just tell you where to\n
13362
12:54:23,851 --> 12:54:27,682
So, for instance, Albus, according to\n
13363
12:54:30,052 --> 12:54:32,782
So the hash function, in the\nmiddle of that black box
13364
12:54:32,781 --> 12:54:35,241
is pretty simplistic in this story.
13365
12:54:35,241 --> 12:54:38,841
It's just looking at the Ascii\n
13366
12:54:39,591 --> 12:54:42,631
And then, subtracting\noff what capital A is 65.
13367
12:54:42,631 --> 12:54:46,951
So like doing some math to get\nback in number between 0 and 25.
13368
12:54:46,951 --> 12:54:50,091
So that's how we got to\nthis point in the story.
13369
12:54:50,091 --> 12:54:54,921
And how might we, then, resolve\nthe problem further and use
13370
12:54:54,921 --> 12:54:56,542
this notion of hashing more generally?
13371
12:54:56,542 --> 12:54:58,417
Well just for demonstration\nsake here, here's
13372
12:54:58,417 --> 12:55:00,772
actually some buckets, literally.
13373
12:55:00,771 --> 12:55:03,861
And we've labeled, in advance,\nthese buckets with the suits
13374
12:55:07,252 --> 12:55:12,082
And we've got diamonds here.
13375
12:55:12,082 --> 12:55:15,592
And we've got, what else here?
13376
12:55:19,372 --> 12:55:22,074
So we have a deck of cards\nhere, for instance, right.
13377
12:55:22,074 --> 12:55:24,531
And this is something you,\nyourself, might do instinctively
13378
12:55:24,531 --> 12:55:26,902
if you're getting ready to\nstart playing a game of cards.
13379
12:55:26,902 --> 12:55:29,069
You're just cleaning up or\nyou want things in order.
13380
12:55:29,069 --> 12:55:31,444
Like, here is literally\na jumbo deck of cards.
13381
12:55:31,444 --> 12:55:33,862
What would be the easiest way\nfor me to sort these things?
13382
12:55:33,862 --> 12:55:36,569
Well we've got a whole bunch of\n
13383
12:55:36,569 --> 12:55:39,112
So I could go through like,\nhere's the 3 of diamonds.
13384
12:55:39,112 --> 12:55:41,362
And I could, here let me\nthrow this up on the screen.
13385
12:55:41,362 --> 12:55:43,052
Just so, if you're far in back.
13386
12:55:47,991 --> 12:55:49,612
I could do this in order here.
13387
12:55:49,612 --> 12:55:52,022
But a lot of us, honestly,\nif given a deck of cards.
13388
12:55:52,021 --> 12:55:54,771
And you just want to clean\nit up and sort it in order
13389
12:55:54,771 --> 12:55:56,101
you might do things like this.
13390
12:55:56,101 --> 12:55:59,512
Well here's my input, 3 of diamonds,\n
13391
12:56:03,122 --> 12:56:06,982
And if you keep going through the cards,\n
13392
12:56:10,552 --> 12:56:12,502
And it's still going\nto take you 52 steps.
13393
12:56:12,502 --> 12:56:15,502
But at the end of it, you\nhave hashed all of the cards
13394
12:56:17,091 --> 12:56:19,971
And now you have problems\nof size 13, which
13395
12:56:19,972 --> 12:56:23,512
is a little more tenable than\ndoing one massive 52 card problem.
13396
12:56:23,512 --> 12:56:25,552
You can now do 4, 13 size problems.
13397
12:56:25,552 --> 12:56:29,272
And so hashing is something that even\n
13398
12:56:29,271 --> 12:56:34,161
Taking as input some card, some name,\n
13399
12:56:34,161 --> 12:56:39,441
A temporary pile in which you\nwant to stage things, so to speak.
13400
12:56:39,442 --> 12:56:41,924
But these collisions are inevitable.
13401
12:56:41,923 --> 12:56:44,631
And honestly, if we kept going\n
13402
12:56:44,631 --> 12:56:47,432
some of these chains would get\nlonger, and longer and longer.
13403
12:56:47,432 --> 12:56:50,811
Which means that instead of\ngetting someone's name quickly
13404
12:56:50,811 --> 12:56:53,659
by searching for them\nor inserting them, might
13405
12:56:53,660 --> 12:56:55,202
start taking a decent amount of time.
13406
12:56:55,201 --> 12:56:58,252
So what could we do instead to\nresolve situations like this?
13407
12:56:58,252 --> 12:57:01,851
If the problem, fundamentally, is\n
13408
12:57:01,851 --> 12:57:04,868
popular, H, we need\nto take in more input.
13409
12:57:04,868 --> 12:57:07,201
Not just the first letter but\nmaybe the first 2 letters.
13410
12:57:07,201 --> 12:57:10,252
So if we do that, we\ncan go from A through Z
13411
12:57:10,252 --> 12:57:16,682
to something more extreme like maybe\n
13412
12:57:16,682 --> 12:57:20,152
So that now Harry and Hermione\nend up at different locations.
13413
12:57:20,152 --> 12:57:23,072
But, darn it, Hagrid\nstill collides with Harry.
13414
12:57:24,862 --> 12:57:27,031
The chains aren't quite as long.
13415
12:57:27,031 --> 12:57:28,891
But the problem isn't\nfundamentally gone.
13416
12:57:28,891 --> 12:57:32,122
And in this case here, anyone\nknow how many buckets we just
13417
12:57:32,122 --> 12:57:40,312
increased to, if we now look at not just\n
13418
12:57:42,921 --> 12:57:46,461
So the easy answer to\n26 squared are 676.
13419
12:57:46,461 --> 12:57:48,051
So that's a lot more buckets.
13420
12:57:48,052 --> 12:57:50,522
And this is why I only showed\na few of them on the screen.
13421
12:57:51,411 --> 12:57:54,531
And it spreads things out in particular.
13422
12:57:54,531 --> 12:57:56,121
What if we take this one step further?
13423
12:57:56,122 --> 12:58:01,612
Instead of H-A, we do like H-A-A,\n
13424
12:58:01,612 --> 12:58:03,561
Well now, we have an\neven better situation.
13425
12:58:03,561 --> 12:58:05,961
Because Hermoine has her one spot.
13426
12:58:09,322 --> 12:58:11,362
But there's a trade off here.
13427
12:58:11,362 --> 12:58:14,722
The upside is now, arithmetically,\nwe can find their locations
13428
12:58:17,512 --> 12:58:21,422
But 3 is constant, no matter how many\n
13429
12:58:21,421 --> 12:58:24,633
But what's the downside here?
13430
12:58:27,771 --> 12:58:33,322
We're now up to 17,576 buckets, which\n
13431
12:58:33,322 --> 12:58:35,222
Computers have a lot\nof memory these days.
13432
12:58:35,222 --> 12:58:38,932
But as you can infer,\nI can't really think
13433
12:58:38,932 --> 12:58:43,641
of someone whose name started with\n
13434
12:58:44,313 --> 12:58:46,521
And if we keep going,\ndefinitely don't know of anyone
13435
12:58:46,521 --> 12:58:49,521
whose name started with\nZ-Z-Z or A-A-A. There's
13436
12:58:49,521 --> 12:58:54,871
a lot of not useful combinations\n
13437
12:58:54,872 --> 12:58:58,522
so that you can do a bit of math\n
13438
12:58:59,773 --> 12:59:01,231
But they're just going to be empty.
13439
12:59:01,232 --> 12:59:04,862
So it's a very sparsely\npopulated array, so to speak.
13440
12:59:04,862 --> 12:59:08,122
So what does that really mean\nfor performance, ultimately?
13441
12:59:08,122 --> 12:59:10,882
Well let's consider, again, in\n
13442
12:59:10,881 --> 12:59:14,271
It turns out that a hash\ntable, technically speaking
13443
12:59:14,271 --> 12:59:18,351
is still just going to give us\nBig O of n in the worst case.
13444
12:59:18,951 --> 12:59:21,921
If you have some crazy perverse\n
13445
12:59:21,921 --> 12:59:25,432
has a name that starts with A, or\n
13446
12:59:25,432 --> 12:59:26,722
you just get really unlucky.
13447
12:59:26,722 --> 12:59:28,599
And your chain is massively long.
13448
12:59:28,599 --> 12:59:30,682
Well then, at that point,\nit's just a linked list.
13449
12:59:31,599 --> 12:59:33,862
It's like the perverse\nsituation with the tree, where
13450
12:59:33,862 --> 12:59:39,682
if you insert it without any mind for\n
13451
12:59:39,682 --> 12:59:43,881
But there's a difference here\nbetween a theoretical performance
13452
12:59:45,502 --> 12:59:48,771
If you look back at the\nthe hash table here
13453
12:59:48,771 --> 12:59:55,371
this is absolutely, in practice, going\n
13454
12:59:55,372 --> 12:59:58,342
Mathematically, asymptotically,\nbig O notation, sure.
13455
13:00:00,112 --> 13:00:03,982
But if what we're really caring about\n
13456
13:00:03,982 --> 13:00:06,472
there's something to be said\nfor crafting a data structure.
13457
13:00:06,472 --> 13:00:09,052
That technically, if this data\nwere uniformly distributed
13458
13:00:09,052 --> 13:00:12,932
is 26 times faster than\na linked list alone.
13459
13:00:12,932 --> 13:00:18,201
And so, there's this tension too\nbetween systems, types of CS
13460
13:00:19,328 --> 13:00:21,411
Where yeah, theoretically,\nthese are all the same.
13461
13:00:21,411 --> 13:00:24,141
But in practice, for\nmaking real-world software
13462
13:00:24,141 --> 13:00:29,872
improving this speed by a factor of 26\n
13463
13:00:29,872 --> 13:00:31,652
might actually make a big difference.
13464
13:00:31,652 --> 13:00:33,152
But there's going to be a trade off.
13465
13:00:33,152 --> 13:00:37,022
And that's typically some other\n
13466
13:00:37,521 --> 13:00:40,581
How about another data\nstructure we could build.
13467
13:00:40,582 --> 13:00:43,492
Let me fast forward to\nsomething here called a trie.
13468
13:00:43,491 --> 13:00:46,402
So a trie, a weird\nname in pronunciation.
13469
13:00:46,402 --> 13:00:49,432
Short for retrieval,\npronounced trie typically.
13470
13:00:49,432 --> 13:00:55,162
A trie is a tree that actually\ngives us constant time lookup
13471
13:00:59,572 --> 13:01:04,711
In the world of a trie, you\ncreate a tree out of arrays.
13472
13:01:04,711 --> 13:01:07,042
So we're really getting into\nthe Frankenstein territory
13473
13:01:07,042 --> 13:01:09,802
of just building things up with\nspare parts of data structures
13474
13:01:10,982 --> 13:01:13,942
But the root of a trie\nis, itself, an array.
13475
13:01:16,012 --> 13:01:22,281
Where each element in that\ntrie points to another node
13476
13:01:22,281 --> 13:01:23,991
which is to say another array.
13477
13:01:23,991 --> 13:01:26,961
And each of those locations in\nthe array represents a letter
13478
13:01:26,961 --> 13:01:28,402
of the alphabet like A through Z.
13479
13:01:28,402 --> 13:01:32,452
So for instance, if you wanted to store\n
13480
13:01:32,451 --> 13:01:36,531
not in a hash table, not in a linked\n
13481
13:01:36,531 --> 13:01:41,301
What you would do is hash on every\n
13482
13:01:42,122 --> 13:01:45,532
So a trie is like a multi-tier\nhash table, in a sense.
13483
13:01:45,531 --> 13:01:47,252
Where you first look\nat the first letter
13484
13:01:47,252 --> 13:01:49,959
then the second letter, then the\n
13485
13:01:49,959 --> 13:01:53,421
For instance, each of these\nlocations represents a letter A
13486
13:01:53,421 --> 13:01:56,932
through Z. Suppose I wanted to\ninsert someone's name into this
13487
13:01:56,932 --> 13:02:01,012
that starts with the letter\nH, like Hagrid for instance.
13488
13:02:01,012 --> 13:02:03,842
Well, I go to the location\nH. I see it's null
13489
13:02:03,841 --> 13:02:06,921
which means I need to malloc myself\n
13490
13:02:08,451 --> 13:02:12,291
Then, suppose I want to store the\n
13491
13:02:12,292 --> 13:02:14,914
an A. So I go to that\nlocation in the second node.
13492
13:02:14,913 --> 13:02:16,371
And I see, OK, it's currently null.
13493
13:02:17,413 --> 13:02:19,921
So I allocate another node\nusing malloc or the like.
13494
13:02:19,921 --> 13:02:24,171
And now I have H-A-G. And\nI continue this with R-I-D.
13495
13:02:24,171 --> 13:02:27,722
And then, when I get to the\nbottom of this person's name
13496
13:02:27,722 --> 13:02:30,322
I just have to indicate\nhere in color, but probably
13497
13:02:30,322 --> 13:02:31,762
with a Boolean value or something.
13498
13:02:31,762 --> 13:02:35,672
Like a true value that\nsays, a name stops here.
13499
13:02:35,671 --> 13:02:41,222
So that it's clear that the person's\n
13500
13:02:41,222 --> 13:02:45,752
or H-A-G-R-I. It's H-A-G-R-I-D.\nAnd the D is green
13501
13:02:45,752 --> 13:02:49,082
just to indicate there's like some\n
13502
13:02:49,082 --> 13:02:52,781
This is the node in\nwhich the name stops.
13503
13:02:52,781 --> 13:02:57,722
And if I continue this logic, here's\n
13504
13:02:57,722 --> 13:03:00,902
And here's how I might\ninsert someone like Hermione.
13505
13:03:00,902 --> 13:03:05,491
And what's interesting about the\n
13506
13:03:07,411 --> 13:03:10,471
Which starts to get compelling\nbecause you're reusing space.
13507
13:03:10,472 --> 13:03:15,391
You're using the same nodes\nfor names like H-A-G and H-A-R
13508
13:03:15,391 --> 13:03:17,851
because they share H and an A in common.
13509
13:03:17,851 --> 13:03:20,112
And they all share an H in common.
13510
13:03:20,112 --> 13:03:23,822
So you have this data structure\nnow that, itself, is a tree.
13511
13:03:23,822 --> 13:03:27,572
Each node in the tree\nis, itself, an array.
13512
13:03:27,572 --> 13:03:31,171
And we, therefore, might implement\n
13513
13:03:31,171 --> 13:03:36,676
Every node is containing, I'll\ndo it in reverse order, an array.
13514
13:03:36,677 --> 13:03:39,302
I'll call it children because\nthat's what it really represents.
13515
13:03:39,302 --> 13:03:41,612
Up to 26 children for\neach of these nodes.
13516
13:03:42,911 --> 13:03:45,841
So I might have used just\na constant for number 26
13517
13:03:45,841 --> 13:03:47,881
to give myself 26\nletters of the alphabet.
13518
13:03:47,881 --> 13:03:52,112
And each of those arrays\nstores that many node stars.
13519
13:03:52,112 --> 13:03:54,031
That many pointers to another node.
13520
13:03:54,031 --> 13:03:55,502
And here's an example of the Bool.
13521
13:03:55,502 --> 13:03:58,232
This is what I represented in\ngreen on the slide a moment ago.
13522
13:03:58,232 --> 13:04:00,062
I also need another piece of data.
13523
13:04:00,061 --> 13:04:03,002
Just a 0 or 1, a true\nor false, that says yes.
13524
13:04:03,002 --> 13:04:08,292
A name stops in this node or it's just\n
13525
13:04:08,292 --> 13:04:12,572
But the upside of this is\nthat the height of this tree
13526
13:04:12,572 --> 13:04:15,572
is only as tall as the\nperson's longest name.
13527
13:04:15,572 --> 13:04:22,411
H-A-G-R-I-D or H-E-R-M-O-I-N-E. And\n
13528
13:04:22,411 --> 13:04:26,221
people are in this data structure,\nthere's 3 at the moment
13529
13:04:26,222 --> 13:04:30,632
if there were 3 million, it would\n
13530
13:04:31,982 --> 13:04:37,232
H-E-R-M-I-O-N-E. So, 8 steps total.
13531
13:04:37,232 --> 13:04:42,062
No matter if there's 2 other people,\n
13532
13:04:42,061 --> 13:04:46,141
Because the path to her name\nis always on the same path.
13533
13:04:46,141 --> 13:04:51,031
And if you assume that there's a\n
13534
13:04:51,902 --> 13:04:53,991
Maybe it's 40, 100, whatever.
13535
13:04:53,991 --> 13:04:55,741
Whatever the longest\nname in the world is.
13536
13:04:56,641 --> 13:04:59,112
Maybe it's 40, 100, but that's constant.
13537
13:04:59,112 --> 13:05:02,322
Which is to say that with a\ntrie, technically speaking
13538
13:05:02,322 --> 13:05:06,961
it is the case that your lookup\n
13539
13:05:09,002 --> 13:05:12,061
It's constant time, because\nunlike every other data structure
13540
13:05:12,061 --> 13:05:16,921
we've looked at, with a trie, the amount\n
13541
13:05:16,921 --> 13:05:20,402
or insert one person is\ncompletely independent of how
13542
13:05:20,402 --> 13:05:24,692
many other pieces of data are\nalready in the data structure.
13543
13:05:24,692 --> 13:05:27,452
And this holds true even if one\nname is a prefix of another.
13544
13:05:27,451 --> 13:05:30,854
I don't think there was a Daniel or\n
13545
13:05:31,771 --> 13:05:35,881
But, D-A-N-I-E-L could be one name.
13546
13:05:35,881 --> 13:05:38,470
And, therefore, we have\na true there in green.
13547
13:05:38,470 --> 13:05:40,262
And if there's a longer\nname like Danielle.
13548
13:05:40,262 --> 13:05:42,242
Then, you keep going\nuntil you get to the E.
13549
13:05:42,241 --> 13:05:45,031
So you can still have with\na trie, one name that's
13550
13:05:45,031 --> 13:05:47,141
a substring of another name.
13551
13:05:47,141 --> 13:05:49,862
So it's not as though we've\ncreated a problem there.
13552
13:05:49,862 --> 13:05:51,533
That, too, is still possible.
13553
13:05:51,533 --> 13:05:54,241
But at the end of the day, it only\n
13554
13:05:54,241 --> 13:05:55,891
to find any of these people.
13555
13:05:55,891 --> 13:05:58,802
And again, that's what's\nparticularly compelling.
13556
13:05:58,802 --> 13:06:00,880
That you effectively have\nconstant time lookup.
13557
13:06:01,921 --> 13:06:05,635
We've gone through this whole story\n
13558
13:06:05,635 --> 13:06:07,052
And then, it went up to n squared.
13559
13:06:07,832 --> 13:06:12,912
And now constant time, what's the price\n
13560
13:06:19,021 --> 13:06:21,451
And in fact, tries are not\nactually used that often
13561
13:06:21,451 --> 13:06:24,981
amazing as they might sound\non some CS level here.
13562
13:06:28,216 --> 13:06:30,379
AUDIENCE: Much like a [INAUDIBLE].
13563
13:06:31,171 --> 13:06:33,091
If you're storing all\nof these darn arrays
13564
13:06:33,091 --> 13:06:36,351
it's, again, a sparsely\npopulated data structure.
13565
13:06:37,351 --> 13:06:41,281
Granted there's only 3 names, but most\n
13566
13:06:42,972 --> 13:06:46,022
So this is an incredibly wide\ndata structure, if you will.
13567
13:06:46,021 --> 13:06:48,521
It uses a huge amount of\nmemory to store the names.
13568
13:06:48,521 --> 13:06:50,341
But again, you've got to pick a lane.
13569
13:06:50,341 --> 13:06:53,461
Either you're going to minimize space\n
13570
13:06:53,461 --> 13:06:56,722
It's not really possible to get\ntruly the best of both worlds.
13571
13:06:56,722 --> 13:06:58,772
You have to decide where\nthe inflection point is
13572
13:06:58,771 --> 13:07:01,591
for the device you're writing\n
13573
13:07:02,942 --> 13:07:06,461
And again, taking all of\nthese things into account.
13574
13:07:06,461 --> 13:07:08,881
So lastly, let's do one\nfurther abstraction.
13575
13:07:08,881 --> 13:07:12,391
So even higher level to discuss\nsomething that are generally
13576
13:07:12,391 --> 13:07:14,444
known as abstract data structures.
13577
13:07:14,444 --> 13:07:16,152
It turns out we could\nspend like all day
13578
13:07:16,152 --> 13:07:17,732
all week, talking about\ndifferent things we
13579
13:07:17,732 --> 13:07:19,182
could build with these data structures.
13580
13:07:19,182 --> 13:07:21,139
But for the most part,\nnow that we have arrays.
13581
13:07:21,139 --> 13:07:23,911
Now that we have linked lists\nor their cousin's trees, which
13582
13:07:24,910 --> 13:07:26,702
And beyond that, there's\neven graphs, where
13583
13:07:26,701 --> 13:07:29,888
the arrows can go in multiple\n
13584
13:07:29,889 --> 13:07:32,222
Now that we have this ability\nto stitch things together
13585
13:07:32,222 --> 13:07:34,272
we can solve all different\ntypes of problems.
13586
13:07:34,271 --> 13:07:38,221
So, for instance, a very\ncommon type of data structure
13587
13:07:38,222 --> 13:07:42,211
to use in a program, or even our\n
13588
13:07:42,211 --> 13:07:46,262
A queue being a data structure\nlike a line outside of a store.
13589
13:07:46,262 --> 13:07:48,332
Where it has what's\ncalled a FIFO property.
13590
13:07:49,722 --> 13:07:52,141
Which is great for fairness,\nat least in the human world.
13591
13:07:52,141 --> 13:07:56,281
And if you've ever waited outside\n
13592
13:07:56,281 --> 13:07:58,472
or some other restaurant\nnearby, presumably
13593
13:07:58,472 --> 13:08:01,262
if you're queuing up at\nthe counter, you want
13594
13:08:01,262 --> 13:08:03,752
them store to maintain a FIFO system.
13595
13:08:05,012 --> 13:08:08,641
So that whoever's first in line gets\n
13596
13:08:08,641 --> 13:08:12,192
So a queue is actually a\ncomputer science term, too.
13597
13:08:12,192 --> 13:08:14,942
And even if you're still in the\n
13598
13:08:14,942 --> 13:08:17,192
there are things you might\nhave heard called printer
13599
13:08:17,192 --> 13:08:19,531
queues, which also do things in order.
13600
13:08:19,531 --> 13:08:21,949
The first person to send\ntheir essay to the printer
13601
13:08:21,949 --> 13:08:24,031
should, ideally, be printed\nbefore the last person
13602
13:08:24,031 --> 13:08:26,402
to send their essay to the printer.
13603
13:08:26,402 --> 13:08:28,202
Again, in the interest of fairness.
13604
13:08:28,201 --> 13:08:29,851
But how can you implement a queue?
13605
13:08:29,851 --> 13:08:32,731
Well, you typically have to\nimplement 2 fundamental operations
13606
13:08:34,292 --> 13:08:37,391
So adding something to it and\nremoving something from it.
13607
13:08:37,391 --> 13:08:41,131
And the interesting thing here is\n
13608
13:08:41,131 --> 13:08:44,131
Well in the human world, you would\n
13609
13:08:44,131 --> 13:08:46,771
for humans to line up from left\nto right, or right to left.
13610
13:08:47,815 --> 13:08:50,732
Like a printer queue, if you send a\n
13611
13:08:50,732 --> 13:08:52,832
a whole bunch of essays\nor documents, well, you
13612
13:08:52,832 --> 13:08:54,912
need a chunk of memory like an array.
13613
13:08:55,411 --> 13:08:57,631
Well, if you use an\narray, what's a problem
13614
13:08:57,631 --> 13:09:01,241
that could happen in the world\nof printing, for instance?
13615
13:09:01,241 --> 13:09:04,502
If you use an array to store all of\n
13616
13:09:04,502 --> 13:09:05,660
AUDIENCE: It can be filled.
13617
13:09:05,660 --> 13:09:07,202
SPEAKER 1: It could be filled, right.
13618
13:09:07,201 --> 13:09:10,502
So if the programmer decided, HP or\n
13619
13:09:10,502 --> 13:09:14,162
oh, you can send like a megabyte worth\n
13620
13:09:14,161 --> 13:09:16,211
At some point you might\nget an error message
13621
13:09:16,211 --> 13:09:17,582
which says, sorry out of memory.
13622
13:09:18,476 --> 13:09:20,851
Which is maybe a reasonable\nsolution, but a little annoy.
13623
13:09:20,851 --> 13:09:24,481
Or HP could write code that maybe\ndynamically resizes the array
13624
13:09:25,152 --> 13:09:27,722
But at that point, maybe they\nshould just use a linked list.
13625
13:09:28,652 --> 13:09:32,372
So there, too, you could\nimplement the notion of a queue
13626
13:09:32,372 --> 13:09:33,720
using a linked list instead.
13627
13:09:33,720 --> 13:09:35,762
You're going to spend more\nmemory, but you're not
13628
13:09:35,762 --> 13:09:38,132
going to run out of space in your array.
13629
13:09:38,131 --> 13:09:39,974
Which might be more compelling.
13630
13:09:39,974 --> 13:09:41,641
This happens even in the physical world.
13631
13:09:41,641 --> 13:09:45,122
You go to the store and you start having\n
13632
13:09:45,122 --> 13:09:49,408
And like, for a really busy store,\n
13633
13:09:49,408 --> 13:09:51,991
But in that case, it tends to\nbe more of an array just because
13634
13:09:51,991 --> 13:09:54,447
of the physical notion\nof humans lining up.
13635
13:09:54,447 --> 13:09:56,072
But there's other data structures, too.
13636
13:09:56,072 --> 13:09:59,197
If you've ever gone to the dining hall\n
13637
13:09:59,197 --> 13:10:04,351
tray, you're typically picking up\n
13638
13:10:04,351 --> 13:10:06,211
not the first tray that was cleaned.
13639
13:10:06,722 --> 13:10:10,652
Because these cafeteria trays\nstack up on top of each other.
13640
13:10:10,652 --> 13:10:13,891
And indeed a stack is another\ntype of abstract data structure.
13641
13:10:13,891 --> 13:10:16,351
In the physical world, it's\nliterally something physical
13642
13:10:18,512 --> 13:10:21,422
Which have what we would\ncall a LIFO property.
13643
13:10:22,942 --> 13:10:24,692
So as these things\ncome out of the washer
13644
13:10:24,692 --> 13:10:27,002
they're putting the most\nrecent ones on the top.
13645
13:10:27,002 --> 13:10:30,722
And then you, the human, are probably\n
13646
13:10:30,722 --> 13:10:33,182
Which means in the\nextreme, no one on campus
13647
13:10:33,182 --> 13:10:36,616
might ever use that very first tray.
13648
13:10:36,616 --> 13:10:38,491
Which is probably fine\nin the world of trays
13649
13:10:38,491 --> 13:10:42,451
but would really be bad in the world of\n
13650
13:10:42,451 --> 13:10:44,252
were the property being implemented.
13651
13:10:44,252 --> 13:10:46,322
But here, too, it could be an array.
13652
13:10:47,432 --> 13:10:49,014
And you see this, honestly, every day.
13653
13:10:49,014 --> 13:10:51,241
If you're using Gmail\nand your Gmail inbox.
13654
13:10:51,241 --> 13:10:53,761
That is actually a stack,\nat least by default
13655
13:10:53,762 --> 13:10:57,160
where your newest message\nlast in are the first ones
13656
13:10:58,201 --> 13:11:00,061
That's a LIFO data structure.
13657
13:11:00,061 --> 13:11:02,191
And it means that you see\nyour most recent emails.
13658
13:11:02,192 --> 13:11:04,650
But if you have a busy day,\nyou're getting a lot of emails
13659
13:11:04,650 --> 13:11:05,912
it might not be a good thing.
13660
13:11:05,911 --> 13:11:08,311
Because now you're ignoring\nthe people who wrote you
13661
13:11:08,311 --> 13:11:10,621
way earlier in the day or the week.
13662
13:11:10,622 --> 13:11:13,082
So LIFO and FIFO are\njust properties that you
13663
13:11:13,082 --> 13:11:15,842
can achieve with these very\nspecific types of data structures.
13664
13:11:15,841 --> 13:11:17,591
And the parliaments\nin the world of stacks
13665
13:11:17,591 --> 13:11:21,451
is to push something onto a\nstack or pop something out.
13666
13:11:21,451 --> 13:11:23,641
These are here, for instance,\nas an example of why
13667
13:11:23,641 --> 13:11:24,932
might you always wear the same color.
13668
13:11:24,932 --> 13:11:27,192
Well, if you're storing all\nof your clothes in a stack
13669
13:11:27,192 --> 13:11:29,012
you might not ever get\nto the different colored
13670
13:11:29,012 --> 13:11:30,452
clothes at the bottom of the list.
13671
13:11:30,451 --> 13:11:35,371
And in fact, to paint this picture,\n
13672
13:11:35,372 --> 13:11:38,372
Just to paint this here, made\nby a faculty member elsewhere.
13673
13:11:38,372 --> 13:11:41,312
Let's go ahead and dim the lights\nfor just a minute or 2 here.
13674
13:11:41,311 --> 13:11:45,466
So that we can take a look\nat Jack learning some facts.
13675
13:11:46,091 --> 13:11:48,841
SPEAKER 2: Once upon a time,\nthere was a guy named Jack.
13676
13:11:48,841 --> 13:11:52,231
When it came to making friends\nJack did not have the knack.
13677
13:11:52,232 --> 13:11:55,202
So Jack went to talk to the\nmost popular guy he knew.
13678
13:11:55,201 --> 13:11:57,871
He went up to Lou and\nasked, what do I do?
13679
13:11:57,872 --> 13:12:00,332
Lou saw that his friend\nwas really distressed.
13680
13:12:00,332 --> 13:12:03,042
Well, Lou began, just\nlook how you're dressed.
13681
13:12:03,042 --> 13:12:05,612
Don't you have any clothes\nwith a different look?
13682
13:12:08,012 --> 13:12:10,202
Come to my house and\nI'll showed them to you.
13683
13:12:10,201 --> 13:12:11,491
So they went off the Jack's.
13684
13:12:11,491 --> 13:12:15,182
And Jack showed Lou the box, where he\n
13685
13:12:16,232 --> 13:12:19,202
Lou said, I see you have\nall your clothes in a pile.
13686
13:12:19,201 --> 13:12:21,781
Why don't you wear some\nothers once in a while?
13687
13:12:21,781 --> 13:12:24,932
Jack said, well, when I\nremove clothes and socks
13688
13:12:24,932 --> 13:12:27,662
I wash them and put\nthem away in the box.
13689
13:12:27,661 --> 13:12:30,151
Then comes the next\nmorning and up I hop.
13690
13:12:30,152 --> 13:12:33,391
I go to the box and get\nmy clothes off the top.
13691
13:12:33,391 --> 13:12:36,002
Lou quickly realized\nthe problem with Jack.
13692
13:12:36,002 --> 13:12:38,972
He kept clothes, CDs,\nand books in a stack.
13693
13:12:38,972 --> 13:12:41,402
When he'd reached for\nsomething to read or to wear
13694
13:12:41,402 --> 13:12:44,012
he chose a top book or underwear.
13695
13:12:44,012 --> 13:12:46,402
Then when he was done he\nwould put it right back.
13696
13:12:46,402 --> 13:12:48,982
Back it would go on top of the stack.
13697
13:12:48,982 --> 13:12:51,352
I know the solution,\nsaid a triumphant Lou.
13698
13:12:51,351 --> 13:12:53,991
You need to learn to\nstart using a queue.
13699
13:12:53,991 --> 13:12:56,781
Lou took Jack's clothes\nand hung them in a closet.
13700
13:12:56,781 --> 13:12:59,601
And when he had emptied\nthe box, he just tossed it.
13701
13:12:59,601 --> 13:13:03,472
Then he said, now Jack, at the end of\n
13702
13:13:04,951 --> 13:13:07,671
Then tomorrow morning when\nyou see the sunshine, get
13703
13:13:07,671 --> 13:13:10,402
your clothes from the right,\nfrom the end of the line.
13704
13:13:10,402 --> 13:13:13,281
Don't you see, said\nLou, it will be so nice.
13705
13:13:13,281 --> 13:13:16,612
You'll wear everything once\nbefore you wear something twice.
13706
13:13:16,612 --> 13:13:19,552
And with everything in queues\nin his closet and shelf
13707
13:13:19,552 --> 13:13:22,162
Jack started to feel\nquite sure of himself.
13708
13:13:22,161 --> 13:13:24,636
All thanks to Lou and\nhis wonderful queue.
13709
13:13:26,701 --> 13:13:29,701
SPEAKER 1: So just to help you realize\n
13710
13:13:33,862 --> 13:13:35,542
If you've ever lined up at this place.
13711
13:13:37,461 --> 13:13:40,281
OK, so sweetgreen, little\nsalad place in the square.
13712
13:13:40,281 --> 13:13:42,171
This is if you order\nonline or in advance
13713
13:13:42,171 --> 13:13:44,713
your food ends up according to\nthe first letter in your name.
13714
13:13:44,713 --> 13:13:46,963
Which actually sounds awfully\nreminiscent of something
13715
13:13:47,781 --> 13:13:50,841
And in fact, no matter whether\n
13716
13:13:50,841 --> 13:13:52,611
did, with an array and linked list.
13717
13:13:52,612 --> 13:13:54,817
Or with 3 shelves like this.
13718
13:13:54,817 --> 13:13:57,802
This is actually an abstract\ndata type called a dictionary.
13719
13:13:57,802 --> 13:14:01,162
And a dictionary, just like in our\n
13720
13:14:01,161 --> 13:14:02,871
Words and their definitions.
13721
13:14:02,872 --> 13:14:07,372
This just has letters of the\nalphabet and salads as their value.
13722
13:14:07,372 --> 13:14:09,742
But here, too, there's\na real world constraint.
13723
13:14:09,741 --> 13:14:13,222
In what kind of scenario does\nthis system at sweetgreen
13724
13:14:13,222 --> 13:14:15,891
devolve into a problem, for instance?
13725
13:14:15,891 --> 13:14:19,582
Because they, too, are using only\nfinite space, finite storage.
13726
13:14:22,012 --> 13:14:23,391
If they run out of space\non the shelf and there's
13727
13:14:23,391 --> 13:14:25,862
a lot of people whose names\nstart with D, or E, or whatever.
13728
13:14:26,781 --> 13:14:29,362
And then, maybe, they kind of\noverflow into the E's or the F's.
13729
13:14:29,362 --> 13:14:31,281
And they probably don't\nreally care because any human
13730
13:14:31,281 --> 13:14:33,771
is going to come by, and just\n
13731
13:14:33,771 --> 13:14:36,261
But in the world of a\ncomputer, you're the one coding
13732
13:14:36,262 --> 13:14:38,152
and have to be ever so precise.
13733
13:14:38,152 --> 13:14:41,722
We thought we would lastly\ndo one final thing here.
13734
13:14:41,722 --> 13:14:45,527
In advance, we prepared a linked\nlist of sorts in the audience.
13735
13:14:45,527 --> 13:14:47,152
Since this has become a bit of a thing.
13736
13:14:47,152 --> 13:14:50,012
I am starting to represent the\nbeginning of this linked list.
13737
13:14:50,012 --> 13:14:54,592
And so far as I have a pointer\nhere with seat location G9.
13738
13:14:54,591 --> 13:14:57,981
Whoever is in G9, would\nyou mind standing up?
13739
13:14:57,982 --> 13:15:00,652
And what letter is on your sheet there?
13740
13:15:01,582 --> 13:15:04,132
SPEAKER 1: OK, so you\nhave S15 and your letter--
13741
13:15:07,161 --> 13:15:09,471
So I see you're holding\na C in your node.
13742
13:15:09,472 --> 13:15:12,982
You are pointing to, if\nyou could physically, F15.
13743
13:15:15,262 --> 13:15:17,872
SPEAKER 1: You have an S. And\nwho should you be pointing at?
13744
13:15:26,302 --> 13:15:30,502
F12, if you'd like to stand up holding\n
13745
13:16:54,512 --> 13:16:58,322
DAVID J. MALAN: All right, this is\n
13746
13:16:58,322 --> 13:17:00,980
And this is the week in which\nyou learn yet another language.
13747
13:17:00,980 --> 13:17:03,272
But the goal is not just to\nteach you another language
13748
13:17:03,271 --> 13:17:06,002
for languages sake,\nas we transition today
13749
13:17:06,002 --> 13:17:09,302
and in the coming weeks from C, where\n
13750
13:17:09,961 --> 13:17:14,051
The goal ultimately is to teach you all\n
13751
13:17:14,052 --> 13:17:16,542
so that by the end of this\ncourse, it's not in your mind
13752
13:17:16,542 --> 13:17:19,232
the fact that you learned\nhow to program in C
13753
13:17:19,232 --> 13:17:21,482
or learned some weeks back\nhow to program in Scratch
13754
13:17:21,482 --> 13:17:24,692
but really how you learned\nhow to program fundamentally
13755
13:17:24,692 --> 13:17:27,152
in a paradigm known as\nprocedural programming
13756
13:17:27,152 --> 13:17:29,972
as well as with some taste\ntoday, and in the weeks to come
13757
13:17:29,972 --> 13:17:31,832
of other aspects of\nprogramming languages
13758
13:17:31,832 --> 13:17:34,531
like object-oriented\nprogramming, and more.
13759
13:17:34,531 --> 13:17:36,701
So recall, though, back\nin week zero, Hello, world
13760
13:17:36,701 --> 13:17:38,201
looked a little something like this.
13761
13:17:38,201 --> 13:17:39,908
And the world was quite simple.
13762
13:17:39,908 --> 13:17:42,241
All you had to do was drag\nand drop these puzzle pieces.
13763
13:17:42,241 --> 13:17:45,481
But there were still functions and\n
13764
13:17:45,482 --> 13:17:47,552
and all of those kinds of primitives.
13765
13:17:47,552 --> 13:17:50,822
We then transitioned, of course,\n
13766
13:17:50,822 --> 13:17:52,362
looked a little something like this.
13767
13:17:52,362 --> 13:17:54,319
And even now, some weeks\nlater, you might still
13768
13:17:54,319 --> 13:17:56,991
be struggling with some of the\nsyntax or getting annoying bugs
13769
13:17:56,991 --> 13:17:59,491
when you try to compile your\ncode, and it just doesn't work.
13770
13:17:59,491 --> 13:18:01,322
But there, too, the\npast few weeks, we've
13771
13:18:01,322 --> 13:18:04,652
been focusing on functions and loops\n
13772
13:18:06,072 --> 13:18:10,232
And so what we begin to do today\n
13773
13:18:10,232 --> 13:18:15,362
we're using, transitioning from C now\n
13774
13:18:15,362 --> 13:18:18,722
program in Python, and look\nat its relative simplicity
13775
13:18:18,722 --> 13:18:20,461
but also transitioning\nto look at how you
13776
13:18:20,461 --> 13:18:22,322
can implement these\nsame kinds of features
13777
13:18:22,322 --> 13:18:23,951
just using a different language.
13778
13:18:23,951 --> 13:18:25,771
So we're going to see\na lot of code today.
13779
13:18:25,771 --> 13:18:29,671
And you won't have nearly as much\n
13780
13:18:29,671 --> 13:18:32,731
But that's because so many of the\n
13781
13:18:32,732 --> 13:18:35,102
And, really, it's going to be a\n
13782
13:18:35,934 --> 13:18:38,281
I know how to do it in C.\nHow do I do this in Python?
13783
13:18:38,281 --> 13:18:39,512
How do I do the same with conditionals?
13784
13:18:39,512 --> 13:18:41,232
How do I declare\nvariables, and the like
13785
13:18:41,232 --> 13:18:43,982
and moving forward, not just in\nCS50, but in life in general
13786
13:18:43,982 --> 13:18:47,282
if you continue programming and learn\n
13787
13:18:47,281 --> 13:18:50,792
if in 5-10 years, there's a new, more\n
13788
13:18:50,792 --> 13:18:53,042
it's just going to be a\nmatter of googling and looking
13789
13:18:53,042 --> 13:18:54,932
at websites like Stack\nOverflow and the like
13790
13:18:54,932 --> 13:18:57,872
to look at just basic building\nblocks of programming languages
13791
13:18:57,872 --> 13:19:01,202
because you already speak,\nafter these past 6 plus weeks
13792
13:19:01,201 --> 13:19:04,021
you already speak programming\nitself fundamentally.
13793
13:19:04,021 --> 13:19:07,591
All right, so let's do a few quick\n
13794
13:19:07,591 --> 13:19:09,481
something might have\nlooked like in Scratch
13795
13:19:09,482 --> 13:19:11,342
and what it then looked\nlike in C, but now
13796
13:19:11,341 --> 13:19:13,291
as of today, what it's going\nto look like in Python.
13797
13:19:13,292 --> 13:19:15,375
Then we'll turn our attention\nto the command line
13798
13:19:15,375 --> 13:19:19,031
ultimately, in order to\nimplement some actual programs.
13799
13:19:19,031 --> 13:19:22,262
So in Scratch, we had\nfunctions like this, say Hello
13800
13:19:23,792 --> 13:19:26,262
In C it looked a little\nsomething like this
13801
13:19:26,262 --> 13:19:29,672
and a bit of a cryptic mess the\nfirst week, you had the printf
13802
13:19:30,811 --> 13:19:32,502
You had the semicolon, the parentheses.
13803
13:19:32,502 --> 13:19:34,944
So there's a lot more syntax\njust to do the same thing.
13804
13:19:34,944 --> 13:19:37,862
We're not going to get rid of all\n
13805
13:19:37,862 --> 13:19:42,101
in Python, that same statement is going\n
13806
13:19:42,101 --> 13:19:44,161
And just to perhaps call\nout the obvious, what
13807
13:19:44,161 --> 13:19:48,572
is different or, now, simpler\nin Python versus C, even
13808
13:19:48,572 --> 13:19:50,161
in this simple example here?
13809
13:19:51,067 --> 13:19:53,942
AUDIENCE: Now print, instead of\n
13810
13:19:53,942 --> 13:19:56,359
DAVID J. MALAN: Good, so it's\nnow print instead of printf.
13811
13:19:56,358 --> 13:19:57,631
And there's also no semicolon.
13812
13:19:57,631 --> 13:19:59,625
And there's one other\nsubtlety, over here.
13813
13:20:00,542 --> 13:20:02,162
DAVID J. MALAN: Yeah,\nso no new line, and that
13814
13:20:02,161 --> 13:20:03,631
doesn't mean it's not\ngoing to be printed.
13815
13:20:03,631 --> 13:20:05,923
It just turns out that one\nof the differences we'll see
13816
13:20:05,923 --> 13:20:08,161
is that, with print, you\nget the new line for free.
13817
13:20:08,161 --> 13:20:11,471
It automatically gets outputted by\n
13818
13:20:11,472 --> 13:20:13,711
But you can override it,\nwe'll see, ultimately, too.
13819
13:20:14,822 --> 13:20:18,603
We had multiple functions like\n
13820
13:20:18,603 --> 13:20:20,311
on the screen, but\nalso asked a question
13821
13:20:20,311 --> 13:20:23,822
thereby being another function that\n
13822
13:20:23,822 --> 13:20:26,252
In C we saw code that\nlooked a little something
13823
13:20:26,252 --> 13:20:29,942
like this, whereby that first line\n
13824
13:20:29,942 --> 13:20:32,312
sets it equal to the\nreturn value of getString
13825
13:20:32,311 --> 13:20:34,261
one of the functions\nfrom the CS50 library
13826
13:20:34,262 --> 13:20:37,502
and then the same double quotes\nand parentheses and semicolon.
13827
13:20:37,502 --> 13:20:41,912
Then we had this format code\nin C that allowed us, with %S
13828
13:20:41,911 --> 13:20:44,281
to actually print out that same value.
13829
13:20:44,281 --> 13:20:46,921
In Python, this, too, is going\nto look a little bit simpler.
13830
13:20:46,921 --> 13:20:49,981
Instead, we're going to have\nanswer equals getString
13831
13:20:49,982 --> 13:20:52,592
quote unquote "What\'s your\nname," and then print
13832
13:20:52,591 --> 13:20:55,391
with a plus sign and a\nlittle bit of new syntax.
13833
13:20:55,391 --> 13:20:58,171
But let's see if we can't just\ninfer from this example what
13834
13:20:59,381 --> 13:21:02,191
Well, first missing on the left is what?
13835
13:21:02,192 --> 13:21:05,141
To the left of the equal sign,\nthere's no what this time?
13836
13:21:05,141 --> 13:21:06,391
Feel free to just call it out.
13837
13:21:07,211 --> 13:21:07,981
DAVID J. MALAN: So there's no type.
13838
13:21:07,982 --> 13:21:10,292
There's no type, like\nthe word string, which
13839
13:21:10,292 --> 13:21:14,612
even though that was a type in\nCS50, every other variable in C
13840
13:21:14,612 --> 13:21:17,959
did we use Int or string or\nfloat, or Bool or something else.
13841
13:21:17,959 --> 13:21:20,042
In Python, there are still\ngoing to be data types
13842
13:21:20,042 --> 13:21:22,502
today onward, but you,\nthe programmer, don't
13843
13:21:22,502 --> 13:21:25,563
have to bother telling the\ncomputer what types you're using.
13844
13:21:25,563 --> 13:21:27,271
The computer is going\nto be smart enough
13845
13:21:27,271 --> 13:21:29,761
the language, really, is going to be\n
13846
13:21:30,781 --> 13:21:32,671
Meanwhile, on the right\nhand side, getString
13847
13:21:32,671 --> 13:21:34,379
is going to be a\nfunction we'll use today
13848
13:21:34,379 --> 13:21:37,841
and this week, which comes from a\n
13849
13:21:37,841 --> 13:21:40,891
But we'll also start to take off\n
13850
13:21:40,891 --> 13:21:44,192
see how to do things without\nany CS50 library moving forward
13851
13:21:44,192 --> 13:21:45,812
using a different function instead.
13852
13:21:45,811 --> 13:21:49,441
As before, no semicolon, but the rest\n
13853
13:21:49,951 --> 13:21:52,534
This starts, of course, to get\na little bit different, though.
13854
13:21:52,535 --> 13:21:54,172
We're using print instead of printf.
13855
13:21:54,171 --> 13:21:57,381
But now, even though this\nlooks a little cryptic
13856
13:21:57,381 --> 13:21:59,631
perhaps, if you've never\nprogrammed before CS50
13857
13:21:59,631 --> 13:22:03,651
what might that plus be doing,\njust based on inference here.
13858
13:22:04,402 --> 13:22:08,241
AUDIENCE: Adding answer\nto the string Hello.
13859
13:22:08,241 --> 13:22:11,511
DAVID J. MALAN: Yeah, so adding\nanswer to the string Hello
13860
13:22:11,512 --> 13:22:13,552
and adding, so to speak,\nnot mathematically
13861
13:22:13,552 --> 13:22:16,102
but in the form of joining\nthem together, much like we
13862
13:22:16,101 --> 13:22:19,561
saw the joined block in Scratch, or\n
13863
13:22:20,061 --> 13:22:23,331
This plus sign appends,\nif you will, whatever's
13864
13:22:23,332 --> 13:22:25,147
in answer to whatever is quoted here.
13865
13:22:25,146 --> 13:22:27,771
And I deliberately left a space\nthere, so that grammatically it
13866
13:22:27,771 --> 13:22:29,943
looks nice, after the comma as well.
13867
13:22:29,944 --> 13:22:31,402
Now there's another way to do this.
13868
13:22:31,402 --> 13:22:33,652
And it, too, is going to\nlook cryptic at first glance.
13869
13:22:33,652 --> 13:22:36,031
But it just gets easier and\nmore convenient over time.
13870
13:22:36,031 --> 13:22:41,101
You can also change this second\nline to be this, instead.
13871
13:22:42,292 --> 13:22:45,232
This is actually a relatively new\n
13872
13:22:45,232 --> 13:22:47,542
of years, where now what\nyou're seeing is, yes
13873
13:22:47,542 --> 13:22:50,101
a string, between these\nsame double quotes
13874
13:22:50,101 --> 13:22:53,597
but this is what Python would\ncall a format string, or Fstring.
13875
13:22:53,597 --> 13:22:56,722
And it literally starts with the letter\n
13876
13:22:57,502 --> 13:23:01,222
But that just indicates\nthat Python should
13877
13:23:01,222 --> 13:23:05,632
assume that anything inside of\ncurly braces inside of the string
13878
13:23:05,631 --> 13:23:09,081
should be interpolated, so to\n
13879
13:23:09,082 --> 13:23:12,682
substitute the value of\nany variables therein.
13880
13:23:12,682 --> 13:23:14,552
And it can do some other things as well.
13881
13:23:14,552 --> 13:23:18,562
So answer is a variable, declared,\n
13882
13:23:18,561 --> 13:23:22,822
This Fstring, then, says to Python,\n
13883
13:23:24,472 --> 13:23:28,912
If, by contrast, you\nomitted the curly braces
13884
13:23:28,911 --> 13:23:30,561
just take a guess, what would happen?
13885
13:23:30,561 --> 13:23:33,441
What would the symptom of that\nbug be, if you accidentally
13886
13:23:33,442 --> 13:23:36,531
forgot the curly braces, but\nmaybe still had the F there?
13887
13:23:36,531 --> 13:23:38,271
AUDIENCE: It would print below it, too.
13888
13:23:38,271 --> 13:23:40,822
DAVID J. MALAN: Yeah, it would literally\n
13889
13:23:41,722 --> 13:23:44,211
So the curly braces just kind\nof allow you to plug things in.
13890
13:23:44,211 --> 13:23:45,872
And, again, it looks\na little more cryptic
13891
13:23:45,872 --> 13:23:47,789
but it's just going to\nsave us time over time.
13892
13:23:47,788 --> 13:23:50,641
And if any of you programmed in\n
13893
13:23:50,641 --> 13:23:53,152
you saw plus in that context,\ntoo, for concatenation.
13894
13:23:53,152 --> 13:23:56,277
This just kind of makes your code a\n
13895
13:23:56,277 --> 13:23:58,252
So it's a convenient\nfeature now in Python.
13896
13:23:58,252 --> 13:24:00,711
All right, this was an example\nin Scratch of a variable
13897
13:24:00,711 --> 13:24:03,262
setting a variable like\ncounter equal to 0.
13898
13:24:03,262 --> 13:24:06,982
In C it looked like this, where\nyou specify the type, the name
13899
13:24:06,982 --> 13:24:08,752
and then the value, with a semicolon.
13900
13:24:08,752 --> 13:24:11,618
In Python, it's going to look like this.
13901
13:24:11,618 --> 13:24:12,951
And I'll state the obvious here.
13902
13:24:12,951 --> 13:24:15,862
You don't need to mention the\n
13903
13:24:15,862 --> 13:24:17,552
And you don't need a semicolon.
13904
13:24:18,652 --> 13:24:21,527
If you want a variable, just write\n
13905
13:24:21,527 --> 13:24:24,592
But the single equal sign\nstill behaves the same as in C.
13906
13:24:24,591 --> 13:24:26,961
Suppose we wanted to\nincrement counter by one.
13907
13:24:26,961 --> 13:24:29,271
In Scratch, we use\nthis puzzle piece here.
13908
13:24:29,271 --> 13:24:31,771
In C, we could do this, actually,\nin a few different ways.
13909
13:24:31,771 --> 13:24:33,921
There was this way, if\ncounter already exists
13910
13:24:33,921 --> 13:24:36,502
you just say counter\nequals counter plus 1.
13911
13:24:36,502 --> 13:24:41,362
There was the slightly less verbose\n
13912
13:24:41,362 --> 13:24:42,921
Let me do the first sentence first.
13913
13:24:42,921 --> 13:24:45,211
In Python, that same\nthing, as you might guess
13914
13:24:45,211 --> 13:24:48,682
is actually going to be almost the\n
13915
13:24:48,682 --> 13:24:51,891
And the mathematics are ultimately\n
13916
13:24:51,891 --> 13:24:53,811
via the assignment operator.
13917
13:24:53,811 --> 13:24:56,091
Now, recall, in C, that\nwe had this shorthand
13918
13:24:56,091 --> 13:24:58,521
notation, which did the same thing.
13919
13:24:58,521 --> 13:25:03,502
In Python, you can similarly do the same\n
13920
13:25:03,502 --> 13:25:05,811
The only step backwards\nwe're taking, if you
13921
13:25:05,811 --> 13:25:10,311
were a big fan of counter plus\n
13922
13:25:12,021 --> 13:25:16,731
You have to do the plus equals 1\nor plus/minus or minus equals 1
13923
13:25:16,732 --> 13:25:20,242
to achieve that same result. All\nright, how about in Python 2?
13924
13:25:20,241 --> 13:25:22,881
Here in Scratch, recall,\nwas a conditional
13925
13:25:22,881 --> 13:25:26,511
asking a silly question like is x less\n
13926
13:25:26,512 --> 13:25:30,502
In C, that looked a little\nsomething like this, printf and if
13927
13:25:30,502 --> 13:25:33,832
with the parentheses, the curly\n
13928
13:25:33,832 --> 13:25:37,132
In Python, this is going to get a\n
13929
13:25:39,841 --> 13:25:42,981
And if someone wants to call out\n
13930
13:25:42,982 --> 13:25:46,887
what has been simplified now in Python\n
13931
13:25:46,887 --> 13:25:48,262
Yeah, what's missing, or changed?
13932
13:25:48,872 --> 13:25:49,927
DAVID J. MALAN: So no curly braces.
13933
13:25:51,891 --> 13:25:53,031
AUDIENCE: Using the colon instead.
13934
13:25:53,031 --> 13:25:55,114
DAVID J. MALAN: And we're\nusing the colon instead.
13935
13:25:55,114 --> 13:25:57,141
So I got rid of the\ncurly braces in Python.
13936
13:25:57,141 --> 13:25:58,714
But I'm using a colon instead.
13937
13:25:58,714 --> 13:26:00,631
And even though this is\na single line of code
13938
13:26:00,631 --> 13:26:04,971
so long as you indent subsequent\nlines along with the printf
13939
13:26:04,972 --> 13:26:09,351
that's going to imply that everything,\n
13940
13:26:09,351 --> 13:26:13,491
should be executed below it, until you\n
13941
13:26:13,491 --> 13:26:14,991
a different line of code altogether.
13942
13:26:14,991 --> 13:26:17,521
So indentation in Python is important.
13943
13:26:17,521 --> 13:26:21,621
So this is among the reasons\nwe've emphasized axes like style
13944
13:26:21,622 --> 13:26:23,362
just how well styled your code is.
13945
13:26:23,362 --> 13:26:25,881
And honestly, we've seen,\ncertainly, in office hours
13946
13:26:25,881 --> 13:26:28,521
and you've seen in your own code,\nsort of a tendency sometimes
13947
13:26:28,521 --> 13:26:31,551
to be a little lax when it\ncomes to indentation, right?
13948
13:26:31,552 --> 13:26:34,192
If you're one of those folks\nwho likes to indent everything
13949
13:26:34,192 --> 13:26:37,732
on the left hand side of the window,\n
13950
13:26:37,732 --> 13:26:41,392
But it's not particularly\nreadable by you or anyone else.
13951
13:26:41,391 --> 13:26:45,112
Python actually addresses this\nby just requiring indentation
13952
13:26:46,311 --> 13:26:50,572
So Python is going to force you to start\n
13953
13:26:50,572 --> 13:26:53,201
perhaps, a tendency otherwise.
13954
13:26:54,141 --> 13:26:55,572
Well, we have no semicolon here.
13955
13:26:55,572 --> 13:26:57,671
Of course, it's print instead of printf.
13956
13:26:57,671 --> 13:27:00,341
But otherwise, those seem to\nbe the primary differences.
13957
13:27:00,341 --> 13:27:02,201
What about something larger in Scratch?
13958
13:27:02,201 --> 13:27:05,333
If an if-else block, like\nthis, you can perhaps
13959
13:27:05,334 --> 13:27:06,792
guess what it's going to look like.
13960
13:27:06,792 --> 13:27:10,061
In C it looks like this, curly\nbraces semicolons, and so forth.
13961
13:27:10,061 --> 13:27:14,051
In Python, it's going to now\nlook like this, almost the same
13962
13:27:14,052 --> 13:27:15,342
but indentation is important.
13963
13:27:16,482 --> 13:27:19,332
And there's one other difference\nthat's now again visible here
13964
13:27:19,332 --> 13:27:21,192
but we didn't call it out a second ago.
13965
13:27:21,192 --> 13:27:24,281
What else is different in Python\n
13966
13:27:24,993 --> 13:27:27,641
AUDIENCE: You don't have any\nparentheses around the condition.
13967
13:27:28,222 --> 13:27:30,612
We don't have any parentheses\naround the condition
13968
13:27:30,612 --> 13:27:32,232
the Boolean expression itself.
13969
13:27:33,088 --> 13:27:34,421
Well, it's just simpler to type.
13970
13:27:35,472 --> 13:27:36,972
You can still use parentheses.
13971
13:27:36,972 --> 13:27:39,072
And, in fact, you might\nwant to or need to
13972
13:27:39,072 --> 13:27:43,991
if you want to combine thoughts and\n
13973
13:27:43,991 --> 13:27:47,441
But by default, you no longer need\n
13974
13:27:48,671 --> 13:27:50,961
Lastly, with conditionals,\nwe had something like this
13975
13:27:50,961 --> 13:27:53,292
an if else if else statement.
13976
13:27:53,292 --> 13:27:55,362
In C, it looked a little\nsomething like this.
13977
13:27:55,362 --> 13:27:57,402
In Python, it's going to\nget really tighter now.
13978
13:27:57,402 --> 13:28:02,351
It's just if, and this is the\ncuriosity, elif x greater than y.
13979
13:28:02,351 --> 13:28:07,631
So it's not else if, it's literally\n
13980
13:28:07,631 --> 13:28:09,836
remain now on each of the three lines.
13981
13:28:09,836 --> 13:28:11,211
But the indentation is important.
13982
13:28:11,211 --> 13:28:13,002
And if we did want to\ndo multiple things
13983
13:28:13,002 --> 13:28:16,760
we could just indent below each\nof these conditionals, as well.
13984
13:28:16,760 --> 13:28:18,552
All right, let me pause\nthere first, to see
13985
13:28:18,552 --> 13:28:21,012
if there's any questions on\nthese syntactic differences.
13986
13:28:21,769 --> 13:28:24,054
AUDIENCE: My thought is\nmaybe like, it's good
13987
13:28:24,053 --> 13:28:27,681
though, does it matter if there's\n
13988
13:28:28,692 --> 13:28:31,572
DAVID J. MALAN: In between,\nbetween what and what?
13989
13:28:31,572 --> 13:28:34,942
AUDIENCE: So like the left-hand\n
13990
13:28:34,942 --> 13:28:38,352
DAVID J. MALAN: Ah, good\nquestion, is Python sensitive
13991
13:28:38,351 --> 13:28:40,271
to spaces and where they go?
13992
13:28:40,271 --> 13:28:42,911
Sometimes no, sometimes\nyes, is the short answer.
13993
13:28:42,911 --> 13:28:46,601
Stylistically, though, you should be\n
13994
13:28:46,601 --> 13:28:50,786
whereby you do have spaces to the\n
13995
13:28:50,786 --> 13:28:52,661
that they're called,\nsomething like less than
13996
13:28:52,661 --> 13:28:54,869
or greater than is a binary\noperator, because there's
13997
13:28:54,870 --> 13:28:57,102
two operands to the left\nand to the right of them.
13998
13:28:57,101 --> 13:29:00,161
And in fact, in Python,\nmore so than the world of C
13999
13:29:00,161 --> 13:29:02,861
there's actually formal\nstyle conventions.
14000
13:29:02,862 --> 13:29:07,209
Not only within CS50 have we had a\n
14001
13:29:07,209 --> 13:29:10,542
for instance, that just dictates how you\n
14002
13:29:11,466 --> 13:29:13,841
In the Python community, they\ntake this one step further
14003
13:29:13,841 --> 13:29:17,781
and there's an actual standard whereby\n
14004
13:29:17,781 --> 13:29:20,832
but generally speaking, in the real\n
14005
13:29:20,832 --> 13:29:23,622
would reject your code, if you're trying\n
14006
13:29:23,622 --> 13:29:25,252
if you don't adhere to these standards.
14007
13:29:25,252 --> 13:29:28,211
So while you could be lax\nwith some of this white space
14008
13:29:29,381 --> 13:29:33,296
And that's Python theme, for the\n
14009
13:29:33,296 --> 13:29:35,921
All right, so let's take a look\nat a couple of other constructs
14010
13:29:35,921 --> 13:29:37,881
before transitioning\nto some actual code.
14011
13:29:37,881 --> 13:29:40,631
This, of course, in Scratch\nwas a loop, meowing forever.
14012
13:29:40,631 --> 13:29:44,862
In C, the closest we could get was\n
14013
13:29:45,622 --> 13:29:48,582
So it's sort of a simple way\nof just saying do this forever.
14014
13:29:48,582 --> 13:29:51,461
In Python, it's pretty\nmuch the same thing
14015
13:29:51,461 --> 13:29:53,262
but a couple of small differences here.
14016
13:29:57,161 --> 13:30:00,784
No semicolon, and there's\none other subtle difference.
14017
13:30:01,451 --> 13:30:02,441
AUDIENCE: True is capitalized?
14018
13:30:02,442 --> 13:30:04,525
DAVID J. MALAN: True is\ncapitalized, just because.
14019
13:30:04,525 --> 13:30:07,092
Both true and false are\nBoolean values in Python.
14020
13:30:07,091 --> 13:30:09,671
But you've got to start\ncapitalizing them, just because.
14021
13:30:09,671 --> 13:30:11,561
All right, how about a\nloop like this, where
14022
13:30:11,561 --> 13:30:14,981
you repeat something a finite number\n
14023
13:30:14,982 --> 13:30:17,572
In C, we could do this\na few different ways.
14024
13:30:17,572 --> 13:30:21,311
There's this very mechanical way,\n
14025
13:30:22,091 --> 13:30:25,871
You then use a while loop and\ncheck if i is less than 3
14026
13:30:25,872 --> 13:30:27,709
the total number of\ntimes you want to meow.
14027
13:30:27,709 --> 13:30:29,292
Then you print what you want to print.
14028
13:30:29,292 --> 13:30:32,891
You increment i using this syntax,\n
14029
13:30:32,891 --> 13:30:34,402
with plus equals or whatnot.
14030
13:30:34,402 --> 13:30:36,732
And then you do it again\nand again and again.
14031
13:30:36,732 --> 13:30:40,692
In Python, you can do it\nfunctionally the same way, same idea
14032
13:30:42,101 --> 13:30:44,711
You just don't bother saying\nwhat type of variable you want.
14033
13:30:44,711 --> 13:30:47,559
Python will infer from the fact\nthat there's a 0 right there.
14034
13:30:47,559 --> 13:30:48,851
You don't need the parentheses.
14035
13:30:51,281 --> 13:30:54,432
You can't do the i plus plus, but\n
14036
13:30:54,432 --> 13:30:56,622
as we could have done in C, as well.
14037
13:30:56,622 --> 13:30:58,842
How else might we do this, though, too?
14038
13:30:58,841 --> 13:31:01,061
Well. it turns out in\nC, we could do something
14039
13:31:01,061 --> 13:31:04,752
like this, which, again, sort\nof cryptic at first glance
14040
13:31:04,752 --> 13:31:07,692
became perhaps more familiar,\nwhere you have initialization
14041
13:31:07,692 --> 13:31:11,442
a conditional, and then an update\n
14042
13:31:11,442 --> 13:31:14,472
In Python, there isn't really an analog.
14043
13:31:14,472 --> 13:31:17,022
There is no analog in\nPython, where you have
14044
13:31:17,021 --> 13:31:19,901
the parentheses and the multiple\nsemicolons in the same line.
14045
13:31:19,902 --> 13:31:23,531
Instead, there is a for loop, but\n
14046
13:31:23,531 --> 13:31:27,072
like English, for i in 0, 1, and 2.
14047
13:31:27,072 --> 13:31:31,302
So we'll see in a bit, these square\n
14048
13:31:31,302 --> 13:31:33,612
to be called a list in Python.
14049
13:31:33,612 --> 13:31:37,811
So lists in Python are more like\n
14050
13:31:38,902 --> 13:31:42,732
So this just means for i and the\nfollowing list of three values.
14051
13:31:42,732 --> 13:31:46,342
And on each iteration of this loop,\n
14052
13:31:49,362 --> 13:31:54,402
Then it sets i to two, so that you\n
14053
13:31:54,402 --> 13:31:57,972
But this doesn't necessarily scale,\n
14054
13:31:57,972 --> 13:32:01,662
Suppose you took this\nat face value as the way
14055
13:32:01,661 --> 13:32:05,501
you iterate some number of times\nin Python, using a for loop.
14056
13:32:05,502 --> 13:32:10,004
At what point does this approach\nperhaps get bad, or bad design?
14057
13:32:10,004 --> 13:32:11,711
Let me give folks just\na moment to think.
14058
13:32:12,936 --> 13:32:15,603
AUDIENCE: If you don't know how\nmany times, last time, you know
14059
13:32:15,603 --> 13:32:17,604
you've got the link in there.
14060
13:32:17,605 --> 13:32:20,022
DAVID J. MALAN: Sure, if you\ndon't know how many times you
14061
13:32:20,021 --> 13:32:23,981
want to loop or iterate, you can't\n
14062
13:32:26,845 --> 13:32:29,512
AUDIENCE: So you want to say raise\na large number of allowances.
14063
13:32:29,512 --> 13:32:32,262
DAVID J. MALAN: Yeah, if you're\n
14064
13:32:32,262 --> 13:32:34,162
this list is going to\nget longer and longer
14065
13:32:34,161 --> 13:32:36,453
and you're just kind of\nstupidly going to be typing out
14066
13:32:36,453 --> 13:32:40,182
like comma 3, comma 4, comma 5, comma\n
14067
13:32:40,182 --> 13:32:42,682
I mean, your code would start\nto look atrocious, eventually.
14068
13:32:44,031 --> 13:32:46,881
In Python, there is a function,\nor technically a type
14069
13:32:46,881 --> 13:32:51,052
called range, that essentially magically\n
14070
13:32:51,052 --> 13:32:54,121
from 0 on up to, but\nnot through a value.
14071
13:32:54,120 --> 13:32:58,131
So the effect of this line of\n
14072
13:32:58,131 --> 13:33:01,006
essentially hands you back\na list of three values
14073
13:33:01,006 --> 13:33:02,881
thereby letting you do\nsomething three times.
14074
13:33:02,881 --> 13:33:05,588
And if you want to do something\n
14075
13:33:07,597 --> 13:33:11,612
AUDIENCE: Is there a way to start\n
14076
13:33:11,612 --> 13:33:15,932
at a number or an integer that's higher\n
14077
13:33:16,982 --> 13:33:18,062
DAVID J. MALAN: A really\ngood question, can
14078
13:33:18,061 --> 13:33:19,961
you start counting at a higher number.
14079
13:33:19,961 --> 13:33:23,432
So not 0, which is the implied default,\n
14080
13:33:23,432 --> 13:33:28,082
Yes, so it turns out the range function\n
14081
13:33:28,082 --> 13:33:31,520
but maybe two or even three, that\n
14082
13:33:31,519 --> 13:33:33,061
So you can customize where it begins.
14083
13:33:33,061 --> 13:33:34,441
You can customize the increment.
14084
13:33:34,442 --> 13:33:36,234
By default, it's one,\nbut if you want to do
14085
13:33:36,233 --> 13:33:39,103
every two values, for like evens\n
14086
13:33:40,061 --> 13:33:42,451
And before long, we'll take a\nlook at some Python documentation
14087
13:33:42,451 --> 13:33:45,331
that will become your authoritative\n
14088
13:33:45,332 --> 13:33:47,312
Like, what can this function do.
14089
13:33:47,311 --> 13:33:51,542
Other questions on this thus far?
14090
13:33:51,542 --> 13:33:56,502
Seeing none, so what else might\nwe compare and contrast here.
14091
13:33:56,502 --> 13:34:00,841
Well, in the world of C, recall that\n
14092
13:34:00,841 --> 13:34:04,831
types, like these here, Bool and char\n
14093
13:34:04,832 --> 13:34:08,192
string, which happened to\ncome from the CS50 library.
14094
13:34:08,192 --> 13:34:12,512
But the language C itself certainly\n
14095
13:34:12,512 --> 13:34:17,222
because the backslash 0, the support\n
14096
13:34:17,222 --> 13:34:19,891
built into C, not a CS50 simplification.
14097
13:34:19,891 --> 13:34:22,141
All we did, and revealed,\nas of a couple of weeks
14098
13:34:22,141 --> 13:34:24,572
ago, is that string,\nthis data type, is just
14099
13:34:24,572 --> 13:34:29,252
a synonym for a typedef for char star,\n
14100
13:34:29,252 --> 13:34:32,131
In Python now, this list actually\n
14101
13:34:32,131 --> 13:34:33,964
for these common primitive data types.
14102
13:34:33,964 --> 13:34:36,631
Still going to have bulls, we're\ngoing to have floats, and Ints
14103
13:34:36,631 --> 13:34:39,121
and we're going to have strings,\n
14104
13:34:39,122 --> 13:34:41,282
And this is not a CS50\nthing from the library
14105
13:34:41,281 --> 13:34:44,822
STR, S-T-R, is, in fact,\na data type in Python
14106
13:34:44,822 --> 13:34:48,781
that's going to do a lot more than\n
14107
13:34:48,781 --> 13:34:53,654
Ints and floats, meanwhile, don't need\n
14108
13:34:53,654 --> 13:34:56,072
because, in fact, among the\nproblems Python solves for us
14109
13:34:56,072 --> 13:34:58,862
too, Ints can get as big as you want.
14110
13:34:58,862 --> 13:35:01,741
Integer overflow is no\nlonger going to be an issue.
14111
13:35:01,741 --> 13:35:04,472
Per week 1, the language\nsolves that for us.
14112
13:35:04,472 --> 13:35:06,311
Floating point\nimprecision, unfortunately
14113
13:35:06,311 --> 13:35:07,711
is still a problem that remains.
14114
13:35:07,711 --> 13:35:11,252
But there are libraries, code that\n
14115
13:35:11,252 --> 13:35:13,531
discussed in weeks past,\nthat allow you to do
14116
13:35:13,531 --> 13:35:16,771
scientific or financial computing,\nusing libraries that build
14117
13:35:16,771 --> 13:35:19,146
on top of these data types, as well.
14118
13:35:19,146 --> 13:35:22,021
So there's other data types, too,\n
14119
13:35:22,021 --> 13:35:25,231
gives us a whole bunch of\nmore power and capability
14120
13:35:25,232 --> 13:35:28,022
things called ranges,\nlike we just saw, lists
14121
13:35:28,021 --> 13:35:30,601
like I called out verbally,\nwith the square brackets
14122
13:35:30,601 --> 13:35:33,421
things called tuples, for\nthings like x comma y
14123
13:35:33,421 --> 13:35:36,826
or latitude, longitude,\ndictionaries, or Dicts
14124
13:35:36,826 --> 13:35:40,261
which allow you to store keys and\n
14125
13:35:40,262 --> 13:35:43,495
from last time, and then sets in the\n
14126
13:35:43,495 --> 13:35:46,412
out duplicates for you, and you can\n
14127
13:35:46,411 --> 13:35:50,431
a whole bunch of words or whatnot,\n
14128
13:35:50,432 --> 13:35:52,921
will filter out duplicates for you.
14129
13:35:52,921 --> 13:35:56,506
Now there's going to be a few functions\n
14130
13:35:56,506 --> 13:35:59,131
training wheels that we're then\ngoing to very quickly take off
14131
13:35:59,131 --> 13:36:02,581
just because, as we'll see today, they\n
14132
13:36:02,582 --> 13:36:05,726
user input correctly, without\naccidentally writing buggy code
14133
13:36:05,726 --> 13:36:08,851
just when you're trying to get Hello,\n
14134
13:36:08,851 --> 13:36:12,572
And we'll give you functions, not\n
14135
13:36:12,572 --> 13:36:15,152
but a subset of these,\nget float, get Int
14136
13:36:15,152 --> 13:36:18,182
and get string, that'll\nautomate the process of getting
14137
13:36:18,182 --> 13:36:21,932
user input in a way that's more\n
14138
13:36:21,932 --> 13:36:23,792
But we'll see what those bugs might be.
14139
13:36:23,792 --> 13:36:26,641
And the way we're going to do\nthis is similar in spirit to C.
14140
13:36:26,641 --> 13:36:30,902
Instead of doing include,\nCS50.h, like we did in C
14141
13:36:30,902 --> 13:36:33,812
you're going to now\nstart saying import CS50.
14142
13:36:33,811 --> 13:36:37,081
Python supports,\nsimilar to C, libraries
14143
13:36:37,082 --> 13:36:38,822
but there aren't header files anymore.
14144
13:36:38,822 --> 13:36:41,612
You just use the name of\nthe library in Python.
14145
13:36:41,612 --> 13:36:44,972
And if you want to import CS50's\n
14146
13:36:44,972 --> 13:36:48,991
Or, if you want to be more precise, and\n
14147
13:36:48,991 --> 13:36:52,381
could be slow, if you've got a really\n
14148
13:36:52,381 --> 13:36:56,252
in it, you can be more precise and\n
14149
13:36:56,252 --> 13:37:00,002
From CS50 import get Int,\nfrom CSM 50 import get string
14150
13:37:00,002 --> 13:37:02,792
or you can just separate\nthem by commas and import 3
14151
13:37:02,792 --> 13:37:07,072
and only 3 things from a\nparticular library, like ours.
14152
13:37:07,072 --> 13:37:08,822
But starting today and\nonward, we're going
14153
13:37:08,822 --> 13:37:11,972
to start making much more\nheavy use of libraries, code
14154
13:37:11,972 --> 13:37:15,092
that other people wrote, so that\n
14155
13:37:15,091 --> 13:37:18,396
We're not making our own linked lists,\n
14156
13:37:18,396 --> 13:37:20,771
We're going to start standing\non the shoulders of others
14157
13:37:20,771 --> 13:37:23,641
so that you can get real work\ndone, so to speak, faster
14158
13:37:23,641 --> 13:37:28,231
by building your software on\ntop of others' code as well.
14159
13:37:28,232 --> 13:37:31,632
All right, so that's it for the\nsyntactic tour of the language
14160
13:37:31,631 --> 13:37:32,881
and the sort of core features.
14161
13:37:32,881 --> 13:37:34,841
Soon we'll transition\nto application thereof.
14162
13:37:34,841 --> 13:37:40,561
But let me pause here to see if there's\n
14163
13:37:48,726 --> 13:37:52,685
AUDIENCE: Why don't Python\nhave the increment operators.
14164
13:37:52,684 --> 13:37:54,851
DAVID J. MALAN: I'm sorry,\nsay it again, why doesn't
14165
13:37:54,851 --> 13:37:56,309
Python have what kind of operators?
14166
13:37:56,309 --> 13:37:59,099
AUDIENCE: Why doesn't Python\nhave the increment operator?
14167
13:37:59,099 --> 13:38:02,141
DAVID J. MALAN: Sorry, someone coughed\n
14168
13:38:03,470 --> 13:38:05,262
DAVID J. MALAN: Oh,\nthe increment operator?
14169
13:38:05,262 --> 13:38:06,929
I'd have to check the history, honestly.
14170
13:38:06,928 --> 13:38:09,431
Python has tended to be a\nfairly minimus language.
14171
13:38:09,432 --> 13:38:12,612
And if you can do something one\nway, the community, arguably
14172
13:38:12,612 --> 13:38:16,667
has tended to not give you multiple\n
14173
13:38:16,667 --> 13:38:18,042
There's probably a better answer.
14174
13:38:18,042 --> 13:38:22,362
And I'll see if I can dig in and post\n
14175
13:38:22,362 --> 13:38:26,391
All right, so before we transition\n
14176
13:38:26,391 --> 13:38:31,391
let me go ahead and consider exactly\n
14177
13:38:31,391 --> 13:38:35,292
In the world of C, recall that it's\n
14178
13:38:35,292 --> 13:38:40,752
We create a file called like Hello.c,\n
14179
13:38:41,921 --> 13:38:44,652
Or, if you think back to week\ntwo, when we sort of peeled back
14180
13:38:44,652 --> 13:38:47,622
the layer of what Hello,\nof what make was doing
14181
13:38:47,622 --> 13:38:50,832
you could more verbosely type out\n
14182
13:38:50,832 --> 13:38:54,162
Clang in our case, command line\narguments like dash Oh, Hello
14183
13:38:54,161 --> 13:38:56,361
to specify what name you want to create.
14184
13:38:56,362 --> 13:38:58,182
And then you can specify the file name.
14185
13:38:58,182 --> 13:39:01,572
And then you can specify what\nlibraries you want to link in.
14186
13:39:01,572 --> 13:39:03,072
So that was a very verbose approach.
14187
13:39:03,072 --> 13:39:05,451
But it was always a two-step approach.
14188
13:39:05,451 --> 13:39:08,201
And so, even as you've been\ndoing recent problem sets
14189
13:39:08,201 --> 13:39:11,921
odds are you've realized that, any time\n
14190
13:39:11,921 --> 13:39:16,182
or make a change to your code\nand try and test your code again
14191
13:39:16,182 --> 13:39:18,881
you're constantly doing those two steps.
14192
13:39:18,881 --> 13:39:22,362
Moving forward in Python,\nit's going to become simpler
14193
13:39:22,362 --> 13:39:24,131
and it's going to be just this.
14194
13:39:24,131 --> 13:39:26,981
The file name is going to change,\n
14195
13:39:26,982 --> 13:39:31,782
It's going to be something like\n
14196
13:39:31,781 --> 13:39:34,512
And that's just a convention,\nusing a different file extension.
14197
13:39:34,512 --> 13:39:37,302
But there's no compilation step per se.
14198
13:39:37,302 --> 13:39:40,692
You jump right to the\nexecution of your code.
14199
13:39:40,692 --> 13:39:43,722
And so Python, it turns out, is\n
14200
13:39:43,722 --> 13:39:48,671
we're going to start using, it's also\n
14201
13:39:48,671 --> 13:39:52,542
assuming it's been pre-installed,\n
14202
13:39:52,542 --> 13:39:56,622
This is to say that Python is generally\n
14203
13:39:57,881 --> 13:40:01,691
And by that, I mean you get to skip,\n
14204
13:40:02,891 --> 13:40:07,391
There is no manual step in the world of\n
14205
13:40:07,391 --> 13:40:11,052
and then compiling it to zeros and ones,\n
14206
13:40:11,052 --> 13:40:13,391
Instead, these kind of\ntwo steps get collapsed
14207
13:40:13,391 --> 13:40:19,091
into the illusion of one, whereby you,\n
14208
13:40:19,091 --> 13:40:22,721
and let the computer figure\nout how to actually convert it
14209
13:40:22,722 --> 13:40:24,762
to something the computer understands.
14210
13:40:24,762 --> 13:40:28,372
And the way we do that is via this\n
14211
13:40:28,372 --> 13:40:30,432
But now, when you have\nsource code, it's going
14212
13:40:30,432 --> 13:40:33,372
to be passed into an\ninterpreter, not a compiler.
14213
13:40:33,372 --> 13:40:35,922
And the best analog of this\nis just to perhaps point out
14214
13:40:35,921 --> 13:40:38,472
that, in the human world, if\nyou speak, or don't speak
14215
13:40:38,472 --> 13:40:42,162
multiple human languages, it can\n
14216
13:40:42,161 --> 13:40:43,791
from one language to another.
14217
13:40:43,792 --> 13:40:46,692
For instance, here are step-by-step\n
14218
13:40:46,692 --> 13:40:49,062
in a phone book,\nunfortunately, in Spanish.
14219
13:40:49,061 --> 13:40:51,881
Unfortunately, if you don't\nspeak or read Spanish.
14220
13:40:53,082 --> 13:40:55,902
You could run this algorithm, but you're\n
14221
13:40:55,902 --> 13:40:58,652
or you're going to have to open\n
14222
13:40:58,652 --> 13:40:59,982
to English and convert this.
14223
13:40:59,982 --> 13:41:03,582
And the catch with translating\nany language, human or computer
14224
13:41:03,582 --> 13:41:07,372
or otherwise, is that you're going\n
14225
13:41:07,372 --> 13:41:10,362
And so converting this in\nSpanish to this in English
14226
13:41:10,362 --> 13:41:12,881
is just going to take you\nlonger than if this were already
14227
13:41:14,974 --> 13:41:17,891
And that's going to be one of the\n
14228
13:41:17,891 --> 13:41:21,701
Yes, it's a feature that you can\n
14229
13:41:21,701 --> 13:41:24,401
to bother compiling it manually first.
14230
13:41:25,572 --> 13:41:27,336
And things might be a little slower.
14231
13:41:27,336 --> 13:41:28,961
Now, there's ways to chip away at that.
14232
13:41:28,961 --> 13:41:30,336
But we'll see an example thereof.
14233
13:41:30,336 --> 13:41:33,222
In fact, let me transition now\nto just a couple of examples
14234
13:41:33,222 --> 13:41:37,182
that demonstrate how Python is\nnot only easier for many people
14235
13:41:37,182 --> 13:41:39,762
to use, perhaps yourselves\ntoo, because it throws away
14236
13:41:39,762 --> 13:41:42,641
a lot of the annoying syntax,\nit shortens the number of lines
14237
13:41:42,641 --> 13:41:46,332
you have to write, and also it\n
14238
13:41:46,332 --> 13:41:51,262
you can just do so much more without\n
14239
13:41:51,262 --> 13:41:54,192
So, as an example of this,\nlet me switch over here
14240
13:41:54,192 --> 13:42:00,612
to this image from problem set 4, which\n
14241
13:42:01,811 --> 13:42:03,766
And this is the original\nphoto, pretty clear
14242
13:42:03,766 --> 13:42:06,891
and it's even higher res if we looked\n
14243
13:42:06,891 --> 13:42:10,182
But there have been no filters, a\n
14244
13:42:10,182 --> 13:42:13,271
Recall, for problem set four, you\n
14245
13:42:13,271 --> 13:42:14,981
And among them might have been blur.
14246
13:42:14,982 --> 13:42:18,132
And blur was probably among the\nmore challenging of the ones
14247
13:42:18,131 --> 13:42:20,711
because you had to iterate\nover all of the pixels
14248
13:42:20,711 --> 13:42:23,652
you had to take into account what's\n
14249
13:42:24,012 --> 13:42:25,970
I mean, there was a lot\nof math and arithmetic.
14250
13:42:25,970 --> 13:42:29,141
And if you ultimately got it, it was\n
14251
13:42:29,141 --> 13:42:31,302
But that was probably\nseveral hours later.
14252
13:42:31,302 --> 13:42:34,062
In a language like\nPython, where there might
14253
13:42:34,061 --> 13:42:37,691
be libraries that had been written\nby others, on whose shoulders
14254
13:42:37,692 --> 13:42:40,402
you can stand, we could\nperhaps do something like this.
14255
13:42:40,402 --> 13:42:44,802
Let me go ahead and run a program, or\n
14256
13:42:44,802 --> 13:42:48,652
And in Blur.py, in VS\nCode, let me just do this.
14257
13:42:48,652 --> 13:42:51,891
Let me import from a library,\nnot the CS50 library
14258
13:42:51,891 --> 13:42:56,141
but the Pillow library, so to\nspeak, a keyword called image
14259
13:42:56,141 --> 13:42:59,851
and another one called image\nfilter, then let me go ahead
14260
13:42:59,851 --> 13:43:02,941
and say, let me open the current\nversion of this image, which
14261
13:43:04,262 --> 13:43:06,781
So the before version\nof the image will be
14262
13:43:06,781 --> 13:43:11,072
the result of calling image.open\nquote unquote "Bridge.bmp
14263
13:43:11,072 --> 13:43:13,561
and then, let me create\nan after version.
14264
13:43:13,561 --> 13:43:15,362
So you'll see before and after.
14265
13:43:15,362 --> 13:43:21,531
After equals the before version\n.filter of image filter.
14266
13:43:21,531 --> 13:43:23,281
And there is, if I\nread the documentation
14267
13:43:23,281 --> 13:43:25,574
I'll see that there's something\ncalled a box blur, that
14268
13:43:25,574 --> 13:43:28,682
allows you to blur in box\nformat, like one pixel above
14269
13:43:30,271 --> 13:43:31,888
So I'll do one pixel there.
14270
13:43:31,889 --> 13:43:34,472
And then, after that's done, let\nme go ahead and save the file
14271
13:43:38,701 --> 13:43:41,432
Assuming this library\nworks as described
14272
13:43:41,432 --> 13:43:44,582
I am opening the file\nin Python, using line 3.
14273
13:43:44,582 --> 13:43:46,202
And this is somewhat new syntax.
14274
13:43:46,201 --> 13:43:49,771
In the world of Python, we're going to\n
14275
13:43:49,771 --> 13:43:51,841
more, because in the\nworld of Python, you have
14276
13:43:51,841 --> 13:43:56,221
what's called object-oriented\n
14277
13:43:56,222 --> 13:43:58,991
And what this means is that\nyou still have functions
14278
13:43:58,991 --> 13:44:01,502
you still have variables,\nbut sometimes those functions
14279
13:44:01,502 --> 13:44:05,372
are embedded inside of the\nvariables, or, more specifically
14280
13:44:05,372 --> 13:44:07,232
inside of the data types themselves.
14281
13:44:07,232 --> 13:44:10,952
Think back to C. When you wanted\n
14282
13:44:10,951 --> 13:44:15,103
there was a to upper function that takes\n
14283
13:44:15,103 --> 13:44:18,061
And you can pass in any char you\n
14284
13:44:19,411 --> 13:44:22,681
Well, you know what, if that's\nsuch a common paradigm, where
14285
13:44:22,682 --> 13:44:26,372
upper-casing chars is a useful\n
14286
13:44:26,372 --> 13:44:30,992
is it embeds into the string\ndata type, or char if you will
14287
13:44:30,991 --> 13:44:35,761
the ability just to uppercase any char\n
14288
13:44:35,762 --> 13:44:38,672
as though it's a struct\nin C. Recall that structs
14289
13:44:38,671 --> 13:44:40,921
encapsulate multiple types of values.
14290
13:44:40,921 --> 13:44:44,131
In object-oriented programming,\nin a language like Python
14291
13:44:44,131 --> 13:44:48,031
you can encapsulate not just\nvalues, but also functionality.
14292
13:44:48,031 --> 13:44:50,339
Functions can now be inside of structs.
14293
13:44:50,339 --> 13:44:52,381
But we're not going to\ncall them structs anymore.
14294
13:44:52,381 --> 13:44:53,792
We're going to call them objects.
14295
13:44:53,792 --> 13:44:55,652
But that's just a different vernacular.
14296
13:44:57,391 --> 13:45:00,391
Inside of the image library,\nthere's a function called open
14297
13:45:00,391 --> 13:45:03,152
and it takes an argument, the\nname of the file, to open.
14298
13:45:03,152 --> 13:45:06,781
Once I have a variable called before,\n
14299
13:45:06,781 --> 13:45:09,811
an object, inside of\nwhich is now, because it
14300
13:45:09,811 --> 13:45:12,661
was returned from this\nfunction, a function
14301
13:45:12,661 --> 13:45:14,801
called filter, that takes an argument.
14302
13:45:14,802 --> 13:45:18,182
The argument here happens\nto be image.boxblur1
14303
13:45:19,351 --> 13:45:21,324
But it just returns the filter to use.
14304
13:45:21,324 --> 13:45:23,491
And then, after, dot save\ndoes what you might think.
14305
13:45:24,671 --> 13:45:27,991
So instead of using fopen and\nfwrite, you just say dot save
14306
13:45:27,991 --> 13:45:31,031
and that does all of\nthat messy work for you.
14307
13:45:31,031 --> 13:45:33,752
So it's just, what, four\nlines of code total?
14308
13:45:33,752 --> 13:45:36,762
Let me go ahead and go\ndown to my terminal window.
14309
13:45:36,762 --> 13:45:40,055
Let me go ahead and show you\nwith LS that, at the moment
14310
13:45:40,055 --> 13:45:41,972
whoops, sorry, let me\nnot bother showing that
14311
13:45:41,972 --> 13:45:43,682
because I have other examples to come.
14312
13:45:43,682 --> 13:45:50,832
I'm going to go ahead and do Python\n
14313
13:45:50,832 --> 13:45:52,092
I did need to make a command.
14314
13:45:52,802 --> 13:45:55,862
OK, let me go ahead and type LS\n
14315
13:45:55,862 --> 13:45:58,082
is among the sample code online today.
14316
13:45:58,082 --> 13:46:01,322
There's only one file\ncalled Bridge.bmp, dammit
14317
13:46:01,322 --> 13:46:04,152
I'm trying to get these\nthings ready at the same time.
14318
13:46:05,252 --> 13:46:08,641
Let me move this code into place.
14319
13:46:08,641 --> 13:46:11,231
All right, I've gone ahead\nand moved this file, Blur.py
14320
13:46:11,232 --> 13:46:13,712
into a folder called\nfilter, inside of which
14321
13:46:13,711 --> 13:46:18,601
there's another file called Bridge.bmp,\n
14322
13:46:18,601 --> 13:46:20,911
Let me now go ahead\nand run Python, which
14323
13:46:20,911 --> 13:46:23,221
is my interpreter, and also\nthe name of the language
14324
13:46:23,222 --> 13:46:25,512
and run Python on this file.
14325
13:46:25,512 --> 13:46:27,870
So much like running\nthe Spanish algorithm
14326
13:46:27,870 --> 13:46:29,912
through Google Translate,\nor something like that
14327
13:46:29,911 --> 13:46:32,171
as input, to get back\nthe English output
14328
13:46:32,171 --> 13:46:36,061
this is going to translate the\nPython language to something
14329
13:46:36,061 --> 13:46:38,281
this computer, or this\ncloud-based environment
14330
13:46:38,281 --> 13:46:41,591
understands, and then run the\ncorresponding code, top to bottom
14331
13:46:42,228 --> 13:46:43,561
I'm going to go ahead and Enter.
14332
13:46:43,561 --> 13:46:45,451
No error message is\ngenerally a good thing.
14333
13:46:45,451 --> 13:46:48,481
If I type LS you'll now see out.bmp.
14334
13:46:48,482 --> 13:46:49,817
Let me go ahead and open that.
14335
13:46:49,817 --> 13:46:52,442
And, you know what, just to make\nclear what's really happening
14336
13:46:52,442 --> 13:46:53,609
let me blur it even further.
14337
13:46:53,608 --> 13:46:57,072
Let's make a box that's not\njust one pixel around, but 10.
14338
13:46:58,472 --> 13:47:01,351
And let me just go ahead and\nrerun it with Python of Blur.py.
14339
13:47:03,841 --> 13:47:08,621
Let me go ahead and open Out.bmp\nand show you first the before
14340
13:47:11,072 --> 13:47:14,341
And now, crossing my fingers,\nfour lines of code later
14341
13:47:14,341 --> 13:47:16,279
the result of blurring it, as well.
14342
13:47:16,279 --> 13:47:18,572
So the library is doing all\nof the same kind of legwork
14343
13:47:18,572 --> 13:47:20,641
that you all did for\nthe assignment, but it's
14344
13:47:20,641 --> 13:47:24,824
encapsulated it all into a single\n
14345
13:47:24,824 --> 13:47:27,241
Those of you who might have\nbeen feeling more comfortable
14346
13:47:27,241 --> 13:47:29,116
might have done a little\nsomething like this.
14347
13:47:29,116 --> 13:47:33,421
Let me go ahead and open up one\nother file, called Edges.py.
14348
13:47:33,421 --> 13:47:36,811
And in Edges.py, I'm again going\n
14349
13:47:36,811 --> 13:47:39,531
the image keyword, and the image filter.
14350
13:47:39,531 --> 13:47:42,031
Then I'm going to go ahead and\ncreate a before image, that's
14351
13:47:42,031 --> 13:47:46,112
a result of calling image.open\nof the same thing, Bridge.bmp
14352
13:47:46,112 --> 13:47:53,432
then I'm going to go ahead and run a\n
14353
13:47:53,432 --> 13:47:58,372
image filter.find edges, which\nis like a content, if you will
14354
13:47:58,372 --> 13:48:00,230
defined inside of this library for us.
14355
13:48:00,230 --> 13:48:02,272
And then I'm going to do\nafter.save quote unquote
14356
13:48:02,271 --> 13:48:04,731
Out.bmp," using the same file name.
14357
13:48:04,732 --> 13:48:13,012
I'm now going to run Python of\n
14358
13:48:13,012 --> 13:48:15,452
We'll see what syntax error means soon.
14359
13:48:15,451 --> 13:48:17,991
Let me go ahead and run\nthe code now, Edges.py.
14360
13:48:17,991 --> 13:48:21,351
Let me now open that new file, Out.bmp.
14361
13:48:21,351 --> 13:48:26,031
And before we had this, and now,\n
14362
13:48:26,031 --> 13:48:28,731
if we did the more comfortable\nversion of P set 4
14363
13:48:28,732 --> 13:48:31,862
we now get this, after\njust four lines of code.
14364
13:48:31,862 --> 13:48:34,641
So again, suggesting the power\nof using a language that's better
14365
13:48:34,641 --> 13:48:36,082
optimized for the tool at hand.
14366
13:48:36,082 --> 13:48:39,472
And at the risk of really\nmaking folks sad, let's go ahead
14367
13:48:39,472 --> 13:48:43,342
and re-implement, if we could,\n
14368
13:48:43,341 --> 13:48:47,601
Let me go ahead and open\nanother version of this code
14369
13:48:47,601 --> 13:48:50,828
wherein I have a C\nversion, just from problem
14370
13:48:50,828 --> 13:48:52,911
set five, wherein you\nimplemented a spell checker
14371
13:48:52,911 --> 13:48:55,161
loading 100,000 plus words into memory.
14372
13:48:55,161 --> 13:48:58,911
And then you kept track of just\n
14373
13:48:58,911 --> 13:49:00,861
And that probably took\na while, implementing
14374
13:49:00,862 --> 13:49:03,052
all of those functions in Dictionary.c.
14375
13:49:03,052 --> 13:49:08,762
Let me instead now go into a\nnew file, called Dictionary.py.
14376
13:49:08,762 --> 13:49:11,722
And let me stipulate, for\nthe sake of discussion
14377
13:49:11,722 --> 13:49:14,182
that we already wrote\nin advance, Speller.py
14378
13:49:14,182 --> 13:49:16,372
which corresponds to Speller.c.
14379
13:49:16,372 --> 13:49:17,902
You didn't write either of those.
14380
13:49:17,902 --> 13:49:20,122
Recall for problem set\nfive, we gave you Speller.c.
14381
13:49:20,122 --> 13:49:22,080
Assume that we're going\nto give you Speller.py.
14382
13:49:22,080 --> 13:49:28,552
So the onus on us right now is only\n
14383
13:49:28,552 --> 13:49:31,462
All right, so I'm going to go\nahead and define a few functions.
14384
13:49:31,461 --> 13:49:34,521
And we're going to see now the syntax\n
14385
13:49:34,521 --> 13:49:38,752
I want to go ahead and define\nfirst, a hash table, which
14386
13:49:38,752 --> 13:49:41,362
was the very first thing\nyou defined in Dictionary.c.
14387
13:49:41,362 --> 13:49:46,491
I'm going to go ahead, then, and say\n
14388
13:49:46,491 --> 13:49:48,205
otherwise known as a hash table.
14389
13:49:48,205 --> 13:49:50,122
All right, now let me\ndefine a function called
14390
13:49:50,122 --> 13:49:53,152
check, which was the first function\nyou might have implemented.
14391
13:49:53,152 --> 13:49:55,522
Check is going to take a word,\nand you'll see in Python
14392
13:49:55,521 --> 13:49:56,896
the syntax is a little different.
14393
13:49:56,896 --> 13:49:58,401
You don't specify the return type.
14394
13:49:58,402 --> 13:50:01,132
You use the word Def instead to define.
14395
13:50:01,131 --> 13:50:05,061
You still specify the name of the\n
14396
13:50:05,061 --> 13:50:07,731
But you omit any mention of types.
14397
13:50:07,732 --> 13:50:09,802
But you do use a colon and indent.
14398
13:50:09,802 --> 13:50:14,302
So how do I check if a word is in\n
14399
13:50:14,302 --> 13:50:17,962
Well, in Python, I can\njust say, if word in words
14400
13:50:17,961 --> 13:50:23,091
go ahead and return true, else\ngo ahead and return false, done
14401
13:50:24,470 --> 13:50:26,161
All right, now I want to do like load.
14402
13:50:26,161 --> 13:50:29,161
That was the heavy lift, where you\n
14403
13:50:29,161 --> 13:50:30,828
So let me define a function called load.
14404
13:50:30,828 --> 13:50:33,171
It takes a string, the\nname of a file to load.
14405
13:50:33,171 --> 13:50:36,502
So I'll call that Dictionary,\njust like in C, but no data type.
14406
13:50:36,502 --> 13:50:40,701
Let me go ahead and open a file by\n
14407
13:50:40,701 --> 13:50:43,261
by opening that Dictionary in read mode.
14408
13:50:43,262 --> 13:50:46,882
So this is a little similar to fopen,\n
14409
13:50:46,881 --> 13:50:49,401
Then let me iterate over\nevery line in the file.
14410
13:50:49,402 --> 13:50:54,322
In Python, this is pretty pleasant,\n
14411
13:50:54,322 --> 13:50:59,031
How, now, do I get at the current\n
14412
13:50:59,031 --> 13:51:02,091
because in this file of\nwords, 140,000 words
14413
13:51:02,091 --> 13:51:05,273
there's word backslash n,\nword backslash n, all right?
14414
13:51:05,273 --> 13:51:07,731
Well, let me go ahead and get\na word from the current line
14415
13:51:07,732 --> 13:51:11,362
but strip off, from the right end\n
14416
13:51:11,362 --> 13:51:14,061
the Rstrip function\nin Python does for me.
14417
13:51:14,061 --> 13:51:18,891
Then let me go ahead and add to my\n
14418
13:51:19,552 --> 13:51:22,057
Let me go ahead and close\nthe file for good measure.
14419
13:51:22,057 --> 13:51:24,682
And then let me go ahead and\nreturn true, because all was well.
14420
13:51:24,682 --> 13:51:26,842
That's it for the load\nfunction in Python.
14421
13:51:26,841 --> 13:51:28,101
How about the size function?
14422
13:51:28,101 --> 13:51:31,341
This did not take any arguments, it\n
14423
13:51:32,512 --> 13:51:36,502
I can do that by returning the\n
14424
13:51:36,502 --> 13:51:41,182
And then lastly, gone from the\n
14425
13:51:42,612 --> 13:51:45,472
So no matter what I do,\nthere's nothing to unload.
14426
13:51:45,472 --> 13:51:47,342
The computer will do that for me.
14427
13:51:47,341 --> 13:51:51,381
So I give you, in these functions,\nproblem set five in Python.
14428
13:51:51,381 --> 13:51:53,542
So, I'm sorry, we made\nyou write it in C first.
14429
13:51:53,542 --> 13:51:57,141
But the implication now is that,\nwhat are you getting for free
14430
13:51:58,372 --> 13:52:00,891
Well, encapsulated in\nthis one line of code
14431
13:52:00,891 --> 13:52:04,792
is much of what you wrote for\nproblem set five, implementing
14432
13:52:04,792 --> 13:52:07,792
your array for all of your\nletters of the alphabet or more
14433
13:52:07,792 --> 13:52:10,912
all of the linked lists that you\nimplemented to create chains
14434
13:52:10,911 --> 13:52:12,451
to store all of those words.
14435
13:52:13,582 --> 13:52:16,612
It's just someone else in the\nworld wrote that code for you.
14436
13:52:16,612 --> 13:52:19,582
And you can now use it\nby way of a dictionary.
14437
13:52:19,582 --> 13:52:22,072
And actually, I can\nchange this a little bit
14438
13:52:22,072 --> 13:52:25,192
because add is technically not\nthe right function to use here.
14439
13:52:25,192 --> 13:52:28,141
I'm actually treating the dictionary\n
14440
13:52:28,141 --> 13:52:31,942
So I'm going to make one tweak, set\n
14441
13:52:31,942 --> 13:52:34,222
But set just allows it\nto handle duplicates
14442
13:52:34,222 --> 13:52:36,952
and it allows me to just throw\nthings into it by literally
14443
13:52:36,951 --> 13:52:38,841
using a function as simple as add.
14444
13:52:38,841 --> 13:52:41,691
And I'm going to make\none other tweak here
14445
13:52:41,692 --> 13:52:46,312
because, when I'm checking a word,\n
14446
13:52:46,311 --> 13:52:49,042
to me in uppercase or capitalized.
14447
13:52:49,042 --> 13:52:52,402
It's not going to necessarily come\n
14448
13:52:53,991 --> 13:52:58,911
I can force every word to\nlowercase by using word.lower.
14449
13:52:58,911 --> 13:53:01,021
And I don't have to do it\ncharacter for character
14450
13:53:01,021 --> 13:53:06,322
I can do the whole darn string at\n
14451
13:53:06,322 --> 13:53:09,381
All right, let me go ahead and\nopen up a terminal window here.
14452
13:53:09,381 --> 13:53:12,639
And let me go into, first,\nmy C version, on the left.
14453
13:53:12,639 --> 13:53:15,682
And actually I'm going to go ahead\n
14454
13:53:15,682 --> 13:53:20,529
And on the right, I'm going to go into\n
14455
13:53:20,529 --> 13:53:23,362
But it's also available online, if\n
14456
13:53:23,362 --> 13:53:26,692
I'm going to go ahead and\nmake speller in C on the left
14457
13:53:26,692 --> 13:53:28,792
and note that it takes\na moment to compile.
14458
13:53:28,792 --> 13:53:33,052
Then I'm going to be ready to\nrun speller of dictionaries
14459
13:53:33,052 --> 13:53:35,852
let's do like the Sherlock\nHolmes text, which is pretty big.
14460
13:53:35,851 --> 13:53:40,491
And then over here, let me get\nready to run Python of speller
14461
13:53:44,254 --> 13:53:46,671
So the syntax is a little\ndifferent at the command prompt.
14462
13:53:46,671 --> 13:53:49,402
I just, on the left, have to\ncompile the code, with make
14463
13:53:49,402 --> 13:53:51,171
and then run it with ./speller.
14464
13:53:51,171 --> 13:53:52,891
On the right, I don't\nneed to compile it.
14465
13:53:52,891 --> 13:53:54,381
But I do need to use the interpreter.
14466
13:53:54,381 --> 13:53:56,752
So even though the lines are\nwrapping a little bit here
14467
13:53:56,752 --> 13:53:58,701
let me go ahead and run it on the right.
14468
13:53:58,701 --> 13:54:00,826
And I'm going to count how\nlong it takes, verbally
14469
13:54:02,091 --> 13:54:05,241
One Mississippi, two Mississippi,\nthree Mississippi, OK
14470
13:54:05,241 --> 13:54:07,711
so it's like three\nseconds, give or take.
14471
13:54:07,711 --> 13:54:10,042
Now running it in\nPython, keeping in mind
14472
13:54:10,042 --> 13:54:13,625
I spent way fewer hours implementing\na spell checker in Python
14473
13:54:13,625 --> 13:54:15,292
than you might have in problem set five.
14474
13:54:15,292 --> 13:54:18,529
But what's the trade-off going to be,\n
14475
13:54:18,529 --> 13:54:20,362
do we all now need to\nbe making consciously?
14476
13:54:20,362 --> 13:54:22,822
Here we go, on the right, in Python.
14477
13:54:22,822 --> 13:54:26,542
One Mississippi, two Mississippi,\n
14478
13:54:26,542 --> 13:54:30,592
five Mississippi, six Mississippi,\n
14479
13:54:30,591 --> 13:54:33,621
nine Mississippi, 10\nMississippi, 11 Mississippi
14480
13:54:33,622 --> 13:54:36,512
all right, so 10 or 11 seconds.
14481
13:54:38,502 --> 13:54:43,072
Let's go to the group here, which\n
14482
13:54:43,072 --> 13:54:47,302
How might you answer that question,\n
14483
13:54:48,052 --> 13:54:50,260
AUDIENCE: I think Python's\nbetter for the programmer
14484
13:54:50,260 --> 13:54:54,368
more comfortable for the programmer,\n
14485
13:54:54,368 --> 13:54:56,201
DAVID J. MALAN: OK, so\nPython, to summarize
14486
13:54:56,201 --> 13:54:59,981
is better for the programmer,\n
14487
13:54:59,982 --> 13:55:02,982
but C is maybe better for the computer,\n
14488
13:55:02,982 --> 13:55:04,649
I think that's a reasonable formulation.
14489
13:55:07,110 --> 13:55:09,402
AUDIENCE: I think it depends\non the size of the project
14490
13:55:10,432 --> 13:55:12,807
So if it's going to be something\nthat's relatively quick
14491
13:55:12,807 --> 13:55:15,232
I might not care that it\ntakes 10 seconds to do it.
14492
13:55:15,232 --> 13:55:17,432
And it could be way faster\nto do it with Python.
14493
13:55:17,432 --> 13:55:20,592
Whereas with C, if I'm dealing\n
14494
13:55:20,591 --> 13:55:24,822
set or something huge, then that\n
14495
13:55:24,822 --> 13:55:29,262
it might be worth it to put in the\n
14496
13:55:29,262 --> 13:55:32,781
so the process continually will run\n
14497
13:55:32,781 --> 13:55:33,951
DAVID J. MALAN: Absolutely,\na really good answer.
14498
13:55:33,951 --> 13:55:36,822
And let me summarize, is it depends\n
14499
13:55:36,822 --> 13:55:40,572
If you have a very large\ndata set, you might
14500
13:55:40,572 --> 13:55:43,650
want to optimize your code to be as\n
14501
13:55:43,650 --> 13:55:45,942
especially if you're running\nthat code again and again.
14502
13:55:45,942 --> 13:55:47,472
Maybe you're a company like Google.
14503
13:55:47,472 --> 13:55:49,632
People are searching a\nhuge database all the time.
14504
13:55:49,631 --> 13:55:52,271
You really want to squeeze\nevery bit of performance
14505
13:55:52,271 --> 13:55:53,743
as you can out of the computer.
14506
13:55:53,743 --> 13:55:56,201
You might want to have someone\nsmart take a language like C
14507
13:55:56,201 --> 13:55:57,971
and write it at a very low level.
14508
13:55:59,921 --> 13:56:02,671
They're going to have to deal with\n
14509
13:56:02,671 --> 13:56:06,011
But if and when it works correctly, it's\n
14510
13:56:06,012 --> 13:56:08,802
By contrast, if you have\na data set that's big
14511
13:56:08,802 --> 13:56:12,342
and 140,000 words is\nnot small, but you don't
14512
13:56:12,341 --> 13:56:15,461
want to spend like 5 hours,\n10 hours, a week of your time
14513
13:56:15,461 --> 13:56:17,584
building a spell\nchecker or a dictionary
14514
13:56:17,584 --> 13:56:20,502
you can instead leverage a different\n
14515
13:56:20,502 --> 13:56:25,211
and build on top of it, in order to\n
14516
13:56:27,362 --> 13:56:29,310
AUDIENCE: Would you,\nbecause with Python
14517
13:56:29,311 --> 13:56:33,450
doesn't it also like\nconvert the words, or like
14518
13:56:33,449 --> 13:56:35,060
convert the words, for a lesson?
14519
13:56:35,061 --> 13:56:37,103
When we convert that into\nthe same version again
14520
13:56:37,103 --> 13:56:40,670
do we just take that into view?
14521
13:56:40,669 --> 13:56:43,461
DAVID J. MALAN: That's a perfect\n
14522
13:56:43,461 --> 13:56:45,862
wanted to make, which was, is\nthere something in between?
14523
13:56:46,881 --> 13:56:49,491
I'm oversimplifying what this\nlanguage is actually doing.
14524
13:56:49,491 --> 13:56:51,801
It's not as stark a difference\nas saying, like, hey
14525
13:56:51,802 --> 13:56:54,862
Python is four times slower than C.\n
14526
13:56:54,862 --> 13:56:57,982
There are absolutely ways that\nengineers can optimize languages
14527
13:56:57,982 --> 13:56:59,752
as they have already done for Python.
14528
13:56:59,752 --> 13:57:02,362
And in fact, I've configured\nmy settings in such a way
14529
13:57:02,362 --> 13:57:05,298
that I've kind of dramatized\njust how big the difference is.
14530
13:57:05,298 --> 13:57:07,131
It is going to be slower,\nPython, typically
14531
13:57:07,131 --> 13:57:08,451
than the equivalent C program.
14532
13:57:08,451 --> 13:57:10,461
But it doesn't have\nto be as big of a gap
14533
13:57:10,461 --> 13:57:14,241
as it is here, because, indeed, among\n
14534
13:57:14,241 --> 13:57:16,641
is to save some intermediate results.
14535
13:57:16,641 --> 13:57:19,881
Technically speaking, yes,\nPython is interpreting
14536
13:57:19,881 --> 13:57:23,211
Dictionary.py and these\nother files, translating them
14537
13:57:23,211 --> 13:57:24,724
from one language to another.
14538
13:57:24,724 --> 13:57:27,891
But that doesn't mean it has to do that\n
14539
13:57:27,891 --> 13:57:33,542
As you propose, you can save, or cache,\n
14540
13:57:33,542 --> 13:57:36,961
So that the second time and the third\n
14541
13:57:36,961 --> 13:57:39,951
And, in fact, Python itself, the\n
14542
13:57:39,951 --> 13:57:42,502
thereof, itself is\nactually implemented in C.
14543
13:57:42,502 --> 13:57:45,811
So you can make sure that your\n
14544
13:57:45,811 --> 13:57:47,871
And what then is maybe\nthe high level takeaway?
14545
13:57:47,872 --> 13:57:50,842
Yes, if you are going to try to\nsqueeze every bit of performance
14546
13:57:50,841 --> 13:57:54,231
out of your code, and\nmaybe code is constrained.
14547
13:57:54,232 --> 13:57:55,672
Maybe you have very small devices.
14548
13:57:55,671 --> 13:57:57,292
Maybe it's like a watch nowadays.
14549
13:57:57,292 --> 13:58:02,842
Or maybe it's a sensor that's installed\n
14550
13:58:02,841 --> 13:58:06,231
or in infrastructure, where you\ndon't have much battery life
14551
13:58:06,232 --> 13:58:08,152
and you don't have much\nsize, you might want
14552
13:58:08,152 --> 13:58:10,232
to minimize just how\nmuch work is being done.
14553
13:58:10,232 --> 13:58:13,265
And so the faster the code runs,\n
14554
13:58:13,264 --> 13:58:14,932
if it's implemented something low level.
14555
13:58:14,932 --> 13:58:18,832
So C is still very commonly used\n
14556
13:58:18,832 --> 13:58:22,101
But, again, if you just want\nto solve real world problems
14557
13:58:22,101 --> 13:58:26,362
and get real work done, and your time\n
14558
13:58:26,362 --> 13:58:28,522
than the device you're\nrunning it on, long term
14559
13:58:28,521 --> 13:58:31,879
you know what, Python is among the\n
14560
13:58:31,879 --> 13:58:34,671
And frankly, if I were implementing\n
14561
13:58:34,671 --> 13:58:36,231
I'm probably starting with Python.
14562
13:58:36,232 --> 13:58:38,065
And I'm not going to\nwaste time implementing
14563
13:58:38,065 --> 13:58:41,452
all of that low-level stuff, because\n
14564
13:58:41,451 --> 13:58:45,981
modern languages is to use abstractions\n
14565
13:58:45,982 --> 13:58:49,432
And by abstraction, I mean something\n
14566
13:58:49,432 --> 13:58:51,891
that just gives you a\ndictionary, or hash table
14567
13:58:51,891 --> 13:58:55,747
or the equivalent version that I\n
14568
13:58:55,747 --> 13:58:59,242
All right, any questions,\nthen, on Python thus far?
14569
13:59:04,232 --> 13:59:06,442
AUDIENCE: Could you\ncompile the Python code
14570
13:59:06,442 --> 13:59:11,132
or is there some, I'd imagine that\n
14571
13:59:11,131 --> 13:59:14,701
but it feels like if you can just\n
14572
13:59:14,701 --> 13:59:16,614
that would give you the\nbest of both worlds.
14573
13:59:16,614 --> 13:59:18,781
DAVID J. MALAN: Really good\nquestion or observation
14574
13:59:18,781 --> 13:59:20,239
could you just compile Python code?
14575
13:59:20,239 --> 13:59:23,701
Yes, absolutely, this idea of\n
14576
13:59:23,701 --> 13:59:26,011
is not native to the language itself.
14577
13:59:26,012 --> 13:59:28,932
It tends to be native to the\nconventions that we humans use.
14578
13:59:28,932 --> 13:59:31,252
So you could actually\nwrite an interpreter for C
14579
13:59:31,252 --> 13:59:34,502
that would read it top to bottom, left\n
14580
13:59:34,502 --> 13:59:38,162
something the computer understands, but\n
14581
13:59:38,161 --> 13:59:40,081
C is generally a compiled language.
14582
13:59:41,192 --> 13:59:44,531
What Python nowadays is actually\n
14583
13:59:44,531 --> 13:59:46,741
It technically is, sort\nof unbeknownst to us
14584
13:59:46,741 --> 13:59:50,491
compiling the code, technically\n
14585
13:59:50,491 --> 13:59:54,031
into something called byte code,\n
14586
13:59:54,031 --> 13:59:58,031
just doesn't take as much time as it\n
14587
13:59:58,031 --> 14:00:00,898
And this is an area of research\nfor computer scientists working
14588
14:00:00,898 --> 14:00:03,481
in programming languages, to\nimprove these kinds of paradigms.
14589
14:00:04,021 --> 14:00:07,261
Well, honestly, for you and I, the\n
14590
14:00:07,262 --> 14:00:10,322
one, run the code and not worry\nabout the stupid second step
14591
14:00:10,322 --> 14:00:11,622
of compiling it all the time.
14592
14:00:12,122 --> 14:00:14,742
It's literally half as many\nsteps for me, the human.
14593
14:00:14,741 --> 14:00:17,021
And that's a nice thing to optimize for.
14594
14:00:17,021 --> 14:00:20,851
And ultimately, too, you might\n
14595
14:00:20,851 --> 14:00:22,441
come with these other languages.
14596
14:00:22,442 --> 14:00:24,482
So you should really\njust be fine-tuning how
14597
14:00:24,482 --> 14:00:28,322
you can enable these features, as\n
14598
14:00:28,322 --> 14:00:31,112
And, in fact, the only time\nI personally ever use C
14599
14:00:31,112 --> 14:00:34,472
is from like September to October\nof every year, during CS50.
14600
14:00:34,472 --> 14:00:36,872
Almost every other month\ndo I reach for Python
14601
14:00:36,872 --> 14:00:40,211
or another language called JavaScript,\n
14602
14:00:40,211 --> 14:00:44,161
which is not to impugn C. It's just that\n
14603
14:00:44,161 --> 14:00:47,551
fits for the amount of time I have to\n
14604
14:00:48,427 --> 14:00:50,927
All right, let's go ahead and\ntake a five minute break here.
14605
14:00:50,927 --> 14:00:53,912
And when we come back, we'll start\n
14606
14:00:54,822 --> 14:00:58,262
So let's go ahead and start writing\nsome code from the beginning
14607
14:00:58,262 --> 14:01:01,232
here, whereby we start small\nwith some simple examples
14608
14:01:01,232 --> 14:01:04,564
and then we'll build our way up to\n
14609
14:01:04,563 --> 14:01:06,271
But what we'll do\nalong the way is first
14610
14:01:06,271 --> 14:01:08,386
look side by side at\nwhat the C code looked
14611
14:01:08,387 --> 14:01:11,162
like way back in week 1\nor 2 or 3 and so forth
14612
14:01:11,161 --> 14:01:13,411
and then write the corresponding\nPython code at right.
14613
14:01:13,411 --> 14:01:16,051
And then we'll transition just\nto focusing on Python itself.
14614
14:01:16,052 --> 14:01:18,844
What I've done in advance today is\n
14615
14:01:18,843 --> 14:01:21,451
from the course's website,\nmy source 6 directory, which
14616
14:01:21,451 --> 14:01:24,346
contains all of the pre-written\nC code from weeks past.
14617
14:01:24,347 --> 14:01:26,222
But it'll also have\ncopies of the Python code
14618
14:01:26,222 --> 14:01:28,182
we'll write here together and look at.
14619
14:01:28,182 --> 14:01:31,967
So first, here is\nHello.c back from week 0.
14620
14:01:31,966 --> 14:01:33,844
And this was version 0 of it.
14621
14:01:33,845 --> 14:01:35,262
I'm going to go ahead and do this.
14622
14:01:35,262 --> 14:01:38,762
I'm going to go ahead and\nsplit my code window up here.
14623
14:01:38,762 --> 14:01:41,564
I'm going to go ahead and create\na new file called Hello.py.
14624
14:01:41,563 --> 14:01:43,771
And this isn't something\nyou'll typically have to do
14625
14:01:43,771 --> 14:01:45,331
laying your code out side by side.
14626
14:01:45,332 --> 14:01:47,402
But I've just clicked the\nlittle icon in VS Code
14627
14:01:47,402 --> 14:01:50,851
that looks like two columns, that\n
14628
14:01:50,851 --> 14:01:53,851
so that we can, in fact, see\nthings, for now, side by side
14629
14:01:53,851 --> 14:01:55,309
with my terminal window down below.
14630
14:01:55,309 --> 14:01:58,269
All right, now I'm going to go ahead\n
14631
14:01:58,269 --> 14:02:01,082
program on the right, which,\nrecall, was just print, quote
14632
14:02:01,082 --> 14:02:03,692
unquote, "Hello, world," and that\'s it.
14633
14:02:03,692 --> 14:02:05,942
Now down in my terminal\nwindow, I'm going
14634
14:02:05,942 --> 14:02:09,602
to go ahead and run Python of\nHello.py, Enter, and voila
14635
14:02:10,972 --> 14:02:13,472
So again, I'm not going to play\nany further with the C code.
14636
14:02:13,472 --> 14:02:15,452
It's there just to jog\nyour memory left and right.
14637
14:02:15,451 --> 14:02:17,761
So let's now look at a second\nversion of Hello, world
14638
14:02:17,762 --> 14:02:20,974
from that first week, whereby\nif I go and get Hello1.c
14639
14:02:20,974 --> 14:02:22,682
I'm going to drag that\nover to the right.
14640
14:02:22,682 --> 14:02:25,502
Whoops, I'm going to go ahead and\n
14641
14:02:25,502 --> 14:02:28,472
And now, on the right,\nlet's modify Hello.py
14642
14:02:28,472 --> 14:02:32,222
to look a little more like this\nsecond version in C, all right?
14643
14:02:32,222 --> 14:02:36,389
I want to get an answer from\nthe user as a return value
14644
14:02:36,389 --> 14:02:38,222
but I also want to get\nsome input from them.
14645
14:02:38,222 --> 14:02:41,942
So from CS50, I'm going to import the\n
14646
14:02:41,942 --> 14:02:43,692
We're going to get rid\nof that eventually
14647
14:02:43,692 --> 14:02:45,484
but for now, it's a\nhelpful training wheel.
14648
14:02:45,483 --> 14:02:47,701
And then down here, I'm\ngoing to say, answer
14649
14:02:47,701 --> 14:02:51,031
equals getString quote\nunquote, "What\'s your name"?
14650
14:02:52,502 --> 14:02:53,974
But no semicolon, no data type.
14651
14:02:53,974 --> 14:02:55,891
And then I'm going to\ngo ahead and print, just
14652
14:02:55,891 --> 14:03:01,639
like the first example on the slide,\n
14653
14:03:01,639 --> 14:03:03,182
And now let me go ahead and run this.
14654
14:03:03,182 --> 14:03:06,182
Python, of Hello.py, all right,\nit's asking me what's my name.
14655
14:03:07,891 --> 14:03:13,029
But it's worth calling attention to the\n
14656
14:03:13,029 --> 14:03:15,362
It's not just that the\nindividual functions are simpler.
14657
14:03:15,362 --> 14:03:18,991
What is also now glaringly omitted\nfrom my Python code at right
14658
14:03:18,991 --> 14:03:21,178
both in this version,\nand the previous version.
14659
14:03:21,178 --> 14:03:22,636
What did I not bother implementing?
14660
14:03:23,788 --> 14:03:26,371
DAVID J. MALAN: Yeah, so I didn't\neven need to implement main.
14661
14:03:26,372 --> 14:03:29,732
We'll revisit the main function,\nbecause having a main function
14662
14:03:29,732 --> 14:03:31,382
actually does solve problems sometimes.
14663
14:03:31,381 --> 14:03:32,612
But it's no longer required.
14664
14:03:32,612 --> 14:03:36,272
In C you have to have that to kick-start\n
14665
14:03:36,858 --> 14:03:39,691
And in fact, if you were missing\n
14666
14:03:39,692 --> 14:03:42,555
if you accidentally compiled\nHelpers.c instead of the file
14667
14:03:42,555 --> 14:03:44,972
that contained main, you would\nhave seen a compiler error.
14668
14:03:44,972 --> 14:03:46,180
In Python it's not necessary.
14669
14:03:46,180 --> 14:03:48,932
In Python you can just jump right\n
14670
14:03:49,872 --> 14:03:51,747
Especially if it's a\nsmall program like this
14671
14:03:51,747 --> 14:03:54,732
you don't need the added overhead\n
14672
14:03:54,732 --> 14:03:56,382
So that's one other difference here.
14673
14:03:56,381 --> 14:03:59,911
All right, there are a few other\nways we could say Hello, world.
14674
14:03:59,911 --> 14:04:02,681
Recall that I could use a format string.
14675
14:04:02,682 --> 14:04:06,881
So I could put this whole thing in\n
14676
14:04:06,881 --> 14:04:09,771
And then let me go ahead and\nrun Python of Hello.py again.
14677
14:04:09,771 --> 14:04:11,771
You can perhaps see where\nwe're going with this.
14678
14:04:11,771 --> 14:04:13,691
Let me type my name,\nDavid, and here we go.
14679
14:04:13,692 --> 14:04:16,092
OK, that's the mistake that\nsomeone identified earlier
14680
14:04:17,561 --> 14:04:21,461
Otherwise no variables are\ninterpolated, that is substituted
14681
14:04:22,911 --> 14:04:26,681
So if I go back in and add those\ncurly braces to the F string
14682
14:04:26,682 --> 14:04:31,154
now let me run Python of Hello.py,\n
14683
14:04:33,701 --> 14:04:37,061
But generally speaking, making\nshorter, more concise code
14684
14:04:38,391 --> 14:04:42,972
So stylistically, the F string is\n
14685
14:04:42,972 --> 14:04:45,802
All right, well, what more\ncan we do besides this?
14686
14:04:45,802 --> 14:04:48,702
Well, let me go ahead here and\n
14687
14:04:51,701 --> 14:04:54,671
Let me get rid of the CS50\nlibrary, which we will ultimately
14688
14:04:54,671 --> 14:04:56,141
in a couple of weeks, anyway.
14689
14:04:56,141 --> 14:04:59,082
I can't use getString,\nbut I can use a function
14690
14:04:59,082 --> 14:05:01,252
that comes with Python called input.
14691
14:05:01,252 --> 14:05:04,572
And, in fact, this is actually a\n
14692
14:05:04,572 --> 14:05:07,902
There's really no downside to\nusing input instead of getString.
14693
14:05:07,902 --> 14:05:09,942
We implement getString\njust for consistency
14694
14:05:09,942 --> 14:05:14,322
with what you saw in C. Python of\n
14695
14:05:14,322 --> 14:05:15,832
Still actually works the same.
14696
14:05:15,832 --> 14:05:17,749
So gone are the CS50\nspecific training wheels.
14697
14:05:17,749 --> 14:05:19,749
But we're going to bring\nthem back shortly, just
14698
14:05:19,749 --> 14:05:21,762
to deal with integers or\nfloats or other values
14699
14:05:21,762 --> 14:05:24,012
too, because it's going to make\nour lives a little simpler
14700
14:05:25,031 --> 14:05:28,871
All right, any questions, before we\n
14701
14:05:28,872 --> 14:05:32,802
from week 1, but now in Python?
14702
14:05:32,802 --> 14:05:34,632
All right, let me go\nahead and open up now.
14703
14:05:34,631 --> 14:05:39,761
Let's say Calculator0.c, which was one\n
14704
14:05:39,762 --> 14:05:43,391
math and operators like that, as\nwell as functions like getInt
14705
14:05:43,391 --> 14:05:48,341
let me go ahead and create a new\nfile now called Calculator.py
14706
14:05:48,341 --> 14:05:51,881
at right, so that I have\nmy C code at left still
14707
14:05:51,881 --> 14:05:53,471
and my Python code at right.
14708
14:05:53,472 --> 14:05:57,132
All right, let me go dive into a\n
14709
14:05:57,131 --> 14:05:59,621
I am going to use getInt\nfrom the CS50 library.
14710
14:06:01,482 --> 14:06:03,862
I'm going to go ahead now\nand get an Int from the user.
14711
14:06:03,862 --> 14:06:07,522
So x equals getInt, and I'll\nask them for an x value
14712
14:06:08,951 --> 14:06:14,322
No need to specify a semicolon,\nthough, or an Int for the x.
14713
14:06:15,461 --> 14:06:18,612
Y is going to get\nanother Int via y colon
14714
14:06:18,612 --> 14:06:23,351
and then down here, I'm going to\n
14715
14:06:23,351 --> 14:06:25,241
So this is already a bit new.
14716
14:06:25,241 --> 14:06:29,921
Recall, the C version required that\n
14717
14:06:30,949 --> 14:06:32,741
Python is just a little\nmore user-friendly.
14718
14:06:32,741 --> 14:06:36,191
If all you want to do is print out a\n
14719
14:06:36,192 --> 14:06:39,132
Don't futz with any percent\nsigns or format codes.
14720
14:06:39,131 --> 14:06:41,682
It's not printf, it's\nindeed just print now.
14721
14:06:41,682 --> 14:06:45,131
All right, let me go ahead and\nrun Python of Calculator.py
14722
14:06:45,131 --> 14:06:50,141
Enter, just do a quick sample,\n1 plus 2 indeed equals 3.
14723
14:06:50,141 --> 14:06:52,932
As an aside, suppose I had\ntaken a different approach
14724
14:06:52,932 --> 14:06:56,029
to importing the whole CS50 library,\n
14725
14:06:56,029 --> 14:06:58,072
You're not to notice any\nperformance impact here.
14726
14:06:59,211 --> 14:07:02,201
But notice what does not\nwork now, whereas it did work
14727
14:07:02,201 --> 14:07:07,631
in C. Python of Calculator.py, Enter,\n
14728
14:07:08,211 --> 14:07:10,091
So a traceback is just\na term of art that
14729
14:07:10,091 --> 14:07:13,731
says, here is a trace back\nthrough all of the functions
14730
14:07:14,771 --> 14:07:16,691
In the world of C, you\nmight call this a stack
14731
14:07:16,692 --> 14:07:19,459
trace, stack being the operative word.
14732
14:07:19,459 --> 14:07:21,792
Recall that when we talked\nabout the stack and the heap
14733
14:07:21,792 --> 14:07:24,599
the stack, like a stack of trays,\nwas all of the functions that
14734
14:07:24,599 --> 14:07:26,182
might get called, one after the other.
14735
14:07:26,182 --> 14:07:30,851
We had main, we had swap, then swap went\n
14736
14:07:30,851 --> 14:07:34,542
So here's a trace back of all of the\n
14737
14:07:34,542 --> 14:07:37,402
There's not really any functions\nother than my file itself.
14738
14:07:37,402 --> 14:07:38,872
Otherwise there'd be more detail.
14739
14:07:38,872 --> 14:07:42,102
But even though it's a little cryptic,\n
14740
14:07:42,101 --> 14:07:46,481
here, name error, so something related\n
14741
14:07:47,472 --> 14:07:50,711
And this of course, happens\non line 3 over there.
14742
14:07:52,042 --> 14:07:55,692
Well, Python essentially\nallows us to namespace
14743
14:07:55,692 --> 14:07:58,272
our functions that come from libraries.
14744
14:07:58,271 --> 14:08:01,811
There was a problem in C. If\nyou were using the CS50 library
14745
14:08:01,811 --> 14:08:03,701
and thus had access\nto getInt, getString
14746
14:08:03,701 --> 14:08:06,371
and so forth, you could\nnot use another library
14747
14:08:06,372 --> 14:08:08,112
that had the same function names.
14748
14:08:08,112 --> 14:08:10,031
They would collide, and\nthe compiler would not
14749
14:08:10,031 --> 14:08:12,551
know how to link them\ntogether correctly.
14750
14:08:12,552 --> 14:08:18,042
In Python, and other languages\nlike JavaScript, and in Java
14751
14:08:18,042 --> 14:08:21,792
you have support for effectively\n
14752
14:08:21,792 --> 14:08:26,891
You can isolate variables and\n
14753
14:08:26,891 --> 14:08:29,112
like their own container in memory.
14754
14:08:29,112 --> 14:08:32,082
And what this means is,\nif you import all of CS50
14755
14:08:32,082 --> 14:08:36,252
you have to say that the getInt you\n
14756
14:08:36,252 --> 14:08:39,701
So just like with the image\nblurring, and the image edges
14757
14:08:39,701 --> 14:08:44,951
before, where I had to specify image dot\n
14758
14:08:44,951 --> 14:08:48,491
am I specifying with a dot operator,\n
14759
14:08:48,491 --> 14:08:50,932
want CS50.getInt in both places.
14760
14:08:50,932 --> 14:08:54,641
And now if I rerun Python\nof Calculator.py, 1 and 2
14761
14:08:57,311 --> 14:09:01,311
Generally speaking, it depends\non just how many functions
14762
14:09:01,311 --> 14:09:02,561
you're using from the library.
14763
14:09:02,561 --> 14:09:05,561
If you're using a whole bunch of\n
14764
14:09:05,561 --> 14:09:09,854
If you're only using maybe one\nor two, import them line by line.
14765
14:09:09,855 --> 14:09:12,272
All right, so let's go ahead\nand make a little tweak here.
14766
14:09:12,271 --> 14:09:15,438
Let's get rid of this library\nand take this training wheel off
14767
14:09:15,438 --> 14:09:18,271
too, as quickly as we introduced\n
14768
14:09:18,271 --> 14:09:20,831
you'll be able to use all\nof these same functions.
14769
14:09:20,832 --> 14:09:24,632
Suppose I get rid of this, and\nI just use the input function
14770
14:09:24,631 --> 14:09:28,231
just like I did by\nreplacing getString earlier.
14771
14:09:28,232 --> 14:09:31,232
Let me go ahead now and run\nthis version of the code.
14772
14:09:31,232 --> 14:09:37,486
Python of Calculator.py, OK,\nhow about 1 plus 2 equals 3.
14773
14:09:39,182 --> 14:09:41,851
All right, obviously wrong, incorrect.
14774
14:09:41,851 --> 14:09:46,411
Can anyone explain what just\nhappened, based on instincts?
14775
14:09:47,911 --> 14:09:49,141
AUDIENCE: You want an answer?
14776
14:09:49,141 --> 14:09:50,266
DAVID J. MALAN: Sure, yeah.
14777
14:09:50,266 --> 14:09:54,451
AUDIENCE: Say you have a number\nof strings that don't have Ints
14778
14:09:54,451 --> 14:09:57,841
so you would part with them and\nsay, printing one, two, better.
14779
14:09:57,841 --> 14:10:01,171
DAVID J. MALAN: Exactly, Python\nis interpreting, or treating
14780
14:10:01,171 --> 14:10:03,332
both x and y as strings,\nwhich is actually
14781
14:10:03,332 --> 14:10:05,641
what the input function\nreturns by default.
14782
14:10:05,641 --> 14:10:08,671
And so plus is now being interpreted\n
14783
14:10:09,182 --> 14:10:12,302
So x plus y isn't x\nplus y mathematically
14784
14:10:12,302 --> 14:10:15,002
but in terms of string\njoining, just like in Scratch.
14785
14:10:15,002 --> 14:10:18,211
So that's why we're getting\n12, or really one two
14786
14:10:18,211 --> 14:10:19,561
which isn't itself a number.
14787
14:10:20,701 --> 14:10:22,471
So we somehow need to convert things.
14788
14:10:22,472 --> 14:10:25,561
And we didn't have this\nability quite as easily in C.
14789
14:10:25,561 --> 14:10:29,191
We did have like the A to i\nfunction, ASCII to integer
14790
14:10:29,192 --> 14:10:30,792
which did allow you to do this.
14791
14:10:30,792 --> 14:10:35,912
The analog in Python is actually just\n
14792
14:10:35,911 --> 14:10:39,271
So just like in C, you\ncan use the keyword Int
14793
14:10:39,271 --> 14:10:41,021
but you use it a little differently.
14794
14:10:41,021 --> 14:10:45,822
Notice that I'm not doing parenthesis\n
14795
14:10:45,822 --> 14:10:47,531
I'm using Int as a function.
14796
14:10:47,531 --> 14:10:49,951
So indeed, in Python, Int is a function.
14797
14:10:49,951 --> 14:10:53,131
Float is a function, that\nyou can pass values into
14798
14:10:53,131 --> 14:10:54,792
to do this kind of conversion.
14799
14:10:54,792 --> 14:10:58,531
So now, if I run Python\nof Calculator.py, 1 and 2
14800
14:10:58,531 --> 14:11:01,951
now we're back in business,\nand getting the answer of 3.
14801
14:11:01,951 --> 14:11:03,761
But there's kind of a catch here.
14802
14:11:03,762 --> 14:11:04,952
There's always going to be a trade-off.
14803
14:11:04,951 --> 14:11:07,081
Like that sounds amazing that\nit just works in this way.
14804
14:11:07,082 --> 14:11:08,972
We can throw away the\nCS50 library already.
14805
14:11:08,972 --> 14:11:13,652
But what if the user accidentally\n
14806
14:11:13,652 --> 14:11:15,557
like a cat, instead of a number.
14807
14:11:15,557 --> 14:11:17,432
Damn, well, there's one\nof these trace backs.
14808
14:11:17,432 --> 14:11:19,302
Like, now my program has crashed.
14809
14:11:19,302 --> 14:11:21,864
This is similar in spirit\nto the kinds of segfaults
14810
14:11:21,864 --> 14:11:23,072
that you might have had in C.
14811
14:11:23,072 --> 14:11:24,362
But they're not segfaults per se.
14812
14:11:24,362 --> 14:11:26,029
It doesn't necessarily relate to memory.
14813
14:11:26,029 --> 14:11:31,812
This time it relates to actual\n
14814
14:11:31,811 --> 14:11:34,771
So this time it's not a name\nerror, it's a value error
14815
14:11:34,771 --> 14:11:39,101
invalid literal for Int with\nbase 10 quote unquote "cat.
14816
14:11:39,101 --> 14:11:43,322
So, again, it's written for sort\nof a programmer, more than sort
14817
14:11:43,322 --> 14:11:46,171
of a typical person, because it's\n
14818
14:11:46,171 --> 14:11:47,421
But let's try to interpret it.
14819
14:11:47,421 --> 14:11:51,383
Invalid literal, a literal is just\n
14820
14:11:51,383 --> 14:11:52,841
is the function name, with base 10.
14821
14:11:52,841 --> 14:11:54,691
It's just defaulting to decimal numbers.
14822
14:11:54,692 --> 14:11:56,937
Cat is apparently not a decimal number.
14823
14:11:56,936 --> 14:11:59,561
It doesn't look like it, therefore\nit can't be treated like it.
14824
14:11:59,561 --> 14:12:01,451
Therefore, there's a value error.
14825
14:12:03,271 --> 14:12:06,721
Unfortunately, you would have\nto somehow catch this error.
14826
14:12:06,722 --> 14:12:08,972
And the only way to do\nthat in Python really
14827
14:12:08,972 --> 14:12:11,491
is by way of another\nfeature that C did not have
14828
14:12:11,491 --> 14:12:13,921
namely, what are called exceptions.
14829
14:12:13,921 --> 14:12:18,601
An exception is exactly what just\n
14830
14:12:18,601 --> 14:12:22,112
They are things that can go wrong\n
14831
14:12:22,112 --> 14:12:27,192
that aren't necessarily going to be\n
14832
14:12:27,192 --> 14:12:32,762
So in Python, and in JavaScript, and in\n
14833
14:12:32,762 --> 14:12:35,762
there's this ability to\nactually try to do something
14834
14:12:35,762 --> 14:12:37,537
except if something goes wrong.
14835
14:12:37,536 --> 14:12:39,661
And in fact, I'm going to\nintroduce a bit of syntax
14836
14:12:39,661 --> 14:12:42,078
here, even though we won't\nhave to use this much just yet.
14837
14:12:42,078 --> 14:12:46,502
Instead of just blindly converting\nx to an Int, let me go ahead
14838
14:12:48,491 --> 14:12:51,902
And if there's an exception,\ngo ahead and say something
14839
14:12:51,902 --> 14:12:58,802
like print, that is not an Int.
14840
14:12:58,802 --> 14:13:02,060
And then I'm going to do\nsomething like exit, right there.
14841
14:13:02,059 --> 14:13:03,601
And let me go ahead and do this here.
14842
14:13:03,601 --> 14:13:07,891
Let me try to get y, except\nif there's an exception.
14843
14:13:07,891 --> 14:13:12,519
Then let me go ahead and say, again,\n
14844
14:13:12,519 --> 14:13:14,851
And then I'm going to exit\nfrom there to, otherwise I'll
14845
14:13:14,851 --> 14:13:16,381
go ahead and print x plus y.
14846
14:13:16,381 --> 14:13:22,981
If I run Python of\nCalculator.py now, whoops, oh
14847
14:13:22,982 --> 14:13:25,202
forgot my close quote, sorry.
14848
14:13:25,201 --> 14:13:31,081
All right, so close quote, Python of\n
14849
14:13:31,082 --> 14:13:34,322
But if I try to type in\nsomething wrong like cat, now
14850
14:13:34,322 --> 14:13:35,832
it actually detects the error.
14851
14:13:35,832 --> 14:13:38,372
So what is the CS50\nlibrary in Python doing?
14852
14:13:38,372 --> 14:13:42,122
It's actually doing that try and accept\n
14853
14:13:42,122 --> 14:13:45,062
otherwise your programs for\nsomething simple, like a calculator
14854
14:13:45,061 --> 14:13:46,421
start to get longer and longer.
14855
14:13:46,421 --> 14:13:49,682
So we factored that kind of\nlogic out to the CS50 getInt
14856
14:13:49,682 --> 14:13:51,211
function and get float function.
14857
14:13:51,211 --> 14:13:55,305
But underneath the hood, they're\n
14858
14:13:55,305 --> 14:13:56,972
but they're being a little more precise.
14859
14:13:56,972 --> 14:14:00,972
They're detecting a specific error,\n
14860
14:14:00,972 --> 14:14:03,572
so that these functions will\nget executed again and again.
14861
14:14:03,572 --> 14:14:07,232
In fact, the best way to do this is to\n
14862
14:14:07,232 --> 14:14:10,600
then print that error\nmessage out to the user.
14863
14:14:10,599 --> 14:14:13,391
And again, let's not get too into\n
14864
14:14:13,391 --> 14:14:15,281
We've already put into the CS50 library.
14865
14:14:15,281 --> 14:14:17,582
But that's why, for instance,\nwe bootstrap things
14866
14:14:17,582 --> 14:14:20,942
by just using these\nfunctions out of the box.
14867
14:14:20,942 --> 14:14:24,132
All right, let's do something\nmore with our calculator here.
14868
14:14:25,531 --> 14:14:28,411
In the world of C, we\nhad another version
14869
14:14:28,411 --> 14:14:33,511
of this code, which actually\ndid some division by way of--
14870
14:14:33,512 --> 14:14:38,202
which actually did division of\n
14871
14:14:38,201 --> 14:14:42,511
So let me go ahead and close the C\n
14872
14:14:42,512 --> 14:14:44,463
now, doing some of these\nsame lines of codes.
14873
14:14:44,463 --> 14:14:46,171
But I'm going to go\nahead and just assume
14874
14:14:46,171 --> 14:14:48,661
that the user is going to\ncooperate and use proper input.
14875
14:14:48,661 --> 14:14:52,831
So from CS50, import getInt, that\n
14876
14:14:52,832 --> 14:15:00,162
X gets getInt, ask the user\nfor an Int x, y equals getInt
14877
14:15:01,692 --> 14:15:03,531
And then, let's go ahead and do this.
14878
14:15:03,531 --> 14:15:07,631
Let's declare a variable called\n
14879
14:15:07,631 --> 14:15:09,371
Then let's go ahead and print z.
14880
14:15:09,372 --> 14:15:13,762
Still no need for a format string, I\n
14881
14:15:13,762 --> 14:15:15,762
Let me go ahead and run\nPython of Calculator.py.
14882
14:15:15,762 --> 14:15:20,172
Let me do 1, 10, and I get 0.1.
14883
14:15:20,171 --> 14:15:25,781
What did I get in C,\nthough, if you think back.
14884
14:15:25,781 --> 14:15:28,597
What would we have happened in C?
14885
14:15:29,942 --> 14:15:32,162
DAVID J. MALAN: Yeah, we\nwould have gotten zero in C.
14886
14:15:32,161 --> 14:15:34,519
But why, in C, when you\ndivide one Int by another
14887
14:15:34,519 --> 14:15:36,436
and those Ints are like\n1 and 10 respectively?
14888
14:15:36,436 --> 14:15:38,199
AUDIENCE: It'll give\nyou an integer back.
14889
14:15:38,199 --> 14:15:39,781
DAVID J. MALAN: It will give you what?
14890
14:15:40,864 --> 14:15:44,432
DAVID J. MALAN: It will give you an\n
14891
14:15:44,432 --> 14:15:46,381
the integer part of it is indeed zero.
14892
14:15:46,381 --> 14:15:48,491
So this was an example of truncation.
14893
14:15:48,491 --> 14:15:51,061
So truncation was an\nissue in C. But it would
14894
14:15:51,061 --> 14:15:53,972
seem as though this is no\nlonger a problem in Python
14895
14:15:53,972 --> 14:15:57,811
insofar as the division operator\nactually handles that for us.
14896
14:15:57,811 --> 14:16:00,752
As an aside, if you want the old\nbehavior, because it actually
14897
14:16:00,752 --> 14:16:03,542
is sometimes useful for\nrounding or flooring values
14898
14:16:03,542 --> 14:16:06,092
you can actually use two slashes.
14899
14:16:06,091 --> 14:16:08,141
And now you get the C behavior.
14900
14:16:08,141 --> 14:16:10,231
So that now 1 divided by 10 is zero.
14901
14:16:10,232 --> 14:16:12,752
So you don't give up that\ncapability, but at least it
14902
14:16:12,752 --> 14:16:14,131
does a more sensible default.
14903
14:16:14,131 --> 14:16:17,551
Most people, especially new programmers,\n
14904
14:16:17,552 --> 14:16:20,522
would want to get 0.1,\nnot 0, for reasons
14905
14:16:20,521 --> 14:16:22,621
that indeed we had to explain weeks ago.
14906
14:16:22,622 --> 14:16:26,461
But what about another problem we\n
14907
14:16:26,461 --> 14:16:28,561
whereby there is imprecision?
14908
14:16:28,561 --> 14:16:31,502
Let me go ahead and, somewhat\n
14909
14:16:32,381 --> 14:16:34,862
I'm going to format\nit using an f-string.
14910
14:16:34,862 --> 14:16:39,241
And I'm going to go ahead and format,\n
14911
14:16:39,972 --> 14:16:43,141
Notice this, if I do Python\nof Calculator.py, 1 and 10
14912
14:16:43,141 --> 14:16:46,292
I get, by default, just\none significant digit.
14913
14:16:46,292 --> 14:16:50,442
But if I use this syntax in Python,\n
14914
14:16:50,442 --> 14:16:53,072
I can actually do in\nC like I did before
14915
14:16:53,072 --> 14:16:56,171
50 significant digits\nafter the decimal point.
14916
14:16:56,171 --> 14:17:00,542
So now let me rerun Python\nof Calculator.py 1 and 10
14917
14:17:00,542 --> 14:17:03,512
and let's see if floating point\nimprecision is still with us.
14918
14:17:04,802 --> 14:17:07,472
And you can see as much here,\nthe f-string, the format string
14919
14:17:07,472 --> 14:17:10,512
is just showing us now 50 digits\ninstead of the default one.
14920
14:17:10,512 --> 14:17:12,632
So we've not solved all problems.
14921
14:17:12,631 --> 14:17:15,366
But we have solved at least some.
14922
14:17:15,366 --> 14:17:18,241
All right, before we pivot away from\n
14923
14:17:18,241 --> 14:17:21,871
now on syntax or concepts or the like?
14924
14:17:22,591 --> 14:17:25,841
AUDIENCE: Do you think\nthe double slash you get
14925
14:17:25,841 --> 14:17:28,458
has merit, how do you comment on that?
14926
14:17:28,459 --> 14:17:29,792
DAVID J. MALAN: How do you what?
14927
14:17:30,750 --> 14:17:33,932
Really good question, if you're\nusing double slash for division
14928
14:17:33,932 --> 14:17:36,391
with flooring or truncation,\nlike I described
14929
14:17:36,391 --> 14:17:38,372
how do you do a comment in Python.
14930
14:17:39,902 --> 14:17:42,452
And the convention is actually\nto use a complete sentence
14931
14:17:42,451 --> 14:17:43,994
like with a capital T here.
14932
14:17:43,995 --> 14:17:46,412
You don't need a period unless\nthere's multiple sentences.
14933
14:17:46,411 --> 14:17:49,361
And technically, it should be above\n
14934
14:17:49,362 --> 14:17:51,641
So you would use a hash symbol instead.
14935
14:17:53,942 --> 14:17:57,272
All right, let's go ahead and make\n
14936
14:17:57,271 --> 14:17:59,951
Let me go ahead and\nopen up, for instance
14937
14:17:59,951 --> 14:18:05,612
an example called Points1.c,\nwhich we saw a few weeks back.
14938
14:18:05,612 --> 14:18:10,052
And let me go ahead on the other side\n
14939
14:18:10,052 --> 14:18:13,412
This was a program, recall, that\n
14940
14:18:13,411 --> 14:18:15,909
lost on the first assignment.
14941
14:18:15,910 --> 14:18:17,702
And then it went ahead\nand just printed out
14942
14:18:17,701 --> 14:18:20,311
whether they lost fewer points\nthan me, because I lost two
14943
14:18:20,311 --> 14:18:23,638
if you recall the photo, more points\n
14944
14:18:23,639 --> 14:18:26,222
Let me go ahead and zoom out so\nwe can see a bit more of this.
14945
14:18:26,222 --> 14:18:30,730
And let me now, on the top right here,\n
14946
14:18:30,730 --> 14:18:33,272
So I want to first prompt the\nuser for some number of points.
14947
14:18:33,271 --> 14:18:37,061
So from CS50 let's import getInt,\n
14948
14:18:37,061 --> 14:18:39,932
Let's then do points\nequals getInt, and ask
14949
14:18:39,932 --> 14:18:43,951
the user, how many points\ndid you lose, question mark.
14950
14:18:43,951 --> 14:18:48,511
Then let's go ahead and say, if points\n
14951
14:18:48,512 --> 14:18:52,322
print, you lost fewer points than me.
14952
14:18:52,322 --> 14:18:59,792
Otherwise, if it's else if points\n
14953
14:18:59,792 --> 14:19:03,592
you lost more points than me.
14954
14:19:03,591 --> 14:19:07,322
Else let's go ahead and handle\nthe final scenario, which is you
14955
14:19:07,322 --> 14:19:11,122
lost the same number of points as me.
14956
14:19:11,122 --> 14:19:15,752
Before I run this, does anyone want to\n
14957
14:19:16,252 --> 14:19:17,912
AUDIENCE: Else if has to be elif.
14958
14:19:17,911 --> 14:19:21,211
DAVID J. MALAN: Yeah, so else if in\n
14959
14:19:22,302 --> 14:19:26,312
So let me change this to elif, and now\n
14960
14:19:26,311 --> 14:19:29,851
suppose you lost three\npoints on some assignment.
14961
14:19:29,851 --> 14:19:31,711
You lost more points than my two.
14962
14:19:31,711 --> 14:19:34,330
If you only lost one point,\nyou lost fewer points than me.
14963
14:19:35,372 --> 14:19:37,562
But notice the code is much tighter.
14964
14:19:37,561 --> 14:19:41,222
In 10 total lines, we did in\nwhat was 24 lines, because we've
14965
14:19:41,222 --> 14:19:42,872
thrown away a lot of the syntax.
14966
14:19:42,872 --> 14:19:44,891
The curly braces are\nno longer necessary.
14967
14:19:44,891 --> 14:19:46,752
The parentheses are\ngone, the semicolons.
14968
14:19:46,752 --> 14:19:50,192
So this is why it just tends to\nbe more pleasant pretty quickly
14969
14:19:52,832 --> 14:19:55,292
All right, let's do\none other example here.
14970
14:19:55,292 --> 14:19:59,522
In C, recall that we were able to\n
14971
14:19:59,521 --> 14:20:01,111
if something is even or odd.
14972
14:20:01,112 --> 14:20:05,522
Well, in Python, let me go ahead\n
14973
14:20:05,521 --> 14:20:09,331
and let's look for a moment\nat the C version at left.
14974
14:20:09,332 --> 14:20:13,202
Here was the code in C that we used\n
14975
14:20:13,201 --> 14:20:16,322
And, really, the key\ntakeaway from all these lines
14976
14:20:16,322 --> 14:20:17,811
was just the remainder operator.
14977
14:20:17,811 --> 14:20:19,061
And that one is still with us.
14978
14:20:19,061 --> 14:20:21,519
So this is a simple demonstration,\njust to make that point
14979
14:20:21,519 --> 14:20:25,291
if in Python, I want to determine\n
14980
14:20:25,292 --> 14:20:29,671
Well, let's go ahead and from CS50,\n
14981
14:20:29,671 --> 14:20:35,131
and get a number like n from the user,\n
14982
14:20:35,131 --> 14:20:40,741
And then let's go ahead and say,\nif n percent sign 2 equals 0
14983
14:20:40,741 --> 14:20:44,792
then let\'s go ahead and\nprint quote unquote "Even.
14984
14:20:44,792 --> 14:20:50,275
Else let's go ahead and print\nout Odd, but before I run this
14985
14:20:50,275 --> 14:20:53,192
anyone want to instinctively, even\n
14986
14:20:57,332 --> 14:20:58,957
DAVID J. MALAN: Yeah, so double equals.
14987
14:20:58,957 --> 14:21:02,372
Again, so even though some of the stuff\n
14988
14:21:02,951 --> 14:21:05,041
So this, too, should\nbe a double equal sign
14989
14:21:05,042 --> 14:21:07,141
because I'm comparing for equality here.
14990
14:21:07,141 --> 14:21:08,675
And why is this the right math?
14991
14:21:08,675 --> 14:21:10,592
Well, if you divide a\nnumber by 2, it's either
14992
14:21:10,591 --> 14:21:12,811
going to have 0 or 1 as a remainder.
14993
14:21:12,811 --> 14:21:15,551
And that's going to determine\nif it's even or odd for us.
14994
14:21:15,552 --> 14:21:18,722
So let's run Python of Parity.py,\ntype in a number like 50
14995
14:21:18,722 --> 14:21:21,182
and hopefully we get, indeed, even.
14996
14:21:21,182 --> 14:21:23,432
So again, same idea, but now\nwe're down to eight lines
14997
14:21:25,082 --> 14:21:27,332
Well, let's now do something\na little more interactive
14998
14:21:27,332 --> 14:21:31,202
and a little representative of tools\n
14999
14:21:31,201 --> 14:21:36,841
In C, recall that we had this\nagreement program, Agree.c.
15000
14:21:36,841 --> 14:21:40,801
And then let's go ahead and implement\n
15001
14:21:42,391 --> 14:21:45,091
And let's look at the C version first.
15002
14:21:45,091 --> 14:21:47,221
On the left, we used get char here.
15003
14:21:47,222 --> 14:21:49,711
And then we used the\ndouble vertical bars
15004
14:21:49,711 --> 14:21:52,951
to check if C is equal to\ncapital Y or lowercase y.
15005
14:21:52,951 --> 14:21:55,021
And then we did the\nsame thing for n for no.
15006
14:21:55,021 --> 14:22:00,901
And so let's go over here and\nlet's do from CS50, import get--
15007
14:22:00,902 --> 14:22:03,092
OK, get char is not a thing.
15008
14:22:03,091 --> 14:22:05,611
And this here is another\ndifference with Python.
15009
14:22:05,612 --> 14:22:09,031
There is no data type for\nindividual characters.
15010
14:22:09,031 --> 14:22:11,161
You have strings, STRs,\nand, honestly, those
15011
14:22:11,161 --> 14:22:13,141
are fine, because if\nyou have a STR that's
15012
14:22:13,141 --> 14:22:15,481
just one character, for\nall intents and purposes
15013
14:22:15,482 --> 14:22:17,232
it is just a single character.
15014
14:22:17,232 --> 14:22:18,482
So it's just a simplification.
15015
14:22:18,482 --> 14:22:19,722
You don't have to think as much.
15016
14:22:19,722 --> 14:22:22,180
You don't have to worry about\ndouble quotes, single quotes.
15017
14:22:22,180 --> 14:22:25,872
In fact, in Python, you can use\ndouble quotes or single quotes
15018
14:22:25,872 --> 14:22:27,452
so long as you're consistent.
15019
14:22:27,451 --> 14:22:29,491
So long as you're\nconsistent, the single quotes
15020
14:22:29,491 --> 14:22:32,191
do not mean something\ndifferent, like they do in C.
15021
14:22:32,192 --> 14:22:34,862
So I'm going to go ahead\nand use getString here
15022
14:22:34,862 --> 14:22:37,741
although, strictly speaking, I\n
15023
14:22:39,002 --> 14:22:43,771
I'm going to get a string from the\n
15024
14:22:43,771 --> 14:22:47,078
quote unquote, "Do you agree," like a\n
15025
14:22:47,078 --> 14:22:50,161
where you have to say yes or no, you\n
15026
14:22:51,101 --> 14:22:54,631
And then let's translate the\nconditionals to Python, now, too.
15027
14:22:54,631 --> 14:23:02,371
So if S equals equals quote-unquote\n
15028
14:23:02,372 --> 14:23:08,702
let's go ahead and print out agreed,\n
15029
14:23:08,701 --> 14:23:12,061
equals N or S equals equals little n.
15030
14:23:12,061 --> 14:23:14,579
Let's go ahead, then,\nand print out not agreed.
15031
14:23:14,580 --> 14:23:17,372
And you can already see, perhaps,\n
15032
14:23:17,372 --> 14:23:20,222
Is Python a little more\nEnglish-like, in that
15033
14:23:20,222 --> 14:23:24,132
you just literally use the English word\n
15034
14:23:24,131 --> 14:23:26,891
But it's ultimately\ndoing the same thing.
15035
14:23:26,891 --> 14:23:29,911
Can we simplify this code a bit, though.
15036
14:23:29,911 --> 14:23:31,861
This would be a little\nannoying if we wanted
15037
14:23:31,862 --> 14:23:34,322
to add support, not just\nfor big Y and little y
15038
14:23:34,322 --> 14:23:40,752
but Yes or big Yes or little yes or\n
15039
14:23:40,752 --> 14:23:43,652
There's a lot of permutations\nof Y-E-S or just y
15040
14:23:43,652 --> 14:23:45,241
that we ideally should tolerate.
15041
14:23:45,241 --> 14:23:47,991
Otherwise, the user is going to\n
15042
14:23:47,991 --> 14:23:49,292
which isn't very user-friendly.
15043
14:23:49,292 --> 14:23:51,572
Any intuition for how\nwe could logically
15044
14:23:51,572 --> 14:23:54,792
even if you don't know how to\ndo it in code, make this better?
15045
14:23:55,292 --> 14:23:58,057
AUDIENCE: Write way over\nthe list, and then up
15046
14:23:58,057 --> 14:23:59,432
it's like the things in the list.
15047
14:23:59,432 --> 14:24:03,572
DAVID J. MALAN: Nice, yeah, we saw an\n
15048
14:24:03,572 --> 14:24:06,421
Why don't we take that same\nidea and ask a similar question.
15049
14:24:06,421 --> 14:24:11,341
If S is in the following list\nof values, Y or little y
15050
14:24:11,341 --> 14:24:15,122
or heck, let me add to the list\n
15051
14:24:15,122 --> 14:24:17,300
And it's going to get a\nlittle annoying, admittedly
15052
14:24:17,300 --> 14:24:20,271
but this is still better than the\n
15053
14:24:20,271 --> 14:24:22,161
I could do things like\nthis, and so forth.
15054
14:24:22,161 --> 14:24:24,261
There's a whole bunch more permutations.
15055
14:24:24,262 --> 14:24:26,992
But let's leave this alone,\nand let me just go into here
15056
14:24:26,991 --> 14:24:33,800
and change this to, if S is in the\n
15057
14:24:33,800 --> 14:24:36,981
and I won't do as, let's just not\n
15058
14:24:39,322 --> 14:24:42,472
Python of Agree.py, do I agree?
15059
14:24:45,262 --> 14:24:46,881
All right, how about big Yes.
15060
14:24:46,881 --> 14:24:48,372
OK, that does not seem to work.
15061
14:24:48,372 --> 14:24:50,872
Notice it did not say agreed,\nand it did not say not agreed.
15062
14:24:53,701 --> 14:24:57,291
Well, you know what I could\ndo, what I don't really
15063
14:24:57,292 --> 14:24:58,762
need the uppercase and lowercase.
15064
14:24:58,762 --> 14:25:00,711
Let me tighten this\nlist up a little bit.
15065
14:25:00,711 --> 14:25:04,162
And why don't I just\nforce S to be lowercase.
15066
14:25:04,161 --> 14:25:07,521
S.lower, recall, whether\nit's one character or more
15067
14:25:07,521 --> 14:25:10,701
is a function built into\nSTRs now, strings in Python
15068
14:25:10,701 --> 14:25:12,471
that forces the whole\nthing to lowercase.
15069
14:25:13,972 --> 14:25:19,222
Python of Agree.py, little y,\nthat works, big Y, that works.
15070
14:25:19,222 --> 14:25:24,362
Big Yes, that works, big Y,\nlittle e, big S, that also works.
15071
14:25:24,362 --> 14:25:27,432
So we've now handled, in one fell\n
15072
14:25:27,432 --> 14:25:29,432
And you know what, we can\ntighten this up a bit.
15073
14:25:29,432 --> 14:25:32,872
Here's an opportunity, in Python,\nfor slightly better design.
15074
14:25:32,872 --> 14:25:36,592
What have I done in here\nthat's a little redundant?
15075
14:25:36,591 --> 14:25:40,701
Does anyone see an opportunity\nto eliminate a redundancy
15076
14:25:40,701 --> 14:25:43,341
doing something more\ntimes than you need.
15077
14:25:45,052 --> 14:25:47,685
AUDIENCE: You can do S dot lower, above.
15078
14:25:47,684 --> 14:25:49,851
DAVID J. MALAN: We could\nmove the S dot lower above.
15079
14:25:49,851 --> 14:25:51,832
Notice that I'm using S dot lower twice.
15080
14:25:51,832 --> 14:25:54,391
But it's going to give me\nthe same answer both times.
15081
14:25:54,391 --> 14:25:56,601
So I could do a couple of things here.
15082
14:25:56,601 --> 14:26:01,222
I could, first of all, get rid of\n
15083
14:26:01,222 --> 14:26:05,241
and then above this, maybe I could\n
15084
14:26:05,241 --> 14:26:08,121
I can't just do this, because\nthat throws the value away.
15085
14:26:08,122 --> 14:26:10,762
It does the math, but it doesn't\nconvert the string itself.
15086
14:26:10,762 --> 14:26:12,362
It's going to return a value.
15087
14:26:12,362 --> 14:26:14,781
So I have to say S equals s.lower.
15088
14:26:15,862 --> 14:26:18,362
Or, honestly, I can chain\nthese things together.
15089
14:26:18,362 --> 14:26:22,592
And this is not something we saw in\n
15090
14:26:22,591 --> 14:26:25,761
and strings have functions\nlike lower in them
15091
14:26:25,762 --> 14:26:28,851
you can chain these functions\n
15092
14:26:28,851 --> 14:26:30,309
dot that, dot this other thing.
15093
14:26:30,309 --> 14:26:33,351
And eventually you want to stop,\n
15094
14:26:33,351 --> 14:26:35,332
But this is reasonable,\nstill fits on the screen.
15095
14:26:36,082 --> 14:26:38,211
It does in one place\nwhat I was doing in two.
15096
14:26:39,531 --> 14:26:42,502
Let me go ahead and do Python\nof Agree.py one last time.
15097
14:26:43,641 --> 14:26:46,881
And it's still working as intended.
15098
14:26:46,881 --> 14:26:49,221
Also if I tried those\nother inputs as well.
15099
14:26:49,957 --> 14:26:55,812
AUDIENCE: Could you add on like a for\n
15100
14:26:55,811 --> 14:26:59,222
and then cover all the functions where\n
15101
14:26:59,222 --> 14:27:01,972
where it's uppercase as well, or\n
15102
14:27:05,616 --> 14:27:06,991
DAVID J. MALAN: Let me summarize.
15103
14:27:06,991 --> 14:27:09,862
Could we handle uppercase and\nlowercase together in some form?
15104
14:27:09,862 --> 14:27:11,542
I'm actually doing that already.
15105
14:27:12,891 --> 14:27:15,828
I have to either be all lowercase\nin my logic or all uppercase
15106
14:27:15,828 --> 14:27:17,661
and not worry about\nwhat the human types in
15107
14:27:17,661 --> 14:27:19,761
because no matter what\nthe human types in, I'm
15108
14:27:19,762 --> 14:27:21,472
forcing their input to lowercase.
15109
14:27:21,472 --> 14:27:24,802
And then I am using a\nlowercase list of values.
15110
14:27:24,802 --> 14:27:26,042
If I want to flip that, fine.
15111
14:27:26,042 --> 14:27:27,561
I just have to be self-consistent.
15112
14:27:27,561 --> 14:27:28,941
But I'm handling that already.
15113
14:27:29,745 --> 14:27:33,475
AUDIENCE: Are strings no\nlonger an array of characters?
15114
14:27:33,474 --> 14:27:35,391
DAVID J. MALAN: A really\ngood loaded questions
15115
14:27:35,391 --> 14:27:38,601
are strings no longer\nan array of characters?
15116
14:27:38,601 --> 14:27:40,641
Conceptually, yes,\nunderneath the hood, no.
15117
14:27:40,641 --> 14:27:42,711
They're a little more\nsophisticated than that
15118
14:27:42,711 --> 14:27:45,112
because with strings,\nyou have a few changes.
15119
14:27:45,112 --> 14:27:47,122
Not only do they have\nfunctions built into them
15120
14:27:47,122 --> 14:27:49,102
because strings are now\nwhat we call objects
15121
14:27:49,101 --> 14:27:51,021
in what's called\nobject-oriented programming.
15122
14:27:51,021 --> 14:27:53,563
And we're going to keep seeing\nexamples of this dot operator.
15123
14:27:53,563 --> 14:27:58,072
They are also immutable, so\nto speak, I-M-M-U-T-A-B-L-E.
15124
14:27:58,072 --> 14:28:01,701
Immutable means they cannot be\nchanged, which means, unlike C
15125
14:28:01,701 --> 14:28:05,271
you can't go into a string and\nchange its individual characters.
15126
14:28:05,271 --> 14:28:08,002
You can make a copy of the\nstring that makes a change
15127
14:28:08,002 --> 14:28:10,220
but you can't change the\noriginal string itself.
15128
14:28:10,220 --> 14:28:12,262
This is both a little\nannoying, maybe, sometimes.
15129
14:28:12,262 --> 14:28:14,887
But it's also pretty protective,\nbecause you can't do screw-ups
15130
14:28:14,887 --> 14:28:18,202
like I did weeks ago, when I was\ntrying to copy S and call it T.
15131
14:28:18,201 --> 14:28:19,791
And then one affected the other.
15132
14:28:19,792 --> 14:28:23,601
Python, underneath the hood, is\n
15133
14:28:23,601 --> 14:28:25,072
and the pointers and all of that.
15134
14:28:25,072 --> 14:28:27,561
There are no pointers in Python.
15135
14:28:27,561 --> 14:28:32,362
So If that wasn't clear, all of that\n
15136
14:28:32,362 --> 14:28:36,802
is now handled by the language\n
15137
14:28:36,802 --> 14:28:38,962
All right, so let's\nintroduce maybe some loops
15138
14:28:38,961 --> 14:28:40,911
like we've been in the habit of doing.
15139
14:28:40,911 --> 14:28:44,691
Let me open up Meow.c, which was\nan example in C, just meowing
15140
14:28:46,252 --> 14:28:49,322
Let me create a file called\nMeow.py here on the right.
15141
14:28:49,322 --> 14:28:51,711
And notice on the left,\nthis was correct code in C
15142
14:28:51,711 --> 14:28:53,192
but it was kind of poorly designed.
15143
14:28:53,692 --> 14:28:55,972
Because it was a missed\nopportunity for a loop.
15144
14:28:55,972 --> 14:28:58,982
Why say something three times\nwhen you can say it just once?
15145
14:28:58,982 --> 14:29:02,512
So in Python, let me do it\nthe poorly designed way first.
15146
14:29:03,921 --> 14:29:07,731
And, like I generally should not,\n
15147
14:29:07,732 --> 14:29:10,192
run Python of Meow.py, and it works.
15148
14:29:11,839 --> 14:29:13,881
So let me go ahead and\nimprove this a little bit.
15149
14:29:13,881 --> 14:29:15,511
And there's a few ways to do this.
15150
14:29:15,512 --> 14:29:20,572
If I wanted to do this three times, I\n
15151
14:29:20,572 --> 14:29:24,531
For i in range of 3, recall that\nthat was the better version
15152
14:29:24,531 --> 14:29:27,891
rather than arbitrarily enumerate\n
15153
14:29:27,891 --> 14:29:30,012
and print out quote unquote "Meow.
15154
14:29:30,012 --> 14:29:32,599
Now if I run Python of\nMeow, still seems to work.
15155
14:29:32,599 --> 14:29:34,432
So it's a little tighter,\nand, my God, like
15156
14:29:34,432 --> 14:29:36,474
programs can't really get\nmuch shorter than this.
15157
14:29:36,474 --> 14:29:40,822
We're down to two lines of code, no\n
15158
14:29:40,822 --> 14:29:43,101
Let's now improve the\ndesign further, like we
15159
14:29:43,101 --> 14:29:46,072
did in C, by introducing\na function called
15160
14:29:46,072 --> 14:29:47,752
meow, that actually does the meowing.
15161
14:29:47,752 --> 14:29:49,521
So this was our first\nabstraction, recall
15162
14:29:49,521 --> 14:29:54,621
both in Scratch and in C. Let me focus\n
15163
14:29:55,281 --> 14:30:00,006
Let me go ahead and\nfirst define a function.
15164
14:30:03,411 --> 14:30:06,771
Let me first go ahead and do\nthis, for i in range of 3
15165
14:30:06,771 --> 14:30:09,951
let's assume for the moment\nthat there's a meow function
15166
14:30:09,951 --> 14:30:11,241
that I'm just going to call.
15167
14:30:11,241 --> 14:30:14,841
Let's now go ahead and define, using\n
15168
14:30:14,841 --> 14:30:17,691
with the speller\ndemonstration, a function
15169
14:30:17,692 --> 14:30:19,402
called meow that takes no arguments.
15170
14:30:19,402 --> 14:30:21,982
And all it does for now is print meow.
15171
14:30:21,982 --> 14:30:27,142
Let me now go ahead and run\nPython of Meow.py Enter, huh, one
15172
14:30:28,472 --> 14:30:30,601
So this is another name error.
15173
14:30:30,601 --> 14:30:33,601
And, again, name meow is not defined.
15174
14:30:33,601 --> 14:30:35,601
What's your instinct here,\neven though we've not
15175
14:30:35,601 --> 14:30:37,281
tripped over this yet in Python?
15176
14:30:37,281 --> 14:30:39,652
Where does your mind go here?
15177
14:30:40,192 --> 14:30:42,602
AUDIENCE: Does it read top\nto bottom, left to right?
15178
14:30:42,601 --> 14:30:46,121
I'm guessing we could find a new case.
15179
14:30:46,122 --> 14:30:49,542
DAVID J. MALAN: Perfect, as smart,\n
15180
14:30:49,542 --> 14:30:51,292
it still makes certain assumptions.
15181
14:30:51,292 --> 14:30:54,531
And if it hasn't seen a keyword\nyet, it just doesn't exist.
15182
14:30:54,531 --> 14:30:57,521
So if you want it to exist, we\nhave to be a little clever here.
15183
14:30:57,521 --> 14:31:00,611
I could just put it, flip\nit around, like this.
15184
14:31:00,612 --> 14:31:02,991
But this honestly isn't\nparticularly good design.
15185
14:31:03,491 --> 14:31:06,911
Because now, if you, the reader\nof your code, whether you
15186
14:31:06,911 --> 14:31:09,491
wrote it or someone else, you\nkind of have to go fishing now.
15187
14:31:09,491 --> 14:31:11,081
Like where does this program begin?
15188
14:31:11,082 --> 14:31:14,652
And even though, yes, it's obvious\n
15189
14:31:14,652 --> 14:31:17,232
like, if the file were longer,\nyou're going to be annoyed
15190
14:31:17,232 --> 14:31:19,702
and fishing visually for\nthe right lines of code.
15191
14:31:20,919 --> 14:31:22,752
And indeed, this would\nbe a common paradigm.
15192
14:31:22,752 --> 14:31:25,902
When you want to start having\n
15193
14:31:25,902 --> 14:31:29,982
just put your own code in main, so that,\n
15194
14:31:29,982 --> 14:31:32,172
you can solve the problem\nwe just encountered.
15195
14:31:32,171 --> 14:31:35,381
So let me define a function called\nmain that has that same loop
15196
14:31:38,561 --> 14:31:43,871
Let me go into my terminal and\nrun Python of Meow.py, Enter.
15197
14:31:47,021 --> 14:31:50,572
All right, investigate this.
15198
14:31:50,572 --> 14:31:52,811
What could explain this symptom.
15199
14:31:52,811 --> 14:31:54,542
I have not told you the answer yet.
15200
14:31:54,542 --> 14:31:56,292
So all you have is\nyour instinct, assuming
15201
14:31:56,292 --> 14:31:58,241
you've never touched Python before.
15202
14:31:58,241 --> 14:32:03,322
What might explain this symptom,\nwhere nothing is meowing?
15203
14:32:03,822 --> 14:32:05,491
AUDIENCE: Didn't run the main function.
15204
14:32:05,491 --> 14:32:07,699
DAVID J. MALAN: Yeah, I\ndidn't run the main function.
15205
14:32:07,699 --> 14:32:09,911
So in C, this is functionality\nyou get for free.
15206
14:32:09,911 --> 14:32:11,286
You have to have a main function.
15207
14:32:11,286 --> 14:32:14,101
But, heck, so long as you make\nit, it will be called for you.
15208
14:32:14,101 --> 14:32:17,911
In Python, this is just a convention,\nto create a main function
15209
14:32:17,911 --> 14:32:19,721
borrowing a very common name for it.
15210
14:32:19,722 --> 14:32:22,842
But if you want to call that\nmain function, you have to do it.
15211
14:32:22,841 --> 14:32:24,631
So this looks a little\nweird, admittedly
15212
14:32:24,631 --> 14:32:26,551
that you have to call your\nown main function now
15213
14:32:26,552 --> 14:32:28,382
and it has to be at\nthe bottom of the file
15214
14:32:28,381 --> 14:32:31,561
because only once the interpreter\n
15215
14:32:31,561 --> 14:32:34,981
have all of your functions\nbeen defined, higher up.
15216
14:32:34,982 --> 14:32:36,512
But this solves both problems.
15217
14:32:36,512 --> 14:32:38,972
It keeps your code, that's\nthe main part of your code
15218
14:32:38,972 --> 14:32:40,182
at the very top of the file.
15219
14:32:40,182 --> 14:32:43,502
So it's just obvious to you, and\n
15220
14:32:43,502 --> 14:32:45,662
where the program logically starts.
15221
14:32:45,661 --> 14:32:49,831
But it also ensures that main is not\n
15222
14:32:52,182 --> 14:32:54,169
So this is another\nperfect example of we're
15223
14:32:54,169 --> 14:32:55,961
learning a new language\nfor the first time.
15224
14:32:55,961 --> 14:32:57,542
You're not going to have heard\nall of the answers before.
15225
14:32:57,542 --> 14:33:01,351
Just apply some logic, as to, like, all\n
15226
14:33:01,351 --> 14:33:04,711
Start to infer how the\nlanguage does or doesn't work.
15227
14:33:04,711 --> 14:33:08,972
If I now go and run this, Python of\n
15228
14:33:08,972 --> 14:33:11,882
And just so you have\nseen it, there is a quote
15229
14:33:11,881 --> 14:33:15,362
unquote "better" way of doing this,\n
15230
14:33:15,362 --> 14:33:18,572
are not going to encounter,\ncertainly in these initial days.
15231
14:33:18,572 --> 14:33:21,961
Typically, you would see in\nonline tutorials or books
15232
14:33:21,961 --> 14:33:25,921
something that looks like this, where\n
15233
14:33:27,332 --> 14:33:30,992
That's functionally the same thing,\n
15234
14:33:30,991 --> 14:33:34,362
if we ourselves were implementing a\n
15235
14:33:34,362 --> 14:33:37,404
But we're going to keep things simpler\n
15236
14:33:37,404 --> 14:33:39,877
because we're not going to\nencounter that problem just yet.
15237
14:33:39,877 --> 14:33:42,752
All right, let's make one change to\n
15238
14:33:42,752 --> 14:33:47,942
In C, the last version of meow also\n
15239
14:33:47,942 --> 14:33:50,432
took arguments to the function meow.
15240
14:33:50,432 --> 14:33:53,012
So suppose that I want\nto factor this out.
15241
14:33:53,012 --> 14:33:55,772
And I want to just call meow as a\n
15242
14:33:55,771 --> 14:33:57,601
say meow this number of times.
15243
14:33:57,601 --> 14:34:00,811
And I figure out how many times\n
15244
14:34:00,811 --> 14:34:03,511
or using getInt or something\nlike that, to figure out
15245
14:34:05,072 --> 14:34:08,341
Well, now, I have to define\ninside my meow function, in input
15246
14:34:08,341 --> 14:34:14,851
let's call it n, and then use that,\n
15247
14:34:14,851 --> 14:34:18,161
let me go ahead and print\nout meow that many times.
15248
14:34:18,161 --> 14:34:20,341
So again, the only thing\nthat's different in C
15249
14:34:20,341 --> 14:34:24,151
is we don't bother specifying return\n
15250
14:34:24,152 --> 14:34:28,752
and we don't bother specifying the\n
15251
14:34:28,752 --> 14:34:31,451
So same ideas, simpler in some sense.
15252
14:34:31,451 --> 14:34:33,182
We're just throwing away keystrokes.
15253
14:34:33,182 --> 14:34:35,972
All right, let me run this one\nfinal time, Python of Meow.py
15254
14:34:35,972 --> 14:34:38,912
and we still have the same program.
15255
14:34:38,911 --> 14:34:40,631
All right, let me pause here.
15256
14:34:41,302 --> 14:34:42,552
And I know this is going fast.
15257
14:34:42,552 --> 14:34:47,877
But hopefully, the C code\nis still somewhat familiar.
15258
14:34:48,377 --> 14:34:54,052
AUDIENCE: Is there any difference\n
15259
14:34:54,052 --> 14:34:55,302
DAVID J. MALAN: Good question.
15260
14:34:55,302 --> 14:34:57,760
Is there any difference between\nglobal and local variables?
15261
14:34:57,760 --> 14:35:00,372
Short answer, yes, and we would\nrun into that same problem
15262
14:35:00,372 --> 14:35:01,842
if we declare a variable\nin one function
15263
14:35:01,841 --> 14:35:03,966
another function is not\ngoing to have access to it.
15264
14:35:03,966 --> 14:35:07,181
We can solve that by\nputting variables globally.
15265
14:35:07,182 --> 14:35:09,281
But we don't have all of\nthe features we had in C
15266
14:35:09,281 --> 14:35:11,682
like there's no such thing\nas a constant in Python.
15267
14:35:11,682 --> 14:35:13,421
The mentality in the\nPython community is
15268
14:35:13,421 --> 14:35:16,002
if you don't want some value\nto change, don't touch it.
15269
14:35:17,152 --> 14:35:18,762
So there's trade-offs here, too.
15270
14:35:18,762 --> 14:35:21,522
Some languages are stronger\nor more defensive than that.
15271
14:35:21,521 --> 14:35:25,511
But that, too, is part of the mindset\n
15272
14:35:27,167 --> 14:35:29,459
AUDIENCE: There is really\nonly one green line, in the--
15273
14:35:29,459 --> 14:35:30,959
DAVID J. MALAN: Oh, sorry, where's--
15274
14:35:31,601 --> 14:35:34,864
AUDIENCE: There has only been\none green line printed at a time.
15275
14:35:34,864 --> 14:35:36,572
DAVID J. MALAN: That\nis an amazing segue.
15276
14:35:36,572 --> 14:35:37,891
Let's come to that in just\na moment, because we're
15277
14:35:37,891 --> 14:35:40,141
going to recreate also\nthat Mario example, where
15278
14:35:40,141 --> 14:35:43,447
we had like the question marks for\n
15279
14:35:43,447 --> 14:35:45,072
So let's come back to that in a second.
15280
14:35:46,177 --> 14:35:49,883
AUDIENCE: If strings are immutable,\n
15281
14:35:49,883 --> 14:35:51,841
DAVID J. MALAN: Correct,\nstrings are immutable.
15282
14:35:51,841 --> 14:35:55,741
Any time you seem to be modifying\n
15283
14:35:57,002 --> 14:35:59,461
So it's taking a little\nmore memory somewhere.
15284
14:35:59,461 --> 14:36:02,667
But you don't have to deal with\nit Python's doing that for you.
15285
14:36:02,667 --> 14:36:05,414
AUDIENCE: So you don't free anything.
15286
14:36:05,413 --> 14:36:06,621
DAVID J. MALAN: Say it again?
15287
14:36:07,747 --> 14:36:11,184
AUDIENCE: You don't free\nlike taking leave on stuff.
15288
14:36:11,184 --> 14:36:12,851
DAVID J. MALAN: You don't free anything.
15289
14:36:12,851 --> 14:36:15,391
So if you weren't a big fan,\nover the past couple of weeks
15290
14:36:15,391 --> 14:36:19,381
of malloc or free or\nmemory or addresses, or all
15291
14:36:19,381 --> 14:36:21,511
of those low level\nimplementation details
15292
14:36:21,512 --> 14:36:23,912
Python is the language for\nyou, because all of that
15293
14:36:23,911 --> 14:36:25,861
is handled for you automatically.
15294
14:36:28,982 --> 14:36:34,766
AUDIENCE: Each up for the variable, you\n
15295
14:36:36,222 --> 14:36:40,307
Well, if there isn't a main function in\n
15296
14:36:40,307 --> 14:36:42,432
DAVID J. MALAN: How do you\ndefine a global variable
15297
14:36:42,432 --> 14:36:44,014
if there's no main function in Python?
15298
14:36:44,014 --> 14:36:48,002
Global variables, by definition, always\n
15299
14:36:49,002 --> 14:36:51,822
If I wanted to have a\nfunction that's outside of
15300
14:36:51,822 --> 14:36:56,224
and, therefore, global to\nall of these, like global--
15301
14:36:56,224 --> 14:36:59,141
actually, don't use the word global,\n
15302
14:36:59,141 --> 14:37:03,972
variable equals Foo, F-O-O,\njust as an arbitrary string
15303
14:37:03,972 --> 14:37:07,932
value that a computer scientist would\n
15304
14:37:07,932 --> 14:37:10,521
There are some caveats, though,\nas to how you access that.
15305
14:37:10,521 --> 14:37:12,531
But let's come back\nto that another time.
15306
14:37:12,531 --> 14:37:14,551
But that problem is solvable, too.
15307
14:37:15,052 --> 14:37:16,302
So let's go ahead and do this.
15308
14:37:16,302 --> 14:37:19,572
To come back to the question about\n
15309
14:37:19,572 --> 14:37:21,822
and create a file now called Mario.py.
15310
14:37:21,822 --> 14:37:24,222
Won't bother showing the C code anymore.
15311
14:37:24,222 --> 14:37:26,112
We'll focus just on\nthe new language here.
15312
14:37:26,112 --> 14:37:31,061
But recall that, in Python, in Mario, we\n
15313
14:37:31,061 --> 14:37:34,121
This was a random screen from\nthe side scroller version 1
15314
14:37:35,322 --> 14:37:39,341
And we just want to print like three\n
15315
14:37:39,341 --> 14:37:41,471
Well, in Python, we could\ndo something like this
15316
14:37:41,472 --> 14:37:47,802
print, oh, sorry, for i in the range of\n
15317
14:37:48,349 --> 14:37:50,141
And I think this is\npretty straightforward.
15318
14:37:50,141 --> 14:37:52,781
Python of Mario.py, we\nget our three hashes.
15319
14:37:52,781 --> 14:37:55,371
You could imagine\nparameterizing this now, though
15320
14:37:55,372 --> 14:37:56,872
and getting actual user input.
15321
14:37:58,252 --> 14:38:03,942
Let me go up here and let me go\n
15322
14:38:03,942 --> 14:38:07,612
and then let's get the\ninput from the user.
15323
14:38:07,612 --> 14:38:09,732
So it actually is a\nvalue n, like, all right
15324
14:38:09,732 --> 14:38:14,712
getInt the height of the column\nof bricks that you want to do.
15325
14:38:14,711 --> 14:38:18,792
And then, let's go ahead and print\n
15326
14:38:20,082 --> 14:38:21,906
Let's print out like five hashes.
15327
14:38:21,906 --> 14:38:24,281
OK, one, two, three, four,\nfive, that seems to work, too.
15328
14:38:24,281 --> 14:38:26,199
And it's going to work\nfor any positive value.
15329
14:38:26,199 --> 14:38:29,921
But it's not going to work\nfor, how about negative 1?
15330
14:38:29,921 --> 14:38:31,182
That just doesn't do anything.
15331
14:38:32,269 --> 14:38:35,351
But also recall that it's not going\n
15332
14:38:35,351 --> 14:38:40,512
weird, like, oh, sorry, it is going\n
15333
14:38:42,311 --> 14:38:45,341
We're using CS50's\ngetInt function, which is
15334
14:38:45,341 --> 14:38:48,231
handling all of those headaches for us.
15335
14:38:48,232 --> 14:38:51,702
But, what if the user indeed\ntypes a negative number?
15336
14:38:52,631 --> 14:38:54,381
So that was the bug I\nwanted to highlight.
15337
14:38:54,381 --> 14:38:56,771
It would be nice to re-prompt\nthem and re-prompt them.
15338
14:38:56,771 --> 14:38:59,081
And in C, what was the\nprogramming construct we
15339
14:38:59,082 --> 14:39:01,542
used when we wanted to\nask the user a question.
15340
14:39:01,542 --> 14:39:05,802
And then, if they didn't cooperate,\n
15341
14:39:07,271 --> 14:39:08,621
DAVID J. MALAN: Yeah,\ndo while loop, right?
15342
14:39:08,622 --> 14:39:11,352
That was useful, because it's\nalmost the same as a while loop.
15343
14:39:11,351 --> 14:39:14,621
But instead of checking a\ncondition, and then doing something
15344
14:39:14,622 --> 14:39:16,470
you do something and\nthen check a condition
15345
14:39:16,470 --> 14:39:18,762
which makes sense with user\ninput, because what are you
15346
14:39:18,762 --> 14:39:21,137
even going to check if the\nuser hasn't done anything yet?
15347
14:39:21,137 --> 14:39:22,722
You need that inverted logic.
15348
14:39:22,722 --> 14:39:26,531
Unfortunately in Python,\nthere is no do while loop.
15349
14:39:29,262 --> 14:39:32,112
And frankly, those are\nenough to recreate this idea.
15350
14:39:32,112 --> 14:39:35,682
And the way to do this in\nPython, the Pythonic way, which
15351
14:39:35,682 --> 14:39:38,682
is another term of art in the\ncommunity, is to say this.
15352
14:39:38,682 --> 14:39:42,822
Deliberately induce an infinite loop,\n
15353
14:39:42,822 --> 14:39:46,451
And then do what you got to do,\nlike get an Int from a user
15354
14:39:46,451 --> 14:39:48,581
asking them for the\nheight of this thing.
15355
14:39:48,582 --> 14:39:54,792
And then, if that is what you want, like\n
15356
14:39:56,542 --> 14:40:01,961
So this is how, in Python, you could\n
15357
14:40:01,961 --> 14:40:03,836
You deliberately induce\nan infinite loop.
15358
14:40:03,836 --> 14:40:05,711
So something's going to\nhappen at least once.
15359
14:40:05,711 --> 14:40:08,801
Then, if you get the answer\nyou want, you break out of it
15360
14:40:08,802 --> 14:40:10,852
effectively achieving the same logic.
15361
14:40:10,851 --> 14:40:13,601
So this is the Pythonic way\nof doing a do while loop.
15362
14:40:13,601 --> 14:40:18,281
Let me go ahead and run Python\nof Mario.py, type in 3 this time.
15363
14:40:18,281 --> 14:40:21,192
And now I get back just\nthe 3 hashes as well.
15364
14:40:21,192 --> 14:40:26,832
What if, though, I wanted to\nget rid of, how about ultimately
15365
14:40:26,832 --> 14:40:31,580
that CS50 library function, and\n
15366
14:40:31,580 --> 14:40:33,622
Well, let's go ahead and\ntweak this a little bit.
15367
14:40:33,622 --> 14:40:35,592
Let me go ahead and\nremove this temporarily.
15368
14:40:35,591 --> 14:40:38,201
Give myself a main function, so\nI don't make the same mistake
15369
14:40:39,881 --> 14:40:43,631
And let me give myself a function called\n
15370
14:40:43,631 --> 14:40:47,141
And inside of that function\nis going to be that same code.
15371
14:40:47,141 --> 14:40:50,802
But I don't want to break in\nthis case, I want to return n.
15372
14:40:50,802 --> 14:40:53,815
So, recall, that if you return\nfrom a function, you're done
15373
14:40:53,815 --> 14:40:55,732
you're going to exit\nfrom right at that point.
15374
14:40:56,841 --> 14:40:59,201
You can just say return\nn inside of the loop
15375
14:40:59,201 --> 14:41:01,841
or, if you would prefer\nto break out, you
15376
14:41:01,841 --> 14:41:03,461
could do something like this instead.
15377
14:41:03,461 --> 14:41:09,222
Break, and then down here,\nyou could return, down here
15378
14:41:11,152 --> 14:41:13,812
And let me make one point here\nbefore we go back up to main.
15379
14:41:13,811 --> 14:41:18,011
This is a little different\nfrom C. And this one's subtle.
15380
14:41:18,012 --> 14:41:23,772
What have I done here that in C would\n
15381
14:41:27,381 --> 14:41:28,741
It's super subtle, this one.
15382
14:41:29,241 --> 14:41:32,432
AUDIENCE: So aren't we like\ndefining mostly object
15383
14:41:32,432 --> 14:41:35,991
like we're using it\nfirst, defining an object?
15384
14:41:40,796 --> 14:41:43,671
DAVID J. MALAN: So similar, it's\n
15385
14:41:43,671 --> 14:41:47,502
So it's OK not to declare a\nvariable with like the data type.
15386
14:41:47,502 --> 14:41:51,942
We've addressed that before, but on line\n
15387
14:41:51,942 --> 14:41:55,122
And then we return n on line 12.
15388
14:41:56,711 --> 14:42:01,932
In the world of C, if we had declared\n
15389
14:42:01,932 --> 14:42:04,722
it would have been scoped\nto that loop, which
15390
14:42:04,722 --> 14:42:08,052
means as soon as you get out of that\n
15391
14:42:09,862 --> 14:42:12,612
It would be local to the\ncurly braces therein.
15392
14:42:12,612 --> 14:42:16,241
Here, logically, curly braces\nare gone, but the indentation
15393
14:42:16,241 --> 14:42:20,771
makes clear that n is still inside of\n
15394
14:42:20,771 --> 14:42:23,801
But n is actually still\nin scope in Python.
15395
14:42:23,802 --> 14:42:26,902
The moment you create a variable\n
15396
14:42:26,902 --> 14:42:30,281
It is available everywhere within\nthat function, even outside
15397
14:42:30,281 --> 14:42:32,211
of the loop in which you defined it.
15398
14:42:32,211 --> 14:42:35,591
So this logic is actually OK in Python.
15399
14:42:35,591 --> 14:42:38,659
In C, recall, to solve\nthis same problem
15400
14:42:38,660 --> 14:42:41,202
we would have had to do something\na little hackish like this
15401
14:42:41,201 --> 14:42:46,121
like define n up here on line 8,\n
15402
14:42:46,122 --> 14:42:48,522
and so that it exists on line 13.
15403
14:42:48,521 --> 14:42:52,221
That is no longer an\nissue or need, in Python.
15404
14:42:52,222 --> 14:42:54,222
Once you create a variable,\neven if it's nested
15405
14:42:54,222 --> 14:42:56,389
nested, nested inside of\nsome loops or conditionals
15406
14:42:56,389 --> 14:43:00,042
it still exists within\nthe function itself.
15407
14:43:00,042 --> 14:43:04,391
All right, any questions then on this,\n
15408
14:43:04,391 --> 14:43:08,201
rid of the CS50 library again?
15409
14:43:08,201 --> 14:43:10,822
OK, so let me go ahead and\nget the height from the user.
15410
14:43:10,822 --> 14:43:13,279
Let's go ahead and create a\nvariable in main called height.
15411
14:43:13,279 --> 14:43:14,981
Let's call this get height function.
15412
14:43:14,982 --> 14:43:19,902
And then let's use that height value,\n
15413
14:43:19,902 --> 14:43:21,522
And let me see if this all works now.
15414
14:43:22,932 --> 14:43:25,631
Hopefully, I haven't\nmessed up, but I did.
15415
14:43:25,631 --> 14:43:27,981
But this is an easy fix now.
15416
14:43:28,482 --> 14:43:29,607
AUDIENCE: Got to call main.
15417
14:43:29,607 --> 14:43:31,065
DAVID J. MALAN: I got to call main.
15418
14:43:31,065 --> 14:43:32,502
So again, I deleted that earlier.
15419
14:43:33,442 --> 14:43:34,650
So I'm actually calling main.
15420
14:43:34,650 --> 14:43:38,711
Let me rerun Python of\nMario.py, there we go, height 3.
15421
14:43:40,402 --> 14:43:42,402
So let's do one last\nthing with Mario, just
15422
14:43:42,402 --> 14:43:45,502
to tie together that idea now\nof exceptions from before.
15423
14:43:45,502 --> 14:43:47,591
Again, exceptions are\na feature of Python
15424
14:43:47,591 --> 14:43:49,581
whereby you can try to do something.
15425
14:43:49,582 --> 14:43:53,232
And if there's a problem, you can\n
15426
14:43:53,232 --> 14:43:56,592
Previously, I handled it by just yelling\n
15427
14:43:56,591 --> 14:43:59,981
But let's actually use this to\nre-implement CS50's own getInt
15428
14:44:00,762 --> 14:44:03,652
Let me throw away\nCS50's getInt function.
15429
14:44:03,652 --> 14:44:09,402
And now let me go ahead and\nreplace getInt with input.
15430
14:44:09,402 --> 14:44:12,192
But it's not sufficient\nto just use input.
15431
14:44:12,192 --> 14:44:16,002
What do I have to add to\nthis line of code on line 8?
15432
14:44:16,002 --> 14:44:17,262
If I want to get back an Int?
15433
14:44:18,311 --> 14:44:20,353
DAVID J. MALAN: Yeah, I\nhave to cast it to an Int
15434
14:44:20,353 --> 14:44:23,021
by calling the Int\nfunction around that value
15435
14:44:23,021 --> 14:44:25,271
or I could do it on a separate\nline, just to be clear.
15436
14:44:25,271 --> 14:44:28,631
I could also do n equals Int of n.
15437
14:44:28,631 --> 14:44:31,542
That would work too, but it's\nsort of an unnecessary extra line.
15438
14:44:31,542 --> 14:44:34,512
This is not sufficient, because\nthat does not change the value.
15439
14:44:35,457 --> 14:44:36,582
But then it throws it away.
15440
14:44:37,713 --> 14:44:40,421
So the conventional way to do this\n
15441
14:44:40,421 --> 14:44:41,879
just to keep things nice and tight.
15442
14:44:43,302 --> 14:44:47,992
If I run Python of Mario.py, I can\n
15443
14:44:47,991 --> 14:44:52,241
I can still type in negative 1, because\n
15444
14:44:52,241 --> 14:44:55,271
What I'm not yet handling\nis weird input like cat
15445
14:44:55,271 --> 14:44:58,281
or some string that is\nnot a base 10 number.
15446
14:44:58,281 --> 14:45:00,402
So here, again, is my traceback.
15447
14:45:00,402 --> 14:45:03,522
And notice that here, let\nme scroll up a little bit
15448
14:45:03,521 --> 14:45:08,141
here we can actually see\nmore detail in the traceback.
15449
14:45:08,141 --> 14:45:13,421
Notice that, just like in C, or just\n
15450
14:45:14,622 --> 14:45:18,012
You can see mention of module, that\n
15451
14:45:18,012 --> 14:45:19,535
is my main function, and get height.
15452
14:45:19,535 --> 14:45:20,952
So notice, it's kind of backwards.
15453
14:45:20,951 --> 14:45:23,241
It's top to bottom instead\nof bottom up, as we drew it
15454
14:45:23,241 --> 14:45:25,241
on the board the other\nday, and as we envisioned
15455
14:45:25,241 --> 14:45:27,042
stacks of trays in the cafeteria.
15456
14:45:27,042 --> 14:45:29,202
But this is your stack,\nof functions that
15457
14:45:29,201 --> 14:45:30,851
have been called, from top to bottom.
15458
14:45:30,851 --> 14:45:33,881
Get height is the most recent,\nmain is the very first
15459
14:45:35,722 --> 14:45:40,262
So let's try to do, let's try to do this\n
15460
14:45:41,262 --> 14:45:46,242
I'm going to go in here, and I'm\n
15461
14:45:46,241 --> 14:45:53,591
Whoops, try to do the following, except\n
15462
14:45:53,591 --> 14:45:57,161
then go ahead and say something,\nwell, like before, print
15463
14:45:57,161 --> 14:46:00,351
that's not an integer exclamation point.
15464
14:46:00,351 --> 14:46:03,281
But the difference this time is\nbecause I'm in a loop, the user
15465
14:46:03,281 --> 14:46:05,722
is going to have a chance\nto recover from this issue.
15466
14:46:05,722 --> 14:46:08,862
So if I run Mario.py, 3\nstill works as before.
15467
14:46:08,862 --> 14:46:12,402
If I run Mario.py and type\nin cat, I detect it now
15468
14:46:12,402 --> 14:46:15,762
and because I'm still in that loop,\n
15469
14:46:15,762 --> 14:46:19,572
because I've caught, so to speak, the\n
15470
14:46:19,572 --> 14:46:23,472
here, that's the way in Python\nto detect these kinds of errors
15471
14:46:23,472 --> 14:46:26,202
that would otherwise end up\nbeing on the user's own screen.
15472
14:46:26,201 --> 14:46:28,061
If I type in cat, dog,\nthat doesn't work.
15473
14:46:28,061 --> 14:46:33,341
If I type in, though, 2, I get my two\n
15474
14:46:33,341 --> 14:46:35,261
Are any questions on\nthis, and we're not going
15475
14:46:35,262 --> 14:46:37,272
to spend too much time on\nexceptions, but just wanted
15476
14:46:37,271 --> 14:46:40,201
to show you what's involved with\n
15477
14:46:40,701 --> 14:46:42,284
AUDIENCE: Then the hash marks in line.
15478
14:46:42,285 --> 14:46:43,827
DAVID J. MALAN: OK, so let's do this.
15479
14:46:43,826 --> 14:46:45,661
That actually comes to\nthe earlier question
15480
14:46:45,661 --> 14:46:47,581
about printing the\nhashes on the same line
15481
14:46:47,582 --> 14:46:50,330
or maybe something like this,\nwhere we have the little bricks
15482
14:46:50,330 --> 14:46:51,872
in the sky, or little question marks.
15483
14:46:51,872 --> 14:46:54,247
Let's recreate this idea,\nbecause the problem with print
15484
14:46:54,247 --> 14:46:57,452
as was noted earlier, is you're\n
15485
14:46:57,451 --> 14:46:58,981
But what if we don't want that.
15486
14:46:58,982 --> 14:47:01,262
Well, let's change\nthis program entirely.
15487
14:47:01,262 --> 14:47:02,832
Let me throw away all the functions.
15488
14:47:02,832 --> 14:47:05,742
Let's just go to a simpler world,\nwhere we're just doing this.
15489
14:47:05,741 --> 14:47:07,434
So let me start fresh in Mario.py.
15490
14:47:07,434 --> 14:47:09,641
I'm not going to bother with\nexceptions or functions.
15491
14:47:09,641 --> 14:47:15,932
Let's just do a very simple program, to\n
15492
14:47:15,932 --> 14:47:19,381
this time, because there are\nfour of these things in the sky.
15493
14:47:19,381 --> 14:47:21,752
Let's go ahead and just\nprint out a question mark
15494
14:47:21,752 --> 14:47:23,972
to represent each of those bricks.
15495
14:47:23,972 --> 14:47:27,662
Odds are you know this not going to end\n
15496
14:47:27,661 --> 14:47:30,971
as you've predicted, on separate lines.
15497
14:47:30,972 --> 14:47:33,902
So it turns out that the\nprint function actually
15498
14:47:33,902 --> 14:47:36,842
takes in multiple arguments, not\n
15499
14:47:36,841 --> 14:47:40,171
but also some additional arguments,\nthat allow you to specify
15500
14:47:40,171 --> 14:47:42,691
what the default line ending should be.
15501
14:47:42,692 --> 14:47:45,632
But what's interesting\nabout this is that, if you
15502
14:47:45,631 --> 14:47:49,151
want to change the line\nending to be something like
15503
14:47:49,152 --> 14:47:53,312
quote unquote, "that is\nnothing," instead of backslash n
15504
14:47:53,311 --> 14:47:55,831
this is not sufficient,\nbecause in Python, you
15505
14:47:55,832 --> 14:47:58,292
can have two types of\narguments, or parameters.
15506
14:47:58,292 --> 14:48:01,682
Some arguments are positional, which\n
15507
14:48:01,682 --> 14:48:03,211
a comma separated list of arguments.
15508
14:48:03,211 --> 14:48:06,061
And that's what we did all the time\n
15509
14:48:06,061 --> 14:48:08,186
comma, something, we did\nit in printf all the time
15510
14:48:08,186 --> 14:48:10,502
and in other functions that\ntook multiple arguments.
15511
14:48:10,502 --> 14:48:14,402
In Python, you have, not\nonly positional arguments
15512
14:48:14,402 --> 14:48:18,182
where you just separate them by commas,\n
15513
14:48:19,171 --> 14:48:22,741
There are also named arguments,\nwhich looks weird but is
15514
14:48:22,741 --> 14:48:24,661
helpful for reasons like this.
15515
14:48:24,661 --> 14:48:27,421
If you read the\ndocumentation, you will see
15516
14:48:27,421 --> 14:48:31,261
that there is a named argument\nthat Python accepts, called end.
15517
14:48:31,262 --> 14:48:34,202
And if you set that\nequal to something, that
15518
14:48:34,201 --> 14:48:36,721
will be used as the end\nof every line, instead
15519
14:48:36,722 --> 14:48:39,272
of the default, which the\ndocumentation will also say
15520
14:48:39,271 --> 14:48:41,221
is quote unquote backslash n.
15521
14:48:41,222 --> 14:48:45,522
So this line here has no effect\non my logic at the moment.
15522
14:48:45,521 --> 14:48:49,801
But if I change it to just quote\nunquote, essentially overriding
15523
14:48:49,802 --> 14:48:54,992
the default new line character, and\n
15524
14:48:55,800 --> 14:48:57,092
There's a bit of a bug, though.
15525
14:48:57,091 --> 14:49:00,131
My prompt is not meant\nto be on the same line.
15526
14:49:00,131 --> 14:49:02,161
So I can fix that by\njust printing nothing.
15527
14:49:02,161 --> 14:49:05,161
But, really, it's not nothing,\n
15528
14:49:05,161 --> 14:49:09,451
So let me run Python of\nMario.py again, and now we
15529
14:49:09,451 --> 14:49:12,661
have what I intended in the first\n
15530
14:49:13,692 --> 14:49:17,432
And this is just one example\nof an argument that has a name.
15531
14:49:17,432 --> 14:49:19,802
But this is a common\nparadigm in Python 2
15532
14:49:19,802 --> 14:49:22,772
to not just separate things by\ncommas, but to be very specific
15533
14:49:22,771 --> 14:49:27,331
because the print function might take\n
15534
14:49:27,332 --> 14:49:31,150
And my God, if you had to\nenumerate like 10 or 20 commas
15535
14:49:32,192 --> 14:49:34,109
You're going to get\nthings in the wrong order.
15536
14:49:34,108 --> 14:49:37,121
Named arguments allow you to\nbe resilient against that.
15537
14:49:37,122 --> 14:49:39,211
So you only specify\narguments by name, and it
15538
14:49:39,211 --> 14:49:42,525
doesn't matter what order they are in.
15539
14:49:42,525 --> 14:49:46,682
All right, any questions, then, on\n
15540
14:49:46,682 --> 14:49:50,792
And to be clear, you can do\nsomething like, very weird
15541
14:49:50,792 --> 14:49:56,432
but logically expected, like this, by\n
15542
14:49:56,432 --> 14:49:58,351
But the right way to\nsolve the Mario problem
15543
14:49:58,351 --> 14:50:02,173
would be just to override\nit to be nothing like this.
15544
14:50:02,173 --> 14:50:03,631
All right, how about this for cool.
15545
14:50:03,631 --> 14:50:05,521
And this is why a lot\nof people like Python.
15546
14:50:05,521 --> 14:50:06,961
Suppose you don't really like loops.
15547
14:50:06,961 --> 14:50:08,491
You don't really like\nthree-line programs
15548
14:50:08,491 --> 14:50:11,158
because that was kind of three\ntimes longer than it needs to be.
15549
14:50:11,158 --> 14:50:15,722
What if you just printed out\na question mark four times?
15550
14:50:15,722 --> 14:50:19,902
Python, whoops, Python of\nMario.py, that also works.
15551
14:50:19,902 --> 14:50:23,072
So it turns out that, just like\nthe plus operator in Python
15552
14:50:23,072 --> 14:50:27,091
can join things together,\nthe multiply operator is not
15553
14:50:28,362 --> 14:50:32,592
It actually means, take this and\nconcatenate it four times over.
15554
14:50:32,591 --> 14:50:35,521
So that's a way of just\ndistilling into one line what
15555
14:50:35,521 --> 14:50:39,271
would have otherwise taken multiple\n
15556
14:50:39,271 --> 14:50:43,651
lines in Python, but is really\nnow rather succinct in Python
15557
14:50:44,906 --> 14:50:48,031
Let's do one last Mario example, which\n
15558
14:50:48,031 --> 14:50:50,612
If this is another part\nof the Mario interface
15559
14:50:50,612 --> 14:50:53,322
this is like a grid of like\n3 by 3 bricks, for instance.
15560
14:50:53,322 --> 14:50:57,211
So two dimensions now, just not just\n
15561
14:50:57,211 --> 14:50:59,652
Let's print out something\nlike that, using hashes.
15562
14:50:59,652 --> 14:51:02,592
Well, how about, how do I do this.
15563
14:51:02,591 --> 14:51:05,731
So how about for i in range of 3.
15564
14:51:05,732 --> 14:51:10,802
Then I could do for j in range of\n
15565
14:51:10,802 --> 14:51:12,332
and that's reasonable for counting.
15566
14:51:12,332 --> 14:51:17,522
I could now print out a hash symbol,\n
15567
14:51:17,521 --> 14:51:24,182
Python of Mario.py, OK, that's\njust one crazy long column.
15568
14:51:24,182 --> 14:51:27,762
What do I need to fix and where\n
15569
14:51:27,762 --> 14:51:32,372
So 3 by 3 bricks, instead\nof one long column.
15570
14:51:32,972 --> 14:51:37,022
AUDIENCE: Why don't we create\na line and then we'll skip it.
15571
14:51:37,021 --> 14:51:39,971
DAVID J. MALAN: OK, so after\nprinting 3, we want to skip a line.
15572
14:51:39,972 --> 14:51:42,272
So maybe like print\nout a blank line here.
15573
14:51:43,262 --> 14:51:46,442
I like that instinct, right, print\n
15574
14:51:46,442 --> 14:51:48,781
Let's go ahead and run\nPython of Mario.py.
15575
14:51:48,781 --> 14:51:53,101
OK, it's more visible, what\nI'm doing, but still wrong.
15576
14:51:53,101 --> 14:51:55,631
What can I, what's the\nremaining fix, though?
15577
14:51:56,131 --> 14:51:59,311
AUDIENCE: So right behind the two.
15578
14:51:59,311 --> 14:52:02,201
DAVID J. MALAN: Yeah, I'm\ngetting an extra new line here
15579
14:52:02,201 --> 14:52:04,391
which I don't want\nwhile I'm on this row.
15580
14:52:04,391 --> 14:52:08,372
So let me do n equals quote unquote,\n
15581
14:52:10,472 --> 14:52:13,866
Python of Mario.py, voila, now\nwe've got it, in two dimensions.
15582
14:52:13,866 --> 14:52:15,241
And even this, we can tighten up.
15583
14:52:15,241 --> 14:52:17,741
Like, we could just use the\nlittle trick we learned.
15584
14:52:17,741 --> 14:52:21,752
So we could just say,\nprint a hash times 3 times
15585
14:52:21,752 --> 14:52:24,332
and we can get rid of one\nof those loops altogether.
15586
14:52:24,332 --> 14:52:27,452
All it's doing is, whoops, all it's\n
15587
14:52:27,451 --> 14:52:29,581
But, no, I don't want to do that.
15588
14:52:29,582 --> 14:52:31,353
What do I, how do I fix this here.
15589
14:52:31,353 --> 14:52:33,061
I don't think I want\nthis anymore, right?
15590
14:52:33,061 --> 14:52:34,871
Because that's giving\nme an extra new line.
15591
14:52:34,872 --> 14:52:37,782
So now this program is\nreally tightened up.
15592
14:52:37,781 --> 14:52:39,572
Same thing, two lines of code.
15593
14:52:39,572 --> 14:52:43,741
But we're now implementing this\n
15594
14:52:43,741 --> 14:52:46,961
All right, any questions here on these?
15595
14:52:47,461 --> 14:52:53,311
AUDIENCE: Is there any practical reason\n
15596
14:52:53,311 --> 14:52:56,371
the print function, you\ndon't put any spaces in it.
15597
14:52:56,372 --> 14:52:58,952
DAVID J. MALAN: If I\nprint n, any spaces.
15598
14:52:59,822 --> 14:53:01,961
AUDIENCE: Whenever we\nwrite n, for example
15599
14:53:01,961 --> 14:53:05,372
the print function\nis, you know, in order
15600
14:53:05,372 --> 14:53:10,342
to stop it from going to a new\nline, it seems like any spaces
15601
14:53:10,341 --> 14:53:14,322
we did like n equals and then too close.
15602
14:53:20,764 --> 14:53:24,552
So in a previous version, let me\n
15603
14:53:25,692 --> 14:53:28,242
The convention in Python\nis not to do that.
15604
14:53:28,872 --> 14:53:30,785
It just starts to add too much space.
15605
14:53:30,785 --> 14:53:32,952
And this is a little\ninconsistent, because, earlier
15606
14:53:32,951 --> 14:53:34,991
when we talked about\nlike pluses or spaces
15607
14:53:34,991 --> 14:53:37,271
around the less than or equal\nsigns, I did say add it.
15608
14:53:37,271 --> 14:53:39,531
Here it's actually\nclearer and recommended
15609
14:53:39,531 --> 14:53:40,781
to keep them tighter together.
15610
14:53:40,781 --> 14:53:44,082
Otherwise it just becomes harder\nto read where the gaps are.
15611
14:53:45,341 --> 14:53:50,878
All right, let's do, how about,\nanother five minute break.
15612
14:53:51,461 --> 14:53:54,254
And then we're going to dive into\n
15613
14:53:54,254 --> 14:53:57,682
and then ultimately build with some\n
15614
14:53:59,652 --> 14:54:04,781
All right, so almost all\nof the examples we just did
15615
14:54:04,781 --> 14:54:07,061
were recreations of\nwhat we did in week 1.
15616
14:54:07,061 --> 14:54:09,641
And recall that week 1 was like\nour most syntax-heavy week.
15617
14:54:09,641 --> 14:54:13,451
It was when we were first learning\n
15618
14:54:13,451 --> 14:54:16,421
we began to focus a bit\nmore on ideas, like arrays
15619
14:54:16,421 --> 14:54:18,161
and other higher-level constructs.
15620
14:54:18,161 --> 14:54:21,401
And we'll do that again here, condensing\n
15621
14:54:21,402 --> 14:54:23,772
into a fewer set of examples in Python.
15622
14:54:23,771 --> 14:54:26,541
And we'll culminate by actually\ntaking Python out for a spin
15623
14:54:26,542 --> 14:54:28,822
and doing things that\nwould be way harder to do
15624
14:54:28,822 --> 14:54:33,351
and way more time-consuming to do in C,\n
15625
14:54:33,351 --> 14:54:36,311
But how do you go about figuring\nout what functions exist
15626
14:54:36,311 --> 14:54:39,491
if you didn't hear it in\nclass, you don't see it online
15627
14:54:39,491 --> 14:54:43,002
but you want to see it officially, you\n
15628
14:54:44,741 --> 14:54:47,862
And I will disclaim that, honestly,\n
15629
14:54:49,271 --> 14:54:51,761
Google will often be your\nfriend, so googling something
15630
14:54:51,762 --> 14:54:55,872
you're interested in, to find your way\n
15631
14:54:55,872 --> 14:54:58,932
or StackOverflow.com is\nanother popular website.
15632
14:54:58,932 --> 14:55:01,302
As always, though, the\nline should be googling
15633
14:55:01,302 --> 14:55:04,122
things like, how do I convert\na string to lowercase.
15634
14:55:04,122 --> 14:55:05,592
Like that's reasonable to Google.
15635
14:55:05,591 --> 14:55:09,681
Or how to convert to uppercase or\n
15636
14:55:09,682 --> 14:55:14,472
But googling, of course, things like\n
15637
14:55:14,472 --> 14:55:15,641
of course, crosses the line.
15638
14:55:15,641 --> 14:55:18,599
But moving forward, and really with\n
15639
14:55:18,599 --> 14:55:20,741
and Stack Overflow are\nyour friends, but the line
15640
14:55:20,741 --> 14:55:23,061
is between the reasonable\nand the unreasonable.
15641
14:55:23,061 --> 14:55:26,411
So let me officially use the\nPython documentation search, just
15642
14:55:26,411 --> 14:55:29,051
to search for something\nlike the lowercase function.
15643
14:55:29,052 --> 14:55:31,062
Like, I know I can\nlowercase things in Python.
15644
14:55:32,502 --> 14:55:34,391
So let me just search\nfor the word lower.
15645
14:55:34,391 --> 14:55:37,332
You're going to get, often, an\noverwhelming number of results
15646
14:55:37,332 --> 14:55:40,200
because Python is a pretty big\n
15647
14:55:40,199 --> 14:55:42,491
And you're going to want to\nlook for familiar patterns.
15648
14:55:42,491 --> 14:55:45,581
For whatever reason,\nstring.lower, which is probably
15649
14:55:45,582 --> 14:55:48,942
more popular or more commonly used than\n
15650
14:55:48,942 --> 14:55:51,982
But it's purple, because I clicked\n
15651
14:55:51,982 --> 14:55:54,972
So str.lower is probably\nwhat I want, because I
15652
14:55:54,972 --> 14:55:57,582
am interested at the moment\nin lower casing strings.
15653
14:55:57,582 --> 14:56:01,779
When I click on that, this is an example\n
15654
14:56:02,322 --> 14:56:03,862
It's in this general format.
15655
14:56:03,862 --> 14:56:05,862
Here's my str.lower function.
15656
14:56:05,862 --> 14:56:08,061
This returns a copy of\nthe string, with all
15657
14:56:08,061 --> 14:56:10,271
of the cased characters\nconverted to lowercase
15658
14:56:10,271 --> 14:56:12,191
and the lower-casing\nalgorithm, dot dot dot.
15659
14:56:12,192 --> 14:56:13,690
So that doesn't give me much.
15660
14:56:13,690 --> 14:56:14,982
It doesn't give me sample code.
15661
14:56:14,982 --> 14:56:16,732
But it does say what the function does.
15662
14:56:16,732 --> 14:56:20,412
And if we keep looking, you'll see\n
15663
14:56:20,411 --> 14:56:24,641
I used its analog, Rstrip before, right\n
15664
14:56:24,641 --> 14:56:27,521
that is strip, from the end of a\n
15665
14:56:27,521 --> 14:56:29,451
like a new line, or even something else.
15666
14:56:29,451 --> 14:56:32,932
And if you scroll through\nstring, this web page here.
15667
14:56:32,932 --> 14:56:34,631
And we're halfway down the page already.
15668
14:56:34,631 --> 14:56:36,701
If you see my scroll\nbar, tiny on the right
15669
14:56:36,701 --> 14:56:41,771
there's a huge amount of functionality\n
15670
14:56:41,771 --> 14:56:44,981
And this is just testament to just\n
15671
14:56:44,982 --> 14:56:49,142
But it's also reason to\nreassure that the goal, when
15672
14:56:49,141 --> 14:56:51,391
playing around with some new\nlanguage and learning it
15673
14:56:51,391 --> 14:56:53,120
is not to learn it exhaustively.
15674
14:56:53,120 --> 14:56:54,912
Just like in English\nor any human language
15675
14:56:54,911 --> 14:56:57,161
there's always going to be\nvocab words you don't know
15676
14:56:57,161 --> 14:57:00,084
ways of presenting the same\ninformation in some language.
15677
14:57:00,084 --> 14:57:01,752
That's going to be the case with Python.
15678
14:57:01,752 --> 14:57:05,141
And what we'll do today and this\nweek in problem set 6 is really
15679
14:57:05,141 --> 14:57:06,641
get your footing with this language.
15680
14:57:06,641 --> 14:57:09,822
But you won't know all of Python,\n
15681
14:57:09,822 --> 14:57:12,822
And, honestly, you won't know all of\n
15682
14:57:12,822 --> 14:57:15,322
unless you're, perhaps, using\nthem full time professionally
15683
14:57:15,322 --> 14:57:18,891
and even then, there's more libraries\n
15684
14:57:18,891 --> 14:57:21,942
So let's actually now\npivot to a few other ideas
15685
14:57:21,942 --> 14:57:24,082
that we'll implement\nin Python, in a moment.
15686
14:57:24,082 --> 14:57:26,531
Let me switch back over to VS Code here.
15687
14:57:26,531 --> 14:57:31,781
And let me whip up, say, a recreation\n
15688
14:57:31,781 --> 14:57:34,404
where we averaged like\nthree scores together.
15689
14:57:34,404 --> 14:57:36,822
And that was an opportunity\nin week 2 to play with arrays
15690
14:57:36,822 --> 14:57:38,951
to realize how constrained arrays are.
15691
14:57:40,241 --> 14:57:41,561
You have to decide in advance.
15692
14:57:41,561 --> 14:57:43,631
But let's see what's\ndifferent here in Python.
15693
14:57:43,631 --> 14:57:48,101
So let me do Scores.py, and let\n
15694
14:57:48,101 --> 14:57:52,301
called scores, sorry, let me give myself\n
15695
14:57:52,302 --> 14:57:54,462
Set it equal to a list\nof three scores, which
15696
14:57:54,461 --> 14:57:59,082
are the same ones we've used\nbefore, 72, 73, 33, in this context
15697
14:57:59,082 --> 14:58:01,152
meant to be scores, not ASCII values.
15698
14:58:01,152 --> 14:58:03,042
And then let's just do\nthe average of these.
15699
14:58:03,042 --> 14:58:05,152
So average will be another variable.
15700
14:58:05,152 --> 14:58:09,432
And it turns out I can do, well,\nhow did I sum these before?
15701
14:58:09,432 --> 14:58:13,101
I probably had a for loop to add\n
15702
14:58:13,101 --> 14:58:16,101
Turns out in Python, you\ncan just say sum of scores
15703
14:58:16,101 --> 14:58:18,051
divided by the length of scores.
15704
14:58:18,052 --> 14:58:19,652
That's going to give me my average.
15705
14:58:19,652 --> 14:58:22,732
So sum is a function that takes\na list, in this case, as input
15706
14:58:22,732 --> 14:58:25,522
and it just does the sum for\nyou, with a for loop or whatever
15707
14:58:26,451 --> 14:58:30,002
Len gives you the length of the\nlist, how many things are in it.
15708
14:58:30,002 --> 14:58:31,762
So I can dynamically figure that out.
15709
14:58:31,762 --> 14:58:36,862
Now let me go ahead and print out,\n
15710
14:58:36,862 --> 14:58:40,150
in curly braces, the actual\naverage, close quote.
15711
14:58:40,150 --> 14:58:42,442
All right, so let's run this\ncode, Python of Scores.py.
15712
14:58:42,442 --> 14:58:47,572
And there is my average, in this\ncase, 59.33333 and so forth
15713
14:58:48,832 --> 14:58:51,022
Well, let's actually, now,\nchange this a little bit
15714
14:58:51,021 --> 14:58:54,146
and make it a little more interesting,\n
15715
14:58:54,146 --> 14:58:55,711
rather than hard coding this.
15716
14:58:55,711 --> 14:58:59,089
Let me go back up here and\nuse from CS50 import getInt
15717
14:58:59,089 --> 14:59:01,881
because I don't want to deal with\n
15718
14:59:01,881 --> 14:59:04,341
Like, I just want to use\nsomeone else's function here.
15719
14:59:04,341 --> 14:59:08,121
Let me give myself an\nempty list called scores.
15720
14:59:08,122 --> 14:59:11,002
And this is not something we\nwere able to do in C, right?
15721
14:59:11,002 --> 14:59:13,131
Because in C, if you tried\nto make an empty array
15722
14:59:13,131 --> 14:59:16,112
well, that's pretty stupid,\nbecause you can't add things to it.
15723
14:59:17,432 --> 14:59:19,171
So it wouldn't even let you do that.
15724
14:59:19,171 --> 14:59:22,161
But I can just create\nan empty list in Python
15725
14:59:22,161 --> 14:59:24,861
because lists, unlike arrays,\nare really lengthless.
15726
14:59:26,271 --> 14:59:29,391
But you and I are not dealing with\n
15727
14:59:31,292 --> 14:59:34,957
So now, let's go ahead and get a\n
15728
14:59:34,957 --> 14:59:36,332
How about three of them in total.
15729
14:59:36,332 --> 14:59:41,872
So for i in range of 3, let's go\n
15730
14:59:41,872 --> 14:59:44,332
using getInt, asking them for score.
15731
14:59:44,332 --> 14:59:51,362
And then let's go ahead and append, to\n
15732
14:59:51,362 --> 14:59:53,722
So it turns out that a list,\nand I could read the Python
15733
14:59:53,722 --> 14:59:57,802
documentation to confirm as much,\n
15734
14:59:57,802 --> 15:00:01,677
and functions built into objects\nare generally known as methods
15735
15:00:01,677 --> 15:00:03,052
if you've heard that term before.
15736
15:00:03,052 --> 15:00:05,842
Same idea, but whereas a function\nkind of stands on its own
15737
15:00:05,841 --> 15:00:09,951
a method is a function built\ninto an object, like a list here.
15738
15:00:09,951 --> 15:00:12,438
That's going to achieve the\nsame result. Strictly speaking
15739
15:00:13,521 --> 15:00:17,125
Just like in C, I could tighten this\n
15740
15:00:17,125 --> 15:00:19,042
But, I don't know, I\nkind of like it this way.
15741
15:00:19,042 --> 15:00:22,491
It's more clear, to me, at least, that\n
15742
15:00:22,491 --> 15:00:24,360
and then appending it to the list.
15743
15:00:24,360 --> 15:00:26,152
Now the rest of the\ncode can stay the same.
15744
15:00:26,152 --> 15:00:31,222
Python of Scores.py,\nscore will be 72, 73, 33.
15745
15:00:32,341 --> 15:00:35,361
But now the program's a little\nmore dynamic, which is nice.
15746
15:00:35,362 --> 15:00:37,461
But there's other\nsyntax I could use here.
15747
15:00:37,461 --> 15:00:40,851
Just so you've seen it, Python does\n
15748
15:00:40,851 --> 15:00:43,371
whereby, if you don't\nwant to do scores.append
15749
15:00:43,372 --> 15:00:47,812
you can actually say scores\nplus equals this score.
15750
15:00:47,811 --> 15:00:52,252
So you can actually concatenate\nlists together in Python 2.
15751
15:00:52,252 --> 15:00:54,862
Just as we used plus to\njoin two strings together
15752
15:00:54,862 --> 15:00:57,921
you can use plus to\njoin two lists together.
15753
15:00:57,921 --> 15:01:00,561
The catch is, you need\nto put the one score I'm
15754
15:01:00,561 --> 15:01:03,292
adding here in a list of its\nown, which is kind of silly.
15755
15:01:03,292 --> 15:01:07,851
But it's necessary, so that this\n
15756
15:01:07,851 --> 15:01:10,491
To do this more verbosely,\nwhich most programmers wouldn't
15757
15:01:10,491 --> 15:01:12,831
do, but just for clarity,\nthis is the same thing
15758
15:01:12,832 --> 15:01:15,472
as saying scores plus this score.
15759
15:01:15,472 --> 15:01:19,432
So now maybe it's a little more\n
15760
15:01:19,432 --> 15:01:24,201
plural, sorry, singular, are both\n
15761
15:01:25,381 --> 15:01:28,261
So two different ways, not sure\none is better than the other.
15762
15:01:28,262 --> 15:01:34,162
This way is pretty common, but .append\n
15763
15:01:34,161 --> 15:01:36,861
All right, how about another\nexample from week two.
15764
15:01:36,862 --> 15:01:39,592
This one was called uppercase.
15765
15:01:39,591 --> 15:01:42,841
So let me do this in\nUppercase.py, though, this time.
15766
15:01:42,841 --> 15:01:46,701
And let me import from\nCS50, get string again.
15767
15:01:46,701 --> 15:01:50,541
And let me go ahead and say,\nbefore will be my first variable.
15768
15:01:50,542 --> 15:01:54,022
Let me get a string from the user,\n
15769
15:01:54,021 --> 15:01:59,182
And then let me go ahead and say,\n
15770
15:01:59,182 --> 15:02:01,711
upper-casing to this string.
15771
15:02:01,711 --> 15:02:04,372
Let me change my line ending to\nbe that, using our new trick.
15772
15:02:04,372 --> 15:02:08,012
And this is where things get cool\n
15773
15:02:08,012 --> 15:02:11,572
If I want to iterate over all\nof the characters in a string
15774
15:02:11,572 --> 15:02:14,661
and print them out in uppercase,\n
15775
15:02:14,661 --> 15:02:22,553
For c in the before string, go ahead and\n
15776
15:02:22,553 --> 15:02:25,761
but don't end the line yet, because I\n
15777
15:02:28,012 --> 15:02:31,492
Python of Uppercase.py, let me\ntype in Hello in all lowercase.
15778
15:02:31,491 --> 15:02:33,531
I've just upper-cased the whole string.
15779
15:02:34,222 --> 15:02:36,652
I first get string, calling it before.
15780
15:02:36,652 --> 15:02:39,202
I then just print out some fluffy\ntext that says after colon
15781
15:02:39,201 --> 15:02:41,362
and I get rid of the line ending,\n
15782
15:02:41,362 --> 15:02:43,154
Notice I hit the spacebar\na couple of times
15783
15:02:43,154 --> 15:02:45,141
just so letters line up to be pretty.
15784
15:02:45,141 --> 15:02:47,302
For c and before, this is new.
15785
15:02:47,302 --> 15:02:51,022
This is powerful in C,\nsorry, in Python, whereby
15786
15:02:51,021 --> 15:02:54,111
you don't have to do like Int i\nequals 0 and i less than this
15787
15:02:54,112 --> 15:02:58,832
you could just say, for c in the\n
15788
15:02:58,832 --> 15:03:02,031
And then here is just upper-casing\nthat specific character
15789
15:03:02,031 --> 15:03:04,222
and making sure we don't\noutput a new line too soon.
15790
15:03:04,222 --> 15:03:06,442
But this is actually more\nwork than I need to do.
15791
15:03:06,442 --> 15:03:10,522
Based on what we've seen thus far,\n
15792
15:03:10,521 --> 15:03:12,141
can I tighten this up further?
15793
15:03:12,141 --> 15:03:16,862
Can I collapse lines 5 and 6,\nmaybe even 7, all together?
15794
15:03:16,862 --> 15:03:23,072
If the goal of this program is just\n
15795
15:03:27,002 --> 15:03:28,809
AUDIENCE: Would it be str.upper?
15796
15:03:28,809 --> 15:03:31,141
DAVID J. MALAN: Str.upper,\nyeah, so I could do something
15797
15:03:31,141 --> 15:03:34,021
like this, after gets before.upper.
15798
15:03:34,021 --> 15:03:36,271
So it's not stir\nliterally dot upper, stir
15799
15:03:36,271 --> 15:03:38,021
just represents the string in question.
15800
15:03:38,021 --> 15:03:41,141
So it would be before.upper,\nbut right idea otherwise.
15801
15:03:41,141 --> 15:03:44,652
And so let me go ahead and just tweak\n
15802
15:03:44,652 --> 15:03:49,332
Let me just go ahead and print out the\n
15803
15:03:49,332 --> 15:03:51,961
So this line is the same, I'm\ngetting a string called before.
15804
15:03:51,961 --> 15:03:55,051
I'm creating another variable\ncalled after, and, as you propose
15805
15:03:55,052 --> 15:03:58,482
I'm calling upper on the whole\n
15806
15:03:59,881 --> 15:04:03,871
And, again, in Python, there aren't\n
15807
15:04:03,872 --> 15:04:05,282
There's only strings, anyway.
15808
15:04:05,281 --> 15:04:07,121
So I might as well do them all at once.
15809
15:04:07,122 --> 15:04:10,742
So if I rerun the code now,\nPython of Uppercase.py.
15810
15:04:10,741 --> 15:04:15,601
Now I'll type in Hello in all\nlowercase, and, oh, so close
15811
15:04:15,601 --> 15:04:18,631
I think I can get rid of\nthis override, because I'm
15812
15:04:18,631 --> 15:04:22,031
printing the whole thing out at\n
15813
15:04:22,031 --> 15:04:26,402
So now if I type in Hello before,\n
15814
15:04:28,601 --> 15:04:32,432
All right, any questions,\nthen, on lists or on strings
15815
15:04:32,432 --> 15:04:37,762
and what this kind of function,\n
15816
15:04:38,262 --> 15:04:41,281
All right, so a couple other\nbuilding blocks before we start.
15817
15:04:44,531 --> 15:04:46,572
DAVID J. MALAN: To the right, right.
15818
15:04:47,561 --> 15:04:53,723
AUDIENCE: Could you write, very close to\n
15819
15:04:53,724 --> 15:04:55,779
you start creating a variable upper.
15820
15:04:55,779 --> 15:04:58,362
DAVID J. MALAN: Yes, do I have\nto create this variable, upper?
15821
15:04:59,112 --> 15:05:01,391
I could actually tighten\nthis up, and, if you really
15822
15:05:01,391 --> 15:05:04,692
want to see something neat,\ninside of the curly braces
15823
15:05:04,692 --> 15:05:07,572
you don't have to just put\nthe names of variables.
15824
15:05:07,572 --> 15:05:10,122
You can put a small\namount of logic, so long
15825
15:05:10,122 --> 15:05:13,302
as it doesn't start to look stupid and\n
15826
15:05:13,302 --> 15:05:15,462
that it's sort of bad\ndesign at that point.
15827
15:05:15,461 --> 15:05:17,061
I can tighten this up like this.
15828
15:05:17,061 --> 15:05:21,131
And now we're in Python of\nUppercase.py, writing Hello again.
15829
15:05:22,252 --> 15:05:23,802
But I would be careful about this.
15830
15:05:23,802 --> 15:05:27,005
You want to resist the temptation of\n
15831
15:05:27,004 --> 15:05:29,921
inside the curly braces, because\n
15832
15:05:29,921 --> 15:05:32,411
But, absolutely, you\ncould indeed do that, too.
15833
15:05:32,411 --> 15:05:35,471
All right, how about command line\narguments, which was one thing
15834
15:05:35,472 --> 15:05:39,552
we introduced in week two also, so\n
15835
15:05:39,552 --> 15:05:43,272
to take input from the user, whoops.
15836
15:05:43,271 --> 15:05:46,791
So we could actually take input\n
15837
15:05:46,792 --> 15:05:49,732
so as to take literally\ncommand line arguments.
15838
15:05:49,732 --> 15:05:52,542
These are a little different,\nbut it follows the same paradigm.
15839
15:05:52,542 --> 15:05:56,381
There's no main by default.\nAnd there's no Def main int
15840
15:05:56,381 --> 15:06:02,572
arg c char, or we called it string,\n
15841
15:06:02,572 --> 15:06:07,031
So if you want access to the\n
15842
15:06:07,031 --> 15:06:11,621
And it turns out, there's another\n
15843
15:06:11,622 --> 15:06:15,702
called CIS, and you can import from\n
15844
15:06:15,701 --> 15:06:17,879
So same idea, different place.
15845
15:06:17,879 --> 15:06:19,461
Now I'm going to go ahead and do this.
15846
15:06:19,461 --> 15:06:24,341
Let's write a program that just requires\n
15847
15:06:24,341 --> 15:06:26,572
after the program's\nname, or none at all.
15848
15:06:26,572 --> 15:06:33,192
So if the length of argv equals 2,\n
15849
15:06:33,192 --> 15:06:41,610
Hello comma argv bracket 1 close quote,\n
15850
15:06:41,610 --> 15:06:44,652
total at the prompt, let's just say\n
15851
15:06:45,682 --> 15:06:48,701
So the only thing that's new here\n
15852
15:06:48,701 --> 15:06:51,971
and we're using this fancy f-string\n
15853
15:06:51,972 --> 15:06:55,031
too, it's putting more complex\nlogic in the curly braces.
15854
15:06:55,792 --> 15:07:00,412
In this case, it's a list called argv,\n
15855
15:07:00,411 --> 15:07:04,301
Let's do Python of Argv.py,\nEnter, Hello, world.
15856
15:07:04,302 --> 15:07:08,002
What if I do Argv.py\nDavid at the command line.
15857
15:07:09,252 --> 15:07:11,201
So there's one curiosity here.
15858
15:07:11,201 --> 15:07:15,896
Python is not included in\nargv, whereas in C, dot
15859
15:07:15,896 --> 15:07:18,461
slash whatever was the first thing.
15860
15:07:18,461 --> 15:07:22,031
If the analog in Python is that\nthe name of your Python program
15861
15:07:22,031 --> 15:07:26,322
is the first thing, in bracket 0,\n
15862
15:07:26,322 --> 15:07:32,262
the word Python does not appear in\n
15863
15:07:32,262 --> 15:07:34,512
But otherwise, the\nidea of these arguments
15864
15:07:34,512 --> 15:07:36,904
is exactly the same as before.
15865
15:07:36,904 --> 15:07:39,072
And in fact, what you can\ndo, which is kind of cool
15866
15:07:39,072 --> 15:07:42,252
is, because argv is a list,\nyou can do things like this.
15867
15:07:42,252 --> 15:07:47,412
For arg in argv, go ahead\nand print out each argument.
15868
15:07:47,411 --> 15:07:49,511
So instead of using a\nfor loop and i and all
15869
15:07:49,512 --> 15:07:53,742
of this, if I do Python of argv Enter,\n
15870
15:07:53,741 --> 15:07:58,481
If I do Python of argv Foo,\nit puts Argv.py and Foo.
15871
15:07:58,482 --> 15:08:03,042
If I do, sorry, if I do Foo and\nbar, those words all print out.
15872
15:08:03,042 --> 15:08:05,292
If I do Foobar baz, those print out too.
15873
15:08:05,292 --> 15:08:08,351
And Foo and bar or baz are like\na mathematician's x and y and z
15874
15:08:08,351 --> 15:08:11,722
for computer scientists, when you\n
15875
15:08:12,942 --> 15:08:16,542
It reads a little more like English, and\n
15876
15:08:16,542 --> 15:08:20,052
allows you to iterate very quickly\n
15877
15:08:20,052 --> 15:08:22,692
Suppose I only wanted the real\nwords that the human typed
15878
15:08:23,771 --> 15:08:26,981
Like, suppose I want to ignore Argv.py.
15879
15:08:26,982 --> 15:08:30,162
I mean I could do something\nhackish like this.
15880
15:08:30,161 --> 15:08:35,626
If arg equals Argv.py,\nI could just ignore
15881
15:08:35,627 --> 15:08:37,002
you know, let's invert the logic.
15882
15:08:37,002 --> 15:08:39,052
I could do this, for instance.
15883
15:08:39,052 --> 15:08:41,622
So if the arg does not\nequal the program name
15884
15:08:41,622 --> 15:08:44,412
then go ahead and print out the word.
15885
15:08:44,411 --> 15:08:46,361
So I get Foobar and baz only.
15886
15:08:46,362 --> 15:08:50,921
Or, this is what's kind of neat\n
15887
15:08:50,921 --> 15:08:54,921
And let me just take a slice of\nthe array of the list instead.
15888
15:08:54,921 --> 15:08:59,332
So it turns out, if argv is\na list, I can actually say
15889
15:08:59,332 --> 15:09:03,582
you know what, go into that list,\n
15890
15:09:03,582 --> 15:09:05,722
and then go all the way to the end.
15891
15:09:05,722 --> 15:09:08,322
And we have not seen this\nsyntax in C. But this
15892
15:09:08,322 --> 15:09:10,932
is a way of slicing a list in Python.
15893
15:09:12,341 --> 15:09:17,381
If I run Python of\nArgv.py, Foo bar baz Enter
15894
15:09:17,381 --> 15:09:21,252
I get only a subset of the\nlist, starting at position 1
15895
15:09:21,252 --> 15:09:23,413
going all of the way to the end.
15896
15:09:23,413 --> 15:09:25,121
And you can even do\nkind of the opposite.
15897
15:09:25,122 --> 15:09:27,852
If, for whatever reason, you\nwant to ignore the last element
15898
15:09:27,851 --> 15:09:33,551
you can say colon, we\ncould say colon negative 1
15899
15:09:33,552 --> 15:09:36,082
and use a negative number,\nwhich we've not seen before
15900
15:09:36,082 --> 15:09:38,992
which slices off the end\nof the list, as well.
15901
15:09:38,991 --> 15:09:42,521
So there's some syntactic tricks\n
15902
15:09:42,521 --> 15:09:46,661
even if at first glance, you might\n
15903
15:09:46,661 --> 15:09:49,319
All right, let's do one\nother example with exit
15904
15:09:49,319 --> 15:09:51,612
and then we'll start actually\napplying some algorithms
15905
15:09:51,612 --> 15:09:52,737
to make things interesting.
15906
15:09:52,737 --> 15:09:56,991
So in one last program here, let's do\n
15907
15:09:56,991 --> 15:09:58,731
before we introduce some algorithms.
15908
15:10:00,741 --> 15:10:05,421
Let's import from CIS, import argv.
15909
15:10:07,012 --> 15:10:09,722
Let's make sure the user gives\nme one command line argument.
15910
15:10:09,722 --> 15:10:16,101
So if the length of argv does not\n
15911
15:10:16,101 --> 15:10:19,311
and print out something like\nmissing command line argument
15912
15:10:19,311 --> 15:10:21,112
just to explain what the problem is.
15913
15:10:25,101 --> 15:10:27,231
But I'm going to use a\nbetter version of exit here.
15914
15:10:27,232 --> 15:10:29,422
Let me import two functions from CIS.
15915
15:10:29,421 --> 15:10:33,561
Turns out the better way to do this is\n
15916
15:10:33,561 --> 15:10:36,514
specifically 2, with this exit code.
15917
15:10:36,514 --> 15:10:38,932
Otherwise, down here, I'm going\nto go ahead and print out
15918
15:10:38,932 --> 15:10:43,340
something like Hello, comma\nargv bracket 1, same as before.
15919
15:10:43,339 --> 15:10:44,881
And then I'm going to exit with zero.
15920
15:10:44,881 --> 15:10:46,932
So, again, this was a\nsubtle thing we introduced
15921
15:10:46,932 --> 15:10:49,432
in week two, where you can\nactually have your programs exit
15922
15:10:49,432 --> 15:10:51,951
with some number, where\n0 signifies success
15923
15:10:51,951 --> 15:10:53,871
and anything else signifies error.
15924
15:10:53,872 --> 15:10:55,762
This is just the same idea in Python.
15925
15:10:55,762 --> 15:11:00,442
So if I, for instance, just run the\n
15926
15:11:00,442 --> 15:11:03,141
I meant to say exit here and exit here.
15927
15:11:04,232 --> 15:11:07,022
If I run this like this, I'm\nmissing a command line argument.
15928
15:11:07,021 --> 15:11:09,721
So let me rerun it with\nlike my name at the prompt.
15929
15:11:09,722 --> 15:11:13,552
So I have exactly two command line\n
15930
15:11:14,572 --> 15:11:16,864
And if I do David Malan, it's\nnot going to work either
15931
15:11:16,864 --> 15:11:18,682
because now argv does not equal 2.
15932
15:11:18,682 --> 15:11:21,381
But the difference here is\nthat we're exiting with 1
15933
15:11:21,381 --> 15:11:26,421
so that special programs can detect an\n
15934
15:11:26,421 --> 15:11:28,701
And now there's one other\nway to do this, too.
15935
15:11:28,701 --> 15:11:30,981
Suppose that you're\nimporting a lot of functions
15936
15:11:30,982 --> 15:11:33,465
and you don't really want\nto make a mess of things
15937
15:11:33,464 --> 15:11:35,631
and just have all of these\nfunction names available
15938
15:11:35,631 --> 15:11:38,151
without it being clear\nwhere they came from.
15939
15:11:38,152 --> 15:11:39,982
Let's just import all of CIS.
15940
15:11:39,982 --> 15:11:43,702
And let's just change our syntax,\n
15941
15:11:43,701 --> 15:11:46,491
where we just prepend to all\nof these library functions
15942
15:11:46,491 --> 15:11:49,941
CIS, just to be super-explicit\nwhere they came from
15943
15:11:49,942 --> 15:11:55,359
and if there's another\nexit or argv value
15944
15:11:55,358 --> 15:11:58,441
that we want to import from a library,\n
15945
15:11:58,442 --> 15:12:01,672
So if I do it one last time here,\nmissing command line argument.
15946
15:12:01,671 --> 15:12:03,711
But David still actually worked.
15947
15:12:03,711 --> 15:12:06,771
All right, only to demonstrate how\n
15948
15:12:06,771 --> 15:12:09,651
Let's now do something more\npowerful, like a search algorithm
15949
15:12:10,553 --> 15:12:13,011
I'm going to go ahead and open\nup a file called Numbers.py
15950
15:12:13,012 --> 15:12:16,942
and let's just do some searching\nor linear search, rather
15951
15:12:20,582 --> 15:12:23,572
How about import CIS as before.
15952
15:12:23,572 --> 15:12:29,362
Let me give myself a list of\nnumbers, like 4, 6, 8, 2, 7, 5, 0
15953
15:12:29,362 --> 15:12:31,192
so just a bunch of integers.
15954
15:12:32,692 --> 15:12:36,112
If you recall from week three,\nwe searched for the number 0
15955
15:12:36,112 --> 15:12:38,402
at the end of the lockers on stage.
15956
15:12:38,402 --> 15:12:40,641
So let's just ask that\nquestion in Python.
15957
15:12:40,641 --> 15:12:42,381
No need for a loop or\nanything like that.
15958
15:12:42,381 --> 15:12:46,072
If 0 is in the numbers, go\nahead and print out found.
15959
15:12:46,072 --> 15:12:49,942
And then let's just exit successfully,\n
15960
15:12:49,942 --> 15:12:52,192
let's just say print not found.
15961
15:12:52,192 --> 15:12:55,732
And then we'll CIS exit with 1.
15962
15:12:55,732 --> 15:12:58,342
So this is where Python\nstarts to get powerful again.
15963
15:12:59,572 --> 15:13:02,254
Here is your loop, that's doing\nall of the checking for you.
15964
15:13:02,254 --> 15:13:04,671
Underneath the hood, Python\nis going to use linear search.
15965
15:13:04,671 --> 15:13:06,338
You don't have to implement it yourself.
15966
15:13:06,338 --> 15:13:08,841
No while loop, no for loop,\nyou just ask a question.
15967
15:13:08,841 --> 15:13:12,752
If 0 is in numbers,\nthen do the following.
15968
15:13:12,752 --> 15:13:14,872
So that's one feature\nwe now get with Python
15969
15:13:14,872 --> 15:13:16,862
and get to throw away\na lot of that code.
15970
15:13:16,862 --> 15:13:18,351
We can do it with strings, too.
15971
15:13:18,351 --> 15:13:21,362
Let me open a file\ncalled Names.py instead
15972
15:13:21,362 --> 15:13:23,512
and do something that was\neven more involved in C
15973
15:13:23,512 --> 15:13:26,542
because we needed Str Comp and\nthe for loop, and so forth.
15974
15:13:26,542 --> 15:13:28,522
Let me import CIS for this file.
15975
15:13:28,521 --> 15:13:30,981
Let's give myself a bunch\nof names like we did in C.
15976
15:13:30,982 --> 15:13:38,152
And those were Bill and Charlie\nand Fred and George and Ginny
15977
15:13:38,152 --> 15:13:41,961
and two more, Percy, and lastly Ron.
15978
15:13:41,961 --> 15:13:43,911
And recall, at the\ntime, we looked for Ron.
15979
15:13:43,911 --> 15:13:45,953
And so we had to iterate\nthrough the whole thing
15980
15:13:45,953 --> 15:13:48,331
doing Str Comp and i plus\nplus and all of that.
15981
15:13:48,332 --> 15:13:55,281
Now just ask the question, if Ron\n
15982
15:13:55,281 --> 15:13:56,961
and, whoops, let me hide that.
15983
15:13:58,771 --> 15:14:02,701
Let me go ahead and say\nprint, found, as before.
15984
15:14:02,701 --> 15:14:06,231
CIS exit 1, just to indicate\nsuccess, and then down here
15985
15:14:06,232 --> 15:14:09,362
if we get to this point,\nwe can say not found.
15986
15:14:09,362 --> 15:14:12,692
And then we'll just CIS exit 1 instead.
15987
15:14:12,692 --> 15:14:17,482
So, again, this just does linear search\n
15988
15:14:17,482 --> 15:14:20,932
we found Ron, because, indeed, he's\n
15989
15:14:20,932 --> 15:14:24,711
But we don't need to deal with\nall of the mechanics of it.
15990
15:14:24,711 --> 15:14:27,051
All right, let's take\nthings one step further.
15991
15:14:27,052 --> 15:14:29,362
In week three, we also\nimplemented the idea
15992
15:14:29,362 --> 15:14:33,502
of a phone book, that actually\nassociated keys with values.
15993
15:14:33,502 --> 15:14:36,531
But remember, the phone book in\nC, was kind of a hack, right?
15994
15:14:36,531 --> 15:14:40,042
Because we first had two arrays,\n
15995
15:14:40,042 --> 15:14:43,851
Then we introduced structs, and\n
15996
15:14:43,851 --> 15:14:47,421
And then we had an array of persons.
15997
15:14:47,421 --> 15:14:51,561
You can do this in Python, using\n
15998
15:14:51,561 --> 15:14:54,191
But we can also just use a\ngeneral purpose dictionary
15999
15:14:54,192 --> 15:14:57,942
because just like in P set 5, you\n
16000
15:14:59,622 --> 15:15:02,922
Well, similarly, can\nPython just do this for us.
16001
15:15:02,921 --> 15:15:05,771
From CS50, let's import get string.
16002
15:15:05,771 --> 15:15:09,281
And now let's give myself\na dictionary of people
16003
15:15:09,281 --> 15:15:13,061
D-I-C-T () open paren closed\nparen gives you a dictionary.
16004
15:15:13,061 --> 15:15:15,822
Or you can simplify\nthe syntax, actually
16005
15:15:15,822 --> 15:15:18,881
and a dictionary again is just keys\n
16006
15:15:18,881 --> 15:15:21,581
You can also just use\ncurly braces instead.
16007
15:15:21,582 --> 15:15:23,542
That gives me an empty dictionary.
16008
15:15:23,542 --> 15:15:26,921
But if I know what I want to put in it\n
16009
15:15:26,921 --> 15:15:34,311
with a number of plus 1-617-495-1000,\n
16010
15:15:34,311 --> 15:15:40,298
David, with plus 1-949-468-2750.
16011
15:15:40,298 --> 15:15:42,881
And it came to my attention,\ntragically, after class that day
16012
15:15:42,881 --> 15:15:44,673
that we had a bug in\nour little Easter egg.
16013
15:15:44,673 --> 15:15:47,711
If today, you would like to call\nme or text me, at that number
16014
15:15:47,711 --> 15:15:50,652
we have fixed the code that\nunderlies that little Easter egg.
16015
15:15:51,612 --> 15:15:53,561
All right, so this now\ngives me a variable
16016
15:15:53,561 --> 15:15:57,641
called people, that's\nassociating keys with values.
16017
15:15:57,641 --> 15:16:01,752
There is some new syntax here in\n
16018
15:16:01,752 --> 15:16:04,811
but the colons, and the quotes\non the left and the right.
16019
15:16:04,811 --> 15:16:07,901
This is a way, in Python,\nof associating keys
16020
15:16:07,902 --> 15:16:11,872
with values, words with definitions,\n
16021
15:16:11,872 --> 15:16:15,072
And it's going to be a super-common\n
16022
15:16:15,072 --> 15:16:18,972
when we look at CSS and HTML and\n
16023
15:16:18,972 --> 15:16:22,362
are like this omnipresent idea in\n
16024
15:16:22,362 --> 15:16:25,822
because it's just a really useful way\n
16025
15:16:25,822 --> 15:16:29,211
So, at this point in the story, we\n
16026
15:16:29,211 --> 15:16:32,711
if you will, of people, associating\nnames with phone numbers
16027
15:16:32,711 --> 15:16:34,197
just like a real world phone book.
16028
15:16:34,197 --> 15:16:37,722
So let's write a program that gets\n
16029
15:16:37,722 --> 15:16:39,912
whose number they would like to look up.
16030
15:16:39,911 --> 15:16:46,031
Then, let's go ahead and say, if that\n
16031
15:16:46,031 --> 15:16:48,612
go ahead and print out\nthat person's number
16032
15:16:48,612 --> 15:16:51,252
by going into the people\ndictionary and going
16033
15:16:51,252 --> 15:16:56,002
to that specific name, within there,\n
16034
15:16:56,002 --> 15:16:58,482
So this is similar in spirit to before.
16035
15:16:58,482 --> 15:17:02,652
Linear search and dictionary lookups\n
16036
15:17:02,652 --> 15:17:05,802
in Python, by just asking the\nquestion, if name and people.
16037
15:17:05,802 --> 15:17:07,692
And this line is just\ngoing to print out
16038
15:17:07,692 --> 15:17:12,232
whoever is in the people\ndictionary, at that name.
16039
15:17:12,232 --> 15:17:16,722
So I'm using square brackets, because\n
16040
15:17:16,722 --> 15:17:19,842
just like you can index into\nan array, or a list in Python
16041
15:17:19,841 --> 15:17:24,671
using numbers, 0, 1, 2, you\ncan very conveniently index
16042
15:17:24,671 --> 15:17:29,601
into a dictionary in Python,\nusing square brackets, as well.
16043
15:17:29,601 --> 15:17:32,591
And just to make clear what's\ngoing on here, let me go
16044
15:17:32,591 --> 15:17:37,002
and create a temporary variable,\n
16045
15:17:37,002 --> 15:17:41,531
And then let's just, or, sorry, let's\n
16046
15:17:41,531 --> 15:17:44,411
And that will just print\nout the number in question.
16047
15:17:44,411 --> 15:17:48,371
In C, and previously in Python,\n
16048
15:17:48,372 --> 15:17:53,472
would have been go to a location in\n
16049
15:17:53,472 --> 15:17:57,311
But that can actually be a string,\n
16050
15:17:57,311 --> 15:17:59,351
And this is what's amazing\nabout dictionaries
16051
15:17:59,351 --> 15:18:02,411
it's not like a big\nline, a big linear thing.
16052
15:18:02,411 --> 15:18:05,261
It's this table, that you can\nlook up in one column the name
16053
15:18:05,262 --> 15:18:07,582
and get back in the\nother column the number.
16054
15:18:07,582 --> 15:18:09,641
So let's go ahead and run\nPython of Phonebook.py
16055
15:18:14,622 --> 15:18:18,402
That's not what's\nsupposed to happen at all.
16056
15:18:18,402 --> 15:18:19,961
I think I'm in the wrong play.
16057
15:18:32,351 --> 15:18:36,491
Python of Phonebook.py, what the--
16058
15:18:55,661 --> 15:18:57,776
What am I not understanding here?
16059
15:19:00,701 --> 15:19:03,869
OK, Roxanne, Carter, do you\nsee what I'm doing wrong?
16060
15:19:10,752 --> 15:19:14,631
SPEAKER 47: When you found the test\n
16061
15:19:14,631 --> 15:19:19,911
DAVID J. MALAN: Oh, yeah, found,\nOK, we're going to do this.
16062
15:19:31,881 --> 15:19:33,792
All this is coming out of the video.
16063
15:19:42,805 --> 15:19:44,722
I will try to figure out\nwhat was going wrong.
16064
15:19:44,722 --> 15:19:47,322
The best I can tell, it was\nrunning the wrong program.
16065
15:19:47,322 --> 15:19:49,341
I don't quite understand why.
16066
15:19:49,341 --> 15:19:50,691
So we will diagnose this later.
16067
15:19:50,692 --> 15:19:53,484
I just put the file into a temporary\n
16068
15:19:53,483 --> 15:19:59,231
So let me go ahead and just run\nthis, Python of Phonebook.py
16069
15:19:59,232 --> 15:20:00,762
type in, for instance, my name.
16070
15:20:00,762 --> 15:20:02,940
And there's my corresponding number.
16071
15:20:02,940 --> 15:20:04,482
Have no idea what was just happening.
16072
15:20:04,482 --> 15:20:06,582
But I will get to the\nbottom of it and update you
16073
15:20:06,582 --> 15:20:07,882
if we can put our finger on it.
16074
15:20:07,881 --> 15:20:11,411
So this was just an example, now,\nof implementing a phone book.
16075
15:20:11,411 --> 15:20:14,111
Let's now consider what we\ncan do that's a little more
16076
15:20:14,112 --> 15:20:16,932
powerful, in these examples,\nlike a phone book that
16077
15:20:16,932 --> 15:20:18,671
actually keeps this information around.
16078
15:20:18,671 --> 15:20:22,031
Thus far, these simple phone book\n
16079
15:20:22,031 --> 15:20:25,301
But using CSV files,\ncomma separated values
16080
15:20:25,302 --> 15:20:28,077
maybe we could actually keep\naround the names and numbers
16081
15:20:28,076 --> 15:20:29,951
so that, like on your\nphone, you can actually
16082
15:20:29,951 --> 15:20:32,301
keep your contacts around long-term.
16083
15:20:32,302 --> 15:20:35,582
So I'm going to go ahead now and\n
16084
15:20:35,582 --> 15:20:39,762
And let me just hide this\ndetail, so it's not confusing.
16085
15:20:39,762 --> 15:20:43,152
Whoops, I'm going to change\nmy prompt temporarily.
16086
15:20:43,152 --> 15:20:47,062
So let me go ahead now and\nrefine this example as follows.
16087
15:20:47,061 --> 15:20:50,351
I'm going to go into\nPhonebook.py, and I'm
16088
15:20:50,351 --> 15:20:52,811
going to import a whole\nlibrary called CSV.
16089
15:20:52,811 --> 15:20:54,671
And this is a powerful\none, because Python
16090
15:20:54,671 --> 15:20:58,391
comes with a library that just\nhandles CSV files for you.
16091
15:20:58,391 --> 15:21:02,122
A CSV file is just a file\nwith comma separated values.
16092
15:21:02,122 --> 15:21:06,102
And, in fact, to demonstrate\nthis, let me check on one thing
16093
15:21:06,101 --> 15:21:08,981
here, just to make this\na little more real.
16094
15:21:08,982 --> 15:21:15,532
To demonstrate this, let's\ngo ahead and do this.
16095
15:21:15,531 --> 15:21:18,491
Let me import the CSV library from CS50.
16096
15:21:20,351 --> 15:21:24,072
Let me then open a file,\nusing the open function
16097
15:21:24,072 --> 15:21:28,932
open a file called\nPhonebook.csv, in append format
16098
15:21:28,932 --> 15:21:31,421
in contrast with read\nformat and write format.
16099
15:21:31,421 --> 15:21:34,972
Write just blows it away if it exists,\n
16100
15:21:34,972 --> 15:21:37,452
So I keep this phone book\naround, just like you might
16101
15:21:37,451 --> 15:21:39,389
keep adding contacts to your phone.
16102
15:21:39,389 --> 15:21:41,932
Now let me go ahead and get a\ncouple of values from the user.
16103
15:21:41,932 --> 15:21:45,342
Let me say getString and\nask the user for a name.
16104
15:21:45,341 --> 15:21:50,681
Then let me getString again, and\nask the user for their number.
16105
15:21:50,682 --> 15:21:52,707
And now, let me go ahead and do this.
16106
15:21:52,707 --> 15:21:54,582
And this is new, and\nthis is Python-specific.
16107
15:21:54,582 --> 15:21:57,342
And you would only know this\nby following a tutorial
16108
15:21:57,341 --> 15:21:59,002
or reading the documentation.
16109
15:21:59,002 --> 15:22:01,391
Let me give myself a\nvariable called writer
16110
15:22:01,391 --> 15:22:06,472
and ask the CSV library\nfor a writer to that file.
16111
15:22:06,472 --> 15:22:09,912
Then, let me go ahead and\nuse that writer variable
16112
15:22:09,911 --> 15:22:13,241
use a function or a method\ninside of it, called write row
16113
15:22:13,241 --> 15:22:17,722
to write out a list containing\nthat person's name and number.
16114
15:22:17,722 --> 15:22:20,832
Notice the square brackets\ninside the parentheses
16115
15:22:20,832 --> 15:22:25,872
because I'm just printing a list\n
16116
15:22:25,872 --> 15:22:27,622
And then I'm just going\nto close the file.
16117
15:22:27,622 --> 15:22:29,264
So what is the effect of all of this?
16118
15:22:29,264 --> 15:22:31,722
Well, let me go ahead and run\nthis version of Phonebook.py
16119
15:22:31,722 --> 15:22:33,202
and I'm prompted for a name.
16120
15:22:33,201 --> 15:22:41,651
Let's do Carter's first, plus\n1-617-495-1000, and then
16121
15:22:44,292 --> 15:22:47,482
Notice in my current directory,\n
16122
15:22:47,482 --> 15:22:50,952
which I wrote, and\napparently Phonebook.csv.
16123
15:22:50,951 --> 15:22:53,351
CSV just stands for\ncomma separated values.
16124
15:22:53,351 --> 15:22:56,902
And it's like a very simple way\n
16125
15:22:56,902 --> 15:23:00,192
if you will, where the comma represents\n
16126
15:23:00,192 --> 15:23:02,891
There's only two columns\nhere, name and number.
16127
15:23:02,891 --> 15:23:06,101
But, because I'm writing to\nthis file in append mode
16128
15:23:06,101 --> 15:23:09,741
let me run it one more time,\nPython of Phonebook.py
16129
15:23:09,741 --> 15:23:18,011
and let me go ahead and do David\nand plus 1-949-468-2750, Enter.
16130
15:23:18,012 --> 15:23:19,872
And notice what happened\nin the CSV file.
16131
15:23:19,872 --> 15:23:22,902
It automatically updated,\nbecause I'm now persisting
16132
15:23:22,902 --> 15:23:25,522
this data to the file in question.
16133
15:23:25,521 --> 15:23:27,881
So if I wanted to now\nread this file in, I
16134
15:23:27,881 --> 15:23:32,201
could actually go ahead and\ndo linear search on the data
16135
15:23:32,201 --> 15:23:35,171
using a read function to\nactually read from the CSV.
16136
15:23:35,171 --> 15:23:37,871
But, for now, we'll just leave\nit a little simply as write.
16137
15:23:37,872 --> 15:23:39,792
And let me make one refinement here.
16138
15:23:39,792 --> 15:23:43,542
It turns out that, if you're in\nthe habit of re-opening a file
16139
15:23:43,542 --> 15:23:45,851
you don't have to even\nclose it explicitly.
16140
15:23:47,442 --> 15:23:52,572
You can instead say, with the opening\n
16141
15:23:52,572 --> 15:23:57,822
in append mode, calling the thing file,\n
16142
15:23:58,872 --> 15:24:00,899
So the with keyword is\na new thing in Python.
16143
15:24:00,898 --> 15:24:03,731
And it's used in a few different\n
16144
15:24:03,732 --> 15:24:04,857
is to tighten up code here.
16145
15:24:04,857 --> 15:24:06,940
And I'm going to move my\nvariables to the outside
16146
15:24:06,940 --> 15:24:09,432
because they don't need to be\ninside of the with statement
16147
15:24:10,389 --> 15:24:12,974
This just has the effect of\nensuring that you, the programmer
16148
15:24:12,974 --> 15:24:15,312
don't screw up, and accidentally\ndon't close your file.
16149
15:24:15,311 --> 15:24:17,201
In fact, you might\nrecall, from C, Valgrind
16150
15:24:17,201 --> 15:24:21,758
might have complained at you, if you had\n
16151
15:24:21,758 --> 15:24:24,341
you might have had a memory leak\nas a result. The with keyword
16152
15:24:24,341 --> 15:24:28,361
takes care of all of\nthat for you, as well.
16153
15:24:28,362 --> 15:24:31,192
How about let's do, want to do this.
16154
15:24:31,192 --> 15:24:34,482
How about, let's do one other thing.
16155
15:24:35,752 --> 15:24:38,802
Let me go ahead and propose,\nthat on your phone or laptop
16156
15:24:38,802 --> 15:24:43,992
here, or online, go to this URL here,\n
16157
15:24:43,991 --> 15:24:46,811
And just to show that these CSVs\n
16158
15:24:46,811 --> 15:24:48,371
and if you've ever\nlike used a Google Form
16159
15:24:48,372 --> 15:24:50,082
or managed a student group,\nor something where you've
16160
15:24:50,082 --> 15:24:52,272
collected data via Google\nForms, you can actually
16161
15:24:52,271 --> 15:24:55,161
export all of that data via CSV files.
16162
15:24:55,161 --> 15:24:57,671
So go ahead to this URL here.
16163
15:24:57,671 --> 15:24:59,472
And those of you\nwatching on demand later
16164
15:24:59,472 --> 15:25:01,061
will find that the form\nis no longer working
16165
15:25:01,061 --> 15:25:02,551
since we're only doing this live.
16166
15:25:02,552 --> 15:25:04,302
But that will lead to\na Google Form that's
16167
15:25:04,302 --> 15:25:07,272
going to let everyone input\ntheir answer to a question
16168
15:25:07,271 --> 15:25:10,182
like what house do you\nwant to end up into
16169
15:25:10,182 --> 15:25:13,152
sort of an approximation of the\nsorting hat in Harry Potter.
16170
15:25:13,152 --> 15:25:17,202
And via this form, will we then\nhave the ability to export
16171
15:25:20,302 --> 15:25:24,132
So let's give you a moment to do that.
16172
15:25:24,131 --> 15:25:26,981
In just a moment, I'll share\nmy version of the screen, which
16173
15:25:26,982 --> 15:25:30,852
is going to let me actually\nopen the file, the form itself.
16174
15:25:30,851 --> 15:25:35,591
And in just a moment, I'll switch over.
16175
15:25:35,591 --> 15:25:37,541
OK, so this is now my\nversion of the form
16176
15:25:37,542 --> 15:25:40,811
here, where we have 200 plus responses\n
16177
15:25:40,811 --> 15:25:44,531
house do you belong in, Gryffindor,\n
16178
15:25:44,531 --> 15:25:49,322
If I go over to responses, I'll see all\n
16179
15:25:49,322 --> 15:25:51,822
So graphical user interface,\nand we could flip through this.
16180
15:25:51,822 --> 15:25:56,531
And it looks like, interestingly,\n40% of Harvard students
16181
15:25:56,531 --> 15:26:00,745
want to be in Gryffindor, 22%\nin Slytherin, and everyone else
16182
15:26:01,661 --> 15:26:03,791
But you might have noticed,\nif ever using a Google Form
16183
15:26:03,792 --> 15:26:05,241
this Google Spreadsheets link.
16184
15:26:05,241 --> 15:26:06,531
So I'm going to go ahead and click that.
16185
15:26:06,531 --> 15:26:08,981
And that's going to automatically open,\n
16186
15:26:08,982 --> 15:26:11,812
But you can do the same thing\nwith Office 365 as well.
16187
15:26:11,811 --> 15:26:14,561
And now you see the raw\ndata as a spreadsheet.
16188
15:26:14,561 --> 15:26:19,421
But in Google Spreadsheets, if I go\n
16189
15:26:19,421 --> 15:26:23,322
notice I can download this as\nan Excel file, a PDF, and also
16190
15:26:23,322 --> 15:26:25,432
a CSV, comma separated values.
16191
15:26:25,432 --> 15:26:27,141
So let me go ahead and do that.
16192
15:26:27,141 --> 15:26:30,442
That gives me a file in my\nDownloads folder on my computer.
16193
15:26:30,442 --> 15:26:34,492
I'm going to now go back\nto my code editor here.
16194
15:26:34,491 --> 15:26:36,701
And what I'm going to go\nahead and do is upload
16195
15:26:36,701 --> 15:26:40,841
this file, from my\nDownloads folder to VS Code
16196
15:26:40,841 --> 15:26:43,131
so that we can actually\nsee it within here.
16197
15:26:43,131 --> 15:26:44,741
And now you can see this open file.
16198
15:26:44,741 --> 15:26:47,741
And I'm going to shorten its name,\n
16199
15:26:47,741 --> 15:26:52,511
I'm going to rename this using the\n
16200
15:26:52,512 --> 15:26:55,889
And then we can see, in the file, that\n
16201
15:26:55,889 --> 15:26:57,972
house, where you have a\nwhole bunch of time stamps
16202
15:26:57,972 --> 15:27:00,792
when people filled out the form,\n
16203
15:27:00,792 --> 15:27:02,502
And then everyone else\njust a moment ago.
16204
15:27:02,502 --> 15:27:05,832
And the second value, after each\n
16205
15:27:05,832 --> 15:27:08,562
Well, let me go ahead here\nand implement a program
16206
15:27:08,561 --> 15:27:12,621
in a file called Hogwarts.py,\nthat processes this data.
16207
15:27:12,622 --> 15:27:14,802
So in Hogwarts.py, let's\njust write a program
16208
15:27:14,802 --> 15:27:17,962
that now reads a CSV, in\nthis case not a phone book
16209
15:27:17,961 --> 15:27:19,932
but everyone's sorting hat information.
16210
15:27:19,932 --> 15:27:21,972
And I'm going to go\nahead and Import CSV.
16211
15:27:21,972 --> 15:27:25,182
And suppose I want to answer a\nreasonable question, ignoring
16212
15:27:25,182 --> 15:27:28,991
the fact that Google's GUI or graphical\n
16213
15:27:28,991 --> 15:27:31,841
I just want to count up who's\ngoing to be in which house.
16214
15:27:31,841 --> 15:27:36,161
So let me give myself a dictionary\n
16215
15:27:37,302 --> 15:27:39,312
And let me pre-create a few keys.
16216
15:27:39,311 --> 15:27:44,021
Let me say Gryffindor is\ngoing to be initialized to 0
16217
15:27:44,021 --> 15:27:48,341
Hufflepuff will be initialized\nto 0 as well, Ravenclaw
16218
15:27:49,722 --> 15:27:53,292
And finally, Slytherin\nwill be initialized to 0.
16219
15:27:53,292 --> 15:27:56,472
So here's another example of\na dictionary, or a hash table
16220
15:27:56,472 --> 15:27:58,662
just being a very\ngeneral-purpose piece of data.
16221
15:27:58,661 --> 15:28:00,281
You can have keys and values.
16222
15:28:00,281 --> 15:28:01,991
The keys, in this case, are the houses.
16223
15:28:01,991 --> 15:28:05,021
The values are initially zero,\nbut I'm going to use this
16224
15:28:05,021 --> 15:28:10,121
instead of like four separate variables,\n
16225
15:28:12,252 --> 15:28:19,701
With opening Hogwarts.csv, in read mode,\n
16226
15:28:19,701 --> 15:28:22,961
I just want to read it, as\nfile as my variable name.
16227
15:28:22,961 --> 15:28:26,051
Let's go ahead and create\na reader this time
16228
15:28:26,052 --> 15:28:31,232
that is using the reader function in\n
16229
15:28:31,232 --> 15:28:33,732
I'm going to go ahead and ignore\nthe first line of the file
16230
15:28:33,732 --> 15:28:36,792
because, recall, that the first\n
16231
15:28:36,792 --> 15:28:37,972
I want to get the real data.
16232
15:28:37,972 --> 15:28:40,061
So this next function\nis just a little trick
16233
15:28:40,061 --> 15:28:43,252
for ignoring the first line of the file.
16234
15:28:44,322 --> 15:28:48,701
For every other row in the\nreader, that is line by line
16235
15:28:48,701 --> 15:28:51,941
get the current person's house,\nwhich is in row bracket 1.
16236
15:28:51,942 --> 15:28:54,735
This is what the CSV reader\nlibrary is doing for us.
16237
15:28:54,735 --> 15:28:56,652
It's handling all of the\nreading of this file.
16238
15:28:56,652 --> 15:29:00,281
It figures out where the comma is,\n
16239
15:29:00,281 --> 15:29:02,771
it hands you back a list of size 2.
16240
15:29:02,771 --> 15:29:07,611
In bracket 0 is the time stamp,\nin bracket 1 is the house name.
16241
15:29:07,612 --> 15:29:11,351
So, in my code, I can say\nhouse equals row bracket 1.
16242
15:29:11,351 --> 15:29:13,491
I don't care about the time\nstamp for this program.
16243
15:29:13,491 --> 15:29:17,591
And then let's go into my dictionary\n
16244
15:29:17,591 --> 15:29:23,891
into it at the house location, by\n
16245
15:29:23,891 --> 15:29:26,802
And now, at the end\nof this block of code
16246
15:29:26,802 --> 15:29:29,562
that has the effect of iterating\nover every line of the file
16247
15:29:29,561 --> 15:29:31,991
updating my dictionary\nin four different places
16248
15:29:31,991 --> 15:29:35,711
based on whether someone typed\n
16249
15:29:36,222 --> 15:29:40,332
And notice that I'm using the name of\n
16250
15:29:40,332 --> 15:29:44,022
to essentially go up to this little\n
16251
15:29:44,021 --> 15:29:46,541
the 1 to a 2, the 2 to\na 3, instead of having
16252
15:29:46,542 --> 15:29:48,522
like four separate\nvariables, which would just
16253
15:29:48,521 --> 15:29:50,591
be much more annoying to maintain.
16254
15:29:50,591 --> 15:29:52,811
Down at the bottom, let's\njust print out the results.
16255
15:29:52,811 --> 15:29:56,141
For each house in those\nhouses, iterating over
16256
15:29:56,141 --> 15:29:58,271
the keys they're in\nby default in Python
16257
15:29:58,271 --> 15:30:01,151
let's go ahead and print\nout an f-string that says
16258
15:30:01,152 --> 15:30:05,982
the current house has the current count.
16259
15:30:05,982 --> 15:30:11,592
And count will be the result of indexing\n
16260
15:30:13,332 --> 15:30:18,461
So let's run this to summarize\nthe data, Hogwarts.py, 140 of you
16261
15:30:18,461 --> 15:30:22,722
answered Gryffindor, 54 Hufflepuff,\n
16262
15:30:22,722 --> 15:30:25,092
And that's just my now way\nof code, and this is, oh
16263
15:30:25,091 --> 15:30:28,748
my God, so much easier than C, to\n
16264
15:30:28,749 --> 15:30:32,082
And one of the reasons that Python is so\n
16265
15:30:32,082 --> 15:30:36,432
more generally, is that it's actually\n
16266
15:30:37,461 --> 15:30:38,891
And let me clean this up slightly.
16267
15:30:38,891 --> 15:30:41,682
It's a little annoying that\nI just have to know and trust
16268
15:30:41,682 --> 15:30:46,932
that the house name is in bracket\n
16269
15:30:47,961 --> 15:30:53,051
There's something called a\nDictionary Reader in the CSV library
16270
15:30:54,402 --> 15:30:58,991
Capital D, capital R, this means\n
16271
15:30:58,991 --> 15:31:01,421
because what a dictionary\nreader does is it
16272
15:31:01,421 --> 15:31:05,411
still returns to me every row from\n
16273
15:31:05,411 --> 15:31:09,081
but it doesn't just give me a list\n
16274
15:31:10,482 --> 15:31:15,522
And it uses, as the keys in that\n
16275
15:31:15,521 --> 15:31:17,981
for every row in the\nfile, which is just to say
16276
15:31:17,982 --> 15:31:20,472
it makes my code a little\nmore readable, because instead
16277
15:31:20,472 --> 15:31:23,112
of doing this little\ntrickery, bracket 1
16278
15:31:23,112 --> 15:31:26,022
I can say quote unquote "Bracket\nHouse" with a capital H
16279
15:31:26,021 --> 15:31:28,881
because it's capitalized\nin the Google Form itself.
16280
15:31:28,881 --> 15:31:31,319
So the code now is\njust minorly different
16281
15:31:31,319 --> 15:31:34,362
but it's way more resilient, especially\n
16282
15:31:34,362 --> 15:31:36,912
and I'm moving the columns around\nor doing something like that
16283
15:31:36,911 --> 15:31:38,494
where the numbers might get messed up.
16284
15:31:38,495 --> 15:31:41,781
Now I can run this on Hogwarts.py\n
16285
15:31:41,781 --> 15:31:46,481
But I now don't have to worry about\n
16286
15:31:46,482 --> 15:31:51,402
All right, any questions on\nthose capabilities there.
16287
15:31:51,402 --> 15:31:53,921
And that's a teaser of sorts,\nfor some of the manipulation
16288
15:31:56,141 --> 15:32:00,076
All right, so some final\nexamples and flair, to intrigue
16289
15:32:00,076 --> 15:32:01,451
with what you can do with Python.
16290
15:32:01,451 --> 15:32:05,231
I'm going to actually switch over\n
16291
15:32:05,232 --> 15:32:08,422
so that I can actually use\naudio a little more effectively.
16292
15:32:08,421 --> 15:32:10,451
So here's just a terminal\nwindow on Mac OS.
16293
15:32:10,451 --> 15:32:14,471
I before class have preinstalled\n
16294
15:32:14,472 --> 15:32:16,900
that won't really work\nin VS Code in the cloud
16295
15:32:16,900 --> 15:32:20,057
because they require audio that the\n
16296
15:32:20,057 --> 15:32:22,182
But I'm going to go ahead\nand write an example here
16297
15:32:22,182 --> 15:32:26,080
that involves writing a speech-based\n
16298
15:32:26,733 --> 15:32:28,691
And I'm going to go ahead\nand import a library
16299
15:32:28,692 --> 15:32:32,230
that, again, I pre-installed,\ncalled Python text to speech
16300
15:32:32,230 --> 15:32:34,781
and I'm going to go ahead\nand, per its documentation
16301
15:32:34,781 --> 15:32:39,400
give myself a speech engine, by\n
16302
15:32:40,601 --> 15:32:43,451
I'm then going to use this\nengine's save function
16303
15:32:43,451 --> 15:32:45,701
to do something fun, like Hello, world.
16304
15:32:45,701 --> 15:32:49,002
And then I'm going to go ahead and\n
16305
15:32:50,377 --> 15:32:52,002
All right, I'm going to save this file.
16306
15:32:52,002 --> 15:32:53,502
I'm not using VS Code at the moment.
16307
15:32:53,502 --> 15:32:56,591
I'm using another popular program\n
16308
15:32:56,591 --> 15:32:59,351
called Vim, which is a\ncommand line program that's
16309
15:32:59,351 --> 15:33:01,311
just in this black and white window.
16310
15:33:01,311 --> 15:33:05,370
Let me go ahead now and run\nPython of Speech.py, and--
16311
15:33:07,266 --> 15:33:09,641
DAVID J. MALAN: All right, so\nit's a little computerized
16312
15:33:09,641 --> 15:33:12,635
but it is speech that has been\nsynthesized from this example.
16313
15:33:12,635 --> 15:33:14,802
Let's change it a little\nbit to be more interesting.
16314
15:33:14,802 --> 15:33:16,010
Let's do something like this.
16315
15:33:16,010 --> 15:33:20,472
Let's ask the user for their name,\n
16316
15:33:20,472 --> 15:33:24,372
And then, let's use the little F\n
16317
15:33:24,372 --> 15:33:26,532
but Hello to that person's name.
16318
15:33:26,531 --> 15:33:30,792
Let me save my file, run\nPython of Speech.py, Enter.
16319
15:33:33,881 --> 15:33:36,161
DAVID J. MALAN: All right,\nso we pronounce my name OK
16320
15:33:36,161 --> 15:33:38,828
might struggle with different\nnames, depending on the phonetics.
16321
15:33:38,828 --> 15:33:40,092
But that one seemed to be OK.
16322
15:33:40,091 --> 15:33:42,371
Let's do something else with\nPython, using similarly
16323
15:33:44,302 --> 15:33:49,062
Let me go into today's examples.
16324
15:33:49,061 --> 15:33:54,851
And I'm going to go into a folder\n
16325
15:33:57,311 --> 15:33:59,891
And in this folder, that\nI've written in advance
16326
15:33:59,891 --> 15:34:02,400
are a few files,\nDetect.py, Recognize.py
16327
15:34:02,400 --> 15:34:06,851
and two full of photos,\nOffice.jpeg and Toby.jpeg.
16328
15:34:06,851 --> 15:34:09,320
If you're familiar with the\nshow, here, for instance
16329
15:34:09,321 --> 15:34:11,331
is the cast photo from The Office here.
16330
15:34:12,821 --> 15:34:15,161
Suppose I want to do\nsomething very Facebook-style
16331
15:34:15,161 --> 15:34:17,381
where I want to analyze\nall of the faces
16332
15:34:17,381 --> 15:34:19,391
or detect all of the faces in there.
16333
15:34:19,391 --> 15:34:21,461
Well, let me go ahead\nand show you a program
16334
15:34:21,461 --> 15:34:24,400
I wrote in advance,\nthat's not terribly long.
16335
15:34:24,400 --> 15:34:25,900
Much of it is actually comments.
16336
15:34:25,900 --> 15:34:27,161
But let's see what I'm doing.
16337
15:34:27,161 --> 15:34:30,521
I'm importing the Pillow library,\n
16338
15:34:30,521 --> 15:34:34,002
I'm importing a library called face\n
16339
15:34:36,650 --> 15:34:39,480
According to its documentation,\nyou go into that library
16340
15:34:39,480 --> 15:34:41,281
and you call a function\ncalled load image
16341
15:34:41,281 --> 15:34:43,891
file, to load something\nlike Office.jpeg
16342
15:34:43,891 --> 15:34:46,561
and then you can use the\nline of code like this.
16343
15:34:46,561 --> 15:34:50,641
Call a function called face\nlocations, passing the images input
16344
15:34:50,641 --> 15:34:53,641
and you get back a list of\nall of the faces in the image.
16345
15:34:53,641 --> 15:34:57,271
And then down here, a for loop,\nthat iterates over all of those
16346
15:34:58,561 --> 15:35:01,320
And inside of this loop, I\njust do a bit of trickery.
16347
15:35:01,321 --> 15:35:06,102
I figure out the top, right, bottom,\n
16348
15:35:06,101 --> 15:35:08,461
And then, using these\nlines of code here
16349
15:35:08,461 --> 15:35:11,355
I'm using that image library,\nto just draw a box, essentially.
16350
15:35:11,355 --> 15:35:12,480
And the code looks cryptic.
16351
15:35:12,480 --> 15:35:14,671
Honestly, I would have to look\nthis up to write it again.
16352
15:35:14,671 --> 15:35:17,171
But per the documentation, this\njust draws a nice little box
16353
15:35:18,131 --> 15:35:24,721
So let me go ahead and zoom out here,\n
16354
15:35:24,722 --> 15:35:29,912
All right, it's analyzing, analyzing,\n
16355
15:35:30,902 --> 15:35:35,702
And here is every face that my,\nwhat, 10 lines of Python code
16356
15:35:37,932 --> 15:35:40,711
Presumably the library\nis looking for something
16357
15:35:40,711 --> 15:35:43,622
maybe without a mask, that has\ntwo eyes, a nose, and a mouth
16358
15:35:43,622 --> 15:35:45,942
in some kind of arrangement,\nsome kind of pattern.
16359
15:35:45,942 --> 15:35:48,961
So it would seem pretty reliable, at\n
16360
15:35:49,891 --> 15:35:52,182
What if we want to look\nfor someone specific
16361
15:35:52,182 --> 15:35:53,701
for instance, someone that's\nalways getting picked on.
16362
15:35:53,701 --> 15:35:55,284
Well, we could do something like this.
16363
15:35:55,285 --> 15:35:59,582
Recognize.py, which is taking two files\n
16364
15:35:59,582 --> 15:36:01,141
of one person in particular.
16365
15:36:01,141 --> 15:36:03,421
And if you're trying to\nfind Toby in a crowd
16366
15:36:03,421 --> 15:36:06,091
here I conflated the program,\nsorry, this is the version that
16367
15:36:06,091 --> 15:36:08,072
draws a box around the given face.
16368
15:36:08,072 --> 15:36:10,201
Here we have Toby as identified.
16369
15:36:10,741 --> 15:36:14,972
Because that program, Recognize.py,\n
16370
15:36:14,972 --> 15:36:19,322
but long story short, it additionally\nloads as input Toby.jpeg
16371
15:36:19,322 --> 15:36:21,932
in order to recognize\nthat specific face.
16372
15:36:21,932 --> 15:36:24,872
And that specific face is a\ncompletely different photo
16373
15:36:24,872 --> 15:36:29,492
but it looks similar enough to the\n
16374
15:36:29,491 --> 15:36:32,341
Let's do one other that's a\nlittle sensitive to microphones.
16375
15:36:32,341 --> 15:36:37,171
Let me go into, how about my listen\n
16376
15:36:38,131 --> 15:36:40,901
And let's just run Python of Listen0.py.
16377
15:36:40,902 --> 15:36:43,952
I'm going to type in like David.
16378
15:36:43,951 --> 15:36:47,041
Oh, sorry, no, I'm going to--
16379
15:36:52,567 --> 15:36:53,942
Oh, no, that's the wrong version.
16380
15:36:53,942 --> 15:36:55,772
[CHUCKLES] OK, I looked like an idiot.
16381
15:36:58,832 --> 15:37:02,822
And if I say goodbye, I'm talking\n
16382
15:37:02,822 --> 15:37:05,112
Now it's detecting what I'm saying here.
16383
15:37:05,112 --> 15:37:08,652
So this first version of the program is\n
16384
15:37:08,652 --> 15:37:12,993
elif elif, and it's just asking\n
16385
15:37:12,993 --> 15:37:14,951
And that was my mistake\nwith the first example.
16386
15:37:14,951 --> 15:37:17,881
And then, I'm just checking,\nis Hello in the user's words?
16387
15:37:17,881 --> 15:37:19,339
Is how are you in the user's words?
16388
15:37:19,339 --> 15:37:20,673
Didn't see that, but it's there.
16389
15:37:20,673 --> 15:37:21,991
Is goodbye in the user's words?
16390
15:37:21,991 --> 15:37:25,801
Now let's do a cooler version, using a\n
16391
15:37:33,241 --> 15:37:40,691
Let's do version 2 of this, that\n
16392
15:37:43,682 --> 15:37:46,232
OK, so now it's artificial intelligence.
16393
15:37:46,232 --> 15:37:48,332
Now let's do something a\nlittle more interesting.
16394
15:37:48,332 --> 15:37:51,752
The third version of this program that\n
16395
15:37:53,402 --> 15:37:55,322
Hello, world, my name is David.
16396
15:37:59,281 --> 15:38:02,521
OK, so that time, it not\nonly analyzed what I said
16397
15:38:02,521 --> 15:38:04,451
but it plucked my name out of it.
16398
15:38:04,451 --> 15:38:07,002
Let's do two final examples.
16399
15:38:07,002 --> 15:38:09,671
This one will generate a QR code.
16400
15:38:09,671 --> 15:38:11,641
Let me go ahead and\nwrite a program called
16401
15:38:11,641 --> 15:38:15,552
QR.py, that very simply does this.
16402
15:38:15,552 --> 15:38:17,342
Let me import a library called OS.
16403
15:38:17,341 --> 15:38:19,752
Let me import a library called QR code.
16404
15:38:19,752 --> 15:38:24,521
Let me grab an image\nhere, that's QRcode.make.
16405
15:38:24,521 --> 15:38:27,961
And let me give you the URL of like a\n
16406
15:38:31,561 --> 15:38:36,362
Let me just type this,\nso I don't get it wrong.
16407
15:38:36,362 --> 15:38:41,822
OK, so if I now use this URL here,\nof a video on YouTube, making
16408
15:38:41,822 --> 15:38:44,334
sure I haven't made any\ntypos, I'm now going
16409
15:38:44,334 --> 15:38:46,292
to go ahead and do two\nlines of code in Python.
16410
15:38:46,292 --> 15:38:49,982
I'm going to first save that as\na file called QR.png, which is
16411
15:38:49,982 --> 15:38:52,012
a two dimensional barcode, a QR code.
16412
15:38:52,012 --> 15:38:53,762
And, indeed, I'm going\nto use this format.
16413
15:38:53,762 --> 15:39:00,312
And I'm going to use the OS.system\n
16414
15:39:00,311 --> 15:39:02,612
And if you'd like to take\nout your phone at this point
16415
15:39:02,612 --> 15:39:08,792
you can see the result of my barcode,\n
16416
15:39:08,792 --> 15:39:10,307
Hopefully from afar that will scan.
16417
15:39:16,671 --> 15:39:18,981
And I think that's an\nappropriate line to end on.
16418
15:40:43,362 --> 15:40:45,372
DAVID J. MALAN: This is CS50.
16419
15:40:45,372 --> 15:40:48,702
And this is week 7, the\nweek, here, of Halloween.
16420
15:40:48,701 --> 15:40:51,851
Indeed, special thanks to\nCS50's own Valerie and her mom
16421
15:40:51,851 --> 15:40:55,932
for having created this very festive\n
16422
15:40:55,932 --> 15:40:58,671
Today, we pick up where\nwe left off last time
16423
15:40:58,671 --> 15:41:00,521
which, recall, we introduced Python.
16424
15:41:00,521 --> 15:41:03,581
And that was our big transition\nfrom C, where suddenly things
16425
15:41:03,582 --> 15:41:06,192
started to look new again,\nprobably, syntactically.
16426
15:41:06,192 --> 15:41:09,732
But also, probably things\nhopefully started to feel easier.
16427
15:41:09,732 --> 15:41:13,422
Well, with that said, problem set\n
16428
15:41:14,722 --> 15:41:18,432
But hopefully you've begun to appreciate\n
16429
15:41:19,631 --> 15:41:22,301
You get more out of the box\nwith the language itself.
16430
15:41:22,302 --> 15:41:24,792
And that's going to be so\nuseful over the coming weeks
16431
15:41:24,792 --> 15:41:29,322
as we transition further to introducing\n
16432
15:41:29,322 --> 15:41:31,612
web programming next\nweek and the week after.
16433
15:41:31,612 --> 15:41:34,241
So that by term's end, and perhaps\neven for your final project
16434
15:41:34,241 --> 15:41:37,211
you really are building\nsomething from scratch
16435
15:41:37,211 --> 15:41:40,756
using all of these various\ntools somehow together.
16436
15:41:40,756 --> 15:41:42,881
So before we do that,\nthough, today, let's consider
16437
15:41:42,881 --> 15:41:47,951
what we weren't really able to\ndo last week, which was actually
16438
15:41:47,951 --> 15:41:50,831
create and store data ourselves.
16439
15:41:50,832 --> 15:41:56,052
In Python, we've played around with the\n
16440
15:41:56,052 --> 15:41:59,152
And you've been able to\nread in CSVs from disk
16441
15:41:59,152 --> 15:42:03,222
so to speak, that is, from files\n
16442
15:42:03,222 --> 15:42:06,796
But we haven't necessarily started\n
16443
15:42:06,796 --> 15:42:09,671
And that's a huge limitation, because\n
16444
15:42:09,671 --> 15:42:11,629
we've done thus far with\na couple of exceptions
16445
15:42:11,629 --> 15:42:14,891
have involved my providing input\n
16446
15:42:14,891 --> 15:42:16,849
But then nothing happens to it.
16447
15:42:16,849 --> 15:42:18,641
It disappears the moment\nthe program quits
16448
15:42:18,641 --> 15:42:20,752
because it was only\nbeing stored in memory.
16449
15:42:20,752 --> 15:42:24,432
But today, we'll start to focus all\n
16450
15:42:24,432 --> 15:42:27,442
that is, storing things\nin files and folders
16451
15:42:27,442 --> 15:42:30,281
so that you can actually\nwrite programs that remember
16452
15:42:30,281 --> 15:42:31,961
what it is the human did last time.
16453
15:42:31,961 --> 15:42:34,661
And ultimately, you can\nactually make mobile or web apps
16454
15:42:34,661 --> 15:42:37,391
that actually begin to grow, and\ngrow, and grow their data sets
16455
15:42:37,391 --> 15:42:40,991
as might happen if you get more and\n
16456
15:42:40,991 --> 15:42:44,741
To play, then, with this new capability\n
16457
15:42:44,741 --> 15:42:47,559
let's go ahead and\njust collect some data.
16458
15:42:47,559 --> 15:42:49,391
In fact, those of you\nhere in person, if you
16459
15:42:49,391 --> 15:42:52,302
want to pull up this URL\non your phone or laptop
16460
15:42:52,302 --> 15:42:54,402
that's going to lead\nyou to a Google Form.
16461
15:42:54,402 --> 15:42:59,472
And that Google Form is going to\n
16462
15:43:00,661 --> 15:43:02,411
And it's going to ask\nyou to categorize it
16463
15:43:02,411 --> 15:43:06,611
according to a genre, like comedy,\n
16464
15:43:07,802 --> 15:43:09,552
And this is useful,\nbecause if you've ever
16465
15:43:09,552 --> 15:43:12,862
used a Google Form before, or\n
16466
15:43:12,862 --> 15:43:15,612
it's a really useful mechanism at\n
16467
15:43:15,612 --> 15:43:19,342
and then ultimately, putting\nit into a spreadsheet form.
16468
15:43:19,341 --> 15:43:23,502
So this is a screenshot of\nthe form that those of you
16469
15:43:23,502 --> 15:43:26,421
here in person or tuning in on\nZoom are currently filling out.
16470
15:43:26,421 --> 15:43:27,761
It's asking only two questions.
16471
15:43:27,762 --> 15:43:29,891
What's the title of\nyour favorite TV show?
16472
15:43:29,891 --> 15:43:34,811
And what are one or more genres\ninto which your TV show falls?
16473
15:43:34,811 --> 15:43:38,201
And I'll go ahead and\npivot now to the view
16474
15:43:38,201 --> 15:43:41,008
that I'll be able to see as the\n
16475
15:43:41,008 --> 15:43:42,550
is quite simply a Google spreadsheet.
16476
15:43:42,550 --> 15:43:45,050
Google Forms has this nice\nfeature, if you've never noticed
16477
15:43:45,050 --> 15:43:47,991
that allows you to export your\ndata to a Google Spreadsheet.
16478
15:43:47,991 --> 15:43:50,472
And then from there, we\ncan actually grab the file
16479
15:43:50,472 --> 15:43:52,842
and download it to my\nown Mac or your own PC
16480
15:43:52,841 --> 15:43:55,661
so that we can actually play around\n
16481
15:43:55,661 --> 15:43:57,911
So in fact, let me go\nahead and slide over
16482
15:43:57,911 --> 15:44:01,881
to this, the live Google Spreadsheet.
16483
15:44:01,881 --> 15:44:05,831
And you'll see, probably, a whole\n
16484
15:44:06,701 --> 15:44:09,161
And if we keep scrolling, and\nscrolling, and scrolling--
16485
15:44:10,631 --> 15:44:12,771
There we go, up to 50 plus already.
16486
15:44:12,771 --> 15:44:15,761
If you need that URL again\nhere, if you're just tuning in
16487
15:44:15,762 --> 15:44:18,192
you can go to this URL here.
16488
15:44:18,192 --> 15:44:21,102
And in just a moment,\nwe'll have a bunch of data
16489
15:44:21,101 --> 15:44:24,792
with which we can start to experiment.
16490
15:44:24,792 --> 15:44:26,412
I'll give you a moment or so there.
16491
15:44:33,760 --> 15:44:35,302
Let me hang in there a little longer.
16492
15:44:35,302 --> 15:44:36,760
OK, we've got over 100 submissions.
16493
15:44:37,732 --> 15:44:40,612
Good, even more coming in now.
16494
15:44:40,612 --> 15:44:42,232
And we can see them coming in live.
16495
15:44:42,232 --> 15:44:44,092
Here, let me switch\nback to the spreadsheet.
16496
15:44:44,091 --> 15:44:46,431
The list is growing, and\ngrowing, and growing.
16497
15:44:48,241 --> 15:44:51,831
let me give Carter a moment to\nhelp me export it in real time.
16498
15:44:51,832 --> 15:44:54,982
Carter, just give me a heads\nup when it's reasonable for me
16499
15:44:58,222 --> 15:45:00,482
All right, and I'll begin\nto do this very slowly.
16500
15:45:00,482 --> 15:45:03,062
So I'm going to go up to the File\n
16501
15:45:03,061 --> 15:45:05,451
Download-- you can download a whole\n
16502
15:45:05,451 --> 15:45:07,401
But more simply, and the one\nwe'll start to play with here
16503
15:45:08,972 --> 15:45:12,351
So CSV files we used this past\nweek, why are they useful?
16504
15:45:12,351 --> 15:45:15,531
Now that you've played with them\n
16505
15:45:15,531 --> 15:45:20,601
what's the utility of a CSV file versus\n
16506
15:45:26,199 --> 15:45:28,072
AUDIENCE: Because it's just a text file?
16507
15:45:28,072 --> 15:45:29,947
DAVID J. MALAN: OK, so\nstorage is compelling.
16508
15:45:29,947 --> 15:45:33,052
A simple text file with ASCII or\n
16509
15:45:36,525 --> 15:45:37,860
DAVID J. MALAN: Yeah, well said.
16510
15:45:37,860 --> 15:45:40,101
It's just a simple text\nformat, but using conventions
16511
15:45:40,101 --> 15:45:43,671
like commas you can represent the\n
16512
15:45:43,671 --> 15:45:45,771
backslash ends invisibly\nat the end of your lines
16513
15:45:45,771 --> 15:45:47,341
you can create the idea of rows.
16514
15:45:47,341 --> 15:45:49,341
So it's a very simple\nway of implementing what
16515
15:45:49,341 --> 15:45:51,951
we might call a flat-file database.
16516
15:45:51,951 --> 15:45:54,201
It's a way of storing\ndata in a flat, that is
16517
15:45:54,201 --> 15:45:57,651
very simple file that's just\npure ASCII or Unicode text.
16518
15:45:57,652 --> 15:46:00,682
And more compellingly, I dare\nsay, is that with a CSV file
16519
15:46:02,203 --> 15:46:04,161
Something is portable in\nthe world of computing
16520
15:46:04,161 --> 15:46:07,251
if it means you can use it on a Mac\n
16521
15:46:08,091 --> 15:46:10,822
And portability is nice because if\n
16522
15:46:10,822 --> 15:46:13,101
there'd be a whole bunch of\npeople in this room and online
16523
15:46:13,101 --> 15:46:15,531
who couldn't download it because\n
16524
15:46:16,281 --> 15:46:21,021
Or if they have a Mac, or if it's\n
16525
15:46:21,021 --> 15:46:22,951
a PC user might not be\nable to download it.
16526
15:46:22,951 --> 15:46:25,141
So a CSV is indeed very portable.
16527
15:46:25,141 --> 15:46:28,281
So I'm going to go ahead and\ndownload, quite simply, the CSV
16528
15:46:29,781 --> 15:46:32,301
That's going to put it onto\nmy own Mac's Downloads folder.
16529
15:46:32,302 --> 15:46:36,802
And let me go ahead here, and in just a\n
16530
15:46:36,802 --> 15:46:40,252
Because it actually downloads\nit at a pretty large name.
16531
15:46:40,252 --> 15:46:43,461
And give me just one moment here,\nand you'll see that, indeed
16532
15:46:43,461 --> 15:46:46,551
on my Mac I have a file\ncalled favorites.csv.
16533
15:46:46,552 --> 15:46:48,092
I shortened the name real quick.
16534
15:46:48,091 --> 15:46:54,021
And now what I'm going to do is go\n
16535
15:46:54,021 --> 15:46:55,822
I'm going to open my File Explorer.
16536
15:46:55,822 --> 15:46:59,961
And if I minimize my window here for\n
16537
15:46:59,961 --> 15:47:03,322
is that you can just drag and drop a\n
16538
15:47:03,322 --> 15:47:06,002
And voila, it's going to\nautomatically upload it for you.
16539
15:47:06,002 --> 15:47:08,601
So let me go ahead and full\nscreen here, close my Explorer
16540
15:47:08,601 --> 15:47:10,461
temporarily close my Terminal window.
16541
15:47:10,461 --> 15:47:14,061
And you'll see here a\nCSV file, favorites.csv.
16542
15:47:14,061 --> 15:47:16,731
And the first row, by\nconvention, has whatever
16543
15:47:16,732 --> 15:47:20,062
the columns were in Google\nSpreadsheets, or Office 365
16544
15:47:20,061 --> 15:47:23,961
in Excel online, timestamp,\ncomma, title, comma, genres.
16545
15:47:23,961 --> 15:47:25,731
Then, we have timestamps,\nwhich indicates
16546
15:47:25,732 --> 15:47:27,064
when people started submitting.
16547
15:47:27,063 --> 15:47:29,271
Looks like a couple of people\nwere super eager to get
16548
15:47:30,771 --> 15:47:34,682
And then, you have the\ntitle next, after a comma.
16549
15:47:34,682 --> 15:47:37,491
But there's kind of a\ncuriosity after that.
16550
15:47:37,491 --> 15:47:40,851
Sometimes I see the genre\nlike comedy, comedy, comedy
16551
15:47:40,851 --> 15:47:45,211
but sometimes it's like crime, comma,\n
16552
15:47:45,891 --> 15:47:47,811
And those things are quoted.
16553
15:47:47,811 --> 15:47:49,521
And yet, I didn't do any quotes.
16554
15:47:49,521 --> 15:47:51,141
You probably didn't type any quotes.
16555
15:47:51,141 --> 15:47:55,521
Where are those quotes\ncoming from in this CSV file?
16556
15:47:55,521 --> 15:47:56,991
Why are they there if we infer?
16557
15:48:00,682 --> 15:48:03,650
DAVID J. MALAN: Yeah, so you\nhave a corner case, if you will.
16558
15:48:03,650 --> 15:48:05,692
Because if you're using\ncommas, as you described
16559
15:48:05,692 --> 15:48:09,622
to separate your data into what\nare effectively columns, well
16560
15:48:09,622 --> 15:48:12,352
you've painted yourself into\na corner if your actual data
16561
15:48:13,919 --> 15:48:16,461
So what Google has done, what\nMicrosoft does, what Apple does
16562
15:48:16,461 --> 15:48:19,671
is, they quote any strings\nof text that themselves
16563
15:48:19,671 --> 15:48:23,902
have commas so that these are\nnow English grammatical commas
16564
15:48:26,072 --> 15:48:28,741
So it's a way of escaping\nyour data, if you will.
16565
15:48:28,741 --> 15:48:31,411
And escaping just means to call\nout a symbol in a special way
16566
15:48:31,411 --> 15:48:33,978
so it's not misinterpreted\nas something else.
16567
15:48:33,978 --> 15:48:35,811
All right, so this is\nall to say that we now
16568
15:48:35,811 --> 15:48:39,322
have all of this data with which we\n
16569
15:48:39,322 --> 15:48:41,182
start calling a flat-file database.
16570
15:48:41,182 --> 15:48:44,572
So suppose I wanted to now\nstart manipulating this data
16571
15:48:44,572 --> 15:48:47,451
and I want to store it ultimately,\nindeed, in this CSV format.
16572
15:48:47,451 --> 15:48:49,881
How can I actually\nstart to read this data
16573
15:48:49,881 --> 15:48:52,292
maybe clean it up, maybe\ndo some analytics on it
16574
15:48:52,292 --> 15:48:55,912
and actually figure out, what's the most\n
16575
15:48:55,911 --> 15:48:57,531
here over the past few minutes?
16576
15:48:57,531 --> 15:48:59,612
Well, let me go ahead and close this.
16577
15:48:59,612 --> 15:49:04,311
Let me go ahead, then, and open up,\n
16578
15:49:04,311 --> 15:49:07,311
And let's code up a file\ncalled favorites.py.
16579
15:49:07,311 --> 15:49:11,451
And let's go ahead and iteratively start\n
16580
15:49:11,451 --> 15:49:13,171
and printing out what's inside of it.
16581
15:49:13,171 --> 15:49:16,671
So you might recall that we can do\n
16582
15:49:16,671 --> 15:49:20,332
to give myself some CSV\nreading functionality.
16583
15:49:20,332 --> 15:49:24,952
Then, I can go ahead and do something\n
16584
15:49:24,951 --> 15:49:27,621
that I want to open in read mode.
16585
15:49:27,622 --> 15:49:29,332
Quote, unquote, "r" means to read it.
16586
15:49:29,332 --> 15:49:31,942
And then, I can say as\nfile, or whatever other name
16587
15:49:31,942 --> 15:49:35,012
for a variable to say that\nI want to open this file
16588
15:49:35,012 --> 15:49:37,822
and essentially store some kind of\n
16589
15:49:38,942 --> 15:49:42,262
Then, I can give myself a\nreader, and I can say csv.reader
16590
15:49:42,262 --> 15:49:43,739
passing in that file as input.
16591
15:49:43,739 --> 15:49:45,322
And this is the magic of that library.
16592
15:49:45,322 --> 15:49:48,531
It deals with the process of opening\n
16593
15:49:48,531 --> 15:49:51,771
back something that you can just\n
16594
15:49:51,771 --> 15:49:55,851
I do want to skip the first row,\nand recall that I can do this.
16595
15:49:55,851 --> 15:49:59,006
Next, reader, is this little trick\n
16596
15:49:59,006 --> 15:50:00,381
Because the first one is special.
16597
15:50:00,381 --> 15:50:02,752
It said timestamp, title, genres.
16598
15:50:02,752 --> 15:50:04,741
That's not your data, that was mine.
16599
15:50:04,741 --> 15:50:07,222
But this means now that\nI've skipped that first row.
16600
15:50:07,222 --> 15:50:10,042
Everything hereafter is going\nto be the title of a show
16601
15:50:10,042 --> 15:50:11,601
that you all like, so let me do this.
16602
15:50:11,601 --> 15:50:16,432
For row in the reader, let's go\nahead and print out the title
16603
15:50:16,432 --> 15:50:18,201
of the show each of you typed in.
16604
15:50:18,201 --> 15:50:22,341
How do I get at the title of\nthe show each of you typed in?
16605
15:50:22,341 --> 15:50:24,081
It's somewhere inside of row.
16606
15:50:26,131 --> 15:50:28,252
So what do I want to\ntype next in order to get
16607
15:50:28,252 --> 15:50:34,262
at the title of the current\nrow just as a quick check here?
16608
15:50:34,262 --> 15:50:36,692
What do I want to type to\nget at the title of the row
16609
15:50:36,692 --> 15:50:40,862
keeping in mind, again, that it\nwas timestamp, title, genres?
16610
15:50:42,237 --> 15:50:44,252
DAVID J. MALAN: So row\nbracket 1 would give me
16611
15:50:44,252 --> 15:50:47,982
the second column, 0 index, that is,\n
16612
15:50:47,982 --> 15:50:49,862
So this program isn't\nthat interesting yet
16613
15:50:49,862 --> 15:50:52,711
but it's a quick and dirty way to\n
16614
15:50:53,161 --> 15:50:55,328
Let me actually just do a\nlittle bit of a check here
16615
15:50:55,328 --> 15:50:57,721
and see if it contains\nthe data I think it does.
16616
15:50:57,722 --> 15:50:59,832
Let me maximize my Terminal window here.
16617
15:50:59,832 --> 15:51:03,152
Let me run Python of\nfavorites.py, hitting Enter.
16618
15:51:03,152 --> 15:51:07,862
And you'll see now a purely\ntextual list of all of the shows
16619
15:51:09,902 --> 15:51:12,421
But what's noteworthy about it?
16620
15:51:12,421 --> 15:51:15,301
Specific shows aside,\njudgment aside as to people's
16621
15:51:15,302 --> 15:51:19,982
TV tastes, what's interesting or\nnoteworthy about the data that
16622
15:51:19,982 --> 15:51:23,432
might create some problems for us\n
16623
15:51:23,432 --> 15:51:25,082
and figure out what's the most popular?
16624
15:51:25,082 --> 15:51:27,932
How many people like this or that?
16625
15:51:29,493 --> 15:51:32,800
AUDIENCE: User errors [INAUDIBLE].
16626
15:51:32,800 --> 15:51:34,842
DAVID J. MALAN: Yeah,\nthere might be user errors
16627
15:51:34,841 --> 15:51:38,141
or just stylistic differences that\n
16628
15:51:42,311 --> 15:51:45,591
Let's see if I can see an\nexample on the screen here.
16629
15:51:45,591 --> 15:51:49,681
Yeah, so friends here is an all\n
16630
15:51:50,482 --> 15:51:51,712
We can sort of mitigate that.
16631
15:51:51,711 --> 15:51:54,822
But this is just a tiny example\nof where data in the real world
16632
15:51:56,099 --> 15:51:57,641
And that probably wasn't even a typo.
16633
15:51:57,641 --> 15:52:02,928
It was just someone not caring as much\n
16634
15:52:02,928 --> 15:52:05,261
Your users are going to type\nwhat they're going to type.
16635
15:52:05,262 --> 15:52:08,922
So let's see if we can't now begin\nto get at more specific data
16636
15:52:08,921 --> 15:52:10,841
and maybe even clean\nsome of this data up.
16637
15:52:10,841 --> 15:52:15,911
Let me go back into my file\ncalled favorites.py here
16638
15:52:15,911 --> 15:52:19,691
and let's actually do something a\n
16639
15:52:19,692 --> 15:52:23,412
Instead of a reader, recall that there\n
16640
15:52:23,411 --> 15:52:25,091
just a little more user friendly.
16641
15:52:25,091 --> 15:52:30,011
And it means I can type in dictionary\n
16642
15:52:30,012 --> 15:52:36,012
But now, when I iterate over this\n
16643
15:52:36,012 --> 15:52:39,402
When using a DictReader instead\nof a reader, recall, and this
16644
15:52:39,402 --> 15:52:43,031
is just a peculiarity\nof the CSV library
16645
15:52:43,031 --> 15:52:47,871
this gives me back, not a list\nof cells, but what instead
16646
15:52:47,872 --> 15:52:50,292
which is marginally more\nuser friendly for me?
16647
15:52:53,802 --> 15:52:56,202
I can now use open bracket,\nquotes, and the title.
16648
15:52:56,201 --> 15:52:59,112
Because what's coming back\nnow is a dict object, that is
16649
15:52:59,112 --> 15:53:02,351
a dictionary which has keys and values.
16650
15:53:02,351 --> 15:53:04,402
The keys of which are\nthe column headings.
16651
15:53:04,402 --> 15:53:06,741
The values of which are the\ndata I actually care about.
16652
15:53:06,741 --> 15:53:09,371
So this is just marginally\nbetter because, one, it's
16653
15:53:09,372 --> 15:53:12,672
just way more obvious to me, the\n
16654
15:53:13,391 --> 15:53:15,652
I don't remember what\ncolumn the title was.
16655
15:53:17,232 --> 15:53:18,792
That's something you're\ngoing to forget over time.
16656
15:53:18,792 --> 15:53:21,851
And God forbid someone changes the\n
16657
15:53:21,851 --> 15:53:24,731
the columns in Excel, or Apple\nNumbers, or Google Spreadsheets.
16658
15:53:24,732 --> 15:53:27,320
That's going to break all\nof your numeric indices.
16659
15:53:27,319 --> 15:53:29,112
And so a dictionary\nreader is arguably just
16660
15:53:29,112 --> 15:53:32,652
better design because it's\nmore robust against changes
16661
15:53:32,652 --> 15:53:34,302
and potential errors like that.
16662
15:53:34,302 --> 15:53:37,902
Now the effect of this change isn't\n
16663
15:53:37,902 --> 15:53:42,162
If I run Python of favorites.py,\n
16664
15:53:42,161 --> 15:53:46,631
But I've now not made any assumptions\n
16665
15:53:48,762 --> 15:53:51,672
Well, let's go ahead and now\nfilter out some duplicates.
16666
15:53:51,671 --> 15:53:55,301
Because there's a lot of commonality\n
16667
15:53:55,302 --> 15:53:58,002
see if we can't filter out duplicates.
16668
15:53:58,002 --> 15:54:04,241
If I'm reading a CSV file top to bottom,\n
16669
15:54:04,241 --> 15:54:06,991
I want to implement to\nfilter out duplicates?
16670
15:54:06,991 --> 15:54:10,241
It's not going to be quite as simple as\n
16671
15:54:10,241 --> 15:54:12,641
I'm going to have to build this.
16672
15:54:12,641 --> 15:54:15,972
But logically, if you're reading\na file from top to bottom
16673
15:54:15,972 --> 15:54:20,202
how might you go about, in\nPython or just any context
16674
15:54:20,201 --> 15:54:23,381
getting rid of duplicate values?
16675
15:54:31,951 --> 15:54:35,072
I could use a list and I could\nadd each title to the list
16676
15:54:35,072 --> 15:54:38,411
but first check if I put\nthis into the list before.
16677
15:54:38,411 --> 15:54:40,481
So let's try a little\nsomething like that.
16678
15:54:40,482 --> 15:54:43,682
Let me go ahead and create a variable\n
16679
15:54:43,682 --> 15:54:46,682
I'll call it titles, for instance,\ninitialize to an empty list
16680
15:54:46,682 --> 15:54:48,391
open bracket, close bracket.
16681
15:54:48,391 --> 15:54:53,201
And then, inside of my loop\nhere, instead of printing it out
16682
15:54:53,201 --> 15:54:54,971
let's start to make a decision.
16683
15:54:54,972 --> 15:55:04,673
So if the current row's\ntitle is in the titles list
16684
15:55:04,673 --> 15:55:05,881
I don't want to put it there.
16685
15:55:05,881 --> 15:55:08,923
And actually, let me invert the logic\n
16686
15:55:08,923 --> 15:55:13,531
So if it's not the case that\nrow bracket title is in titles
16687
15:55:13,531 --> 15:55:21,542
then, go ahead and do something like\n
16688
15:55:21,542 --> 15:55:24,512
And recall that we saw\n.append a week or so ago
16689
15:55:24,512 --> 15:55:27,042
where it just allows you to\nappend to the current list.
16690
15:55:27,042 --> 15:55:30,152
And then, what can I do at\nthe very end, after I'm all
16691
15:55:30,152 --> 15:55:31,802
done reading the whole file?
16692
15:55:31,802 --> 15:55:35,162
Why don't I go ahead and\nsay, for title in titles
16693
15:55:35,161 --> 15:55:38,072
go ahead and print\nout the current title?
16694
15:55:38,072 --> 15:55:42,042
So it's two loops now, and we can come\n
16695
15:55:42,042 --> 15:55:44,715
But let me go ahead here and\nrerun Python of favorites.py.
16696
15:55:44,714 --> 15:55:47,881
Let me increase the size of my Terminal\n
16697
15:55:53,461 --> 15:55:56,432
I don't think I'm seeing\nduplicates, although I
16698
15:55:56,432 --> 15:55:59,342
am seeing some near duplicates.
16699
15:55:59,341 --> 15:56:02,101
For instance, there's Friends again.
16700
15:56:02,101 --> 15:56:05,402
And if we keep going, and\ngoing, and going, and going
16701
15:56:06,601 --> 15:56:12,411
Oh, interesting, so that's curious\n
16702
15:56:12,411 --> 15:56:13,971
and I have this one here, too.
16703
15:56:13,972 --> 15:56:16,461
So how might we clean this up further?
16704
15:56:16,461 --> 15:56:18,735
I like your instincts, and\nit's a step closer to it.
16705
15:56:18,735 --> 15:56:20,902
What are we going to have\nto do to really filter out
16706
15:56:24,487 --> 15:56:29,386
AUDIENCE: You could set\neverything to lower [INAUDIBLE]..
16707
15:56:30,262 --> 15:56:32,012
What are the common\nmistakes to summarize?
16708
15:56:32,012 --> 15:56:34,747
We could ignore the capitalization\naltogether and maybe
16709
15:56:34,747 --> 15:56:37,372
just force everything to lowercase,\nor everything to uppercase.
16710
15:56:37,372 --> 15:56:39,262
Doesn't matter which, but\nlet's just be consistent.
16711
15:56:39,262 --> 15:56:42,137
And for those of you who might have\n
16712
15:56:42,137 --> 15:56:44,992
the spacebar at the beginning of\nyour input or even at the end
16713
15:56:46,521 --> 15:56:49,831
Stripping whitespace is a common\n
16714
15:56:49,832 --> 15:56:53,211
So let me go back into my\ncode here, and let me go ahead
16715
15:56:53,211 --> 15:56:55,882
and tweak the title a little bit.
16716
15:56:55,881 --> 15:56:58,701
Let me say that the current\ntitle inside of this loop
16717
15:56:58,701 --> 15:57:01,641
is not going to be just\nthe current row's title.
16718
15:57:01,641 --> 15:57:05,781
But let me go ahead and strip off,\n
16719
15:57:06,601 --> 15:57:09,601
If you read the documentation for the\n
16720
15:57:09,601 --> 15:57:12,572
It gets rid of whitespace to the\nleft, whitespace to the right.
16721
15:57:12,572 --> 15:57:15,652
And then, if I want to force\neverything to maybe uppercase
16722
15:57:15,652 --> 15:57:18,052
I can just uppercase the entire string.
16723
15:57:18,052 --> 15:57:21,652
And remember, what's handy about Python\n
16724
15:57:21,652 --> 15:57:24,891
calls together by just\nusing dots again and again.
16725
15:57:24,891 --> 15:57:26,841
And that just takes\nwhatever just happened
16726
15:57:26,841 --> 15:57:29,661
like the whitespace got stripped\noff, then, it additionally
16727
15:57:29,661 --> 15:57:31,621
uppercases the whole thing as well.
16728
15:57:31,622 --> 15:57:36,322
So now, I'm going to just check whether\n
16729
15:57:36,322 --> 15:57:40,101
And if not, I'm going to go\nahead and append that title
16730
15:57:40,101 --> 15:57:42,542
massaged into this different\nformat, if you will.
16731
15:57:42,542 --> 15:57:44,692
So I'm throwing away some information.
16732
15:57:44,692 --> 15:57:49,102
I'm sacrificing all of the\nnuances of your grammar and input
16733
15:57:50,362 --> 15:57:52,822
But at least I'm trying to\ncanonicalize size, that is
16734
15:57:52,822 --> 15:57:55,292
standardize what the\ndata actually looks like.
16735
15:57:55,292 --> 15:57:59,152
So let me go ahead and run Python\n
16736
15:57:59,152 --> 15:58:00,754
Oh, and this is just user error.
16737
15:58:00,754 --> 15:58:02,211
Maybe you haven't seen this before.
16738
15:58:02,211 --> 15:58:06,211
This just looks like\na mistake on my part.
16739
15:58:06,211 --> 15:58:08,542
I meant to say not even uppercase.
16740
15:58:09,544 --> 15:58:11,752
The function is called upper,\nnow that I think of it.
16741
15:58:12,171 --> 15:58:14,671
Let's go and increase the size\nof the Terminal window again.
16742
15:58:16,161 --> 15:58:20,871
And now, it's a little more overwhelming\n
16743
15:58:22,641 --> 15:58:28,641
But I don't think I'm seeing\nmultiple Friends, so to speak.
16744
15:58:28,641 --> 15:58:31,796
There's one Friends\nup here and that's it.
16745
15:58:31,796 --> 15:58:33,171
I'm back up at my prompt already.
16746
15:58:33,171 --> 15:58:35,612
So we seem now to be\nfiltering out duplicates.
16747
15:58:35,612 --> 15:58:38,923
Now, before we dive in further and\n
16748
15:58:38,923 --> 15:58:40,131
what else could we have done?
16749
15:58:40,131 --> 15:58:42,441
Well, it turns out that\nin Python 2 you often
16750
15:58:42,442 --> 15:58:44,692
do get a lot of functionality\nbuilt into the language.
16751
15:58:44,692 --> 15:58:47,655
And I'm kind of implementing\nmyself the idea of a set.
16752
15:58:47,654 --> 15:58:49,822
If you think back to\nmathematics, a set is typically
16753
15:58:49,822 --> 15:58:53,811
something with a bunch of values\n
16754
15:58:53,811 --> 15:58:56,631
Recall that Python\nalready has this for us.
16755
15:58:56,631 --> 15:58:59,542
And we saw it really briefly\nwhen I whipped up the dictionary
16756
15:58:59,542 --> 15:59:01,531
implementation a couple of weeks back.
16757
15:59:01,531 --> 15:59:06,322
So I could actually define my titles\n
16758
15:59:06,322 --> 15:59:11,421
and this would just modestly allow\n
16759
15:59:11,421 --> 15:59:14,211
that I don't have to bother\nchecking for duplicates anyway.
16760
15:59:14,211 --> 15:59:18,051
I can instead just say\nsomething like, titles.add
16761
15:59:18,052 --> 15:59:21,149
the current title, like this.
16762
15:59:21,148 --> 15:59:24,231
Marginally better design if you know\n
16763
15:59:24,232 --> 15:59:26,095
getting more functionality out of this.
16764
15:59:26,095 --> 15:59:28,012
All right, so let's clean\nthe data up further.
16765
15:59:28,012 --> 15:59:31,492
We've now gone ahead and fixed\nthe problem of case sensitivity.
16766
15:59:31,491 --> 15:59:34,281
We threw away whitespace in case\nsomeone had hit the spacebar
16767
15:59:35,332 --> 15:59:39,302
Let's go ahead now and sort these\n
16768
15:59:39,302 --> 15:59:42,622
So instead of just printing out\nthe titles in the same order
16769
15:59:42,622 --> 15:59:47,484
you all inputted them, but filtering\n
16770
15:59:47,483 --> 15:59:49,941
and use another function in\nPython you might not have seen
16771
15:59:49,942 --> 15:59:52,281
which is literally\ncalled sorted, and will
16772
15:59:52,281 --> 15:59:57,241
take care of the process of\nactually sorting titles for you.
16773
15:59:57,241 --> 15:59:59,781
Let me go ahead and increase\nthe font size of my Terminal
16774
15:59:59,781 --> 16:00:01,911
run Python of favorites.py,\nand hit Enter.
16775
16:00:01,911 --> 16:00:05,751
And now you can really see how many of\n
16776
16:00:06,561 --> 16:00:08,932
Now it's a little easier\nto wrap our minds around
16777
16:00:08,932 --> 16:00:11,991
just because it's at least\nsorted alphabetically.
16778
16:00:11,991 --> 16:00:15,561
But now you can really see some of\n
16779
16:00:17,192 --> 16:00:21,742
But a few of you decided to stylize\n
16780
16:00:21,741 --> 16:00:24,862
Brooklyn 99 is a couple\nof different ways here.
16781
16:00:24,862 --> 16:00:28,042
And I think if we keep going we'll see\n
16782
16:00:28,042 --> 16:00:31,802
did not fix by focusing on\nwhitespace and capitalization alone.
16783
16:00:31,802 --> 16:00:35,212
So already here, this is only,\nwhat, 100 plus, 200 rows.
16784
16:00:35,211 --> 16:00:38,137
Already real-world data\nstarts to get messy quickly
16785
16:00:38,137 --> 16:00:40,012
and that might not bode\nwell when we actually
16786
16:00:40,012 --> 16:00:41,992
want to keep around real\ndata from real users.
16787
16:00:41,991 --> 16:00:44,366
You can imagine an actual\nwebsite or a mobile application
16788
16:00:44,366 --> 16:00:47,002
dealing with this kind\nof thing on scale.
16789
16:00:47,002 --> 16:00:48,421
Well, let's go ahead and do this.
16790
16:00:48,421 --> 16:00:51,711
Let's actually figure out the\npopularity of these various shows
16791
16:00:51,711 --> 16:00:57,021
by now iterating over my data, and\n
16792
16:00:58,341 --> 16:01:03,231
We're going to ignore the problems\n
16793
16:01:03,232 --> 16:01:07,732
Sorry, yeah, Avatar,\nwhere there was things
16794
16:01:07,732 --> 16:01:12,412
that were different beyond just\nwhitespace and capitalization.
16795
16:01:12,411 --> 16:01:14,181
But let's go ahead and\nkeep track of, now
16796
16:01:14,182 --> 16:01:17,631
how many of you inputted\neach of these titles.
16797
16:01:18,790 --> 16:01:21,082
I'm still going to take this\napproach of iterating over
16798
16:01:21,082 --> 16:01:23,452
the CSV file from top to bottom.
16799
16:01:23,451 --> 16:01:25,701
We've used a couple of\ndata structures thus far
16800
16:01:25,701 --> 16:01:29,601
a list to keep track of titles,\n
16801
16:01:29,601 --> 16:01:32,691
But what if I now want to keep\naround a little more information?
16802
16:01:32,692 --> 16:01:38,452
For each title, I want to keep around\n
16803
16:01:39,472 --> 16:01:43,222
I'm throwing away the total\nnumber of times I see these shows.
16804
16:01:43,222 --> 16:01:45,862
How could I start to keep that around?
16805
16:01:45,862 --> 16:01:47,186
AUDIENCE: Use a dictionary.
16806
16:01:47,186 --> 16:01:49,311
DAVID J. MALAN: We could\nuse a dictionary, and how?
16807
16:01:51,476 --> 16:01:53,434
DAVID J. MALAN: Perfect,\nreally good instincts.
16808
16:01:53,434 --> 16:01:55,191
Using a dictionary,\ninsofar as it lets us
16809
16:01:55,192 --> 16:01:58,641
store keys and values, that is,\n
16810
16:01:59,192 --> 16:02:01,822
This is why a dictionary\nor hash tables more
16811
16:02:01,822 --> 16:02:04,972
generally are such a useful,\npractical data structure.
16812
16:02:04,972 --> 16:02:08,461
Because they just let you remember\n
16813
16:02:08,461 --> 16:02:11,211
So if the keys are going\nto be the titles I've seen
16814
16:02:11,211 --> 16:02:15,201
the values could be the number of\n
16815
16:02:15,201 --> 16:02:19,649
And so it's kind of like just\n
16816
16:02:19,650 --> 16:02:22,192
For instance, if I were going\nto do this on a piece of paper
16817
16:02:22,192 --> 16:02:24,682
I might just have two\ncolumns here, where
16818
16:02:24,682 --> 16:02:29,932
maybe this is the title that I've\n
16819
16:02:29,932 --> 16:02:33,592
This is, in effect, a\ndictionary in Python.
16820
16:02:33,591 --> 16:02:36,831
It's two columns, keys on the\nleft, values on the right.
16821
16:02:36,832 --> 16:02:38,961
And this, if I can implement\nin code, will actually
16822
16:02:38,961 --> 16:02:42,921
allow me to store this data, and\n
16823
16:02:42,921 --> 16:02:44,792
to figure out which is the most popular.
16824
16:02:45,722 --> 16:02:49,582
Let me go ahead and change my titles\n
16825
16:02:49,582 --> 16:02:54,502
Let's have it be a dictionary instead,\n
16826
16:02:54,502 --> 16:02:58,942
two curly braces that are empty gives\n
16827
16:03:00,201 --> 16:03:02,781
I think most of my\ncode can stay the same.
16828
16:03:02,781 --> 16:03:06,201
But down here, I don't want\nto just blindly add titles
16829
16:03:07,521 --> 16:03:10,401
I somehow need to keep\ntrack of the count.
16830
16:03:10,402 --> 16:03:14,031
And unfortunately, if I just\ndo this-- let's do titles
16831
16:03:14,031 --> 16:03:18,921
bracket, title, plus equals 1.
16832
16:03:18,921 --> 16:03:21,472
This is a reasonable\nfirst attempt at this.
16833
16:03:22,762 --> 16:03:28,072
If titles is a dictionary and I want\n
16834
16:03:28,072 --> 16:03:30,862
the syntax for that, like before,\nis titles, bracket, and then
16835
16:03:30,862 --> 16:03:34,222
the key you want to use to\nindex into the dictionary.
16836
16:03:34,222 --> 16:03:37,342
It's not a number in this case,\nit's an actual word, a title.
16837
16:03:37,341 --> 16:03:39,561
And you're just going\nto increment it by one
16838
16:03:39,561 --> 16:03:42,351
and then eventually I'll come\nback and finish my second loop
16839
16:03:42,351 --> 16:03:45,051
and do things in terms of the order.
16840
16:03:45,052 --> 16:03:48,922
But for now, let's just keep\ntrack of the total counts.
16841
16:03:48,921 --> 16:03:51,002
Let me go ahead and\nincrease my Terminal window.
16842
16:03:51,002 --> 16:03:54,932
Let me do Python of\nfavorites.py and hit Enter.
16843
16:03:55,432 --> 16:03:59,482
How I Met Your Mother is\ngiving me a key error.
16844
16:04:04,461 --> 16:04:07,671
And in fact, just to give a\nlittle bit of a breadcrumb here
16845
16:04:09,482 --> 16:04:12,262
Let me open up the CSV\nfile again real quickly.
16846
16:04:12,262 --> 16:04:15,802
And wow, we didn't even get\npast the second row in the file
16847
16:04:15,802 --> 16:04:17,552
or the first show in the file.
16848
16:04:17,552 --> 16:04:20,182
Notice that How I Met Your\nMother, somewhat lowercased
16849
16:04:20,182 --> 16:04:22,904
is the very first show in therein.
16850
16:04:22,904 --> 16:04:24,862
What's your instinct for\nwhy this is happening?
16851
16:04:24,862 --> 16:04:27,092
AUDIENCE: You don't\nhave a starting point.
16852
16:04:27,091 --> 16:04:29,008
DAVID J. MALAN: I don't\nhave a starting point.
16853
16:04:30,262 --> 16:04:35,182
I'm blindly indexing into the dictionary\n
16854
16:04:35,182 --> 16:04:37,222
that doesn't yet exist\nin the dictionary.
16855
16:04:37,222 --> 16:04:39,952
And so Python throws\nwhat's called a key error
16856
16:04:39,951 --> 16:04:42,781
because the key you're trying\nto use just doesn't exist yet.
16857
16:04:42,781 --> 16:04:46,341
So logically, how could we fix this?
16858
16:04:47,031 --> 16:04:50,451
We got half of the problem solved,\n
16859
16:04:50,451 --> 16:04:52,191
case of nothing being there.
16860
16:04:52,936 --> 16:04:54,271
AUDIENCE: Creating a counter.
16861
16:04:54,271 --> 16:04:54,961
DAVID J. MALAN: Creating a--
16862
16:04:55,711 --> 16:04:57,582
DAVID J. MALAN: Creating\nthe counter itself.
16863
16:04:57,582 --> 16:04:59,531
So maybe I could do something like this.
16864
16:04:59,531 --> 16:05:03,722
Let me close my Terminal window\nand let me ask a question first.
16865
16:05:03,722 --> 16:05:10,322
If the current title is in the\n
16866
16:05:10,322 --> 16:05:12,781
that's going to give me a\ntrue-false answer it turns out.
16867
16:05:12,781 --> 16:05:17,582
Then, I can safely say, titles,\nbracket, title, plus equals 1.
16868
16:05:17,582 --> 16:05:22,082
And recall, this is just shorthand\n
16869
16:05:25,510 --> 16:05:28,052
That's the same thing as this\nbut it's a little more succinct
16870
16:05:30,152 --> 16:05:34,832
Else, if it's logically not the case\n
16871
16:05:34,832 --> 16:05:38,952
dictionary, then I probably want to\n
16872
16:05:38,951 --> 16:05:40,656
Feel free to just shout it out.
16873
16:05:42,156 --> 16:05:46,661
I just have to put some value there\n
16874
16:05:47,161 --> 16:05:49,711
So now that I've got this\ngoing on, let me go ahead
16875
16:05:49,711 --> 16:05:51,961
and undo my sorting temporarily.
16876
16:05:51,961 --> 16:05:54,991
And now let me go ahead and do this.
16877
16:05:54,991 --> 16:05:58,801
I can, as a quick check, let me\ngo ahead and just run the code
16878
16:05:58,802 --> 16:06:00,391
as is, Python of favorites.py.
16879
16:06:02,372 --> 16:06:05,132
It's printing correctly, no key\nerrors, but it's not sorted.
16880
16:06:05,131 --> 16:06:06,961
And I'm not seeing any of the counts.
16881
16:06:06,961 --> 16:06:09,122
Let me just quickly add\nthe counts, and there's
16882
16:06:09,122 --> 16:06:10,872
a couple of ways I could do this.
16883
16:06:10,872 --> 16:06:18,242
I could, say, print out the title, and\n
16884
16:06:18,241 --> 16:06:22,561
how about just, comma,\ntitles, bracket, title?
16885
16:06:22,561 --> 16:06:24,362
So I'm going to print\ntwo things at once
16886
16:06:24,362 --> 16:06:26,942
both the current title\nin the dictionary
16887
16:06:26,942 --> 16:06:29,641
and whatever its value\nis by indexing into it.
16888
16:06:29,641 --> 16:06:31,481
Let me increase my Terminal window.
16889
16:06:31,482 --> 16:06:35,762
Let me run Python of\nfavorites.py, Enter, and OK.
16890
16:06:39,421 --> 16:06:42,902
None of you said a whole\nlot of TV shows, it seems.
16891
16:06:42,902 --> 16:06:47,031
What's the logical error here?
16892
16:06:47,031 --> 16:06:50,801
What did I do wrong if I\nlook back at my code here?
16893
16:06:55,832 --> 16:07:00,482
To summarize, I initialized the\n
16894
16:07:00,482 --> 16:07:03,842
but I should have initialized it at\n
16895
16:07:03,841 --> 16:07:05,561
Or I should change my code a bit.
16896
16:07:05,561 --> 16:07:08,222
So for instance, if I go back\nin here, the simplest fix
16897
16:07:08,222 --> 16:07:11,732
is probably to initialize to 1,\n
16898
16:07:11,732 --> 16:07:14,552
obviously, I'm seeing this\ntitle for the very first time.
16899
16:07:14,552 --> 16:07:16,922
Or I could change my logic a little bit.
16900
16:07:16,921 --> 16:07:18,811
I could do something like this instead.
16901
16:07:18,811 --> 16:07:24,182
If the current title is not in titles,\n
16902
16:07:24,182 --> 16:07:28,201
And then I could get rid of\nthe else, and now blindly index
16903
16:07:30,241 --> 16:07:34,441
Because now, on line 11, I\ncan trust that lines 9 and 10
16904
16:07:34,442 --> 16:07:37,382
took care of the initialization\nfor me if need be.
16905
16:07:38,911 --> 16:07:42,491
This one's a little nicer, maybe\nbecause it's one line fewer.
16906
16:07:42,491 --> 16:07:45,811
But I think both approaches are\n
16907
16:07:45,811 --> 16:07:47,761
But the key thing, no\npun intended, is that we
16908
16:07:47,762 --> 16:07:52,442
have to make sure the key exists\n
16909
16:07:59,741 --> 16:08:03,031
So otherwise, everyone would have\n
16910
16:08:03,031 --> 16:08:04,531
how many people said the same thing.
16911
16:08:04,531 --> 16:08:06,578
Now the code is as it should be.
16912
16:08:06,578 --> 16:08:08,911
So let me go ahead and open\nup my Terminal window again.
16913
16:08:08,911 --> 16:08:13,051
Let me run Python of favorites.py,\n
16914
16:08:13,052 --> 16:08:14,492
Some shows weren't that popular.
16915
16:08:14,491 --> 16:08:16,171
There's just 1s and maybe 2s.
16916
16:08:16,171 --> 16:08:21,911
But I bet if we sort these things we\n
16917
16:08:23,232 --> 16:08:29,862
Well, turns out, when dealing\nwith a dictionary like this--
16918
16:08:29,862 --> 16:08:32,502
let's go ahead and just\nsort the titles themselves.
16919
16:08:32,502 --> 16:08:37,472
So let's reintroduce the sorted function\n
16920
16:08:37,472 --> 16:08:40,077
Let me go ahead now and\nrun Python of favorites.py.
16921
16:08:40,076 --> 16:08:42,451
Now it's just a little easier\nto wrap your mind around it
16922
16:08:42,451 --> 16:08:43,909
because at least it's alphabetical.
16923
16:08:43,910 --> 16:08:47,942
But it's not sorted by\nvalue, it's sorted by key.
16924
16:08:47,942 --> 16:08:51,512
But sure enough, if we scroll\ndown, there's something down here
16925
16:08:51,512 --> 16:08:54,544
for instance, like,\nlet's see, The Office.
16926
16:08:54,544 --> 16:08:56,252
That's definitely\ngoing to be a contender
16927
16:08:56,252 --> 16:08:58,201
for most popular, 15 responses.
16928
16:08:58,201 --> 16:09:01,201
But let's see what's actually\ngoing to bubble up to the top.
16929
16:09:01,201 --> 16:09:06,211
Unfortunately, the sorted function\n
16930
16:09:09,752 --> 16:09:12,841
But it turns out, in Python,\nif you read the documentation
16931
16:09:12,841 --> 16:09:14,851
for the sorted function,\nyou can actually
16932
16:09:14,851 --> 16:09:19,921
pass in other arguments that\ntell it how to sort things.
16933
16:09:19,921 --> 16:09:22,771
For instance, if I want to\ndo things in reverse order
16934
16:09:22,771 --> 16:09:27,481
I can add a second parameter to\n
16935
16:09:28,652 --> 16:09:30,961
You literally say,\nreverse equals true, so
16936
16:09:30,961 --> 16:09:34,171
that the position of it in the\n
16937
16:09:34,171 --> 16:09:37,103
If I now rerun this after\nincreasing my Terminal window
16938
16:09:37,103 --> 16:09:39,061
you'll see now that it's\nin the opposite order.
16939
16:09:39,061 --> 16:09:41,671
Now adventure and Anne\nwith an E is at the bottom
16940
16:09:41,671 --> 16:09:43,752
of the output instead of the top.
16941
16:09:43,752 --> 16:09:52,186
How can I tell it to sort\nby values instead of by key?
16942
16:09:52,186 --> 16:09:53,561
Well, let's go ahead and do this.
16943
16:09:53,561 --> 16:09:56,281
Let me go ahead and define a function.
16944
16:09:56,281 --> 16:09:58,411
I'm just going to call it\nf to keep things simple.
16945
16:09:58,411 --> 16:10:01,531
And this f function is going\nto take a title as input.
16946
16:10:01,531 --> 16:10:06,481
And given a given title, it's going\n
16947
16:10:06,482 --> 16:10:09,902
So actually, maybe a better name\nfor this would be get value
16948
16:10:09,902 --> 16:10:12,162
and/or we could come up\nwith something else as well.
16949
16:10:12,161 --> 16:10:14,641
The purpose of the get\nvalue function, to be clear
16950
16:10:14,641 --> 16:10:19,542
is to take it as input a title and\n
16951
16:10:20,802 --> 16:10:23,102
Well, it turns out that the\nsorted function in Python
16952
16:10:23,101 --> 16:10:27,211
according to its documentation,\nalso takes a key parameter
16953
16:10:27,211 --> 16:10:31,201
where you can pass in, crazy\nenough, the name of a function
16954
16:10:31,201 --> 16:10:36,991
that it will use in order to determine\n
16955
16:10:36,991 --> 16:10:41,891
or by the value, or in other cases,\n
16956
16:10:41,891 --> 16:10:44,731
So there's a curiosity here,\nthough, that's very deliberate.
16957
16:10:44,732 --> 16:10:46,892
Key is the name of the\nparameter, just like reverse
16958
16:10:46,891 --> 16:10:48,434
was the name of this other parameter.
16959
16:10:48,434 --> 16:10:51,451
The value of it, though,\nis not a function call.
16960
16:10:52,921 --> 16:10:55,862
Notice I am not doing\nthis, no parentheses.
16961
16:10:55,862 --> 16:11:00,741
I'm instead passing in get value,\n
16962
16:11:00,741 --> 16:11:03,241
And this is a feature of Python\nand certain other languages.
16963
16:11:03,241 --> 16:11:06,451
Just like variables, you can\nactually pass whole functions
16964
16:11:06,451 --> 16:11:10,901
around so that they can be called\n
16965
16:11:10,902 --> 16:11:14,222
So what this means is that the\n
16966
16:11:14,222 --> 16:11:16,752
they didn't know what you're\ngoing to want to sort by today.
16967
16:11:16,752 --> 16:11:21,152
But if you provide them with a function\n
16968
16:11:21,152 --> 16:11:23,342
their sorted function\nwill use that function
16969
16:11:23,341 --> 16:11:27,181
to determine, OK, if you don't want to\n
16970
16:11:28,292 --> 16:11:31,141
This is going to tell\nit to sort by the value
16971
16:11:31,141 --> 16:11:34,091
by returning the specific\nvalue we care about.
16972
16:11:34,091 --> 16:11:37,921
So let me go ahead now and rerun this\n
16973
16:11:39,961 --> 16:11:42,451
Here we have now an example\nof all of the titles you all
16974
16:11:42,451 --> 16:11:47,521
typed in, albeit forced to uppercase\n
16975
16:11:47,521 --> 16:11:50,074
And now, The Office is\nan easy win over Friends
16976
16:11:50,074 --> 16:11:52,741
versus Community, versus Game of\nThrones, Breaking Bad, and then
16977
16:11:52,741 --> 16:11:55,081
a lot of variants thereafter.
16978
16:11:55,082 --> 16:11:57,022
So there's a lot of steps to go through.
16979
16:11:57,021 --> 16:11:58,896
This isn't that bad once\nyou've done it once
16980
16:11:58,896 --> 16:12:00,813
and you know what these\nfunctions are, and you
16981
16:12:00,813 --> 16:12:02,211
know that these parameters exist.
16982
16:12:03,351 --> 16:12:07,432
That's 17 lines of code\njust to analyze a CSV file
16983
16:12:07,432 --> 16:12:10,762
that you all created by way of\nthose Google Form submissions.
16984
16:12:10,762 --> 16:12:13,702
But it took me a lot of work just\n
16985
16:12:13,701 --> 16:12:15,618
And indeed, that's going\nto be among the goals
16986
16:12:15,618 --> 16:12:18,261
for today, ultimately, is, how\ncan we just make this easier?
16987
16:12:18,262 --> 16:12:20,137
It's one thing to learn\nnew things in Python
16988
16:12:20,137 --> 16:12:22,641
but if we can avoid writing\ncode, or this much code
16989
16:12:22,641 --> 16:12:24,182
that's going to be a good thing.
16990
16:12:24,182 --> 16:12:26,362
And so one other technique\nwe can introduce here
16991
16:12:26,362 --> 16:12:28,912
that does allow us to\nwrite a little less code
16992
16:12:28,911 --> 16:12:31,072
is, we can actually get\nrid of this function.
16993
16:12:31,072 --> 16:12:34,582
It turns out, in Python, if you\njust need to make a function
16994
16:12:34,582 --> 16:12:37,281
but it's going to be used and\nthen essentially thrown away
16995
16:12:37,281 --> 16:12:40,131
it's not something you're going\n
16996
16:12:40,131 --> 16:12:42,891
it's not like a library function\nthat you want to keep around--
16997
16:12:42,891 --> 16:12:45,021
you can actually just do this.
16998
16:12:45,021 --> 16:12:48,771
You can change the value\nof this key parameter
16999
16:12:48,771 --> 16:12:51,291
to be what's called a\nlambda function, which
17000
16:12:51,292 --> 16:12:54,381
is a fancy way of saying a function\n
17001
16:12:57,741 --> 16:13:00,871
Well, it's kind of stupid that\nI invented this name on line 13.
17002
16:13:00,872 --> 16:13:04,012
I used it on line 16, and\nthen I never again used it.
17003
16:13:04,012 --> 16:13:07,862
If there's only being used in one place,\n
17004
16:13:07,862 --> 16:13:10,342
So if you instead, in\nPython, say lambda
17005
16:13:10,341 --> 16:13:13,191
and then type out the\nname of the parameter
17006
16:13:13,192 --> 16:13:15,472
you want this anonymous\nfunction to take
17007
16:13:15,472 --> 16:13:19,802
you can then say, go ahead\nand return this value.
17008
16:13:19,802 --> 16:13:22,372
Now let's notice the\ninconsistencies here.
17009
16:13:22,372 --> 16:13:25,192
When you use this special lambda\nkeyword that says, hey Python
17010
16:13:25,192 --> 16:13:28,192
give me an anonymous function,\na function with no name
17011
16:13:28,192 --> 16:13:31,952
it then says, Python, this anonymous\n
17012
16:13:31,951 --> 16:13:34,040
Notice there's no parentheses.
17013
16:13:34,040 --> 16:13:35,841
And that's deliberate, if confusing.
17014
16:13:35,841 --> 16:13:38,250
It just tightens things up a little bit.
17015
16:13:38,250 --> 16:13:42,111
Notice that there's no return keyword,\n
17016
16:13:42,112 --> 16:13:44,122
up a bit, albeit inconsistently.
17017
16:13:44,122 --> 16:13:47,992
But this line of code\nI've just highlighted
17018
16:13:47,991 --> 16:13:51,771
is actually identical in\nfunctionality to this.
17019
16:13:51,771 --> 16:13:53,662
But it throws away the word [INAUDIBLE].
17020
16:13:53,661 --> 16:13:55,191
It throws away the word get value.
17021
16:13:55,192 --> 16:13:58,852
It throws away the parentheses, and\n
17022
16:14:00,021 --> 16:14:02,631
And it's well suited\nfor a problem like this
17023
16:14:02,631 --> 16:14:05,301
where I just want to pass in\na tiny little function that
17024
16:14:06,432 --> 16:14:08,182
But it's not something\nI'm going to reuse.
17025
16:14:08,182 --> 16:14:10,442
It doesn't need multiple\nlines to take up space.
17026
16:14:10,442 --> 16:14:12,501
It's just a nice, elegant one liner.
17027
16:14:12,500 --> 16:14:14,181
That's all a lambda function does.
17028
16:14:14,182 --> 16:14:17,251
It allows you to create an anonymous\n
17029
16:14:17,250 --> 16:14:22,551
And then the function you're passing it\n
17030
16:14:22,552 --> 16:14:26,152
Indeed, if I run Python of favorites.py\n
17031
16:14:26,152 --> 16:14:28,141
the result is exactly the same.
17032
16:14:28,141 --> 16:14:31,701
And we see at the bottom here\nall of those small results.
17033
16:14:31,701 --> 16:14:36,151
Are any questions, then, on\nthis syntax, on these ideas?
17034
16:14:36,152 --> 16:14:39,112
The goal here has been to write\n
17035
16:14:39,112 --> 16:14:44,201
to analyze or clean up data like this.
17036
16:14:48,614 --> 16:14:51,781
DAVID J. MALAN: Could you use the lambda\n
17037
16:14:51,781 --> 16:14:54,601
It's really meant for one\nline of code, generally.
17038
16:14:54,601 --> 16:14:56,762
So you don't use the return keyword.
17039
16:14:56,762 --> 16:14:59,222
You just say what it\nis you want to return.
17040
16:15:03,021 --> 16:15:04,271
DAVID J. MALAN: Good question.
17041
16:15:04,271 --> 16:15:06,301
Could you do more in\nthat one line if it's
17042
16:15:06,302 --> 16:15:08,012
got to be a more involved algorithm?
17043
16:15:08,012 --> 16:15:11,162
Yes, but you would just ultimately\nreturn the value in question.
17044
16:15:11,161 --> 16:15:13,288
In short, if it's getting\nat all sophisticated
17045
16:15:13,288 --> 16:15:15,121
you don't use the lambda\nfunction in Python.
17046
16:15:15,122 --> 16:15:17,852
You go ahead and actually\njust define a name for it
17047
16:15:17,851 --> 16:15:19,574
even if it's a one-off name.
17048
16:15:19,574 --> 16:15:21,991
JavaScript, another language\nwe'll look at in a few weeks
17049
16:15:21,991 --> 16:15:24,932
makes heavier use, I dare\nsay, of lambda functions.
17050
16:15:24,932 --> 16:15:27,372
And those can actually be\nmultiple, multiple lines
17051
16:15:27,372 --> 16:15:30,862
but Python does not\nsupport that instinct.
17052
16:15:31,362 --> 16:15:33,069
So let's go ahead and\ndo one other thing.
17053
16:15:33,069 --> 16:15:35,682
Office was clearly popping out\nof the code here quite a bit.
17054
16:15:35,682 --> 16:15:38,101
Let's go ahead and write a\nslightly different program
17055
16:15:38,101 --> 16:15:40,741
that maybe just focuses on\nThe Office for the moment
17056
16:15:42,241 --> 16:15:46,591
So let me go ahead and throw most of\n
17057
16:15:46,591 --> 16:15:48,421
when I'm inside of my inner loop.
17058
16:15:48,421 --> 16:15:51,391
And let me go ahead, and I don't\n
17059
16:15:51,391 --> 16:15:53,582
All I want to do is focus\non the current title.
17060
16:15:53,582 --> 16:15:56,072
How could I detect if\nsomeone likes The Office?
17061
16:15:56,072 --> 16:15:59,131
Well, I could say something like--
17062
16:16:01,652 --> 16:16:03,692
We'll just focus on The Office.
17063
16:16:03,692 --> 16:16:09,272
If title equals, equals The Office,\n
17064
16:16:13,741 --> 16:16:15,199
There's no dictionary involved now.
17065
16:16:15,199 --> 16:16:17,221
It's just a simple integer variable.
17066
16:16:17,222 --> 16:16:21,092
And then, down here\nI'll say something like
17067
16:16:21,091 --> 16:16:26,311
number of people who like The\nOffice is, whatever this value is.
17068
16:16:26,311 --> 16:16:29,191
And I'll put in counter in\ncurly braces, and then I'll
17069
16:16:29,192 --> 16:16:31,125
turn this whole thing into an F string.
17070
16:16:31,125 --> 16:16:32,792
All right, let me go ahead and run this.
17071
16:16:32,792 --> 16:16:35,442
Python of favorites.py, Enter.
17072
16:16:35,442 --> 16:16:37,952
Number of people who\nlike The Office is 15.
17073
16:16:39,332 --> 16:16:42,872
But let's go ahead now and\ndeliberately muddy the data a bit.
17074
16:16:42,872 --> 16:16:46,502
All of you were very nice in\nthat you typed in The Office.
17075
16:16:46,502 --> 16:16:48,572
But you can imagine\nsomeone just typing Office
17076
16:16:48,572 --> 16:16:51,033
for instance, maybe there, maybe there.
17077
16:16:51,033 --> 16:16:53,491
And many people might just\nwrite Office, you could imagine.
17078
16:16:53,491 --> 16:16:55,741
Didn't happen here, but\nsuppose it did, and probably
17079
16:16:55,741 --> 16:16:58,631
would have if we had even more\nand more submissions over time.
17080
16:16:58,631 --> 16:17:02,341
Now let's go ahead and rerun this\n
17081
16:17:02,341 --> 16:17:04,471
Now only 13 people like The Office.
17082
16:17:05,491 --> 16:17:11,131
The data is now as I mutated it to have\n
17083
16:17:11,131 --> 16:17:16,391
How could I change my Python code to\n
17084
16:17:16,391 --> 16:17:20,731
What could I change up here in\norder to improve this situation?
17085
16:17:24,091 --> 16:17:27,641
AUDIENCE: You write\nthe title [INAUDIBLE]..
17086
16:17:27,641 --> 16:17:30,391
DAVID J. MALAN: Yeah, so I could\n
17087
16:17:30,391 --> 16:17:34,711
If title equals The Office,\nor title equals, equals just
17088
16:17:35,779 --> 16:17:38,072
And I'm still don't have to\nworry about capitalization.
17089
16:17:38,072 --> 16:17:41,154
I don't have to worry about spaces\n
17090
16:17:41,154 --> 16:17:43,542
Now I can go ahead and rerun this code.
17091
16:17:43,542 --> 16:17:45,332
Let me go run it a third time.
17092
16:17:50,552 --> 16:17:54,002
You could imagine this\nnot scaling very well.
17093
16:17:54,002 --> 16:17:57,105
Avatar had three different\n
17094
16:17:57,105 --> 16:17:59,522
if we dug deeper that there\nmight have been more variants.
17095
16:17:59,521 --> 16:18:01,771
Could we do something a\nlittle more general purpose?
17096
16:18:01,771 --> 16:18:03,572
Well, we could do something like this.
17097
16:18:07,224 --> 16:18:09,391
this is kind of a cool thing\nyou can do with Python.
17098
16:18:09,391 --> 16:18:12,271
It's very English-like, just ask\nthe question, albeit tersely.
17099
16:18:12,271 --> 16:18:16,002
This, interesting, just\ngot me into trouble.
17100
16:18:16,002 --> 16:18:18,482
Now, all of a sudden, we're up to 16.
17101
16:18:18,482 --> 16:18:21,594
Does anyone know what the other one is?
17102
16:18:21,593 --> 16:18:23,439
AUDIENCE: Someone put V Office.
17103
16:18:23,440 --> 16:18:24,607
DAVID J. MALAN: What Office?
17104
16:18:24,607 --> 16:18:27,462
AUDIENCE: Someone entered\na V Office, [INAUDIBLE]..
17105
16:18:29,858 --> 16:18:31,191
DAVID J. MALAN: Oh, interesting.
17106
16:18:42,472 --> 16:18:44,811
OK, this one's actually going\nto be hard to correct for.
17107
16:18:44,811 --> 16:18:46,641
I can't really think of a general--
17108
16:18:46,641 --> 16:18:51,201
well, this is actually a good\nexample of data gets messy fast.
17109
16:18:51,201 --> 16:18:53,421
And you could imagine doing\nsomething where, OK, we
17110
16:18:53,421 --> 16:18:58,261
could have like 26 conditions if someone\n
17111
16:18:58,762 --> 16:18:59,992
You could imagine doing that.
17112
16:18:59,991 --> 16:19:02,741
But then there's surely going to\n
17113
16:19:02,741 --> 16:19:04,951
So that's actually a hard one to fix.
17114
16:19:04,951 --> 16:19:10,072
But it turns out we got lucky and now\n
17115
16:19:10,072 --> 16:19:12,002
But the data is itself messy.
17116
16:19:12,002 --> 16:19:15,292
Let me show another way that just\n
17117
16:19:15,292 --> 16:19:20,092
It turns out that there's this feature\n
17118
16:19:20,091 --> 16:19:22,381
among them, called regular expressions.
17119
16:19:22,381 --> 16:19:24,381
And this is actually a\nreally powerful technique
17120
16:19:24,381 --> 16:19:26,214
that we'll just scratch\nthe surface of here.
17121
16:19:26,214 --> 16:19:29,631
But it's going to be really useful,\n
17122
16:19:29,631 --> 16:19:34,221
in web programming, any time you want\n
17123
16:19:34,222 --> 16:19:37,252
And actually, just to make\nthis clear, give me a moment
17124
16:19:37,252 --> 16:19:39,502
before I switch screens here.
17125
16:19:39,502 --> 16:19:43,792
And let me open up a\nGoogle Form from scratch.
17126
16:19:43,792 --> 16:19:47,572
Give me just a moment to\ncreate something real quick.
17127
16:19:47,572 --> 16:19:50,601
If you've never noticed this\nbefore when creating a Google Form
17128
16:19:53,752 --> 16:19:55,701
And if you want the user\nto type in something
17129
16:19:55,701 --> 16:19:58,375
very specific as a short\ntext answer like this
17130
16:19:58,375 --> 16:20:01,042
you might know that there's toggles\nlike this in Google's world
17131
16:20:02,271 --> 16:20:04,611
Or you can do response validation.
17132
16:20:04,612 --> 16:20:07,012
You could say, what's your email?
17133
16:20:07,012 --> 16:20:12,592
And then you could say something\nlike, text is an email.
17134
16:20:12,591 --> 16:20:17,871
So here's an example in Google Forms\n
17135
16:20:17,872 --> 16:20:22,492
But a feature most of you have probably\n
17136
16:20:22,491 --> 16:20:24,831
is this thing called a\nregular expression, where
17137
16:20:24,832 --> 16:20:26,781
you can actually define a pattern.
17138
16:20:26,781 --> 16:20:30,171
And I could actually reimplement that\n
17139
16:20:30,171 --> 16:20:36,411
I can say, let the user type in anything\n
17140
16:20:36,411 --> 16:20:41,941
then something else, then a\nliteral period, then, for instance
17141
16:20:43,021 --> 16:20:45,291
So it's very cryptic,\nadmittedly, at first glance.
17142
16:20:45,292 --> 16:20:48,772
But this means any\ncharacter 0 more times.
17143
16:20:48,771 --> 16:20:51,502
This means any character 0 more times.
17144
16:20:51,502 --> 16:20:54,067
This means a literal\nperiod, because apparently
17145
16:20:54,067 --> 16:20:57,502
dot means any character in\nthe context of these patterns.
17146
16:20:57,502 --> 16:21:01,412
Then this thing means any\ncharacter 0 more times.
17147
16:21:01,411 --> 16:21:04,011
So I should actually be\na little more nitpicky.
17148
16:21:04,012 --> 16:21:06,872
You don't want 0 or more times,\nyou want 1 or more times.
17149
16:21:06,872 --> 16:21:10,682
So this with the plus means\nany character 1 or more time.
17150
16:21:10,682 --> 16:21:12,362
So there has to be something there.
17151
16:21:12,362 --> 16:21:16,972
And I think I want the same thing\n
17152
16:21:16,972 --> 16:21:21,502
Or heck, if I want to restrict this\n
17153
16:21:21,502 --> 16:21:24,862
I could change that last\nthing to literally .edu.
17154
16:21:24,862 --> 16:21:26,912
And so long story short,\neven though this looks
17155
16:21:26,911 --> 16:21:31,761
I'm sure, pretty cryptic, there's\n
17156
16:21:31,762 --> 16:21:35,242
and JavaScript, and Java, and other\n
17157
16:21:35,241 --> 16:21:37,771
patterns in a standardized way.
17158
16:21:37,771 --> 16:21:41,271
And this pattern is actually something\n
17159
16:21:41,271 --> 16:21:43,491
And let me switch back to\nPython for a second just
17160
16:21:43,491 --> 16:21:45,261
to do the same kind of idea.
17161
16:21:45,262 --> 16:21:48,292
Let me toggle back to my code here.
17162
16:21:48,292 --> 16:21:52,072
Let me put up, for instance, a\nsummary of what it is you can do.
17163
16:21:52,072 --> 16:21:58,372
And here's just a quick summary\n
17164
16:21:58,372 --> 16:22:04,672
A period may represent any character.\n
17165
16:22:05,311 --> 16:22:08,362
So the dot means anything,\nso it can be A or nothing.
17166
16:22:09,502 --> 16:22:14,872
It can be A, B, A, B, C. It can be any\n
17167
16:22:14,872 --> 16:22:18,202
Change that to a plus and you now\n
17168
16:22:18,201 --> 16:22:21,201
Question mark means\nsomething is optional.
17169
16:22:21,201 --> 16:22:24,711
Caret symbol means start matching at\n
17170
16:22:24,711 --> 16:22:30,442
Dollar sign means stop matching\nat the end of the user's input.
17171
16:22:30,442 --> 16:22:32,552
So we won't play with\nall of these just now.
17172
16:22:32,552 --> 16:22:36,812
But let me go over here and\nactually tackle this Office problem.
17173
16:22:36,811 --> 16:22:40,792
Let me go ahead and import a new library\n
17174
16:22:42,622 --> 16:22:45,902
And then, down here, let me say this.
17175
16:22:50,421 --> 16:22:55,551
Let's just search for Office, quote,\n
17176
16:22:55,552 --> 16:22:58,072
Then we're going to go ahead\nand increase the counter.
17177
16:22:58,072 --> 16:23:00,381
So it turns out that the\nregular expression library
17178
16:23:00,381 --> 16:23:04,432
has a function called search that\n
17179
16:23:04,432 --> 16:23:07,311
and then, as its second\nargument the string you
17180
16:23:07,311 --> 16:23:09,502
want to analyze for that pattern.
17181
16:23:09,502 --> 16:23:13,222
So it's sort of looking for a needle\n
17182
16:23:13,222 --> 16:23:17,421
Let me go ahead now and run this\nversion of the program, Enter.
17183
16:23:17,421 --> 16:23:21,591
And now I screwed up because I forgot\n
17184
16:23:24,491 --> 16:23:27,141
Number of people who\nlike The Office is now 0.
17185
16:23:28,311 --> 16:23:30,981
thank you-- big step backwards.
17186
16:23:36,951 --> 16:23:39,936
I forced all my input to uppercase,\n
17187
16:23:39,936 --> 16:23:41,811
So we'll come back to\nother approaches there.
17188
16:23:42,771 --> 16:23:45,141
OK, now we're back up to 16.
17189
16:23:45,141 --> 16:23:47,542
But I could even, let's say--
17190
16:23:47,542 --> 16:23:50,452
I could tolerate just The Office.
17191
16:23:50,451 --> 16:23:55,461
How about this, or how about\nsomething like, or The Office?
17192
16:23:57,714 --> 16:23:59,631
And let me use these\nother special characters.
17193
16:23:59,631 --> 16:24:02,721
This caret sign means the\nbeginning of the string.
17194
16:24:02,722 --> 16:24:06,082
This dollar sign weirdly\nrepresents the end of the string.
17195
16:24:06,082 --> 16:24:09,832
I'm adding in some parentheses just\n
17196
16:24:11,811 --> 16:24:15,921
And this is saying start matching\n
17197
16:24:15,921 --> 16:24:20,002
Check if the beginning of the string is\n
17198
16:24:21,262 --> 16:24:23,882
And then, you better be\nat the end of the string.
17199
16:24:23,881 --> 16:24:26,991
So they can't keep typing words\nbefore or after that input.
17200
16:24:26,991 --> 16:24:29,031
Let me go ahead and rerun the program.
17201
16:24:29,031 --> 16:24:32,841
And now we're down to 15, which\nused to be our correct answer
17202
16:24:32,841 --> 16:24:36,111
but then we noticed The V Office.
17203
16:24:38,021 --> 16:24:41,541
It's going to be messier\nto deal with that.
17204
16:24:41,542 --> 16:24:46,897
How about if I tolerate any\ncharacter represented by dot
17205
16:24:48,982 --> 16:24:53,452
Now if I rerun it, now I really\nhave this expressive capability.
17206
16:24:53,451 --> 16:24:57,682
So this is only to say, there are so\n
17207
16:24:58,732 --> 16:25:01,292
And some of these tools are\nmore sophisticated than others.
17208
16:25:01,292 --> 16:25:04,298
This is one that you've actually\n
17209
16:25:04,298 --> 16:25:06,381
in the context of Google\nForms for years if you're
17210
16:25:06,381 --> 16:25:09,112
in the habit of creating these for\n
17211
16:25:09,112 --> 16:25:11,182
But it's now something\nyou can start to leverage.
17212
16:25:11,182 --> 16:25:14,781
And we're just scratching the surface\n
17213
16:25:14,781 --> 16:25:18,981
But let's now do one final example\n
17214
16:25:18,982 --> 16:25:20,872
And let's actually\nwrite a program that's
17215
16:25:20,872 --> 16:25:25,382
a little more general purpose that\n
17216
16:25:25,381 --> 16:25:27,112
and figure out its popularity.
17217
16:25:27,112 --> 16:25:29,752
So let me go ahead and simplify this.
17218
16:25:29,752 --> 16:25:31,972
Let's get rid of our\nregular expressions.
17219
16:25:31,972 --> 16:25:35,281
Let's go ahead and continue\ncapitalizing the title.
17220
16:25:36,921 --> 16:25:41,362
at the beginning of this program,\n
17221
16:25:42,722 --> 16:25:45,662
So title equals, let's\nask the user for input
17222
16:25:45,661 --> 16:25:48,831
which is essentially the same thing\n
17223
16:25:50,302 --> 16:25:53,512
And then whatever they type in,\n
17224
16:25:53,512 --> 16:25:56,242
and uppercase the thing again.
17225
16:25:56,241 --> 16:26:01,161
And now, inside of my loop, I\ncould say something like this.
17226
16:26:01,161 --> 16:26:08,001
If the current row's title after\n
17227
16:26:08,002 --> 16:26:12,262
it to uppercase, too, equals\nthe user's title, then, go ahead
17228
16:26:12,262 --> 16:26:14,781
and maybe increment a counter.
17229
16:26:14,781 --> 16:26:16,502
So I still need that counter back.
17230
16:26:16,502 --> 16:26:21,951
So let me go ahead and define this\n
17231
16:26:21,951 --> 16:26:24,061
And then, at the very\nend of this program
17232
16:26:24,061 --> 16:26:26,391
let me go ahead and print\nout just the popularity
17233
16:26:26,391 --> 16:26:28,381
of whatever the human typed in.
17234
16:26:28,381 --> 16:26:31,371
So again, the only difference is\n
17235
16:26:32,061 --> 16:26:34,491
I'm initializing my\ncounter to 0, then I'm
17236
16:26:34,491 --> 16:26:38,002
searching for their\ntitle in the CSV file
17237
16:26:38,002 --> 16:26:41,152
by doing the same massaging of the\n
17238
16:26:41,152 --> 16:26:43,912
and getting rid of the whitespace.
17239
16:26:43,911 --> 16:26:47,121
So now, when I run Python\nof favorites.py, Enter
17240
16:26:47,122 --> 16:26:55,372
I could type in the office all lowercase\n
17241
16:27:02,042 --> 16:27:05,982
Because I'm the one that went in and\n
17242
16:27:05,982 --> 16:27:08,372
If we fixed those, we\nwould be back up to 15.
17243
16:27:08,372 --> 16:27:12,992
If we added support for The V\n
17244
16:27:12,991 --> 16:27:15,691
All right, any questions then\non these various manipulations?
17245
16:27:15,692 --> 16:27:17,525
And if you're feeling\nlike, oh, my god, this
17246
16:27:17,525 --> 16:27:20,442
is so much Python code just to do\n
17247
16:27:20,442 --> 16:27:22,502
And indeed, even though\nit's a powerful language
17248
16:27:22,502 --> 16:27:26,012
and can solve these kinds of problems,\n
17249
16:27:26,012 --> 16:27:28,812
just to ask a single question like this.
17250
16:27:28,811 --> 16:27:32,461
But any questions on how we did\n
17251
16:27:38,641 --> 16:27:40,141
Let's take a five-minute break here.
17252
16:27:40,141 --> 16:27:42,572
When we come back, we'll do it better.
17253
16:27:43,822 --> 16:27:45,951
And the rest of today\nis ultimately about, how
17254
16:27:45,951 --> 16:27:50,182
can we store, and manipulate,\nand change, and retrieve data
17255
16:27:50,182 --> 16:27:53,432
more efficiently than we might\nby just writing raw code?
17256
16:27:53,432 --> 16:27:56,781
This isn't to say that you shouldn't\n
17257
16:27:57,622 --> 16:28:02,362
And in fact, it might be super common\n
17258
16:28:02,362 --> 16:28:04,415
from users that you might\nwant to clean it up.
17259
16:28:04,415 --> 16:28:07,582
And maybe the best way to do that is\n
17260
16:28:07,582 --> 16:28:09,711
you can make all of the\nrequisite changes and fixes
17261
16:28:09,711 --> 16:28:12,864
like we did with The Office,\nfor instance, again and again
17262
16:28:12,864 --> 16:28:15,531
and reuse that code, especially\nif more and more submissions are
17263
16:28:16,491 --> 16:28:18,891
But another theme of\ntoday, ultimately, is
17264
16:28:18,891 --> 16:28:22,980
that sometimes there are different,\n
17265
16:28:22,980 --> 16:28:24,772
And in fact, now at\nthis point in the term
17266
16:28:24,771 --> 16:28:27,651
as we begin to introduce not\njust Python, but in a moment
17267
16:28:27,652 --> 16:28:31,461
a language called SQL, and next\n
17268
16:28:31,461 --> 16:28:34,491
and the week after that, synthesizing\n
17269
16:28:34,491 --> 16:28:37,761
together is to just kind\nof paint a picture of how
17270
16:28:37,762 --> 16:28:41,242
you might decide what the trade-offs are\n
17271
16:28:42,171 --> 16:28:45,112
Because undoubtedly you can\nsolve problems moving forward
17272
16:28:45,112 --> 16:28:48,002
in many different ways\nwith many different tools.
17273
16:28:48,002 --> 16:28:50,362
So let's give you another\ntool, one with which
17274
16:28:50,362 --> 16:28:53,512
you can implement a proper\nrelational database.
17275
16:28:53,512 --> 16:28:56,391
What we just saw in\nthe form of CSV files
17276
16:28:56,391 --> 16:28:59,152
are what we might call\nflat-file databases.
17277
16:28:59,152 --> 16:29:02,842
Again, just a very simple file, flat\n
17278
16:29:04,612 --> 16:29:09,622
And that is all ultimately\nstoring ASCII or Unicode text.
17279
16:29:09,622 --> 16:29:12,742
A relational database, though,\nis something that's actually
17280
16:29:12,741 --> 16:29:16,191
closer to a proper spreadsheet program.
17281
16:29:16,192 --> 16:29:18,781
A CSV is an individual\nsheet, if you will
17282
16:29:18,781 --> 16:29:20,601
from a spreadsheet when you export it.
17283
16:29:20,601 --> 16:29:22,801
If you had multiple\nsheets in a spreadsheet
17284
16:29:22,802 --> 16:29:24,937
you would have to export multiple CSVs.
17285
16:29:24,936 --> 16:29:26,811
And that gets annoying\nquickly in code if you
17286
16:29:26,811 --> 16:29:29,331
have to open up this CSV,\nthis CSV, all of which
17287
16:29:29,332 --> 16:29:32,421
represent different sheets or\ntabs in a proper spreadsheet.
17288
16:29:32,421 --> 16:29:36,862
A relational database is more\nlike a spreadsheet program
17289
16:29:36,862 --> 16:29:39,982
that you, a programmer,\nnow can interact with.
17290
16:29:41,482 --> 16:29:45,022
You can read data from it, and you\n
17291
16:29:45,021 --> 16:29:47,491
tables storing all of your data.
17292
16:29:47,491 --> 16:29:49,581
So whereas Excel and numbers\nin Google spreadsheet
17293
16:29:49,582 --> 16:29:52,432
are meant to be reused really by humans\n
17294
16:29:52,432 --> 16:29:55,192
clicking, and pointing, and\nmanipulating things graphically
17295
16:29:55,192 --> 16:29:57,502
a relational database\nusing a language called
17296
16:29:57,502 --> 16:30:02,662
SQL is one in which the programmer\nhas similar capabilities
17297
16:30:04,341 --> 16:30:08,061
Specifically, using a language\ncalled SQL, and at a scale
17298
16:30:08,061 --> 16:30:11,011
that's much grander\nthan spreadsheets alone.
17299
16:30:11,012 --> 16:30:13,762
In fact, if you try on your Mac\n
17300
16:30:13,762 --> 16:30:16,432
got tens of thousands\nof rows, it'll probably
17301
16:30:16,432 --> 16:30:20,122
work fine, hundreds of thousands\n
17302
16:30:20,122 --> 16:30:22,342
At some point your Mac or\nPC is going to struggle
17303
16:30:22,341 --> 16:30:24,471
to open particularly large data sets.
17304
16:30:24,472 --> 16:30:26,961
And that, too, is where\nproper databases come
17305
16:30:26,961 --> 16:30:29,481
into play and proper\nlanguages for databases come
17306
16:30:29,482 --> 16:30:31,462
into play, when it's all about scale.
17307
16:30:31,461 --> 16:30:34,731
And indeed, most any mobile app or\n
17308
16:30:34,732 --> 16:30:38,762
might write should probably plan\n
17309
16:30:38,762 --> 16:30:41,072
So we need the right\ntools for that problem.
17310
16:30:41,072 --> 16:30:44,451
So fortunately, even though we're\n
17311
16:30:44,451 --> 16:30:49,701
it only does four things fundamentally,\n
17312
16:30:49,701 --> 16:30:53,211
SQL, this language for\ndatabases, supports the ability
17313
16:30:53,211 --> 16:30:57,741
to create data, read data,\nupdate data, and delete data.
17314
16:30:58,762 --> 16:31:02,031
There's a few more keywords that\n
17315
16:31:03,091 --> 16:31:04,799
But at the end of the\nday, even if you're
17316
16:31:04,800 --> 16:31:07,522
starting to feel like this\nis a lot very quickly
17317
16:31:07,521 --> 16:31:10,281
it all boils down to these\nfour basic operations.
17318
16:31:10,281 --> 16:31:12,981
And the four commands\nin SQL, if you will
17319
16:31:12,982 --> 16:31:17,122
functions in a sense that implement\n
17320
16:31:17,122 --> 16:31:19,612
They're almost the same but\nwith some slight variance.
17321
16:31:19,612 --> 16:31:24,622
The ability to create or insert data\n
17322
16:31:27,472 --> 16:31:30,389
Delete is the same, but drop\nis also a keyword as well.
17323
16:31:30,389 --> 16:31:32,182
So we'll see these and\na few other keywords
17324
16:31:32,182 --> 16:31:35,752
in SQL that, at the end of the day, just\n
17325
16:31:35,752 --> 16:31:39,652
data using verbs, if\nyou will, like these.
17326
16:31:39,652 --> 16:31:43,372
So to do that, what's\nthe syntax going to be?
17327
16:31:43,372 --> 16:31:45,632
Well, we won't get into the\nweeds too quickly on this.
17328
16:31:45,631 --> 16:31:47,991
But here's a representative\nsyntax of how
17329
16:31:47,991 --> 16:31:51,051
you can create using this\nlanguage called SQL, in your very
17330
16:31:51,052 --> 16:31:53,362
own database, a brand new table.
17331
16:31:53,362 --> 16:31:56,252
This is so easy in Excel, and Google\n
17332
16:31:56,252 --> 16:31:58,252
You want a new sheet, you\nclick the plus button.
17333
16:31:59,031 --> 16:32:00,832
You give it a name,\nand boom, you're done.
17334
16:32:00,832 --> 16:32:05,391
In the world of programming, though, if\n
17335
16:32:05,391 --> 16:32:08,781
spreadsheet in the computer's memory,\n
17336
16:32:08,781 --> 16:32:13,762
like a sheet, that has a name, and then\n
17337
16:32:13,762 --> 16:32:17,332
But unlike Google Spreadsheets,\nand Apple Numbers, and Excel
17338
16:32:17,332 --> 16:32:20,415
you have to decide as the\nprogrammer what types of data
17339
16:32:20,415 --> 16:32:22,582
you're going to be storing\nin each of these columns.
17340
16:32:22,582 --> 16:32:24,772
Now even though Excel,\nand Google Spreadsheets
17341
16:32:24,771 --> 16:32:28,651
and Numbers does allow you to format\n
17342
16:32:28,652 --> 16:32:33,022
it's not strongly typed data like it\n
17343
16:32:33,021 --> 16:32:35,541
And heck, even in Python\nthere's underlying data types.
17344
16:32:35,542 --> 16:32:37,500
Even if you don't have\nto type them explicitly
17345
16:32:37,500 --> 16:32:40,241
databases are going to want to\nknow, are you storing integers?
17346
16:32:40,241 --> 16:32:41,981
Are you storing real numbers or floats?
17347
16:32:43,482 --> 16:32:46,302
Because especially as your\ndata scales, the more hints
17348
16:32:46,302 --> 16:32:49,752
you give the database about your\n
17349
16:32:49,752 --> 16:32:52,841
the faster it can help you\nget at and store that data.
17350
16:32:52,841 --> 16:32:54,644
So types are about to\nbe important again
17351
16:32:54,644 --> 16:32:57,101
but there's not going to be\nthat many of them, fortunately.
17352
16:32:57,101 --> 16:32:59,981
Now how can I go about converting,\nfor instance, some real data
17353
16:32:59,982 --> 16:33:02,832
like that from you,\nmy favorites.csv file
17354
16:33:02,832 --> 16:33:04,781
into a proper relational database?
17355
16:33:04,781 --> 16:33:07,991
Well, it turns out that\nusing SQL I can do this
17356
16:33:07,991 --> 16:33:10,601
in VS Code on my own Mac,\nor PC, or in the cloud
17357
16:33:10,601 --> 16:33:13,796
here by just importing\nthe CSV into a database.
17358
16:33:13,796 --> 16:33:15,671
We'll see eventually\nhow to do this manually.
17359
16:33:15,671 --> 16:33:17,963
For now, I'm going to use\nmore of an automated process.
17360
16:33:17,963 --> 16:33:20,021
So let me go over to VS Code here.
17361
16:33:20,021 --> 16:33:22,511
Let me type ls to see\nwhere we left off before.
17362
16:33:22,512 --> 16:33:26,347
I had two files favorites.csv, which\n
17363
16:33:26,347 --> 16:33:27,972
Recall that I made a couple of changes.
17364
16:33:27,972 --> 16:33:31,391
We deleted a couple of Thes\nfrom the file for The Office.
17365
16:33:31,391 --> 16:33:33,942
But this is the same file\nas before, and then we
17366
16:33:33,942 --> 16:33:36,552
have favorites.py, which\nwe'll set aside for now.
17367
16:33:36,552 --> 16:33:40,212
I'm going to go ahead now\nand run a command SQLite3.
17368
16:33:40,211 --> 16:33:43,362
So in the world of\nrelational databases, there's
17369
16:33:43,362 --> 16:33:48,372
many different products out there,\nmany different software that
17370
16:33:48,372 --> 16:33:50,711
implements the SQL language.
17371
16:33:51,942 --> 16:33:55,422
There's something called MySQL\n
17372
16:33:55,421 --> 16:33:57,461
Facebook, for instance,\nused it early on.
17373
16:33:57,461 --> 16:34:00,372
PostgreSQL, Microsoft\nAccess Server, Oracle
17374
16:34:00,372 --> 16:34:02,300
and maybe a whole bunch\nof other product names
17375
16:34:02,300 --> 16:34:04,092
you might have encountered\nover time, which
17376
16:34:04,091 --> 16:34:08,322
is to say there's many different\ntypes of tools, and servers
17377
16:34:08,322 --> 16:34:10,332
and software in which you can use SQL.
17378
16:34:10,332 --> 16:34:13,122
We're going to use a very lightweight\n
17379
16:34:14,711 --> 16:34:17,021
This is the version of\nSQL that's generally
17380
16:34:17,021 --> 16:34:19,361
used on iPhones and\nAndroid devices these days.
17381
16:34:19,362 --> 16:34:22,272
If you download an app that stores\ndata like your own contacts
17382
16:34:22,271 --> 16:34:24,341
typically is stored using SQLite.
17383
16:34:24,341 --> 16:34:28,051
Because it's fairly lightweight,\n
17384
16:34:28,052 --> 16:34:31,152
thousands, even tens of\nthousands of pieces of data
17385
16:34:31,152 --> 16:34:33,312
even using this lightweight\nversion thereof.
17386
16:34:33,311 --> 16:34:36,131
SQLite3 is like version 3 of this tool.
17387
16:34:36,131 --> 16:34:41,682
We're going to go ahead and run SQLite3\n
17388
16:34:41,682 --> 16:34:45,461
It's conventional in the world of\n
17389
16:34:45,461 --> 16:34:47,832
I'm going to create a\ndatabase called favorites.db.
17390
16:34:47,832 --> 16:34:52,351
Once I'm inside of the program, now I'm\n
17391
16:34:52,351 --> 16:34:54,101
Again, not something\nyou have to memorize
17392
16:34:54,101 --> 16:34:55,809
just something you\ncan look up as needed.
17393
16:34:55,809 --> 16:34:59,441
And then, I'm going to\nimport favorites.csv
17394
16:34:59,442 --> 16:35:05,602
into a table, that is, a sheet, if\n
17395
16:35:05,601 --> 16:35:09,371
Now I'm going to hit Enter and I'm\n
17396
16:35:10,991 --> 16:35:13,511
Now I have three files\nin my current directory--
17397
16:35:13,512 --> 16:35:17,472
the CSV file, the Python file\nfrom before, and now favorites.db.
17398
16:35:17,472 --> 16:35:21,522
But if I did this right, all of the\n
17399
16:35:21,521 --> 16:35:25,182
has now been loaded into a proper\ndatabase where I can now use
17400
16:35:25,182 --> 16:35:28,521
this SQL language to access it instead.
17401
16:35:28,521 --> 16:35:33,072
So let's go ahead again and run SQLite3\n
17402
16:35:33,072 --> 16:35:35,982
And now, at the SQLite\nprompt I can start
17403
16:35:35,982 --> 16:35:38,502
to play around and\nsee what this data is.
17404
16:35:38,502 --> 16:35:41,951
For instance, I can\nlook, by typing .schema
17405
16:35:41,951 --> 16:35:44,703
at what the schema is of\nmy data, what's the design.
17406
16:35:44,703 --> 16:35:47,411
Now no thought was put into the\n
17407
16:35:47,411 --> 16:35:49,241
because I automated the whole process.
17408
16:35:49,241 --> 16:35:52,091
Once we start creating\nour own databases we'll
17409
16:35:52,091 --> 16:35:55,091
give more thought to the data\n
17410
16:35:55,091 --> 16:35:59,561
But we can see what SQLite\npresumed I wanted just
17411
16:35:59,561 --> 16:36:01,871
by importing the data by default.
17412
16:36:01,872 --> 16:36:06,461
What the import command did for me a\n
17413
16:36:06,461 --> 16:36:09,851
It automated the process of creating\n
17414
16:36:11,112 --> 16:36:14,322
And then notice, in parentheses\nit gave me three columns--
17415
16:36:14,322 --> 16:36:18,701
timestamp, title, and genres, which\n
17416
16:36:18,701 --> 16:36:21,341
All three of which have\nbeen decreed to be text.
17417
16:36:21,341 --> 16:36:24,521
Again, once we're more comfortable\nwe'll create our own tables
17418
16:36:24,521 --> 16:36:26,351
choose our own types and column names.
17419
16:36:26,351 --> 16:36:28,691
But for now, I just automated\nthe whole process just
17420
16:36:28,692 --> 16:36:33,461
to get us started by using this\nbuilt-in import command as well.
17421
16:36:34,152 --> 16:36:36,972
So what now can I begin to do?
17422
16:36:36,972 --> 16:36:42,252
Well, if I wanted to, for instance,\n
17423
16:36:42,252 --> 16:36:44,936
I might execute a couple\nof different commands.
17424
16:36:48,341 --> 16:36:53,762
Let me find the right one here--\none of which would be select.
17425
16:36:53,762 --> 16:36:56,951
Select being one of our\nmost versatile tools
17426
16:36:56,951 --> 16:36:58,521
to select data from this database.
17427
16:36:58,521 --> 16:37:01,061
So if I have these three\ncolumns here-- timestamp
17428
16:37:01,061 --> 16:37:04,362
title, and genres, suppose I\nwant to select all of the titles.
17429
16:37:04,362 --> 16:37:09,131
Doing that earlier in Python\nrequired importing the CSV library
17430
16:37:09,131 --> 16:37:14,081
opening the file, creating a reader or\n
17431
16:37:14,082 --> 16:37:16,842
adding every title to a dictionary\nor just printing it out
17432
16:37:17,652 --> 16:37:20,512
There was a dozen or so lines\nof code when we first began.
17433
16:37:22,182 --> 16:37:26,561
Select title from\nfavorites, semicolon, done.
17434
16:37:26,561 --> 16:37:30,911
So now, with this particular\n
17435
16:37:30,911 --> 16:37:34,271
and it's simulating what it looks like\n
17436
16:37:36,211 --> 16:37:39,421
Select title from\nfavorites is a distillation
17437
16:37:39,421 --> 16:37:42,871
in a different language called\nSQL of all the lines of code
17438
16:37:42,872 --> 16:37:46,082
I wrote early on when we first\n
17439
16:37:46,082 --> 16:37:50,882
SQL is therefore optimized for\n
17440
16:37:50,881 --> 16:37:52,841
and ultimately, deleting data.
17441
16:37:52,841 --> 16:37:56,041
So here's perhaps a better tool\n
17442
16:37:56,042 --> 16:37:59,372
Tossing it into a more\npowerful, versatile format
17443
16:37:59,372 --> 16:38:02,569
might allow you now to get\nmore work done more quickly
17444
16:38:02,569 --> 16:38:04,112
without having to reinvent the wheel.
17445
16:38:04,112 --> 16:38:06,851
Someone else has figured out\nhow to select data like this.
17446
16:38:09,101 --> 16:38:12,391
Well, let me go ahead and pull\n
17447
16:38:14,732 --> 16:38:19,182
Give me one second to find this.
17448
16:38:19,182 --> 16:38:23,432
So suppose I want to now select\ndata a little more powerfully.
17449
16:38:23,432 --> 16:38:25,561
So here's what I just\ndid in a canonical way.
17450
16:38:25,561 --> 16:38:27,061
So select typically works like this.
17451
16:38:27,061 --> 16:38:31,201
You select columns from a\nspecific table, semicolon.
17452
16:38:31,201 --> 16:38:33,601
Unfortunately, stupid\nsemicolons are back.
17453
16:38:33,601 --> 16:38:38,051
Select columns from table then, is\n
17454
16:38:38,052 --> 16:38:42,463
More specifically, I selected one\n
17455
16:38:42,463 --> 16:38:43,921
Favorites is the name of the table.
17456
16:38:45,031 --> 16:38:48,781
Suppose I wanted to get two things, like\n
17457
16:38:48,781 --> 16:38:53,762
I could instead do select title,\ncomma, genres from favorites
17458
16:38:53,762 --> 16:38:55,562
and then, a semicolon, and Enter.
17459
16:38:55,561 --> 16:38:57,451
It's going to look a\nlittle ugly on my screen
17460
16:38:57,451 --> 16:38:59,011
because some of these titles and--
17461
16:38:59,012 --> 16:39:02,641
OK, one of you really went\nall out with Community.
17462
16:39:02,641 --> 16:39:06,002
You can see that it's just\nwrapping in an ugly way
17463
16:39:06,002 --> 16:39:08,641
but it's just now\nshowing me two columns.
17464
16:39:08,641 --> 16:39:12,182
If we scroll up to the very top\nagain, the left most of one
17465
16:39:12,182 --> 16:39:13,622
Black Mirror went all out, too.
17466
16:39:14,491 --> 16:39:17,341
And now, OK, we're going to\nhave to clean some of these up.
17467
16:39:17,341 --> 16:39:19,456
Game of Thrones, good comedy, yes.
17468
16:39:22,891 --> 16:39:24,822
Keep going, keep going, keep going.
17469
16:39:24,822 --> 16:39:28,211
So now we've selected two of\nthe columns that we care about.
17470
16:39:28,711 --> 16:39:31,722
OK, so it's crazy wide because\nof all of those genres.
17471
16:39:31,722 --> 16:39:34,476
But it allows me to select\nexactly the data I want.
17472
16:39:34,476 --> 16:39:37,351
Let's go back to the titles, though,\n
17473
16:39:39,091 --> 16:39:43,651
For instance, it turns out, using\n
17474
16:39:45,241 --> 16:39:48,184
You've got a lot of functions, similar\n
17475
16:39:48,184 --> 16:39:49,351
where you can have formulas.
17476
16:39:49,351 --> 16:39:51,661
SQL provides you with some\nof the same heuristics that
17477
16:39:51,661 --> 16:39:55,691
allow you to apply operations\nlike these on entire columns.
17478
16:39:55,692 --> 16:39:58,262
For instance, you can take\naverages, count the total
17479
16:39:58,262 --> 16:40:01,351
get the distinct values, force\n
17480
16:40:02,561 --> 16:40:04,951
So let's try distinct, for instance.
17481
16:40:04,951 --> 16:40:08,791
Let me go back to my Terminal,\nand let's say, select
17482
16:40:08,792 --> 16:40:14,101
how about the distinct titles\nfrom the favorites table?
17483
16:40:14,855 --> 16:40:16,772
I didn't bother selecting\nthe genres because I
17484
16:40:16,771 --> 16:40:18,104
want it to be a little prettier.
17485
16:40:18,105 --> 16:40:23,432
And you can see here that we\nhave just the distinct titles
17486
16:40:23,432 --> 16:40:25,889
except for issues of formatting.
17487
16:40:25,889 --> 16:40:27,722
So whitespace is going\nto be an issue again.
17488
16:40:27,722 --> 16:40:29,555
Capitalization is going\nto be a thing again.
17489
16:40:30,572 --> 16:40:34,622
One of the things I was doing in Python\n
17490
16:40:34,622 --> 16:40:36,272
and then getting rid of whitespace.
17491
16:40:36,271 --> 16:40:37,741
But we could combine some of these.
17492
16:40:37,741 --> 16:40:40,682
I could do something like\nforce every title to uppercase
17493
16:40:40,682 --> 16:40:41,849
then get the distinct value.
17494
16:40:41,849 --> 16:40:44,724
And that's actually going to get\n
17495
16:40:44,724 --> 16:40:47,202
And again, I did it all in\none simple line that was fast.
17496
16:40:47,201 --> 16:40:49,368
So let me pull up at the\nbottom of the screen again.
17497
16:40:49,368 --> 16:40:53,432
I selected distinct upper\ntitles from favorites
17498
16:40:53,432 --> 16:40:56,412
and that did everything for\nme at once in just one breath.
17499
16:40:56,411 --> 16:40:58,981
Suppose I want to get the total\nnumber of counts of titles.
17500
16:40:58,982 --> 16:41:05,492
How about select count of all\nof those titles from favorites?
17501
16:41:05,491 --> 16:41:09,331
Semicolon, Enter, and now\nyou get back a mini table
17502
16:41:09,332 --> 16:41:13,302
that contains just your\nanswer, 158 in this case.
17503
16:41:13,302 --> 16:41:15,902
So that's the total\nnumber of, not distinct
17504
16:41:15,902 --> 16:41:18,031
but total titles that\nwe had in the file.
17505
16:41:18,031 --> 16:41:21,902
And we could continue to manipulate\n
17506
16:41:23,792 --> 16:41:26,891
But there's also additional\nfiltration we can do.
17507
16:41:26,891 --> 16:41:32,351
We can also qualify our selections by\n
17508
16:41:32,351 --> 16:41:35,972
So just as in Scratch, and C, and\n
17509
16:41:35,972 --> 16:41:41,612
you can have the same in SQL as well,\n
17510
16:41:44,732 --> 16:41:46,682
Like allows me to do approximations.
17511
16:41:46,682 --> 16:41:48,842
If I want to get something\nthat's like The Office
17512
16:41:48,841 --> 16:41:51,661
but not necessarily\nT-H-E, space, Office
17513
16:41:51,661 --> 16:41:54,781
I could do pattern\nmatching using like here.
17514
16:41:54,781 --> 16:41:58,512
Order by, limit, and grouped by are\n
17515
16:41:58,512 --> 16:42:01,322
So let me go back and do\na couple of these here.
17516
16:42:01,322 --> 16:42:07,951
How about, let me just get, oh, I don't\n
17517
16:42:07,951 --> 16:42:10,189
but limit it to 10 results.
17518
16:42:10,190 --> 16:42:13,232
That might be one thing that's helpful\n
17519
16:42:13,232 --> 16:42:15,452
of the data at the top there instead.
17520
16:42:15,451 --> 16:42:21,871
How about, select all of the titles\n
17521
16:42:21,872 --> 16:42:25,052
is like, quote, unquote, "Office?
17522
16:42:25,052 --> 16:42:28,082
And this will give me only two answers.
17523
16:42:28,082 --> 16:42:32,522
Those are the two rows, recall, that I\n
17524
16:42:32,521 --> 16:42:37,221
Notice that like allows me too\ntolerate uppercase and lowercase.
17525
16:42:37,222 --> 16:42:40,222
Because if I instead\njust use the equal sign
17526
16:42:40,222 --> 16:42:46,402
and in SQL a single equal sign\ndoes, in fact, mean equality.
17527
16:42:46,402 --> 16:42:48,711
For comparison's sake,\nit's not doing assignment.
17528
16:42:48,711 --> 16:42:51,411
This is not how you assign data in SQL.
17529
16:42:51,411 --> 16:42:53,281
I got back no answers there.
17530
16:42:53,281 --> 16:42:56,961
So indeed, the equal sign\nis giving me literal answers
17531
16:42:56,961 --> 16:42:59,332
that searches just for what I typed in.
17532
16:42:59,332 --> 16:43:00,711
How could I get all of these?
17533
16:43:00,711 --> 16:43:04,432
Well, similar in spirit to regular\n
17534
16:43:04,432 --> 16:43:06,961
in SQL, I could do something like this.
17535
16:43:06,961 --> 16:43:10,671
I can select the title from favorites\n
17536
16:43:12,052 --> 16:43:17,792
But I can add, a bit weirdly, percent\n
17537
16:43:17,792 --> 16:43:23,272
So the language SQL supports the\nsame notion of pattern matching
17538
16:43:23,271 --> 16:43:24,888
but much more limited out of the box.
17539
16:43:24,889 --> 16:43:26,722
If we want more powerful\nregular expressions
17540
16:43:26,722 --> 16:43:28,762
we probably do want\nto use Python instead.
17541
16:43:28,762 --> 16:43:32,062
But the percent sign here\nmeans 0 or more characters
17542
16:43:32,061 --> 16:43:34,682
on the left, 0 or more\ncharacters on the right.
17543
16:43:34,682 --> 16:43:39,802
So this will just grab any title that\n
17544
16:43:40,442 --> 16:43:44,632
And now I get all 16, it would\nseem, of those results, again.
17545
16:43:45,832 --> 16:43:48,502
Well, I can just get the\ncount of those titles
17546
16:43:48,502 --> 16:43:51,482
and get back that\nanswer instead as well.
17547
16:43:51,482 --> 16:43:54,982
So again, it takes some\ngetting used to, the vocabulary
17548
16:43:54,982 --> 16:43:56,315
and the syntax that you can use.
17549
16:43:56,315 --> 16:43:58,024
There's these building\nblocks and others.
17550
16:43:58,023 --> 16:44:00,841
But SQL is really designed, again,\n
17551
16:44:01,822 --> 16:44:06,421
For instance, I've never really\n
17552
16:44:06,421 --> 16:44:12,201
So right now if I do select,\nhow about title from favorites
17553
16:44:12,201 --> 16:44:18,682
where title like, quote, unquote,\n
17554
16:44:18,682 --> 16:44:20,671
We can see that there's\na whole bunch of them.
17555
16:44:21,862 --> 16:44:23,302
Let's just do a quick count.
17556
16:44:25,232 --> 16:44:28,642
Well, delete from favorites.
17557
16:44:28,641 --> 16:44:35,841
OK, you and me, delete from favorites,\n
17558
16:44:35,841 --> 16:44:39,621
Nothing seems to happen,\nbut bye-bye Friends.
17559
16:44:44,451 --> 16:44:46,731
So now we've actually changed the data.
17560
16:44:46,732 --> 16:44:50,452
And this is what's compelling\nabout a proper database.
17561
16:44:50,451 --> 16:44:54,661
Yes, you could technically write Python\n
17562
16:44:55,881 --> 16:44:58,161
You can change using quote,\nunquote, "A" for append
17563
16:44:58,161 --> 16:45:01,251
or quote, unquote, "W" for\nwrite, instead of quote, unquote
17564
16:45:02,605 --> 16:45:05,272
But it's definitely a little more\ninvolved to do that in Python.
17565
16:45:05,271 --> 16:45:07,591
But with SQL, you can update\nthe data in real time.
17566
16:45:07,591 --> 16:45:11,091
And if I were actually running a\n
17567
16:45:11,091 --> 16:45:13,221
for a mobile app, that\nchange, theoretically
17568
16:45:13,222 --> 16:45:15,502
would be reflected everywhere\non your own devices
17569
16:45:15,502 --> 16:45:17,552
if you're somehow talking\nto this application.
17570
16:45:17,552 --> 16:45:19,337
So that's the direction we're headed.
17571
16:45:19,336 --> 16:45:20,961
This other thing has been bothering me.
17572
16:45:20,961 --> 16:45:27,981
So select, how about title from\nfavorites, where title equals
17573
16:45:33,021 --> 16:45:37,491
How about we update\nfavorites by setting title
17574
16:45:37,491 --> 16:45:44,421
equal to The Office, where title\n
17575
16:45:45,811 --> 16:45:47,991
And now, if I select\nthe same thing again
17576
16:45:47,991 --> 16:45:50,152
I can go up and down with\nmy arrow keys quickly.
17577
16:45:50,152 --> 16:45:52,432
Now there is no The V Office.
17578
16:45:52,432 --> 16:45:54,542
We've actually changed that value.
17579
16:45:55,641 --> 16:46:01,222
Select genres from favorites,\nwhere the title is title
17580
16:46:01,222 --> 16:46:04,792
equals Game of Thrones, semicolon.
17581
16:46:04,792 --> 16:46:08,342
These were kind of long, and I\n
17582
16:46:08,341 --> 16:46:14,901
So how about we update favorites,\nset genres equal to, sure
17583
16:46:14,902 --> 16:46:17,991
action, adventure, sure, drama?
17584
16:46:19,732 --> 16:46:22,042
Fantasy, sure, thriller, war.
17585
16:46:22,042 --> 16:46:26,391
OK, anything really but\ncomedy, I would say.
17586
16:46:26,391 --> 16:46:28,502
Let's go ahead and hit Enter now.
17587
16:46:28,502 --> 16:46:33,141
And now, if I select genres again, same\n
17588
16:46:34,502 --> 16:46:36,591
So whether or not that\nis right is probably
17589
16:46:36,591 --> 16:46:38,361
a bit subjective and argumentative.
17590
16:46:38,362 --> 16:46:42,262
But I have at least cleaned up my\n
17591
16:46:42,262 --> 16:46:46,012
Create, read, update, delete,\nyou can do it that easily.
17592
16:46:47,631 --> 16:46:51,771
Beware worse using drop, whereby\nyou can drop an entire table.
17593
16:46:51,771 --> 16:46:54,651
But via these kinds of\ncommands, can we actually now
17594
16:46:54,652 --> 16:46:58,732
manipulate our data much more\nrapidly and with single thoughts.
17595
16:46:58,732 --> 16:47:01,732
And in fact, if you're an aspiring\n
17596
16:47:01,732 --> 16:47:05,662
or analyst in the real world, SQL\n
17597
16:47:05,661 --> 16:47:08,914
because it allows you to really\ndive into data quickly, and ask
17598
16:47:08,915 --> 16:47:11,332
questions of the data, and get\nback answers quite quickly.
17599
16:47:11,332 --> 16:47:12,872
And this is a simple data set.
17600
16:47:12,872 --> 16:47:17,182
You can do this with much larger\ndata sets as we soon will, too.
17601
16:47:17,182 --> 16:47:20,391
Or any questions on what\nwe've seen of SQL thus far?
17602
16:47:20,391 --> 16:47:22,512
Only scratched the\nsurface, but again, it
17603
16:47:22,512 --> 16:47:28,442
boils down to creating, reading,\nupdating, and deleting data.
17604
16:47:30,752 --> 16:47:33,162
Well, let's consider\nthe design of this data.
17605
16:47:33,161 --> 16:47:37,241
Recall that if I do .schema, that\n
17606
16:47:37,241 --> 16:47:39,331
the so-called schema of my data.
17607
16:47:40,351 --> 16:47:42,991
It gets the job done, and frankly,\neverything the user typed in
17608
16:47:42,991 --> 16:47:46,951
was arguably text, including the\n
17609
16:47:46,951 --> 16:47:49,381
But so the data set\nitself is somewhat simple.
17610
16:47:49,381 --> 16:47:54,911
But if we look at the data set itself,\n
17611
16:47:54,911 --> 16:47:57,302
Select genres from favorites.
17612
16:47:57,302 --> 16:47:59,882
And let me point out one other\nthing stylistically, too.
17613
16:47:59,881 --> 16:48:04,391
I am very deliberately capitalizing\n
17614
16:48:04,391 --> 16:48:08,042
and I'm lowercasing all of the\ncolumn names and the table names.
17615
16:48:08,042 --> 16:48:11,101
This is a convention, and\nhonestly, it just helps you read
17616
16:48:11,101 --> 16:48:14,551
I think, the code when you're\nco-mingling your names for columns
17617
16:48:14,552 --> 16:48:17,942
and tables with proper SQL keywords.
17618
16:48:17,942 --> 16:48:23,370
But I could just as easily do\nselect genres from favorites
17619
16:48:23,370 --> 16:48:26,491
but again, the SQL specific keywords\n
17620
16:48:26,491 --> 16:48:29,582
So stylistically, we would\nrecommend this, selecting genres
17621
16:48:38,192 --> 16:48:40,562
I accidentally made\nevery show, including
17622
16:48:40,561 --> 16:48:45,752
The Office about action, adventure,\n
17623
16:48:45,752 --> 16:48:49,802
How did I do that accidentally?
17624
16:48:57,802 --> 16:48:59,701
I think I did say\nbeware around this time.
17625
16:48:59,701 --> 16:49:03,152
So the SQL database took me--\nliterally, I updated favorites
17626
16:49:03,152 --> 16:49:06,351
setting genres equal to that,\nsemicolon, end of thought.
17627
16:49:06,351 --> 16:49:08,841
I really wanted to say\nwhere title equals
17628
16:49:08,841 --> 16:49:11,361
quote, unquote, "Game of Thrones.
17629
16:49:11,362 --> 16:49:14,421
Unfortunately, there isn't an\nundo command or time machine
17630
16:49:14,421 --> 16:49:17,271
with a SQL database, so\nthe best we can do here
17631
16:49:17,271 --> 16:49:21,591
is, let's actually get\nrid of favorites.db.
17632
16:49:21,591 --> 16:49:27,681
Let's run SQLite of favorites.db\n
17633
16:49:27,682 --> 16:49:29,781
Let me change myself into CSV mode.
17634
16:49:29,781 --> 16:49:35,271
Let me import, into my\nfavorites table, the CSV file.
17635
16:49:35,271 --> 16:49:39,591
And now, Friends is back,\nfor better or for worse
17636
16:49:39,591 --> 16:49:40,851
but so are all of our genres.
17637
16:49:43,432 --> 16:49:46,671
If I now reload the file\nand do select, star, from--
17638
16:49:47,182 --> 16:49:51,180
Select genres from favorites,\nthat was the result I was getting.
17639
16:49:51,180 --> 16:49:53,972
It's much messier, but that's\n
17640
16:49:53,972 --> 16:49:55,639
But now we're back to the original data.
17641
16:49:55,639 --> 16:49:58,042
Lesson here, be sure\nto back up your work.
17642
16:49:58,582 --> 16:50:02,192
So what more can we\nnow do with this data?
17643
16:50:02,192 --> 16:50:05,822
Well, I don't love the design of the\n
17644
16:50:05,822 --> 16:50:08,542
One, we didn't have\nany sort of validation
17645
16:50:08,542 --> 16:50:10,552
but user input is going to be messy.
17646
16:50:10,552 --> 16:50:13,132
There's just a lot of\nredundancy in here.
17647
16:50:15,061 --> 16:50:17,101
Let me select all the\ncomedies you all typed in.
17648
16:50:17,101 --> 16:50:23,301
So select title from\nfavorites, where genres equals
17649
16:50:25,461 --> 16:50:31,072
OK, so there's all of the shows\nthat are explicitly comedies.
17650
16:50:31,072 --> 16:50:34,281
But I think there might\nactually be others.
17651
16:50:37,521 --> 16:50:39,381
What was a comedy and a drama?
17652
16:50:39,381 --> 16:50:44,481
How about let's search for the-- oops,\n
17653
16:50:44,482 --> 16:50:49,042
OK, so The Office, in this case, was\n
17654
16:50:49,042 --> 16:50:52,351
It's Always Sunny in Philadelphia,\nand Gilmore Girls as well.
17655
16:50:52,351 --> 16:50:56,792
But notice that I get many more\nwhen I just search for comedy.
17656
16:50:56,792 --> 16:51:01,792
So the catch here is that, because I\n
17657
16:51:01,792 --> 16:51:04,372
the way Google did, as\na comma-separated list
17658
16:51:04,372 --> 16:51:08,932
it's actually really hard and messy\n
17659
16:51:08,932 --> 16:51:12,171
that are somewhere described as comedy.
17660
16:51:12,171 --> 16:51:15,021
Because if I search for quote,\n
17661
16:51:15,021 --> 16:51:18,951
I'm going to get are this one, whatever\n
17662
16:51:20,582 --> 16:51:22,012
But I'm not going to get this one.
17663
16:51:22,012 --> 16:51:23,512
I'm not going to get this one.
17664
16:51:24,411 --> 16:51:28,131
If I'm searching for, where genres\n
17665
16:51:28,131 --> 16:51:29,691
why am I missing those other shows?
17666
16:51:37,362 --> 16:51:39,732
It's not just a comedy,\nit's a comedy and a drama
17667
16:51:39,732 --> 16:51:42,022
and a comedy or a news\nshow, and so forth.
17668
16:51:42,021 --> 16:51:45,851
So I have to search for these commas,\n
17669
16:51:45,851 --> 16:51:47,902
Let me copy this so I can do this.
17670
16:51:47,902 --> 16:51:51,491
Let me search for where\ngenres equals comedy.
17671
16:51:51,491 --> 16:51:58,572
How about, or genres equals\ncomedy, drama, or genres
17672
16:51:58,572 --> 16:52:01,991
equals this whole thing,\ncomedy, news, talk show?
17673
16:52:01,991 --> 16:52:03,711
I'm going to get more and more results.
17674
16:52:03,711 --> 16:52:05,241
But that's not going to scale well.
17675
16:52:05,241 --> 16:52:08,152
What could I do instead\nof enumerating with ors
17676
16:52:08,152 --> 16:52:11,141
all of the different permutations\nof genres, do you think?
17677
16:52:15,872 --> 16:52:19,772
So I could use the keyword is,\nsimilar in Python to the word in.
17678
16:52:19,771 --> 16:52:22,322
I could use the like\nkeyword so that so long
17679
16:52:22,322 --> 16:52:27,421
as the genres is like\ncomedy somewhere in there
17680
16:52:27,421 --> 16:52:31,241
that's going to give me all of them,\n
17681
16:52:31,241 --> 16:52:34,606
But let me go ahead and just\nopen the form from earlier.
17682
16:52:37,591 --> 16:52:40,414
Let me see if I can open this\nreal quick before I toggle over.
17683
16:52:40,415 --> 16:52:42,332
If we look back at the\nform, recall that there
17684
16:52:42,332 --> 16:52:47,972
were all of those radio buttons\nasking for the specific genres
17685
16:52:49,872 --> 16:52:55,052
And if I open this, let me full screen\n
17686
16:52:55,052 --> 16:52:58,262
You'll see all of the\ngenres here, none of which
17687
16:52:58,262 --> 16:53:04,022
are that worrisome except for a\n
17688
16:53:04,021 --> 16:53:08,671
Where might the like keyword\nalone get me into trouble?
17689
16:53:13,021 --> 16:53:16,531
DAVID J. MALAN: Yeah, music and musical\n
17690
16:53:16,531 --> 16:53:19,002
Because, one, they're separate genres.
17691
16:53:19,002 --> 16:53:21,652
But if I just search for\nsomething that's like music
17692
16:53:21,652 --> 16:53:24,152
I'm going to accidentally suck\nin all of the musicals, which
17693
16:53:25,292 --> 16:53:28,652
If music is a music video or\nwhatever, and musical is actually
17694
16:53:28,652 --> 16:53:31,961
a different type of show, I\ndon't want to just do that.
17695
16:53:31,961 --> 16:53:33,451
So it seems just very messy.
17696
16:53:33,451 --> 16:53:37,023
I could probably hack something together\n
17697
16:53:37,982 --> 16:53:40,862
But this is just not a\ngood design for the data.
17698
16:53:40,862 --> 16:53:43,232
Google has done it this\nway because it's just
17699
16:53:43,232 --> 16:53:47,342
simple to actually keep the user's\ndata all in a single column
17700
16:53:47,341 --> 16:53:49,861
and just as they did,\nseparate it by commas.
17701
16:53:49,862 --> 16:53:54,061
But this is a real\nmessy way to use CSV is
17702
16:53:54,061 --> 16:53:58,171
by putting comma-separated values\n
17703
16:53:58,171 --> 16:54:00,691
Arguably, the folks at\nGoogle probably just did this
17704
16:54:01,862 --> 16:54:03,987
And they didn't want to\ngive people multiple sheets
17705
16:54:03,987 --> 16:54:07,561
or complicate things using some other\n
17706
16:54:07,561 --> 16:54:09,792
But I bet there's a better\nway for us to do this.
17707
16:54:09,792 --> 16:54:11,202
And let me go ahead and do this.
17708
16:54:11,201 --> 16:54:13,319
Let me go back into my code here.
17709
16:54:13,319 --> 16:54:15,362
And in just a moment, I'm\ngoing to grab a program
17710
16:54:15,362 --> 16:54:19,592
that I wrote in advance that's going\n
17711
16:54:19,591 --> 16:54:24,451
iterate over all of the rows, and load\n
17712
16:54:24,451 --> 16:54:27,971
two tables, one called\nshows, and one called genres
17713
16:54:27,972 --> 16:54:30,482
so as to actually separate\nthese two things out.
17714
16:54:30,482 --> 16:54:33,072
Give me just a moment to grab the code.
17715
16:54:33,072 --> 16:54:36,061
And when I run this, I'll\nonly have to run it once.
17716
16:54:36,061 --> 16:54:38,432
Let me go ahead and\nrun Python in a moment
17717
16:54:38,432 --> 16:54:41,281
and I'll reveal the results in a sec.
17718
16:54:41,281 --> 16:54:44,131
This is going to be version\n8 of the code online.
17719
16:54:44,131 --> 16:54:47,822
When I do this, let me go\nahead and open up this file.
17720
16:54:47,822 --> 16:54:51,601
Give me a second to move\nit into this directory.
17721
16:54:53,911 --> 16:54:56,856
So here we have version 8 of\nthis that's available online
17722
16:54:56,857 --> 16:54:58,232
that's going to do the following.
17723
16:54:58,232 --> 16:55:00,190
And I'll gloss over some\nof the details just so
17724
16:55:00,190 --> 16:55:04,082
that we don't get stuck in the\nweeds of some of this code.
17725
16:55:04,082 --> 16:55:06,722
I'm going to be using, at\nthe top of this program
17726
16:55:06,722 --> 16:55:11,162
as we'll soon see, a CS50 library,\n
17727
16:55:11,161 --> 16:55:14,072
or get_int, or get_float, but\nbecause there's some built-in SQL
17728
16:55:14,072 --> 16:55:17,222
functionality that we didn't discuss\n
17729
16:55:18,091 --> 16:55:22,021
But inside of the CS50 library we'll\n
17730
16:55:22,021 --> 16:55:26,671
SQL that gives you the ability using\n
17731
16:55:26,671 --> 16:55:31,381
technically called a URI, that allows\n
17732
16:55:31,381 --> 16:55:33,721
And long story short, all\nof the subsequent code
17733
16:55:33,722 --> 16:55:37,921
is going to iterate over this\n
17734
16:55:37,921 --> 16:55:41,582
And it's going to import it\ninto the SQLite database
17735
16:55:41,582 --> 16:55:44,772
but it's going to use two\ntables instead of just one.
17736
16:55:44,771 --> 16:55:46,981
So give me just a moment\nto run this, and then I'll
17737
16:55:48,942 --> 16:55:51,612
This is going to be\nrun on favorites.csv.
17738
16:55:56,851 --> 16:56:00,826
And taking a look here,\ngive me just a moment.
17739
16:56:11,421 --> 16:56:14,362
This program should not\nbe taking this long.
17740
16:56:24,841 --> 16:56:27,691
Let me just skim this code real\n
17741
16:56:30,182 --> 16:56:35,732
Reader, title, show ID\nin certain two shows.
17742
16:56:35,732 --> 16:56:40,712
[INAUDIBLE] genres split, DB execute.
17743
16:56:41,222 --> 16:56:42,752
This is me debugging in real time.
17744
16:56:42,752 --> 16:56:48,184
All those times we encourage you to use\n
17745
16:56:48,184 --> 16:56:50,101
We'll see how quickly I\ncan recover from this.
17746
16:56:50,101 --> 16:56:51,902
Python of favorites version 8.
17747
16:56:54,442 --> 16:56:57,412
OK, so here's me debugging in real time.
17748
16:56:58,224 --> 16:56:59,932
Oh, maybe I just didn't\nwait long enough.
17749
16:57:01,281 --> 16:57:05,241
What I'm doing is printing out\nthe dictionary that represents
17750
16:57:05,241 --> 16:57:06,883
each row that you all typed in.
17751
16:57:06,883 --> 16:57:08,341
And we're actually making progress.
17752
16:57:09,561 --> 16:57:11,851
I was too impatient and\ndidn't wait long enough.
17753
16:57:13,311 --> 16:57:15,711
All right, so all we have\nto do sometimes is wait.
17754
16:57:15,711 --> 16:57:19,762
Let me go ahead now and open\nthis file using SQLite3.
17755
16:57:19,762 --> 16:57:23,422
So in SQLite3 I now have a\ndifferent version of favorites.db.
17756
16:57:23,421 --> 16:57:25,311
I named it number 8 for consistency.
17757
16:57:25,311 --> 16:57:28,621
Once I've run the program I can\ndo .schema to look inside of it.
17758
16:57:28,622 --> 16:57:32,542
And here's what the two tables in\n
17759
16:57:32,542 --> 16:57:36,262
I've created a table called shows, this\n
17760
16:57:36,262 --> 16:57:39,752
that are favorites,\nthat has two columns.
17761
16:57:39,752 --> 16:57:42,154
One is called ID, one is called Title.
17762
16:57:42,154 --> 16:57:44,362
But now I'm going to start\ntaking out for a spin some
17763
16:57:44,362 --> 16:57:45,982
of the other features of SQL.
17764
16:57:45,982 --> 16:57:50,242
And besides there being text, it turns\n
17765
16:57:50,241 --> 16:57:52,252
Besides there being a\ndata type called text
17766
16:57:52,252 --> 16:57:55,491
there's also a special key\nphrase that you can specify
17767
16:57:55,491 --> 16:57:57,171
that the title can never be null.
17768
16:57:57,171 --> 16:58:00,502
Think back to our use\nof null in C. Think back
17769
16:58:00,502 --> 16:58:02,601
to the keyword none in Python.
17770
16:58:02,601 --> 16:58:06,171
This is a database constraint that\n
17771
16:58:06,171 --> 16:58:07,972
can't have of favorite TV show.
17772
16:58:07,972 --> 16:58:11,662
If you submit the form, you have\nto have typed in a title for it
17773
16:58:11,661 --> 16:58:13,591
to end up in our database here.
17774
16:58:13,591 --> 16:58:16,281
And you'll notice one other new feature.
17775
16:58:16,281 --> 16:58:18,801
It turns out, on this\ntable I'm defining what's
17776
16:58:18,802 --> 16:58:22,312
called a primary key,\nspecifically to be the ID column.
17777
16:58:22,311 --> 16:58:23,941
More on that in just a moment.
17778
16:58:23,942 --> 16:58:28,042
Meanwhile, the second table my code\n
17779
16:58:28,042 --> 16:58:33,561
gives me a column called\nshow ID, and then, a genre
17780
16:58:33,561 --> 16:58:36,481
the value of which is text\nthat can also not be null.
17781
16:58:36,482 --> 16:58:38,102
And then more on this in a moment.
17782
16:58:38,101 --> 16:58:41,182
This table has what we're\ngoing to call a foreign key
17783
16:58:41,182 --> 16:58:45,891
specifically the show ID column\nthat references shows ID.
17784
16:58:45,891 --> 16:58:48,591
So before we get into\nthe weeds of this, this
17785
16:58:48,591 --> 16:58:52,461
is now a way of creating the\nrelation in relational database.
17786
16:58:52,461 --> 16:58:56,481
If I have two tables now, not\njust one, they can somehow
17787
16:58:56,482 --> 16:59:00,062
be linked together by a common column.
17788
16:59:00,061 --> 16:59:03,201
In other words, the shows column--
17789
16:59:03,201 --> 16:59:06,682
shows table is going to give\nme a table with two columns--
17790
16:59:08,421 --> 16:59:12,141
Every title you gave me, I'm\ngoing to assign a unique value.
17791
16:59:12,141 --> 16:59:16,942
The genre's table, meanwhile, is\n
17792
16:59:16,942 --> 16:59:19,822
singular with that same idea.
17793
16:59:19,822 --> 16:59:26,491
And the result of this, to pop back to\n
17794
16:59:26,491 --> 16:59:30,801
Select star from shows\nof this new database
17795
16:59:30,802 --> 16:59:34,372
and you'll see that I've given,\n
17796
16:59:36,004 --> 16:59:39,171
I didn't filter out duplicates or do\n
17797
16:59:39,771 --> 16:59:42,271
So there's going to be some\nduplicates here because I didn't
17798
16:59:42,271 --> 16:59:44,182
want to get rid of anyone's data.
17799
16:59:44,182 --> 16:59:47,302
But you'll see that,\nindeed, I've given everyone
17800
16:59:47,302 --> 16:59:49,762
a unique identifier, from\nthe very first person who
17801
16:59:49,762 --> 16:59:53,992
typed How I Met Your Mother, all\n
17802
16:59:53,991 --> 17:00:01,101
Meanwhile, if I do select star from\n
17803
17:00:01,101 --> 17:00:03,621
a column in the original\ndata, now you'll
17804
17:00:03,622 --> 17:00:08,537
see a much better design for this data.
17805
17:00:08,536 --> 17:00:09,661
Notice what I've done here.
17806
17:00:09,661 --> 17:00:12,711
Let me go all the way to the top and\n
17807
17:00:12,711 --> 17:00:16,849
is called show ID, the other\nof which is called genre.
17808
17:00:16,849 --> 17:00:18,891
And again, I wrote some\ncode to do this because I
17809
17:00:18,891 --> 17:00:22,016
had to take Google's messy output where\n
17810
17:00:22,016 --> 17:00:25,671
I had to tear away the commas and\n
17811
17:00:27,112 --> 17:00:29,542
Even though we haven't\nintroduced the syntax via which
17812
17:00:29,542 --> 17:00:32,572
we can reconstitute the\ndata and reassociate
17813
17:00:32,572 --> 17:00:35,872
your genres with your\ntitles, why, at a glance
17814
17:00:35,872 --> 17:00:38,482
might this be a better design now?
17815
17:00:38,482 --> 17:00:42,082
Even though I've doubled the\nnumber of tables from one to two
17816
17:00:42,082 --> 17:00:46,882
why is this probably on the\ndirection toward a better design?
17817
17:00:46,881 --> 17:00:48,471
What might your instincts be?
17818
17:00:53,061 --> 17:00:56,572
Again, first time with SQL,\nwhy is it better, perhaps
17819
17:00:56,572 --> 17:00:59,122
that we've done this\nwith our genre's table?
17820
17:01:02,542 --> 17:01:06,702
Oh, just because we had the\n
17821
17:01:09,190 --> 17:01:14,472
We've cleaned up the data by giving\n
17822
17:01:14,472 --> 17:01:16,601
column in the original\nGoogle Spreadsheet
17823
17:01:16,601 --> 17:01:19,371
its own cell in this table, if you will.
17824
17:01:19,372 --> 17:01:22,272
And now notice show ID\nmight appear multiple times.
17825
17:01:22,271 --> 17:01:26,871
Whoever typed in How I Met Your Mother,\n
17826
17:01:26,872 --> 17:01:29,802
And so we see that\nshow ID 1 is a comedy.
17827
17:01:30,987 --> 17:01:32,862
I forget the name of\nthe second show offhand.
17828
17:01:32,862 --> 17:01:37,302
But that person, whoever was\nassigned show ID 2 checked off
17829
17:01:37,302 --> 17:01:39,192
a whole bunch of the genre's boxes.
17830
17:01:39,192 --> 17:01:42,882
That happened again with show ID 3, 4.
17831
17:01:42,881 --> 17:01:46,281
Persons 5, 6, 7 only checked one box.
17832
17:01:46,281 --> 17:01:50,381
And so you can see now that we've\n
17833
17:01:50,381 --> 17:01:53,031
might call a one-to-many relationship.
17834
17:01:53,031 --> 17:01:58,391
A one-to-many relationship, whereby\n
17835
17:01:58,391 --> 17:02:02,141
it can now have many genres\nassociated with it, each of which
17836
17:02:02,141 --> 17:02:06,591
is represented by a separate row here.
17837
17:02:06,591 --> 17:02:10,301
So again, if I go ahead and\nselect star from shows--
17838
17:02:10,302 --> 17:02:14,082
let's limit it to the first 10 just\n
17839
17:02:14,082 --> 17:02:17,112
How I Met Your Mother, The Sopranos\nwas the second input there.
17840
17:02:17,112 --> 17:02:20,442
It would seem that now that I've\ncreated the data in this way
17841
17:02:20,442 --> 17:02:25,266
I could ideally somehow search the\n
17842
17:02:25,266 --> 17:02:26,891
I don't have to worry about the commas.
17843
17:02:26,891 --> 17:02:29,266
I don't have to worry about\nthe hackish approach of music
17844
17:02:29,266 --> 17:02:30,972
being a substring of musical.
17845
17:02:30,972 --> 17:02:33,652
But how can I actually\nget back at this data?
17846
17:02:33,652 --> 17:02:35,092
Well, let's go ahead and do this.
17847
17:02:35,091 --> 17:02:39,041
Suppose I did want to get back\nmaybe all of the comedies.
17848
17:02:39,042 --> 17:02:42,461
All of the comedies, no matter whether\n
17849
17:02:42,461 --> 17:02:44,891
box or multiple boxes instead.
17850
17:02:44,891 --> 17:02:48,671
How now, given that I\nhave two tables, could I
17851
17:02:48,671 --> 17:02:53,082
go about selecting only\nthe titles of comedies?
17852
17:02:53,082 --> 17:02:55,122
I've actually made the\nproblem a little harder
17853
17:02:55,122 --> 17:02:58,062
but again, SQL is going to\ngive me a solution for this.
17854
17:02:58,061 --> 17:03:00,371
The problem is that if I\nwant to search for comedies
17855
17:03:00,372 --> 17:03:03,162
I have to check the genres table first.
17856
17:03:03,161 --> 17:03:04,991
And then what's that going to give me?
17857
17:03:04,991 --> 17:03:08,921
If I search the genres\ntable for comedies
17858
17:03:08,921 --> 17:03:11,951
what's that going to\ngive me back potentially?
17859
17:03:13,451 --> 17:03:14,451
DAVID J. MALAN: Maybe show ID.
17860
17:03:15,243 --> 17:03:21,161
Let me do select show ID from genres,\n
17861
17:03:21,161 --> 17:03:22,661
equals quote, unquote, "comedy.
17862
17:03:22,661 --> 17:03:25,811
No commas, no like, no percent signs.
17863
17:03:25,811 --> 17:03:30,281
Because literally, that column now is\n
17864
17:03:31,061 --> 17:03:33,011
Let me go ahead and hit Enter here.
17865
17:03:33,012 --> 17:03:36,042
OK, so I got back a whole\nbunch of ID numbers.
17866
17:03:36,042 --> 17:03:38,442
Now this could very\nquickly get annoying.
17867
17:03:38,442 --> 17:03:43,272
It looks like show ID 1, 2, 4, 5, 6,\n
17868
17:03:43,271 --> 17:03:49,031
So I could do something really\n
17869
17:03:49,031 --> 17:03:54,731
where ID equals 1, or ID equals 2.
17870
17:03:54,732 --> 17:03:57,702
This is not going to\nscale very well, but this
17871
17:03:57,701 --> 17:03:59,951
is why SQL is especially powerful.
17872
17:03:59,951 --> 17:04:04,761
You can actually compose one\nSQL question from multiple ones.
17873
17:04:05,781 --> 17:04:09,822
Why don't I select the title\nwhere the ID of the show
17874
17:04:09,822 --> 17:04:13,391
is in the following list of IDs?
17875
17:04:13,391 --> 17:04:20,052
Select show ID from genres, where the\n
17876
17:04:20,982 --> 17:04:23,892
So I've got two SQL queries.
17877
17:04:23,891 --> 17:04:27,192
One is deliberately nested\ninside of parentheses.
17878
17:04:27,192 --> 17:04:30,102
That's going to give me back\nthat whole list of show IDs.
17879
17:04:30,101 --> 17:04:32,411
But that's exactly what\nI want to then look up
17880
17:04:32,411 --> 17:04:36,341
the titles for by selecting title\n
17881
17:04:38,722 --> 17:04:44,052
And so now if I hit Enter,\nI get back only those shows
17882
17:04:44,052 --> 17:04:47,862
that were somehow flagged as\ncomedy, whether you in the audience
17883
17:04:47,862 --> 17:04:51,912
checked one box for comedy,\ntwo boxes, or all of the boxes.
17884
17:04:51,911 --> 17:04:54,269
Somehow we teased out\ncomedy, again, just
17885
17:04:54,269 --> 17:04:56,561
by using that Python script,\nwhich loaded this data not
17886
17:04:56,561 --> 17:04:59,141
into one big table, but instead, two.
17887
17:04:59,141 --> 17:05:01,762
And if we want to clean this\nup, let's do a couple of things.
17888
17:05:01,762 --> 17:05:05,922
Let's, outside of the\nparentheses, do order by title.
17889
17:05:05,921 --> 17:05:08,891
This is a way of sorting\nthe data in SQL very easily.
17890
17:05:08,891 --> 17:05:13,481
Now we have a whole list of the\nsame titles that are now sorted.
17891
17:05:13,482 --> 17:05:17,982
And what was the keyword with which\n
17892
17:05:19,601 --> 17:05:24,941
Same query, but let's select only the\n
17893
17:05:24,942 --> 17:05:27,112
And notice, I've very\ndeliberately done it this way.
17894
17:05:27,112 --> 17:05:28,862
And to this day, any\ntime I'm using SQL, I
17895
17:05:28,862 --> 17:05:31,529
don't just start at the beginning\nand type out my whole thought
17896
17:05:31,529 --> 17:05:33,281
and just get it right on the first try.
17897
17:05:33,281 --> 17:05:35,951
I very commonly start\nwith the subquery, if you
17898
17:05:35,951 --> 17:05:38,141
will, the thing in\nparentheses, just to get myself
17899
17:05:38,141 --> 17:05:39,942
one step toward what I care about.
17900
17:05:41,381 --> 17:05:43,932
Then I add to it, just like\nwe've encouraged in Python and C
17901
17:05:43,932 --> 17:05:47,711
taking baby steps in order to get to\n
17902
17:05:48,822 --> 17:05:51,402
And other than this\nmistake, which we didn't
17903
17:05:51,402 --> 17:05:55,692
fix because I re-imported the data after\n
17904
17:05:55,692 --> 17:06:00,531
we now have an alphabetized\nlist of all of the same data.
17905
17:06:00,531 --> 17:06:06,012
But now it's better designed, because we\n
17906
17:06:10,061 --> 17:06:14,292
What questions do we have, if any here?
17907
17:06:25,622 --> 17:06:27,622
DAVID J. MALAN: Oh, now\nthat we have a database
17908
17:06:27,622 --> 17:06:29,982
how do we transfer it to a CSV?
17909
17:06:31,402 --> 17:06:33,912
And in fact, there's a\ncommand within SQLite
17910
17:06:33,911 --> 17:06:36,931
that allows you to export\nyour data back to a CSV file.
17911
17:06:36,932 --> 17:06:38,682
If you want to email\nit to someone and you
17912
17:06:38,682 --> 17:06:41,771
want them to be able to open it in\n
17913
17:06:41,771 --> 17:06:44,351
Numbers, or the like, you can\ngo in the other direction.
17914
17:06:44,351 --> 17:06:47,231
Generally though, once\nyou're in the world of SQL
17915
17:06:47,232 --> 17:06:49,962
you're probably storing\nyour data there long term.
17916
17:06:49,961 --> 17:06:52,932
And you're probably updating it,\n
17917
17:06:53,614 --> 17:06:55,781
For instance, the one command\nI did not show earlier
17918
17:06:55,781 --> 17:06:58,661
is, suppose someone forgot a show.
17919
17:06:58,661 --> 17:07:00,981
Let's see, did I see this in the output?
17920
17:07:00,982 --> 17:07:02,922
All right, so Curb Your Enthusiasm.
17921
17:07:04,872 --> 17:07:06,502
Did anyone see it last night?
17922
17:07:07,002 --> 17:07:10,084
All right, well, just the one person\n
17923
17:07:10,084 --> 17:07:12,082
What's another show that\ndidn't make the list?
17924
17:07:13,559 --> 17:07:14,891
It's now on Netflix, apparently.
17925
17:07:21,732 --> 17:07:25,252
Well, we want to insert\nmaybe an ID and a title.
17926
17:07:25,252 --> 17:07:27,491
But I don't actually\ncare what the ID is
17927
17:07:27,491 --> 17:07:28,991
so I'm just going to insert a title.
17928
17:07:28,991 --> 17:07:31,121
And the value I'm going\nto give to that title
17929
17:07:31,122 --> 17:07:34,242
is going to be, quote,\nunquote, "Seinfeld.
17930
17:07:34,241 --> 17:07:37,152
And then, let me go\nahead and hit semicolon.
17931
17:07:37,152 --> 17:07:39,702
Nothing seems to happen, but\nlet me rerun the big query
17932
17:07:39,701 --> 17:07:41,591
from before looking for comedies.
17933
17:07:41,591 --> 17:07:45,191
And unfortunately, Seinfeld has\n
17934
17:07:45,192 --> 17:07:47,052
so let's get this right, too.
17935
17:07:47,052 --> 17:07:50,712
What intuitively I'm going to\nhave to do to associate, now
17936
17:07:53,322 --> 17:07:55,482
I just inserted into the show's table.
17937
17:07:55,482 --> 17:07:59,232
What more needs to happen before\n
17938
17:08:03,851 --> 17:08:08,292
So I need to insert into the\ngenres table two things now
17939
17:08:08,292 --> 17:08:13,522
a show ID, like this, and\nthen, the name of the genre
17940
17:08:14,682 --> 17:08:16,152
What values do I want to insert?
17941
17:08:16,152 --> 17:08:18,137
Well, the show ID, I better grab that.
17942
17:08:18,137 --> 17:08:19,512
Oh, I don't even know what it is.
17943
17:08:19,512 --> 17:08:21,112
I'm going to have to\nfigure out what that is.
17944
17:08:21,112 --> 17:08:23,002
So I could do this in a couple of ways.
17945
17:08:24,491 --> 17:08:28,265
Select star from shows,\nwhere title equals
17946
17:08:28,266 --> 17:08:32,112
quote, unquote,\n"Seinfeld" semicolon 159.
17947
17:08:32,112 --> 17:08:37,122
So now I could do, insert\ninto genres a show ID
17948
17:08:37,122 --> 17:08:45,852
and a genre name, the values 159, and,\n
17949
17:08:46,601 --> 17:08:50,051
And now, if I scroll back in my history\n
17950
17:08:50,052 --> 17:08:52,032
again, looking for\nall distinct comedies
17951
17:08:52,031 --> 17:08:54,442
now Seinfeld has made the list.
17952
17:08:54,442 --> 17:08:57,978
But I did this manually so I\ndidn't actually capitalize it.
17953
17:09:03,042 --> 17:09:08,781
Set title equals to Seinfeld semicolon.
17954
17:09:09,281 --> 17:09:13,121
OK, thank you, where title equals,\nquote, unquote, "Seinfeld.
17955
17:09:13,122 --> 17:09:14,862
Let's not make that mistake again.
17956
17:09:15,671 --> 17:09:18,612
And now, if I execute that really\nbig query, now Seinfeld is
17957
17:09:18,612 --> 17:09:21,822
indeed, considered a comedy.
17958
17:09:21,822 --> 17:09:23,302
So where are we going with this?
17959
17:09:23,302 --> 17:09:25,582
Well, thus far we've been doing\nall of this pretty manually.
17960
17:09:25,582 --> 17:09:28,122
And this is absolutely what an\n
17961
17:09:28,122 --> 17:09:30,539
might do if just manipulating\na pretty large data set just
17962
17:09:30,538 --> 17:09:33,351
to get at interesting answers\nthat might be across one
17963
17:09:33,351 --> 17:09:34,781
two, or even many more tables.
17964
17:09:34,781 --> 17:09:37,781
Eventually, in a few weeks, we're\n
17965
17:09:37,781 --> 17:09:41,951
by writing code in Python\nthat generates SQL to do this.
17966
17:09:41,951 --> 17:09:44,621
If you go to most any website\non the internet today
17967
17:09:44,622 --> 17:09:48,762
and you, for instance, log in, odds are\n
17968
17:09:50,771 --> 17:09:53,351
Well, the website might not\nbe implemented in Python
17969
17:09:53,351 --> 17:09:56,951
but it's probably implemented in some\n
17970
17:09:58,451 --> 17:10:03,671
And that language is probably using\n
17971
17:10:03,671 --> 17:10:07,271
to use SQL to get your\nusername, get your password
17972
17:10:07,271 --> 17:10:09,338
and compare the two against\nwhat you've typed in.
17973
17:10:09,338 --> 17:10:11,921
And actually, it's hopefully not\ngetting your actual password
17974
17:10:11,921 --> 17:10:13,504
but something called the hash thereof.
17975
17:10:13,504 --> 17:10:15,701
But there's probably a\ndatabase involved doing that.
17976
17:10:15,701 --> 17:10:18,941
When you buy something on\nAmazon.com and you click Check Out
17977
17:10:18,942 --> 17:10:22,062
odds are there's some\ncode on Amazon's server
17978
17:10:22,061 --> 17:10:25,211
that's looking at what it is\nyou added to your shopping cart
17979
17:10:25,211 --> 17:10:29,031
and then maybe using a for loop of some\n
17980
17:10:29,031 --> 17:10:33,521
It's doing a whole bunch of SQL\n
17981
17:10:34,631 --> 17:10:37,601
There's other types of databases,\ntoo, but SQL databases
17982
17:10:37,601 --> 17:10:39,989
or relational databases\nare quite popular.
17983
17:10:39,989 --> 17:10:42,072
So let's go ahead and write\none other program here
17984
17:10:42,072 --> 17:10:46,661
in Python that now merges these\ntwo languages together, whereby
17985
17:10:46,661 --> 17:10:50,081
I'm going to use SQL\ninside of a Python program
17986
17:10:50,082 --> 17:10:53,772
so I can implement my logic\nof my program in Python
17987
17:10:55,391 --> 17:10:59,601
But when I want to get at some data I\n
17988
17:10:59,601 --> 17:11:02,531
So let me go ahead\nand open favorites.py.
17989
17:11:05,082 --> 17:11:10,542
And let me go ahead and throw away\n
17990
17:11:10,542 --> 17:11:13,031
just now add a SQL to the mix.
17991
17:11:13,031 --> 17:11:16,781
From the CS50 library, let's\nimport the SQL function.
17992
17:11:16,781 --> 17:11:19,601
This will be useful to use\nbecause most third-party libraries
17993
17:11:19,601 --> 17:11:22,731
that deal with SQL and Python are\n
17994
17:11:22,732 --> 17:11:25,732
So I think you'll find\nthis library easier to use.
17995
17:11:25,732 --> 17:11:27,122
Let's then do the following.
17996
17:11:27,122 --> 17:11:29,242
Create a variable\ncalled db for database.
17997
17:11:29,241 --> 17:11:30,741
But I could call it anything I want.
17998
17:11:30,741 --> 17:11:34,432
Let's use that you URI, which is\na fancy way of saying something
17999
17:11:34,432 --> 17:11:43,131
that looks like a URL, but that actually\n
18000
17:11:44,451 --> 17:11:47,961
Let's now ask the user for a title by\n
18001
17:11:49,161 --> 17:11:53,301
And let's strip off any whitespace\n
18002
17:11:53,302 --> 17:11:56,072
And then, let's go ahead and do this.
18003
17:11:57,561 --> 17:12:01,911
I'm going to go ahead now and write\n
18004
17:12:01,911 --> 17:12:05,181
to talk to the original favorites.db.
18005
17:12:05,182 --> 17:12:09,322
So again, I'm not using the two-table\n
18006
17:12:09,322 --> 17:12:12,661
I'm using the original that we\nimported from your own data
18007
17:12:12,661 --> 17:12:14,431
and I'm going to do the following.
18008
17:12:14,432 --> 17:12:19,491
I'm going to use db.execute to execute\n
18009
17:12:19,491 --> 17:12:28,042
I'm going to select the count\nof shows from the favorites
18010
17:12:28,042 --> 17:12:35,302
table, where the title the user\n
18011
17:12:35,302 --> 17:12:37,252
And why I'm doing that is as follows.
18012
17:12:37,252 --> 17:12:40,942
Just like in C, when we had\npercent S, in SQL for now
18013
17:12:40,942 --> 17:12:42,832
the analogue is going\nto be a question mark.
18014
17:12:42,832 --> 17:12:44,332
So same idea, different syntax.
18015
17:12:44,332 --> 17:12:46,522
Instead of percent S,\nit's just a question mark.
18016
17:12:46,521 --> 17:12:51,621
And using a comma outside of this\n
18017
17:12:51,622 --> 17:12:54,682
function I can pass in\na SQL string, a command
18018
17:12:54,682 --> 17:12:59,177
then any arguments I want to plug\n
18019
17:12:59,177 --> 17:13:01,552
So the goal at hand is to\nactually write a program that's
18020
17:13:01,552 --> 17:13:07,762
going to search favorites.csv, a.k.a.,\n
18021
17:13:07,762 --> 17:13:10,641
of people that liked a particular show.
18022
17:13:10,641 --> 17:13:14,391
So this is going to select the count\n
18023
17:13:14,391 --> 17:13:18,741
where the title they typed in is like\n
18024
17:13:19,311 --> 17:13:22,072
This db execute function returns a list.
18025
17:13:23,154 --> 17:13:25,529
And you would only know that\nby my telling you or reading
18026
17:13:26,631 --> 17:13:29,481
And therefore, if I want to\nget back to the total count
18027
17:13:29,482 --> 17:13:34,282
I'm going to go ahead and grab\nthe first row from those rows.
18028
17:13:34,281 --> 17:13:36,561
Because it's only going\nto give me back the count.
18029
17:13:36,561 --> 17:13:41,781
And then I'm going to go ahead and\n
18030
17:13:41,781 --> 17:13:43,281
But it's going to be a little weird.
18031
17:13:43,281 --> 17:13:46,762
Technically the column is going to be\n
18032
17:13:47,762 --> 17:13:49,522
Let me add one more feature to the mix.
18033
17:13:49,521 --> 17:13:51,621
You can actually give\nnicknames to columns
18034
17:13:51,622 --> 17:13:55,432
that are coming back, especially if they\n
18035
17:13:55,432 --> 17:13:59,512
I can just call that column\ncounter, in all lowercase.
18036
17:13:59,512 --> 17:14:06,961
That means I can now say get back the\n
18037
17:14:06,961 --> 17:14:08,701
So just to recap, what have we done?
18038
17:14:08,701 --> 17:14:11,351
We've imported the CS50\nlibrary SQL function.
18039
17:14:11,351 --> 17:14:14,421
We've, with this line of\ncode, opened the favorites.db
18040
17:14:14,421 --> 17:14:20,152
file that you and I created earlier\n
18041
17:14:20,152 --> 17:14:23,482
I'm now just asking the user for\n
18042
17:14:23,482 --> 17:14:27,412
I'm now executing this SQL\nquery on that database
18043
17:14:27,411 --> 17:14:30,591
plugging in whatever the\nhuman typed in as their title
18044
17:14:30,591 --> 17:14:32,511
in order to get back a total count.
18045
17:14:32,512 --> 17:14:36,292
And I'm giving the count a\nnickname, an alias of counter
18046
17:14:36,292 --> 17:14:39,202
just so it's more self-explanatory.
18047
17:14:39,201 --> 17:14:43,671
This function, db execute, no matter\n
18048
17:14:43,671 --> 17:14:45,811
even if there's only\none row inside of it.
18049
17:14:45,811 --> 17:14:48,651
So this line of code just gives\nme the first and only row.
18050
17:14:48,652 --> 17:14:53,302
And then, this goes inside of that row,\n
18051
17:14:53,302 --> 17:14:59,072
and gives me the key counter\nand the value it corresponds to.
18052
17:14:59,072 --> 17:15:00,682
So what, to be clear, is this doing?
18053
17:15:00,682 --> 17:15:03,561
Let's go ahead and run this manually\n
18054
17:15:03,561 --> 17:15:07,311
Let me run SQLite3 on favorites--
18055
17:15:08,722 --> 17:15:12,752
On favorites.db, let me\nimport the data again.
18056
17:15:12,752 --> 17:15:20,252
So mode csv.import in from\nfavorites.csv into a favorites table.
18057
17:15:20,252 --> 17:15:22,671
So I've just recreated the\nsame data set that you all
18058
17:15:22,671 --> 17:15:25,123
gave me earlier in favorites.db.
18059
17:15:25,124 --> 17:15:27,832
If I were to do this manually,\n
18060
17:15:27,832 --> 17:15:34,552
Select, count star from favorites,\nwhere title like, and let's
18061
17:15:34,552 --> 17:15:37,612
just manually type it\nin for now, The Office.
18062
17:15:37,612 --> 17:15:40,671
We'll search for the one\nwith the word The, semicolon.
18063
17:15:41,902 --> 17:15:44,122
But technically, notice what I get back.
18064
17:15:44,122 --> 17:15:50,422
I technically get back a miniature\n
18065
17:15:50,421 --> 17:15:52,432
What if I want to rename that column?
18066
17:15:52,432 --> 17:15:54,182
That's where the as keyword comes in.
18067
17:15:54,182 --> 17:15:56,421
So select count star as counter.
18068
17:15:58,252 --> 17:16:01,224
I just get back-- same\nsimple table, but I've
18069
17:16:01,224 --> 17:16:03,891
renamed the column to be counter\njust because it's a little more
18070
17:16:03,891 --> 17:16:05,752
self-explanatory as to what it is.
18071
17:16:05,752 --> 17:16:08,542
So what am I doing\nwith this line of code?
18072
17:16:08,542 --> 17:16:12,772
This line of code is returning to\n
18073
17:16:12,771 --> 17:16:16,161
in the form of a list of dictionaries.
18074
17:16:16,161 --> 17:16:20,961
The list contains one\nrow, as we'll see, and it
18075
17:16:20,961 --> 17:16:26,141
contains one column, as we'll\nsee, the key for which is counter.
18076
17:16:26,141 --> 17:16:27,881
So let's now run the code itself.
18077
17:16:27,881 --> 17:16:32,621
I'm going to get out of SQLite3 and I'm\n
18078
17:16:33,461 --> 17:16:34,881
I'm being prompted for a title.
18079
17:16:34,881 --> 17:16:39,131
I'm going to type in The Office and\n
18080
17:16:40,152 --> 17:16:42,792
Well, there's a typo again\nbecause I re-imported the CSV.
18081
17:16:42,792 --> 17:16:46,612
I had deleted two of the Thes, so\n
18082
17:16:46,612 --> 17:16:51,101
So there\'s 12 total that have,\nquote, unquote, "The Office
18083
17:16:54,201 --> 17:16:57,161
We've combined some\nPython with some SQL
18084
17:16:57,161 --> 17:17:00,131
but we've relegated all of the\n
18085
17:17:00,131 --> 17:17:02,141
the selecting of something,\ngotten rid of all
18086
17:17:02,141 --> 17:17:04,902
of the with keyword, the\nopen keyword, the for loop
18087
17:17:04,902 --> 17:17:06,942
the reader the DictReader,\nand all of that.
18088
17:17:06,942 --> 17:17:11,802
And it's just one line of SQL now,\n
18089
17:17:11,802 --> 17:17:17,062
All right, any questions on what we've\n
18090
17:17:26,411 --> 17:17:29,754
DAVID J. MALAN: When does this\n
18091
17:17:31,372 --> 17:17:33,582
So let's do that by changing\nthe problem at hand.
18092
17:17:33,582 --> 17:17:36,312
This program was designed just\nto select the total count.
18093
17:17:36,311 --> 17:17:41,411
Let's go ahead and\nselect, for instance, all
18094
17:17:41,411 --> 17:17:46,046
of the ways you all typed in The Office\n
18095
17:17:49,451 --> 17:17:53,811
If I do this in SQLite3, let\nme go ahead and do this again
18096
17:17:53,811 --> 17:17:55,311
after increasing my Terminal window.
18097
17:17:56,262 --> 17:18:00,912
Select title from favorites,\nwhere the title is like
18098
17:18:00,911 --> 17:18:04,176
quote, unquote, "The Office," semicolon.
18099
17:18:04,177 --> 17:18:07,302
I get back all of these different rows,\n
18100
17:18:07,302 --> 17:18:09,252
There's actually another\nlittle typo in there
18101
17:18:09,252 --> 17:18:12,972
with some capitalization of the\nE, and the C, and the E. That
18102
17:18:12,972 --> 17:18:16,182
would be an example of a query\nthat gives me back therefore
18103
17:18:17,421 --> 17:18:19,332
So let's now change my Python program.
18104
17:18:19,332 --> 17:18:24,882
If I now, in my Python program, do\n
18105
17:18:24,881 --> 17:18:26,451
containing all of those titles.
18106
17:18:26,451 --> 17:18:31,691
I can now do, for row in rows, I can\n
18107
17:18:31,692 --> 17:18:34,732
and now manipulate all\nof those things together.
18108
17:18:34,732 --> 17:18:36,102
Let me keep both on the screen.
18109
17:18:36,101 --> 17:18:37,661
Let me run Python of favorites.py.
18110
17:18:37,661 --> 17:18:41,661
And that for loop now should\niterate, what, 10 or more times
18111
17:18:41,661 --> 17:18:43,361
once for each of those titles.
18112
17:18:43,362 --> 17:18:47,351
And indeed, if I type in\nThe Office again, Enter.
18113
17:18:52,091 --> 17:18:55,394
Oh, I should not be renaming\ntitle to counter this time.
18114
17:18:55,394 --> 17:18:57,101
So that's just a dumb\nmistake on my part.
18115
17:18:58,752 --> 17:19:01,572
And now I should see after\ntyping in The Office
18116
17:19:01,572 --> 17:19:03,762
Enter, a whole bunch of The Offices.
18117
17:19:03,762 --> 17:19:05,862
And because I'm using\nlike, even the missed
18118
17:19:05,862 --> 17:19:08,891
capitalizations are coming through,\n
18119
17:19:08,891 --> 17:19:11,231
It doesn't matter if it's\nuppercase or lowercase.
18120
17:19:11,232 --> 17:19:15,642
Whereas had I used the equal sign\n
18121
17:19:17,277 --> 17:19:20,592
All right, any questions on this next?
18122
17:19:20,591 --> 17:19:25,041
All right, so let's transition\nto a larger, juicier data
18123
17:19:25,042 --> 17:19:26,952
set, and consider some\nof the issues that
18124
17:19:26,951 --> 17:19:31,211
arise when actually now using SQL and\n
18125
17:19:31,211 --> 17:19:34,311
using SQL for mobile apps, web\napps, and generally speaking
18126
17:19:36,141 --> 17:19:39,432
So let's start with a larger\ndata set just like that.
18127
17:19:39,432 --> 17:19:45,141
Give me just a moment to switch screens\n
18128
17:19:45,141 --> 17:19:48,311
which is an actual relational\ndatabase that we've created out
18129
17:19:48,311 --> 17:19:51,881
of a real-world data set from IMDb.
18130
17:19:51,881 --> 17:19:54,551
So InternetMovieDatabase.com\nis a website
18131
17:19:54,552 --> 17:19:57,132
where you can search for TV\nshows, and movies, and actors
18132
17:19:57,131 --> 17:20:00,221
and so forth, all using their\ndatabase behind the scenes.
18133
17:20:00,222 --> 17:20:04,872
IMDb wonderfully makes their data\n
18134
17:20:04,872 --> 17:20:08,302
but TSV files, tab-separated values.
18135
17:20:08,302 --> 17:20:11,802
And so what we did is, before class\n
18136
17:20:11,802 --> 17:20:15,641
We wrote a Python program\nsimilar to my favorites8.py file
18137
17:20:15,641 --> 17:20:19,091
earlier that read in\nall of those TSV files
18138
17:20:19,091 --> 17:20:24,161
created some SQL tables\nin an IMDb database
18139
17:20:24,161 --> 17:20:28,611
for you in SQLite that has multiple\ntables and multiple columns.
18140
17:20:28,612 --> 17:20:32,531
So let's go and wrap our minds around\n
18141
17:20:32,531 --> 17:20:36,281
Let me go back to VS Code\nhere, and in just a moment
18142
17:20:36,281 --> 17:20:40,601
I'm going to go ahead and copy the\n
18143
17:20:40,601 --> 17:20:45,851
And I'm going to go ahead and increase\n
18144
17:20:45,851 --> 17:20:48,911
Whenever playing around with a\n
18145
17:20:48,911 --> 17:20:51,822
typing .schema is perhaps a good\n
18146
17:20:52,752 --> 17:20:54,461
And things just escalated quickly.
18147
17:20:54,461 --> 17:20:56,981
There's a lot in this data\nset, because, indeed, there's
18148
17:20:56,982 --> 17:21:01,092
going to be tens of hundreds of\n
18149
17:21:01,091 --> 17:21:04,572
and also problem set 7, where we'll\n
18150
17:21:06,262 --> 17:21:09,281
So what is the schema that\nwe have created for you
18151
17:21:09,281 --> 17:21:12,491
from IMDb's actual real-world data?
18152
17:21:12,491 --> 17:21:14,292
One, there's a table called shows.
18153
17:21:14,292 --> 17:21:17,292
And notice we've just added whitespace\n
18154
17:21:17,292 --> 17:21:19,391
to make it a little more\nstylistically readable.
18155
17:21:19,391 --> 17:21:23,322
The shows table has an ID\ncolumn, a title column, a year
18156
17:21:23,322 --> 17:21:26,082
and the total number of\nepisodes for a given show.
18157
17:21:26,082 --> 17:21:31,092
And the types of those columns are\n
18158
17:21:31,091 --> 17:21:33,431
So it turns out there's\nactually a few different data
18159
17:21:33,432 --> 17:21:39,192
types that are worth being aware of when\n
18160
17:21:39,192 --> 17:21:43,512
In fact, in SQLite there's\nfive data types, and only five
18161
17:21:43,512 --> 17:21:46,992
fortunately, one of which is, indeed,\n
18162
17:21:46,991 --> 17:21:50,351
numeric, which is kind of a\ncatchall for dates and times
18163
17:21:50,351 --> 17:21:52,661
things that are numeric\nbut are not just integers
18164
17:21:52,661 --> 17:21:54,851
and not just real numbers, for instance.
18165
17:21:54,851 --> 17:21:58,362
Real number is what we've generally\n
18166
17:21:58,362 --> 17:22:00,400
Text, of course, is\njust text, but notice
18167
17:22:00,400 --> 17:22:02,442
that you don't have to\nworry about how big it is.
18168
17:22:02,442 --> 17:22:04,452
Like in Python, it will size to fit.
18169
17:22:04,451 --> 17:22:07,182
And then there's BLOB, which\nis binary large object, which
18170
17:22:07,182 --> 17:22:10,641
is for just raw 0s and 1s, like\nfor files or things like that.
18171
17:22:10,641 --> 17:22:12,911
But we'll generally use\nthe other four of these.
18172
17:22:12,911 --> 17:22:16,301
And so, indeed, when we\nimported this data for you
18173
17:22:16,302 --> 17:22:21,612
we decided that every show would be\n
18174
17:22:21,612 --> 17:22:24,802
Every show has, of course, a\ntitle, which should not be null.
18175
17:22:24,802 --> 17:22:26,662
Otherwise, why is it in the database?
18176
17:22:26,661 --> 17:22:30,171
Every show has a year,\nwhich is numeric according
18177
17:22:30,171 --> 17:22:31,521
to that definition a moment ago.
18178
17:22:31,521 --> 17:22:34,881
And the total number of episodes for\n
18179
17:22:34,881 --> 17:22:38,451
What now is with these primary keys\n
18180
17:22:38,451 --> 17:22:43,432
A primary key is the column that\n
18181
17:22:43,432 --> 17:22:46,461
In our case, with the\nfavorites, I automatically
18182
17:22:46,461 --> 17:22:50,091
gave each of your submissions a unique\n
18183
17:22:50,091 --> 17:22:52,701
typed in The Office,\nyour submission still
18184
17:22:52,701 --> 17:22:57,651
had a unique identifier, a number\n
18185
17:22:57,652 --> 17:23:01,671
with your genres, just\nas we saw a moment ago.
18186
17:23:01,671 --> 17:23:04,621
In this version of IMDb,\nthere's also genres.
18187
17:23:04,622 --> 17:23:07,461
But they don't come from\nus, they come from IMDb.com.
18188
17:23:07,461 --> 17:23:11,661
And so a genre has a show ID, and\n
18189
17:23:11,661 --> 17:23:15,231
But these are real-world genres\nwith a bit more filtration.
18190
17:23:15,232 --> 17:23:19,862
Notice, though, just like my\nversion, there's a foreign key.
18191
17:23:19,862 --> 17:23:25,311
A foreign key is the appearance\nof another table's primary key
18192
17:23:27,391 --> 17:23:30,442
So when you have a table\nlike genres, which is somehow
18193
17:23:30,442 --> 17:23:36,262
cross referencing the original shows\n
18194
17:23:36,262 --> 17:23:40,612
called ID, and those same numbers\nappear in the genres table
18195
17:23:40,612 --> 17:23:45,592
under the column called show ID, by\n
18196
17:23:45,591 --> 17:23:47,991
It's the same numbers but\nit's foreign in the sense
18197
17:23:47,991 --> 17:23:50,601
that the number is being\nused in this table
18198
17:23:50,601 --> 17:23:54,472
even though it's officially defined\n
18199
17:23:54,472 --> 17:23:57,112
This is what we mean by\nrelational databases.
18200
17:23:57,112 --> 17:24:02,512
You have multiple tables with some\n
18201
17:24:02,512 --> 17:24:06,202
And those numbers allow you to line\n
18202
17:24:06,201 --> 17:24:09,381
that you can reconnect the\nshows with their genres
18203
17:24:09,381 --> 17:24:12,141
just like we did with our\nsmaller data set a moment ago.
18204
17:24:12,141 --> 17:24:14,391
This logic is extended further.
18205
17:24:14,391 --> 17:24:18,411
Notice that the IMDb database we've\n
18206
17:24:18,411 --> 17:24:22,072
like TV show stars, the actors therein.
18207
17:24:22,072 --> 17:24:25,552
And that table, interestingly,\nhas no mention of people
18208
17:24:25,552 --> 17:24:27,562
and no mention of shows, per se.
18209
17:24:27,561 --> 17:24:31,072
It only has a column called\nshow ID, which is an integer
18210
17:24:31,072 --> 17:24:33,561
and a person ID, which is an integer.
18211
17:24:33,561 --> 17:24:39,661
Meanwhile, if we scrolled\ndown to the bottom
18212
17:24:39,661 --> 17:24:42,831
you will see a table called people.
18213
17:24:42,832 --> 17:24:48,351
And we have decided in IMDb's world\n
18214
17:24:48,351 --> 17:24:52,851
will have a unique identifier that's\n
18215
17:24:52,851 --> 17:24:56,841
date, which is numeric, and\nthen, again, specifying that ID
18216
17:24:56,841 --> 17:25:00,691
is going to be their primary key.
18217
17:25:02,281 --> 17:25:07,981
Well, it turns out that TV stars and\n
18218
17:25:07,982 --> 17:25:13,072
So using this relational database,\n
18219
17:25:13,072 --> 17:25:15,112
We're factoring out commonalities.
18220
17:25:15,112 --> 17:25:17,912
And if a person can be\ndifferent things in life
18221
17:25:17,911 --> 17:25:20,601
well, we're defining them\nfirst and foremost as people.
18222
17:25:20,601 --> 17:25:23,491
And then, notice these two\ntables are almost the same.
18223
17:25:23,491 --> 17:25:26,002
The stars table has a show\nID, which is a number
18224
17:25:26,002 --> 17:25:28,012
and a person ID, which\nis a number, which
18225
17:25:28,012 --> 17:25:36,052
allows us via this middleman table, if\n
18226
17:25:36,052 --> 17:25:41,422
Similarly, the writers table allows\n
18227
17:25:41,421 --> 17:25:43,561
by just recording those numbers.
18228
17:25:43,561 --> 17:25:46,322
So if we go into this data\nset, let's do the following.
18229
17:25:46,322 --> 17:25:49,682
Let's do select star\nfrom people semicolon.
18230
17:25:49,682 --> 17:25:52,372
So a huge amount of data is coming back.
18231
17:25:52,372 --> 17:25:56,822
This is hundreds of thousands of rows\n
18232
17:25:56,822 --> 17:25:59,661
So this is real-world data\nnow flying across the screen.
18233
17:25:59,661 --> 17:26:03,501
There's a lot of people in the TV show\n
18234
17:26:06,021 --> 17:26:07,318
There's a lot of data there.
18235
17:26:07,319 --> 17:26:09,652
So my god, if you had to do\nanything manual in this data
18236
17:26:09,652 --> 17:26:12,002
set it's probably not going\nto work out very well.
18237
17:26:12,002 --> 17:26:14,932
And actually, we're up to, what,\na million people in this data
18238
17:26:14,932 --> 17:26:17,332
set, plus, which would mean\nthis probably isn't even
18239
17:26:17,332 --> 17:26:20,932
going to open very well in Excel, or\n
18240
17:26:20,932 --> 17:26:23,042
SQL probably is the\nbetter approach here.
18241
17:26:23,042 --> 17:26:25,702
Let's search for someone\nspecific, like select star
18242
17:26:25,701 --> 17:26:31,401
from people, where name equals\n
18243
17:26:32,044 --> 17:26:33,502
All right, so there's Steve Carell.
18244
17:26:33,502 --> 17:26:39,442
He is person number\n136,797, born in 1962.
18245
17:26:39,442 --> 17:26:41,882
And that's as much data as\nwe have on Steve Carell here.
18246
17:26:41,881 --> 17:26:44,551
How do we figure out what\nshows, for instance, he's in?
18247
17:26:44,552 --> 17:26:48,842
Well, let's see, select\nstar from shows, semicolon.
18248
17:26:48,841 --> 17:26:52,521
There's a crazy number of shows\nout there in the IMDb database.
18249
17:26:52,521 --> 17:26:55,491
And you can see it here again\nflying across the screen.
18250
17:26:55,491 --> 17:26:58,972
Feels like we're going to have to\n
18251
17:26:58,972 --> 17:27:02,402
to get at all of Steve Carell's shows.
18252
17:27:02,402 --> 17:27:04,562
So how are we going to do that?
18253
17:27:04,561 --> 17:27:07,072
Well, god, this is a lot of data here.
18254
17:27:07,072 --> 17:27:10,461
And in fact, yeah, we\nhave, what, 15 million
18255
17:27:10,461 --> 17:27:12,622
shows plus in this data set, too.
18256
17:27:12,622 --> 17:27:15,682
So doing things efficiently is\nnow going to start to matter.
18257
17:27:17,131 --> 17:27:18,801
Let me select a specific show.
18258
17:27:18,802 --> 17:27:23,932
Select star from shows where title\n
18259
17:27:23,932 --> 17:27:26,262
And there presumably shouldn't\nbe typos in this data
18260
17:27:26,262 --> 17:27:28,812
because it comes from the\nreal website IMDb.com.
18261
17:27:30,341 --> 17:27:33,551
Turns out there's been a lot of\nThe Offices out in the world.
18262
17:27:33,552 --> 17:27:37,512
The one that started in 2005\nis the one that we want
18263
17:27:37,512 --> 17:27:40,332
presumably the most\npopular with 188 episodes.
18264
17:27:41,561 --> 17:27:46,991
Maybe we could do and year\nequals, how about 2005?
18265
17:27:46,991 --> 17:27:50,591
All right, so now we've got\nback just the ID of The Office
18266
17:27:52,572 --> 17:27:55,451
Let me turn on a timer\nwithin SQLite just
18267
17:27:55,451 --> 17:27:57,131
to get a sense of running time now.
18268
17:27:58,451 --> 17:28:01,391
Select star from shows, where\ntitle equals The Office
18269
17:28:04,192 --> 17:28:05,711
Let's just do titles for now.
18270
17:28:06,792 --> 17:28:08,802
All right, so not terribly long.
18271
17:28:08,802 --> 17:28:12,332
It found it pretty fast, but it looks\n
18272
17:28:12,332 --> 17:28:15,351
0.02 seconds, not bad for just a title.
18273
17:28:15,351 --> 17:28:18,551
But just to plant a seed, it\nturns out that we can probably
18274
17:28:20,271 --> 17:28:23,561
Let me create something called an\n
18275
17:28:23,561 --> 17:28:25,511
in CRUD for creating something.
18276
17:28:25,512 --> 17:28:28,152
And I'm going to call this title index.
18277
17:28:28,152 --> 17:28:32,682
And I'm going to create\nit on the shows table
18278
17:28:32,682 --> 17:28:34,614
specifically on the title column.
18279
17:28:34,614 --> 17:28:37,031
And we'll see in a moment what\nthis is going to do for me.
18280
17:28:38,262 --> 17:28:42,472
Took a moment, like 0.349 seconds,\n
18281
17:28:42,472 --> 17:28:46,932
But now watch, if I select star from\n
18282
17:28:46,932 --> 17:28:49,572
previously it took me 0.021 seconds.
18283
17:28:53,021 --> 17:28:56,531
Literally no time at all, or so low\n
18284
17:28:56,531 --> 17:28:58,811
And I'll do it again just\nto get a sense of things.
18285
17:29:00,201 --> 17:29:05,362
Now even though 0.021 seconds, not crazy\n
18286
17:29:05,362 --> 17:29:07,902
a lot of users running a real\nwebsite or real mobile app.
18287
17:29:07,902 --> 17:29:11,322
Every millisecond we can start to\n
18288
17:29:13,152 --> 17:29:17,171
Well, we actually just created\nsomething called an index.
18289
17:29:17,171 --> 17:29:19,301
And this is a nice way\nto tie in, now, some
18290
17:29:19,302 --> 17:29:21,762
of our week 5 discussion\nof data structures
18291
17:29:21,762 --> 17:29:23,592
and our week 3 discussion\nof running times.
18292
17:29:23,591 --> 17:29:26,621
An index in a database is\nsome kind of fancy data
18293
17:29:26,622 --> 17:29:31,452
structure that allows the database\n
18294
17:29:31,451 --> 17:29:35,921
Literally, as you just saw, these\n
18295
17:29:37,491 --> 17:29:39,641
And so when I first\nsearched for The Office
18296
17:29:39,641 --> 17:29:43,122
it was literally doing linear search,\n
18297
17:29:46,211 --> 17:29:48,671
It's not that slow, 0.021 seconds.
18298
17:29:48,671 --> 17:29:52,362
But that's relatively slow just\ntheoretically, algorithmically
18299
17:29:53,921 --> 17:29:57,432
But if you instead create\nan index using syntax
18300
17:29:57,432 --> 17:30:03,222
like this, which I just did, creating an\n
18301
17:30:03,222 --> 17:30:06,561
table, that's like giving the\ndatabase a clue in advance saying
18302
17:30:06,561 --> 17:30:10,002
hey, I know I'm going to search on\n
18303
17:30:10,002 --> 17:30:12,771
Do something with data\nstructures to speed things up.
18304
17:30:12,771 --> 17:30:15,371
And so if you think back to our\ndiscussion of data structures
18305
17:30:17,061 --> 17:30:21,401
Maybe it's using a trie or a hash\n
18306
17:30:21,402 --> 17:30:25,272
structure is generally going to lift\n
18307
17:30:26,152 --> 17:30:28,961
So it's just much faster\nto find data, especially
18308
17:30:28,961 --> 17:30:31,841
if it's sorting it now\nbased on title, and not
18309
17:30:31,841 --> 17:30:33,371
just storing it in one long list.
18310
17:30:33,372 --> 17:30:35,562
And in fact, in the world\nof relational databases
18311
17:30:35,561 --> 17:30:37,901
the type of structure that's\noften used in a database
18312
17:30:37,902 --> 17:30:39,262
is something called a B-tree.
18313
17:30:40,512 --> 17:30:44,382
Different use of the letter B, but it\n
18314
17:30:45,161 --> 17:30:47,021
It's not binary because\nsome of the nodes
18315
17:30:47,021 --> 17:30:50,231
might have more than\ntwo children or fewer
18316
17:30:50,232 --> 17:30:53,532
but it's a very wide but\nrelatively shallow tree.
18317
17:30:55,332 --> 17:30:59,112
And the upside of that is that if\n
18318
17:30:59,112 --> 17:31:01,402
the database can find it more quickly.
18319
17:31:01,402 --> 17:31:06,612
And the reason it took half a second,\n
18320
17:31:06,612 --> 17:31:10,601
is because SQLite needed to take\nsome non-zero amount of time
18321
17:31:10,601 --> 17:31:12,972
to just build up this tree in memory.
18322
17:31:12,972 --> 17:31:17,241
And it has algorithms for doing so based\n
18323
17:31:17,241 --> 17:31:20,682
But you spend a bit of time\nup front, a third of a second.
18324
17:31:22,811 --> 17:31:25,811
Every subsequent query, if I\nkeep doing it again and again
18325
17:31:25,811 --> 17:31:29,381
is going to be crazy\nlow, 0.000, maybe 0.001.
18326
17:31:29,381 --> 17:31:33,581
But an order of magnitude, a\nfactor of 10 or 100 faster than it
18327
17:31:36,131 --> 17:31:39,701
So we have these indexes which\nallow us to get at data faster.
18328
17:31:39,701 --> 17:31:42,671
But what if we want to\nactually get data that's
18329
17:31:42,671 --> 17:31:44,711
now across these multiple tables?
18330
17:31:45,582 --> 17:31:48,402
And how might these indices\nor indexes help further?
18331
17:31:48,402 --> 17:31:52,241
Well, it turns out there is\na way that we've seen already
18332
17:31:52,241 --> 17:31:54,851
indirectly to join two tables together.
18333
17:31:54,851 --> 17:31:58,752
Previously, when I selected\nthe ID of The Office
18334
17:31:58,752 --> 17:32:03,082
and then I searched for it in the other\n
18335
17:32:03,082 --> 17:32:05,752
I was joining two tables together.
18336
17:32:05,752 --> 17:32:08,241
And it turns out there's a\ncouple of ways to do this.
18337
17:32:08,241 --> 17:32:11,891
Let's go ahead now and, for instance,\n
18338
17:32:11,891 --> 17:32:14,021
Not just The Office\nbut all of them, too.
18339
17:32:14,021 --> 17:32:21,656
Unfortunately, if we look at our schema,\n
18340
17:32:21,656 --> 17:32:27,201
oh, shows over here has no\nmention of the TV stars in them.
18341
17:32:27,201 --> 17:32:30,471
And people have no mention of shows.
18342
17:32:30,472 --> 17:32:34,702
We somehow need to use this\ntable here to connect the two.
18343
17:32:34,701 --> 17:32:40,161
And this is called a join table, in the\n
18344
17:32:40,161 --> 17:32:43,131
it joins the two tables\ntogether logically.
18345
17:32:43,131 --> 17:32:47,091
And so if you're savvy enough with SQL,\n
18346
17:32:47,091 --> 17:32:51,351
earlier and like recombine\ntables by using these common IDs
18347
17:32:53,671 --> 17:32:58,072
Let me go ahead and figure out,\n
18348
17:32:58,072 --> 17:32:59,311
So how am I going to do this?
18349
17:32:59,311 --> 17:33:04,461
Well, if I select star from people,\n
18350
17:33:04,461 --> 17:33:06,182
fortunately, there's only one of them.
18351
17:33:06,182 --> 17:33:12,141
So this gives me back his name,\nhis ID, and his birth year.
18352
17:33:12,141 --> 17:33:14,302
But it's really only his\nID that I care about.
18353
17:33:15,021 --> 17:33:20,841
Because in order to get back his shows,\n
18354
17:33:20,841 --> 17:33:22,981
So I need to know his ID number.
18355
17:33:22,982 --> 17:33:24,932
So what could I do with this?
18356
17:33:24,932 --> 17:33:29,572
Well, remember the schema\nand the stars table.
18357
17:33:29,572 --> 17:33:33,171
I've just gotten, from the\npeople table, Steve Carell's ID.
18358
17:33:33,171 --> 17:33:38,871
I bet by transitivity I could\nnow use his person ID, his ID
18359
17:33:38,872 --> 17:33:41,242
to get back all of his show IDs.
18360
17:33:41,241 --> 17:33:44,511
And then once I've got all of his show\n
18361
17:33:44,512 --> 17:33:46,672
and get back all of his shows' titles.
18362
17:33:46,671 --> 17:33:50,781
So the answer is actually English\n
18363
17:33:51,631 --> 17:33:52,981
So let me go ahead and do this.
18364
17:33:52,982 --> 17:33:57,082
Let me, again, get Steve\nCarell's ID number, but not star.
18365
17:33:58,402 --> 17:34:00,952
It's a wildcard character in SQL.
18366
17:34:00,951 --> 17:34:03,651
Let me just select the\nID of Steve Carell.
18367
17:34:03,652 --> 17:34:06,982
And that gives me back 136,797.
18368
17:34:06,982 --> 17:34:08,782
And it's only giving me back one value.
18369
17:34:08,781 --> 17:34:12,051
The thing called ID is just\nthe column heading up above.
18370
17:34:12,052 --> 17:34:16,702
Now, suppose I want to\nselect all of the show IDs
18371
17:34:16,701 --> 17:34:18,801
that Steve Carell is affiliated with.
18372
17:34:18,802 --> 17:34:25,912
Let me select Show ID from stars,\nwhere the person ID in stars
18373
17:34:25,911 --> 17:34:28,822
happens to equal Steve Carell's ID.
18374
17:34:28,822 --> 17:34:32,661
So again, I'm building up my answer in\n
18375
17:34:32,661 --> 17:34:36,771
On the right, in parentheses,\nI'm getting Steve Carell's ID.
18376
17:34:36,771 --> 17:34:40,671
On the left, I am now\nselecting all of the show IDs
18377
17:34:40,671 --> 17:34:44,752
that have some connection with\n
18378
17:34:44,752 --> 17:34:47,192
This answer, too, is not\ngoing to be that illuminating.
18379
17:34:47,192 --> 17:34:50,762
It's just a whole bunch of integers\n
18380
17:34:50,762 --> 17:34:53,112
But let's take this one step further.
18381
17:34:53,112 --> 17:34:54,862
And even though my\ncode is getting long, I
18382
17:34:54,862 --> 17:34:57,262
could hit Enter and format\nit nicely, especially
18383
17:34:57,262 --> 17:34:59,002
if I were doing this in a code file.
18384
17:34:59,002 --> 17:35:00,921
But I'm just doing it\ninteractively for now.
18385
17:35:00,921 --> 17:35:04,761
Let's now select all of the\ntitles from the shows table
18386
17:35:04,762 --> 17:35:13,442
where the ID of the show is in\nthis following previous query.
18387
17:35:13,442 --> 17:35:15,082
So again, the query is getting long.
18388
17:35:15,082 --> 17:35:17,542
But notice, it's the\nthird and last step.
18389
17:35:17,542 --> 17:35:21,292
Select title from the shows\ntable, where the ID of the show
18390
17:35:21,292 --> 17:35:23,932
is in the list of all\nof the show IDs that
18391
17:35:23,932 --> 17:35:27,381
came back from the stars table\n
18392
17:35:27,381 --> 17:35:28,822
How did we get that person ID?
18393
17:35:30,021 --> 17:35:36,502
Well, I selected, in my innermost\n
18394
17:35:36,502 --> 17:35:38,781
So now, when I hit Enter, voila.
18395
17:35:38,781 --> 17:35:41,866
I get all of Steve Carell's\nTV shows up until now.
18396
17:35:41,866 --> 17:35:44,991
And if I want to tidy this up further,\n
18397
17:35:47,521 --> 17:35:50,881
Now I've got it all\nalphabetized as before.
18398
17:35:50,881 --> 17:35:53,421
So again, with SQL comes\nthe ability to search--
18399
17:35:53,421 --> 17:35:56,631
I mean, look how quickly\nwe do this, 0.094 seconds
18400
17:35:56,631 --> 17:35:59,991
to search across three different\ntables to get back this answer.
18401
17:35:59,991 --> 17:36:04,161
But my data is now all neatly\ndesigned in individual tables
18402
17:36:04,161 --> 17:36:07,341
which is going to be important\n
18403
17:36:07,341 --> 17:36:09,681
But let me take this one step further.
18404
17:36:09,682 --> 17:36:12,271
Let me go ahead and do this.
18405
17:36:12,271 --> 17:36:16,921
Let me go ahead and point\nout that with this query
18406
17:36:16,921 --> 17:36:20,211
notice that I'm searching on--
18407
17:36:20,211 --> 17:36:24,051
let's say I'm searching\non a person ID here.
18408
17:36:24,052 --> 17:36:27,752
And at the end here, I'm\nsearching on a name column here.
18409
17:36:27,752 --> 17:36:30,572
So let me actually go ahead and do this.
18410
17:36:30,572 --> 17:36:34,851
Let me go ahead and see\nif we can't speed this up.
18411
17:36:34,851 --> 17:36:38,432
This query at the moment\ntakes 0.092 seconds.
18412
17:36:38,432 --> 17:36:41,271
Let's see if we can't speed this\n
18413
17:36:41,271 --> 17:36:44,271
a few more of those B-trees\nin the databases memory.
18414
17:36:44,271 --> 17:36:49,581
Create an index called person index, and\n
18415
17:36:52,192 --> 17:36:53,830
It's taking a moment, taking a moment.
18416
17:36:53,830 --> 17:36:56,122
That's almost a full second\nbecause that's a big table.
18417
17:36:56,122 --> 17:37:00,391
Let's create another index called\nshow index on the stars table.
18418
17:37:00,891 --> 17:37:03,292
Because I want to search\nby the show ID also.
18419
17:37:03,292 --> 17:37:05,122
That was part of my big query.
18420
17:37:06,002 --> 17:37:09,152
OK, just more than\nabout 2/3 of a second.
18421
17:37:09,152 --> 17:37:11,781
Now let's create one last one,\nanother index called name index
18422
17:37:11,781 --> 17:37:14,502
but I could call these things\n
18423
17:37:14,902 --> 17:37:16,891
Because I'm also searching\non the name column.
18424
17:37:16,891 --> 17:37:19,072
So in short, I'm\ncreating indexes on each
18425
17:37:19,072 --> 17:37:22,792
of the columns that are somehow\ninvolved in my search query
18426
17:37:22,792 --> 17:37:25,022
going from one table to the other.
18427
17:37:25,021 --> 17:37:32,932
Now let's go back to the previous\nquery, which, recall, took--
18428
17:37:36,112 --> 17:37:37,987
Well, it was roughly\nthis order of magnitude.
18429
17:37:37,987 --> 17:37:39,482
We're not seeing the data now.
18430
17:37:39,482 --> 17:37:42,502
But let me go ahead and run\nmy original big query once.
18431
17:37:42,502 --> 17:37:45,722
And boom, we're down to almost nothing.
18432
17:37:45,722 --> 17:37:48,202
So again, creating\nthese indexes in memory
18433
17:37:48,201 --> 17:37:52,981
has the effect of rapidly\nspeeding up our computation time.
18434
17:37:52,982 --> 17:37:56,482
Now if you've ever used, for instance,\n
18435
17:37:56,482 --> 17:38:00,472
here on campus, or Yale's analogue, you\n
18436
17:38:00,472 --> 17:38:04,671
This could be one of the reasons why\n
18437
17:38:04,671 --> 17:38:07,011
thousands of courses\ntend to be slow, if
18438
17:38:07,012 --> 17:38:10,222
and I'm only conjecturing, if the\n
18439
17:38:10,222 --> 17:38:12,232
If you're building your\nown web application
18440
17:38:12,232 --> 17:38:14,512
and you're finding that users\nare waiting and waiting
18441
17:38:14,512 --> 17:38:17,632
and things are spinning and spinning,\n
18442
17:38:17,631 --> 17:38:21,112
Well, it could absolutely just be bad\n
18443
17:38:21,112 --> 17:38:23,612
Or it might be that you\nhaven't thought about, well
18444
17:38:23,612 --> 17:38:27,112
what column should be optimized\nfor searches and filtration
18445
17:38:27,112 --> 17:38:31,046
like I've done here in order\nto speed up subsequent queries?
18446
17:38:31,046 --> 17:38:33,171
Again, from the outside\nin, we can only conjecture.
18447
17:38:33,171 --> 17:38:36,921
But ultimately, this is\njust one of the things that
18448
17:38:36,921 --> 17:38:39,650
explains performance problems as well.
18449
17:38:39,650 --> 17:38:42,442
All right, let's point out just a\n
18450
17:38:42,442 --> 17:38:45,112
and then we'll consider,\nbigger picture, some problems
18451
17:38:45,112 --> 17:38:47,451
that might arise in this world.
18452
17:38:47,451 --> 17:38:52,221
If these nested, nested queries\nstart to get a little much
18453
17:38:52,222 --> 17:38:54,502
there are other ways,\njust so you've seen it
18454
17:38:54,502 --> 17:38:57,262
that you can execute\nsimilar logic in SQL.
18455
17:38:57,262 --> 17:38:59,752
For instance, if I\nknow in advance that I
18456
17:38:59,752 --> 17:39:04,732
want to connect Steve Carell to\n
18457
17:39:04,732 --> 17:39:06,532
we can do something more like this.
18458
17:39:06,531 --> 17:39:17,391
Select title from the people table,\n
18459
17:39:21,832 --> 17:39:25,124
And again, this is not something you'll\n
18460
17:39:25,124 --> 17:39:29,422
But just so you've seen other\n
18461
17:39:30,502 --> 17:39:35,031
This is an explicit way to say, take\n
18462
17:39:35,031 --> 17:39:37,371
table in the other hand,\nand somehow join them
18463
17:39:37,372 --> 17:39:39,332
as I keep doing with my fingertips here.
18464
17:39:40,951 --> 17:39:45,711
Join them so that the people, the ID\n
18465
17:39:45,711 --> 17:39:48,601
with the person ID in the stars table.
18466
17:39:48,601 --> 17:39:50,601
But that's not quite everything.
18467
17:39:50,601 --> 17:39:54,082
I could also say, join\nfurther on the shows table
18468
17:39:54,082 --> 17:40:00,632
where the stars show ID\nequals the shows ID column.
18469
17:40:01,881 --> 17:40:11,331
That's saying, go further and join\n
18470
17:40:11,332 --> 17:40:14,332
joining the show ID\ncolumn with the ID column.
18471
17:40:14,332 --> 17:40:17,092
Again, this starts to get a\nlittle messy to think about.
18472
17:40:17,091 --> 17:40:21,171
But now I can just say, where name\n
18473
17:40:21,171 --> 17:40:24,411
I can do in one query what previously\n
18474
17:40:24,411 --> 17:40:25,941
and get back the same answers.
18475
17:40:25,942 --> 17:40:30,722
And I can still add in my order\nby title to get back the result.
18476
17:40:30,722 --> 17:40:35,122
And if I do this a little more\n
18477
17:40:36,091 --> 17:40:41,961
Let me type this out by adding a\n
18478
17:40:41,961 --> 17:40:43,461
I'm going to leave it alone for now.
18479
17:40:43,461 --> 17:40:46,042
We can type it on multiple\nlines in other contexts.
18480
17:40:46,042 --> 17:40:49,042
And let me do one last thing.
18481
17:40:50,353 --> 17:40:52,311
I'm going to show it,\nbut this is not something
18482
17:40:52,311 --> 17:40:53,781
you should ingrain just yet either.
18483
17:40:53,781 --> 17:40:56,961
Select title from\npeople, stars, and shows.
18484
17:40:56,961 --> 17:41:00,201
If you know in advance that you want\n
18485
17:41:00,201 --> 17:41:03,471
you can just enumerate them,\none table name after the other.
18486
17:41:03,472 --> 17:41:08,838
And then you can say where\npeople.ID equals stars.personID.
18487
17:41:08,838 --> 17:41:10,671
And now I'm hitting\nEnter so that it formats
18488
17:41:10,671 --> 17:41:12,411
a little more readably on my screen.
18489
17:41:12,411 --> 17:41:20,481
And stars.showID equals shows.ID,\n
18490
17:41:20,482 --> 17:41:25,072
In short, you specify that you\n
18491
17:41:26,031 --> 17:41:31,792
And then you tell the database how to\n
18492
17:41:31,792 --> 17:41:35,031
that is, the columns that\nhave those integers in common.
18493
17:41:35,031 --> 17:41:38,061
If I hit Enter now, I get\nthe same exact results, ever
18494
17:41:38,061 --> 17:41:41,451
more so if I also add\nin an order by title.
18495
17:41:43,612 --> 17:41:45,531
That's why I didn't\nwant to do this earlier.
18496
17:41:45,531 --> 17:41:48,531
I have to go back through my history\n
18497
17:41:48,531 --> 17:41:49,981
the multi-line query this time.
18498
17:41:52,622 --> 17:41:56,707
But this is only to say that, even\n
18499
17:41:56,707 --> 17:41:59,332
more sophisticated, and we put\nsome of it over here, some of it
18500
17:41:59,332 --> 17:42:03,472
over here, some of it over here so as to\n
18501
17:42:03,472 --> 17:42:07,252
like putting commas in the data, we\n
18502
17:42:07,252 --> 17:42:09,622
that we might want across\nthese several tables.
18503
17:42:09,622 --> 17:42:13,922
And using indexes, we can\nsignificantly speed up these processes
18504
17:42:13,921 --> 17:42:17,481
so as to handle 10 times as\nmany, a 100 times as many users
18505
17:42:17,482 --> 17:42:19,012
on the same actual database.
18506
17:42:19,012 --> 17:42:20,362
There is going to be a downside.
18507
17:42:20,362 --> 17:42:22,881
And thinking back to our\ndiscussion of algorithms and data
18508
17:42:22,881 --> 17:42:27,461
structures in past weeks, what might be\n
18509
17:42:27,461 --> 17:42:31,451
Because as of now, I created four\n
18510
17:42:31,451 --> 17:42:34,901
the title column, and\nsome other columns, too.
18511
17:42:34,902 --> 17:42:37,272
Why wouldn't I just go\nahead and index everything
18512
17:42:37,271 --> 17:42:39,731
if it's clearly speeding things up?
18513
17:42:41,112 --> 17:42:44,232
Any time you're starting to benefit\n
18514
17:42:44,232 --> 17:42:47,292
odds are you're sacrificing\nspace, or vice versa.
18515
17:42:47,292 --> 17:42:50,741
And probably indexing absolutely\neverything is a little dumb
18516
17:42:50,741 --> 17:42:54,771
because you're going to waste way more\n
18517
17:42:54,771 --> 17:42:56,951
So figuring out where the\nright inflection point is
18518
17:42:56,951 --> 17:43:01,752
is part of the process of designing and\n
18519
17:43:01,752 --> 17:43:06,252
Now unfortunately, a whole lot of\n
18520
17:43:06,252 --> 17:43:10,211
and they continue to in the real\n
18521
17:43:10,211 --> 17:43:12,101
And in fact, here on\nout, if you're reading
18522
17:43:12,101 --> 17:43:16,871
something technical about SQL databases,\n
18523
17:43:16,872 --> 17:43:20,002
and passwords leaking out,\nunfortunately, all too often
18524
17:43:20,002 --> 17:43:22,847
it is because of what are\ncalled SQL injection attacks.
18525
17:43:22,847 --> 17:43:24,972
And just to give you a\nsense now to counterbalance
18526
17:43:24,972 --> 17:43:26,930
maybe [INAUDIBLE] enthusiasm\nfor like, oh, that
18527
17:43:26,930 --> 17:43:28,961
was neat how we can\ndo things so quickly.
18528
17:43:28,961 --> 17:43:32,021
With great power comes\nresponsibility in this world, too.
18529
17:43:32,021 --> 17:43:34,661
And so many people introduce\nbugs into their code
18530
17:43:34,661 --> 17:43:42,501
by not quite appreciating how it is the\n
18531
17:43:43,732 --> 17:43:46,542
Here, for instance, is a\ntypical login screen for Yale.
18532
17:43:46,542 --> 17:43:49,122
And here's the analogue for\nHarvard where you're prompted
18533
17:43:49,122 --> 17:43:51,792
every day probably, for your\nusername and your password
18534
17:43:51,792 --> 17:43:53,802
your email address and\nyour password here.
18535
17:43:53,802 --> 17:43:57,762
Suppose, though, that\nbehind this login page
18536
17:43:57,762 --> 17:44:00,372
whether Harvard's or Yale's,\nthere's some website.
18537
17:44:00,372 --> 17:44:03,612
And that website is using\nSQL underneath the hood
18538
17:44:03,612 --> 17:44:06,042
to store all of the\nHarvard or Yale people's
18539
17:44:06,042 --> 17:44:09,281
usernames, passwords, ID\nnumbers, courses, transcripts
18540
17:44:10,311 --> 17:44:12,822
So there's a SQL database\nunderneath the website.
18541
17:44:12,822 --> 17:44:15,701
Well, what might go\nwrong with this process?
18542
17:44:15,701 --> 17:44:18,191
Unfortunately, there's\nsome special syntax in SQL
18543
17:44:18,192 --> 17:44:19,872
just like there is in C and Python.
18544
17:44:19,872 --> 17:44:22,302
For instance, there are\ncomments in SQL, too.
18545
17:44:22,302 --> 17:44:26,022
If you do two hyphens, dash,\ndash, that's a comment in SQL.
18546
17:44:26,021 --> 17:44:31,511
And if you, the programmer, aren't\n
18547
17:44:31,512 --> 17:44:34,902
such that you defend against\npotentially adversarial attacks
18548
17:44:34,902 --> 17:44:36,502
you might do something like this.
18549
17:44:36,502 --> 17:44:41,412
Suppose that I somewhat\nmaliciously or curiously log in
18550
17:44:41,411 --> 17:44:44,471
by typing my username,\nMalan@harvard.edu, and then maybe
18551
17:44:44,472 --> 17:44:46,332
a single quote and a dash, dash.
18552
17:44:47,021 --> 17:44:50,201
Because I'm trying to suss out\nif there is a vulnerability here
18553
17:44:53,252 --> 17:44:56,502
But if I were the owner of the website\n
18554
17:44:56,502 --> 17:45:00,641
I might try using potentially\ndangerous characters in my input.
18555
17:45:01,631 --> 17:45:05,682
Because single quote is used for\n
18556
17:45:05,682 --> 17:45:07,152
single quotes or double quotes.
18557
17:45:07,152 --> 17:45:10,272
Dash, dash, I claim now,\nis used for commenting.
18558
17:45:10,271 --> 17:45:13,301
But let's now imagine what\nthe code underneath the hood
18559
17:45:13,302 --> 17:45:17,502
might be for something like\nYale's login or Harvard's login.
18560
17:45:17,502 --> 17:45:19,942
What if it's code that looks like this?
18561
17:45:19,942 --> 17:45:21,882
So let me read it from left to right.
18562
17:45:21,881 --> 17:45:26,051
Suppose that they are using something\n
18563
17:45:26,052 --> 17:45:28,572
and they've got some SQL\ntyped into the website that
18564
17:45:28,572 --> 17:45:32,502
says select star from users,\nwhere username equals this
18565
17:45:34,391 --> 17:45:37,851
And they're plugging in\nusername and password.
18566
17:45:38,949 --> 17:45:41,531
Well, when the user types their\nusername password, hits Enter
18567
17:45:41,531 --> 17:45:44,262
I probably want to select\nthat user from my database
18568
17:45:44,262 --> 17:45:46,362
to see if the username\nand passwords match.
18569
17:45:46,362 --> 17:45:49,061
So the underlying SQL\nmight be, select star
18570
17:45:49,061 --> 17:45:51,131
from users, where username\nequals question mark
18571
17:45:51,131 --> 17:45:52,548
and password equals question mark.
18572
17:45:57,341 --> 17:46:02,771
And if we get back one row,\npresumably Malan@harvard.edu
18573
17:46:04,311 --> 17:46:06,531
We should let him proceed\nfrom there on out.
18574
17:46:06,531 --> 17:46:10,481
So that's some pseudo code, if\nyou will, for this scenario.
18575
17:46:10,482 --> 17:46:14,922
What if, though, this code is not\n
18576
17:46:14,921 --> 17:46:16,841
is, and isn't using question marks?
18577
17:46:16,841 --> 17:46:20,098
So the question mark syntax\nis a fairly common SQL thing
18578
17:46:20,099 --> 17:46:22,182
where the question marks\nare used as placeholders
18579
17:46:22,182 --> 17:46:24,732
just like in printf, percent S was.
18580
17:46:24,732 --> 17:46:28,242
But this function, db.execute\nfrom CS50's library
18581
17:46:28,241 --> 17:46:30,761
and third-party libraries\nas well, is also
18582
17:46:30,762 --> 17:46:33,132
doing some good stuff\nwith these question marks
18583
17:46:33,131 --> 17:46:35,171
and defending against\nthe following attack.
18584
17:46:35,171 --> 17:46:38,261
Suppose that you were not using\na third-party library like ours
18585
17:46:38,262 --> 17:46:41,832
and you were just manually constructing\n
18586
17:46:41,832 --> 17:46:45,281
You were to do something like this\n
18587
17:46:45,281 --> 17:46:47,141
You're comfortable with\nformat strings now.
18588
17:46:47,141 --> 17:46:50,224
You've gotten into the habit of using\n
18589
17:46:50,224 --> 17:46:52,572
Suppose that you, the\naspiring programmer
18590
17:46:52,572 --> 17:46:55,002
is just using techniques\nthat you've been taught.
18591
17:46:55,002 --> 17:46:58,002
So you have an f-string\nwith select star from users
18592
17:46:58,002 --> 17:47:01,811
where username equals, quote,\n
18593
17:47:01,811 --> 17:47:06,612
And password equals, quote,\nunquote, "password" in curly braces.
18594
17:47:06,612 --> 17:47:09,612
As of what, two weeks\nago, this was perfectly
18595
17:47:09,612 --> 17:47:14,802
legitimate technique in Python\nto plug in values into a string.
18596
17:47:14,802 --> 17:47:18,972
But notice if you are using\nsingle quotes yourself
18597
17:47:18,972 --> 17:47:24,092
and the user has typed in single\nquotes to their input, what
18598
17:47:25,262 --> 17:47:29,851
Where are we going with this if you're\n
18599
17:47:29,851 --> 17:47:33,691
into your own prepared string of text?
18600
17:47:40,942 --> 17:47:46,012
Worst case, they could insert what is\n
18601
17:47:47,152 --> 17:47:50,482
Generally speaking, if you're using\n
18602
17:47:50,482 --> 17:47:52,342
to surround the user's\ninput, you'd better
18603
17:47:52,341 --> 17:47:54,591
hope that they don't have\nan apostrophe in their name.
18604
17:47:54,591 --> 17:47:57,216
Or you better hope that they\ndon't type a single quote as well.
18605
17:47:57,216 --> 17:48:01,366
Because what if their single quote\n
18606
17:48:01,366 --> 17:48:03,241
and then the rest of\nthis is somehow ignored?
18607
17:48:03,241 --> 17:48:04,671
Well, let's consider\nhow this might happen.
18608
17:48:05,851 --> 17:48:08,182
This got a little\nblurry here, but let me
18609
17:48:08,182 --> 17:48:10,192
plug in here-- wow, that looks awful.
18610
17:48:13,281 --> 17:48:15,771
Just change this to white\nso it's more readable.
18611
17:48:15,771 --> 17:48:22,072
What happens if the\nuser does this instead?
18612
17:48:22,072 --> 17:48:24,652
They type in, like I\ndid into the screenshot
18613
17:48:24,652 --> 17:48:28,101
'Malan@harvard.edu,'\nsingle quote, dash, dash.
18614
17:48:28,101 --> 17:48:30,411
What has just happened\nlogically, even though we've
18615
17:48:30,411 --> 17:48:32,241
only just begun with SQL today?
18616
17:48:32,241 --> 17:48:37,491
Well, select star from users, where\n
18617
17:48:38,661 --> 17:48:42,681
What's bad about the rest of this?
18618
17:48:42,682 --> 17:48:45,093
Dash, dash, I claim,\nmeans a comment, which
18619
17:48:45,093 --> 17:48:47,551
means my color coding is going\nto be a little blurry again.
18620
17:48:47,552 --> 17:48:50,482
But everything after the\ndash, dash is just ignored.
18621
17:48:50,482 --> 17:48:52,782
The logic, then, of\nthe SQL query, then, is
18622
17:48:52,781 --> 17:48:56,101
to just say, select\nMalan@harvard.edu from the database
18623
17:48:56,101 --> 17:48:58,832
not even checking the password anymore.
18624
17:48:58,832 --> 17:49:01,522
Therefore, you will get\nback at least one row.
18625
17:49:01,521 --> 17:49:06,531
So length of rows will equal 1, and so\n
18626
17:49:06,531 --> 17:49:09,531
logs the user in, gives them\naccess to my my.harvard account
18627
17:49:10,491 --> 17:49:15,981
And they've pretended to be me simply\n
18628
17:49:15,982 --> 17:49:17,787
dash in the username field.
18629
17:49:17,786 --> 17:49:19,911
Again, please don't go\nstart doing this later today
18630
17:49:19,911 --> 17:49:21,481
on Harvard, Yale, or other websites.
18631
17:49:21,482 --> 17:49:23,012
But it could be as simple as that.
18632
17:49:23,512 --> 17:49:25,372
Because the programmer\npracticed what they
18633
17:49:25,372 --> 17:49:29,452
were taught, which was just to\nuse curly braces to plug in
18634
17:49:30,902 --> 17:49:33,932
But if you don't understand how the\n
18635
17:49:33,932 --> 17:49:37,597
and if you don't distrust your users\n
18636
17:49:37,597 --> 17:49:39,472
out there there's going\nto be, unfortunately
18637
17:49:39,472 --> 17:49:44,722
some adversary who just wants to try\n
18638
17:49:45,832 --> 17:49:48,322
This is what's known as\na SQL injection attack
18639
17:49:48,322 --> 17:49:52,341
because the user can type something\n
18640
17:49:52,341 --> 17:49:56,451
and trick your database into doing\n
18641
17:49:56,451 --> 17:50:00,171
like, for instance, logging the user in.
18642
17:50:00,171 --> 17:50:02,222
Worst case, they could\neven do something else.
18643
17:50:02,222 --> 17:50:06,832
Maybe the user types a semicolon, then\n
18644
17:50:06,832 --> 17:50:10,522
You could imagine doing semicolon\nupdate table grades, where
18645
17:50:10,521 --> 17:50:14,481
name equals Malan, and set the\ngrade equal to A instead of B
18646
17:50:16,012 --> 17:50:18,891
The ability to inject\nSQL into the database
18647
17:50:18,891 --> 17:50:22,161
means you can do anything you want with\n
18648
17:50:25,341 --> 17:50:28,221
And now, just a quick, little\n
18649
17:50:34,752 --> 17:50:38,362
OK, to, like, one of us, two of us.
18650
17:50:39,741 --> 17:50:41,902
All right, so let's move\non to one last condition.
18651
17:50:41,902 --> 17:50:44,472
There's one other problem\nthat can go awry here.
18652
17:50:44,472 --> 17:50:45,722
Oh, and I should explain this.
18653
17:50:45,722 --> 17:50:50,842
So this is an allusion to the son,\n
18654
17:50:50,841 --> 17:50:54,151
The word drop, table, students, and\n
18655
17:50:54,152 --> 17:50:56,781
This is humor that only\nCS people would understand
18656
17:50:56,781 --> 17:51:00,381
because it's the mom realizing,\n
18657
17:51:01,650 --> 17:51:04,942
Less funny when you explain it, but once\n
18658
17:51:06,802 --> 17:51:10,192
So one final threat, now\nthat you are graduating
18659
17:51:10,192 --> 17:51:14,662
to the world of proper databases\nand away from CSV files alone.
18660
17:51:14,661 --> 17:51:17,511
Things can go wrong\nwhen using databases
18661
17:51:17,512 --> 17:51:21,180
and honestly, even using CSV\nfiles if you have multiple users.
18662
17:51:21,180 --> 17:51:22,972
And thus far, you and\nI have had the luxury
18663
17:51:22,972 --> 17:51:25,889
in almost every program we've written\n
18664
17:51:25,889 --> 17:51:27,172
It's just you using your code.
18665
17:51:27,171 --> 17:51:30,112
And even if your teaching fellow\nor TA is using it, probably
18666
17:51:31,402 --> 17:51:36,112
But the world gets interesting if you\n
18667
17:51:36,112 --> 17:51:40,072
on websites, such that now you might\n
18668
17:51:40,072 --> 17:51:42,771
to log in at the same time,\nliterally clicking a button
18669
17:51:42,771 --> 17:51:44,752
at the same, or nearly the same time.
18670
17:51:44,752 --> 17:51:47,722
What happens, then, if\na computer is trying
18671
17:51:47,722 --> 17:51:50,632
to handle requests from two\ndifferent people at once
18672
17:51:50,631 --> 17:51:52,822
as might happen all\nthe time on a website?
18673
17:51:52,822 --> 17:51:54,951
You might get what are\ncalled race conditions.
18674
17:51:54,951 --> 17:51:58,401
And this is a problem in computing in\n
18675
17:51:58,402 --> 17:52:02,302
with Python, really just any\ntime you have shared data
18676
17:52:02,302 --> 17:52:04,492
like a database, as follows.
18677
17:52:04,491 --> 17:52:08,961
This apparently is one of the\nmost liked Instagram posts ever.
18678
17:52:08,961 --> 17:52:11,451
It is literally just\na picture of an egg.
18679
17:52:11,451 --> 17:52:12,987
Has anyone clicked on this egg?
18680
17:52:15,561 --> 17:52:19,222
So go search for this photo if you'd\n
18681
17:52:19,222 --> 17:52:21,452
The account is world_record_egg.
18682
17:52:21,451 --> 17:52:24,381
This is just a screenshot of\n
18683
17:52:24,381 --> 17:52:25,881
If you're in the habit\nof using Instagram
18684
17:52:25,881 --> 17:52:28,839
or like any social media site, there's\n
18685
17:52:28,839 --> 17:52:30,261
or a heart button these days.
18686
17:52:30,262 --> 17:52:32,242
And that's actually a\nreally hard problem.
18687
17:52:32,241 --> 17:52:35,391
Such a simple idea to count\nthe number of likes something
18688
17:52:35,391 --> 17:52:38,332
has, but that means\nsomeone has to click on it.
18689
17:52:38,332 --> 17:52:40,252
Your code has to detect the click.
18690
17:52:40,252 --> 17:52:43,201
Your code has to update the database,\n
18691
17:52:43,201 --> 17:52:48,081
even if multiple people are perhaps\n
18692
17:52:48,082 --> 17:52:53,692
And unfortunately, bad things can\n
18693
17:52:53,692 --> 17:52:55,882
at the same time on a computer.
18694
17:52:57,031 --> 17:53:01,012
So here's some more code, half\n
18695
17:53:01,921 --> 17:53:05,572
Suppose that what happens when you,\n
18696
17:53:05,572 --> 17:53:08,811
on the like button on\nthe Instagram post.
18697
17:53:08,811 --> 17:53:13,101
Suppose that code, like the following,\n
18698
17:53:13,101 --> 17:53:19,531
db.execute of select likes from\n
18699
17:53:22,432 --> 17:53:24,622
I'm assuming that that\nphotograph has a unique ID.
18700
17:53:24,622 --> 17:53:28,012
It's some big integer, whatever\nit was, randomly assigned.
18701
17:53:28,012 --> 17:53:30,472
I'm assuming that when\nyou click on the heart
18702
17:53:30,472 --> 17:53:33,502
the unique ID is somehow\nsent to Instagram servers
18703
17:53:33,502 --> 17:53:36,082
so that their code can call it ID.
18704
17:53:36,082 --> 17:53:39,171
And I'm assuming that Instagram\nis using its SQL database
18705
17:53:39,171 --> 17:53:43,131
and selecting, from a posts\ntable, the current number of likes
18706
17:53:43,131 --> 17:53:46,502
of that egg for that given ID number.
18707
17:53:47,002 --> 17:53:50,294
Because I need to know how many likes it\n
18708
17:53:50,294 --> 17:53:51,531
and then update the database.
18709
17:53:51,531 --> 17:53:55,051
I need to select the data, then\nI need to update the data here.
18710
17:53:55,552 --> 17:53:59,122
So in some Python code here,\nlet's store, in a variable called
18711
17:53:59,122 --> 17:54:03,292
likes, whatever comes back in the\n
18712
17:54:03,292 --> 17:54:06,002
Again, this is new syntax\nspecific to our library
18713
17:54:06,002 --> 17:54:09,171
but a common way of getting back\nfirst row and the column called
18714
17:54:10,141 --> 17:54:12,262
So at this point in the\nstory, likes is storing
18715
17:54:12,262 --> 17:54:14,804
the total number of likes, in\nthe millions or whatever it is
18716
17:54:17,241 --> 17:54:21,741
Execute update posts,\nset the number of likes
18717
17:54:21,741 --> 17:54:25,949
equal to this value, where the\nID of the post equals this value.
18718
17:54:25,949 --> 17:54:27,531
What do I want to update the likes to?
18719
17:54:27,531 --> 17:54:31,881
Whatever likes currently is plus\n1, and then plugging in the ID.
18720
17:54:33,682 --> 17:54:37,072
I'm checking the value of\nthe likes, and maybe it's 10.
18721
17:54:37,072 --> 17:54:40,372
I'm changing 10 to 11 and\nthen updating the table.
18722
17:54:40,372 --> 17:54:43,732
But a problem can arise\nif two people have
18723
17:54:43,732 --> 17:54:48,352
clicked on that egg at roughly the\n
18724
17:54:49,682 --> 17:54:52,192
Well, in the world of\ndatabases and servers
18725
17:54:52,192 --> 17:54:56,461
and the Instagrams of the world have\n
18726
17:54:56,461 --> 17:55:00,771
So they can support millions,\nbillions even, of users nowadays.
18727
17:55:02,252 --> 17:55:05,872
Well, typically code like this\nis not what we'll call atomic.
18728
17:55:05,872 --> 17:55:09,772
To be atomic means that it all\nexecutes together or not at all.
18729
17:55:09,771 --> 17:55:14,841
Rather, code typically is executed,\n
18730
17:55:14,841 --> 17:55:18,451
And if your code is running on a server\n
18731
17:55:18,451 --> 17:55:20,871
which is absolutely the case\nfor an app like Instagram
18732
17:55:20,872 --> 17:55:23,902
if you and I click on the\nheart at roughly the same time
18733
17:55:23,902 --> 17:55:27,502
for efficiency, the computer,\nthe server, owned by Instagram
18734
17:55:27,502 --> 17:55:29,841
might execute this line of code for me.
18735
17:55:29,841 --> 17:55:31,878
Then it might execute\nthis line of code for you.
18736
17:55:31,879 --> 17:55:34,461
Then this line of code for me,\nthen this line of code for you
18737
17:55:34,461 --> 17:55:37,044
then this line of code for me,\nthen this line of code for you.
18738
17:55:37,044 --> 17:55:41,891
That is to say, our queries might\n
18739
17:55:41,891 --> 17:55:44,711
Because it'd be a little obnoxious\n
18740
17:55:44,711 --> 17:55:47,381
I'm blocked out while you're\ninteracting with the site.
18741
17:55:47,381 --> 17:55:50,119
It'd be a lot nicer for efficiency\nand fairness if somehow they
18742
17:55:50,120 --> 17:55:52,662
do a little bit of work for me,\na little bit of work for you
18743
17:55:52,661 --> 17:55:55,881
and back and forth, and back and\nforth, equitably on the server.
18744
17:55:55,881 --> 17:55:58,661
So that's what typically happens\nby default. These lines of code
18745
17:56:00,762 --> 17:56:05,112
And they can happen in alternating\norder with other users.
18746
17:56:05,112 --> 17:56:07,150
You can get them combined like this.
18747
17:56:07,150 --> 17:56:11,182
Same order top to bottom, but other\n
18748
17:56:11,182 --> 17:56:15,342
So suppose that the number of\n
18749
17:56:15,341 --> 17:56:19,421
And suppose that Carter and I both click\n
18750
17:56:19,421 --> 17:56:21,761
And suppose this line of\ncode gets executed for me
18751
17:56:21,762 --> 17:56:25,211
and that gives me a value\nin likes, ultimately, of 10.
18752
17:56:25,211 --> 17:56:28,631
Suppose, then, that the computer takes\n
18753
17:56:28,631 --> 17:56:31,031
does the same code for\nCarter, and gets back
18754
17:56:31,031 --> 17:56:33,252
what value for the\ncurrent number of likes?
18755
17:56:34,512 --> 17:56:36,550
Because mine has not been recorded yet.
18756
17:56:36,550 --> 17:56:39,252
At this point in the story,\nsomewhere in the computer's memory
18757
17:56:39,252 --> 17:56:41,561
there's a likes variable\nfor me, storing 10.
18758
17:56:41,561 --> 17:56:44,741
There's a likes variable\nstoring 10 for Carter.
18759
17:56:44,741 --> 17:56:46,631
Then this line of code executes for me.
18760
17:56:46,631 --> 17:56:50,560
It updates the database to be likes\n
18761
17:56:50,561 --> 17:56:55,781
Then Carter's code is executed,\n
18762
17:56:59,502 --> 17:57:03,550
Because his value of likes happened\n
18763
17:57:03,550 --> 17:57:06,881
And so the metaphor here, that if we\n
18764
17:57:06,881 --> 17:57:10,131
actually act out, is something that was\n
18765
17:57:10,131 --> 17:57:15,640
systems class, whereby the most similar\n
18766
17:57:15,641 --> 17:57:17,712
if you've got a mini\nfridge in your dorm room.
18767
17:57:17,711 --> 17:57:23,781
And one of you and your roommates comes\n
18768
17:57:23,781 --> 17:57:26,502
oh, we're out of milk, was\nhow the story went in my day.
18769
17:57:26,502 --> 17:57:30,525
So you close the refrigerator, and\n
18770
17:57:30,525 --> 17:57:31,900
and get in line to buy some milk.
18771
17:57:31,900 --> 17:57:33,650
Meanwhile, your roommate comes home.
18772
17:57:33,650 --> 17:57:38,265
They, too, inspect the state of your\n
18773
17:57:38,266 --> 17:57:40,391
open the door, and realizes,\noh, we're out of milk.
18774
17:57:41,322 --> 17:57:43,362
Close the fridge, go\nacross the street, and head
18775
17:57:43,362 --> 17:57:45,400
to maybe a different store,\nor the line is long enough
18776
17:57:45,400 --> 17:57:47,192
that you don't see each\nother at the store.
18777
17:57:47,192 --> 17:57:51,042
So long story short, you both eventually\n
18778
17:57:51,042 --> 17:57:52,900
now there's milk from\nyour other roommate
18779
17:57:52,900 --> 17:57:56,021
there because you both\nmade a decision on this
18780
17:57:56,021 --> 17:58:01,150
based on the state of a variable\n
18781
17:58:01,150 --> 17:58:03,042
And you didn't somehow communicate.
18782
17:58:03,042 --> 17:58:05,952
Now in the real world, this\nis absolutely solvable.
18783
17:58:05,951 --> 17:58:09,461
How would you fix this or avoid\nthis problem in the real world?
18784
17:58:09,461 --> 17:58:11,112
Literally, own roommate, own fridge.
18785
17:58:11,112 --> 17:58:13,092
AUDIENCE: Text your\nroommate [INAUDIBLE]..
18786
17:58:14,091 --> 17:58:15,824
Let them know, so somehow communicate.
18787
17:58:15,824 --> 17:58:18,281
And in fact, the terminology\nhere would be multiple threads
18788
17:58:18,281 --> 17:58:20,800
can somehow intercommunicate\nby having shared state
18789
17:58:20,800 --> 17:58:22,572
like the iMessage thread on your phone.
18790
17:58:23,531 --> 17:58:26,832
You could, more dramatically,\nlock the refrigerator somehow
18791
17:58:26,832 --> 17:58:30,912
thereby making the milk\npurchasing process atomic.
18792
17:58:30,911 --> 17:58:33,730
The fundamental problem is\nthat for efficiency, again
18793
17:58:33,730 --> 17:58:37,031
computers tend to\nintermingle logic that needs
18794
17:58:37,031 --> 17:58:41,322
to happen when it's happening across\n
18795
17:58:42,521 --> 17:58:45,161
You need to make sure that all\nthree of these lines of code
18796
17:58:45,161 --> 17:58:48,610
execute for me, and then\nfor Carter, and then for you
18797
17:58:48,610 --> 17:58:51,050
if you want to ensure that\nthis count is correct.
18798
17:58:51,050 --> 17:58:54,561
And for years, when social media\n
18799
17:58:54,561 --> 17:58:56,442
this was a super hard problem.
18800
17:58:56,442 --> 17:58:59,262
Twitter used to go down all\nof the time, and tweets
18801
17:58:59,262 --> 17:59:01,601
and retweets were a thing\nthat were similarly happening
18802
17:59:02,832 --> 17:59:04,362
These are hard problems to solve.
18803
17:59:04,362 --> 17:59:06,031
And thankfully, there are solutions.
18804
17:59:06,031 --> 17:59:08,781
And we won't get into the weeds\n
18805
17:59:08,781 --> 17:59:11,442
but know that there are\nsolutions in the form of things
18806
17:59:11,442 --> 17:59:14,952
called locks, which I use that\n
18807
17:59:14,951 --> 17:59:18,911
Software locks can allow you to\n
18808
17:59:18,911 --> 17:59:20,891
look at it until you're done with it.
18809
17:59:20,891 --> 17:59:23,112
There are things called\ntransactions, which
18810
17:59:23,112 --> 17:59:26,442
allow you to do the equivalent of\n
18811
17:59:26,442 --> 17:59:29,472
out your roommate from accessing\nthat same variable, too
18812
17:59:29,472 --> 17:59:32,262
but for slightly less amount of time.
18813
17:59:32,262 --> 17:59:34,222
There are solutions to these problems.
18814
17:59:34,222 --> 17:59:37,991
So for instance, in Python,\nthe same code now in green
18815
17:59:37,991 --> 17:59:39,701
might look a little something like this.
18816
17:59:39,701 --> 17:59:42,491
When you know that something\nhas to happen all at once
18817
17:59:42,491 --> 17:59:46,362
altogether, you first begin a\n
18818
17:59:46,362 --> 17:59:49,048
and then you commit the\ntransaction at the very end.
18819
17:59:49,048 --> 17:59:51,131
Here, too, though, there's\ngoing to be a downside.
18820
17:59:51,131 --> 17:59:55,243
Typically, the more you use\ntransactions in this way
18821
17:59:55,243 --> 17:59:56,951
potentially the higher\nthe probability is
18822
17:59:56,951 --> 18:00:00,531
that you're going to box someone out or\n
18823
18:00:01,031 --> 18:00:02,824
Because we can't interact\nat the same time.
18824
18:00:02,824 --> 18:00:05,262
Or you might make his request\nfail if he tries to update
18825
18:00:05,262 --> 18:00:07,211
something that's already been updated.
18826
18:00:07,211 --> 18:00:10,061
So you generally want to\nhave as few lines of code
18827
18:00:10,061 --> 18:00:13,182
together in between these transactions\n
18828
18:00:13,182 --> 18:00:16,122
And you go to CVS and you get\nback really fast so as to not
18829
18:00:16,122 --> 18:00:17,891
cause these kind of performance things.
18830
18:00:17,891 --> 18:00:20,411
So things indeed\nescalated quickly today.
18831
18:00:20,411 --> 18:00:23,613
The original goal was just to solve\n
18832
18:00:23,614 --> 18:00:24,822
more effectively than Python.
18833
18:00:24,822 --> 18:00:27,114
But as soon as you have these\nmore powerful techniques
18834
18:00:27,114 --> 18:00:28,866
a whole new set of problems arises.
18835
18:00:28,866 --> 18:00:30,491
Takes practice to get comfortable with.
18836
18:00:30,491 --> 18:00:34,511
But ultimately, this is all leading\n
18837
18:00:34,512 --> 18:00:37,601
of web programming with HTML,\nCSS, and some JavaScript.
18838
18:00:37,601 --> 18:00:40,432
The week after, bringing Python\nand SQL back into the mix.
18839
18:00:40,432 --> 18:00:42,281
So that by term's end,\nwe've really now used
18840
18:00:42,281 --> 18:00:45,101
all of these different languages\nfor what they're best at.
18841
18:00:45,101 --> 18:00:48,184
And over the next few weeks, the goal\n
18842
18:00:48,184 --> 18:00:51,273
and comfortable with what each of\n
18843
18:00:51,273 --> 18:00:52,481
Let's go ahead and wrap here.
18844
18:00:52,482 --> 18:00:53,815
I'll stick around for questions.
18845
18:02:15,171 --> 18:02:19,222
This is CS50, and this\nis already week 8.
18846
18:02:19,222 --> 18:02:21,705
And if we think back to\nthe past several weeks now
18847
18:02:21,705 --> 18:02:24,622
recall that things started pretty\n
18848
18:02:24,622 --> 18:02:27,049
in like week 0, when\nwe were using Scratch
18849
18:02:27,048 --> 18:02:29,631
because with Scratch we had a\nGUI, a graphical user interface.
18850
18:02:29,631 --> 18:02:33,171
So even as we explored variables and\n
18851
18:02:33,171 --> 18:02:36,083
you had kind of a fun environment\n
18852
18:02:36,084 --> 18:02:37,792
And then in week 1,\nwe sort of took a lot
18853
18:02:37,792 --> 18:02:41,692
of that away, when we introduced C, and\n
18854
18:02:41,692 --> 18:02:46,372
because now, all of your programs became\n
18855
18:02:46,372 --> 18:02:49,432
and gone was the mouse, the\nanimations, the menus, and so forth.
18856
18:02:49,432 --> 18:02:51,622
And so now, fast\nforward to week 8, we're
18857
18:02:51,622 --> 18:02:54,952
going to bring those kinds of\nuser interface, UI, elements back
18858
18:02:54,951 --> 18:02:56,511
in the form of web programming.
18859
18:02:56,512 --> 18:02:58,851
And this goes beyond\njust laying out websites.
18860
18:02:58,851 --> 18:03:03,069
This will, to this week and next week,\n
18861
18:03:03,069 --> 18:03:05,362
stuff that we've been doing\nfor the past several weeks
18862
18:03:05,362 --> 18:03:07,972
using Python, using\nSQL, and now introducing
18863
18:03:07,972 --> 18:03:10,972
a couple of other languages,\non the so-called client side
18864
18:03:10,972 --> 18:03:13,461
on your own Mac, your own PC,\nyour own phone, that's going
18865
18:03:13,461 --> 18:03:15,472
to talk to those back-end services.
18866
18:03:15,472 --> 18:03:18,772
So indeed, at this end of\nCS50, does everything rather
18867
18:03:18,771 --> 18:03:22,131
come together into a user interface\nthat's just super familiar.
18868
18:03:22,131 --> 18:03:24,951
All of us are on our phones,\ndesktops, laptops, every day.
18869
18:03:24,951 --> 18:03:28,252
And increasingly, even the mobile\napps that you all are using
18870
18:03:28,252 --> 18:03:32,512
are implemented, not necessarily\n
18871
18:03:32,512 --> 18:03:34,641
if you're familiar with\nthose, but with languages
18872
18:03:34,641 --> 18:03:38,552
called HTML, CSS, and JavaScript,\n
18873
18:03:38,552 --> 18:03:43,372
But before we do that, let's provide a\n
18874
18:03:43,372 --> 18:03:46,822
because indeed, we'll start to look\n
18875
18:03:46,822 --> 18:03:51,105
itself works, albeit quickly, so that\n
18876
18:03:51,105 --> 18:03:54,022
all of this code is running, how you\n
18877
18:03:54,021 --> 18:03:56,752
really, ultimately, after\nCS50, you can learn, by just
18878
18:03:56,752 --> 18:03:59,156
poking around other actual websites.
18879
18:03:59,156 --> 18:04:00,531
So the internet, we're all on it.
18880
18:04:00,531 --> 18:04:05,101
Literally, right now, what\nis it, in your own words?
18881
18:04:07,671 --> 18:04:10,371
It's this utility nowadays, that\nwe all rather take for granted.
18882
18:04:12,654 --> 18:04:14,572
SPEAKER 1: OK, big\nstorage, and indeed, that's
18883
18:04:14,572 --> 18:04:18,451
how the cloud is described, which is\n
18884
18:04:18,451 --> 18:04:21,771
for a whole lot of wires\nand cables and hardware.
18885
18:04:21,771 --> 18:04:24,941
And the internet, other\nformulations of the term, how else?
18886
18:04:24,942 --> 18:04:26,692
AUDIENCE: Bunch of\ndata that we can reach.
18887
18:04:26,692 --> 18:04:28,609
SPEAKER 1: OK, a bunch\nof data that we can all
18888
18:04:28,608 --> 18:04:32,241
reach, by way of being interconnected\n
18889
18:04:32,241 --> 18:04:35,002
And so really, the internet,\ntoo, is a hardware thing.
18890
18:04:35,002 --> 18:04:38,811
There's a whole lot of servers out\n
18891
18:04:38,811 --> 18:04:41,331
via physical cables, via\ninternet service providers
18892
18:04:41,332 --> 18:04:43,382
via wireless connectivity, and the like.
18893
18:04:43,381 --> 18:04:46,161
And once you start to have\nnetworks of networks of networks
18894
18:04:47,362 --> 18:04:50,332
Indeed, Harvard has its own network\n
18895
18:04:50,332 --> 18:04:52,687
and your own home probably\nhas its own network.
18896
18:04:52,686 --> 18:04:54,561
But once you start\nconnecting those networks
18897
18:04:54,561 --> 18:04:58,921
do you get the interconnected network\n
18898
18:04:58,921 --> 18:05:01,551
So there's this whole\nalphabet soup that goes
18899
18:05:01,552 --> 18:05:03,802
with the internet, some of\nwhose acronyms and terms
18900
18:05:03,802 --> 18:05:04,732
you've probably seen before.
18901
18:05:04,732 --> 18:05:06,532
But let's at least peel\nback some of those layers
18902
18:05:06,531 --> 18:05:08,752
and consider what some of\nthe building blocks are.
18903
18:05:08,752 --> 18:05:11,752
So here's a picture of the internet\n
18904
18:05:11,752 --> 18:05:14,182
back in 1969, when it\nwas something called
18905
18:05:14,182 --> 18:05:17,482
ARPANET, from the Advanced\nResearch Projects Agency.
18906
18:05:17,482 --> 18:05:20,872
And the intent, originally, was just\n
18907
18:05:20,872 --> 18:05:25,702
in Utah and California, literally\nservers, or computers, in each
18908
18:05:25,701 --> 18:05:27,891
of those areas, somehow\ninterconnected with wires
18909
18:05:27,891 --> 18:05:29,616
so that people could\nstart to share data.
18910
18:05:29,616 --> 18:05:33,182
A year later, it expanded to\ninclude MIT and Harvard and others.
18911
18:05:33,182 --> 18:05:36,201
And now fast forward to\ntoday, you have a huge number
18912
18:05:36,201 --> 18:05:39,322
of systems around the world\nthat are on this same network.
18913
18:05:39,322 --> 18:05:41,841
And, in fact, if I\njust pull up a web page
18914
18:05:41,841 --> 18:05:44,301
here, that's sort of\nconstantly changing
18915
18:05:44,302 --> 18:05:48,802
a visualization of the internet as\n
18916
18:05:48,802 --> 18:05:52,072
in the abstract, all of these\nlines and interconnections
18917
18:05:52,072 --> 18:05:56,008
represent just how interconnected\nthe world is today.
18918
18:05:56,008 --> 18:05:59,091
And it just means that there's all the\n
18919
18:05:59,091 --> 18:06:02,822
all of the more hardware giving\n
18920
18:06:02,822 --> 18:06:07,402
But if we focus, really, on just\n
18921
18:06:07,402 --> 18:06:11,872
whether back in 1970, or now in 2021,\n
18922
18:06:11,872 --> 18:06:15,385
yes, a server, but a certain type\n
18923
18:06:15,385 --> 18:06:17,302
And a router, as the\nname implies, just routes
18924
18:06:17,302 --> 18:06:21,302
data left to right, top to\nbottom, from one point to another.
18925
18:06:21,302 --> 18:06:25,132
And so there's all these servers here\n
18926
18:06:25,131 --> 18:06:28,651
in Comcast's network, Verizon's\nnetwork, your own home network
18927
18:06:28,652 --> 18:06:31,461
you have your own routers out\nthere, whose purpose in life
18928
18:06:31,461 --> 18:06:34,277
is to take in data and then\ndecide, should I send it this way
18929
18:06:34,277 --> 18:06:36,652
or this way, or this way, so\nto speak, assuming there are
18930
18:06:36,652 --> 18:06:38,512
multiple options with multiple cables.
18931
18:06:38,512 --> 18:06:41,692
You, in your home, probably have just\n
18932
18:06:41,692 --> 18:06:46,162
But certainly, if you're a place like\n
18933
18:06:46,161 --> 18:06:49,221
there's probably a whole\nbunch of interconnections
18934
18:06:49,222 --> 18:06:51,891
that the data can then\ntravel across ultimately.
18935
18:06:51,891 --> 18:06:54,981
So how do we get data\namong these routers?
18936
18:06:54,982 --> 18:06:57,652
For instance, if you want\nto send an email to someone
18937
18:06:57,652 --> 18:07:00,502
at Stanford, in California,\nfrom here, on the East Coast
18938
18:07:00,502 --> 18:07:04,771
or if you want to visit\nwww.stanford.edu, how does your laptop
18939
18:07:04,771 --> 18:07:08,557
your phone, your desktop, actually\n
18940
18:07:08,557 --> 18:07:12,112
Well, essentially, your\nlaptop or phone knows
18941
18:07:12,112 --> 18:07:16,141
when it boots up at the beginning of\n
18942
18:07:16,141 --> 18:07:17,682
the address of that local router is.
18943
18:07:17,682 --> 18:07:20,491
So if you want to send an\nemail from my laptop over here
18944
18:07:20,491 --> 18:07:23,701
my laptop is essentially going to\n
18945
18:07:23,701 --> 18:07:25,711
And then, from there, I\ndon't know, I don't care
18946
18:07:25,711 --> 18:07:27,254
how it gets the rest of the distance.
18947
18:07:27,254 --> 18:07:29,822
But hopefully, within some\nsmall number of steps later
18948
18:07:29,822 --> 18:07:32,461
Harvard's router is going to\nsend it to maybe Boston's router
18949
18:07:32,461 --> 18:07:34,586
is going to send it to\nCalifornia's router is going
18950
18:07:34,586 --> 18:07:37,771
to send it to Stanford's router, until\n
18951
18:07:38,372 --> 18:07:41,822
And we can depict this, actually,\nhow about a bit playfully.
18952
18:07:41,822 --> 18:07:44,072
Thankfully, the course's\nstaff kindly volunteered
18953
18:07:44,072 --> 18:07:48,792
to create a visualization for\nthis, using a familiar technology.
18954
18:07:48,792 --> 18:07:52,322
So here we have some of our TFs\n
18955
18:07:52,322 --> 18:07:55,472
Let me go ahead and full\nscreen this window here.
18956
18:07:55,472 --> 18:07:58,391
Give me just a moment to\npull it up on my screen here.
18957
18:07:58,391 --> 18:08:03,182
And we'll consider what happens if we\n
18958
18:08:03,182 --> 18:08:07,502
from one person or router,\nnamely Phyllis in this case
18959
18:08:07,502 --> 18:08:10,561
in the bottom right hand corner,\nup to Brian, in this case
18960
18:08:10,561 --> 18:08:11,831
in the top left hand corner.
18961
18:08:11,832 --> 18:08:14,461
So each of the staff members\nhere represents exactly one
18962
18:08:14,461 --> 18:08:17,051
of these routers on the internet.
18963
18:08:47,612 --> 18:08:49,891
It actually took us a\nsignificant number of attempts
18964
18:08:49,891 --> 18:08:51,481
to get that ultimately right.
18965
18:08:51,482 --> 18:08:54,632
So when, what was it the\nstaff were all passing here?
18966
18:08:54,631 --> 18:08:57,714
Here we have just, physically, what\n
18967
18:08:57,714 --> 18:08:59,881
So Phyllis started with an\nenvelope, inside of which
18968
18:08:59,881 --> 18:09:01,951
was that email, presumably,\non the East Coast
18969
18:09:01,951 --> 18:09:05,136
and she wanted to send it to Brian on\n
18970
18:09:05,137 --> 18:09:08,012
And so she had all of these different\n
18971
18:09:08,012 --> 18:09:11,112
between her and point B, namely Brian.
18972
18:09:11,112 --> 18:09:14,671
She could go up, down, in her case, and\n
18973
18:09:14,671 --> 18:09:17,461
could go up, down, left, or right,\n
18974
18:09:17,461 --> 18:09:19,442
And long story short,\nthere's algorithms that
18975
18:09:19,442 --> 18:09:22,812
figure out how you decide\nto send a packet up, down
18976
18:09:22,811 --> 18:09:24,661
left, or right, so to speak.
18977
18:09:24,661 --> 18:09:29,716
But they do so by taking an input, and\n
18978
18:09:29,716 --> 18:09:32,341
And there's at least a couple of\nthings on the outside of this
18979
18:09:32,341 --> 18:09:35,281
because all of these routers and,\n
18980
18:09:35,281 --> 18:09:38,012
and phones these days,\nspeak something called
18981
18:09:38,012 --> 18:09:41,132
TCP/IP, a set of\nacronyms you've probably
18982
18:09:41,131 --> 18:09:44,011
seen somewhere on your\nphone, your Mac or PC
18983
18:09:44,012 --> 18:09:48,422
in print somewhere, which refers\n
18984
18:09:48,421 --> 18:09:51,194
that computers use to\ninter-communicate these days.
18985
18:09:52,112 --> 18:09:54,631
A protocol is like a set\nof rules, that you behave.
18986
18:09:54,631 --> 18:09:57,091
In healthier times, I might\nextend my hand and someone
18987
18:09:57,091 --> 18:10:00,631
like Carter might extend his hand,\n
18988
18:10:00,631 --> 18:10:03,871
on a human protocol of like\nliterally physically shaking hands.
18989
18:10:03,872 --> 18:10:07,082
Nowadays, we have mask protocols,\nwhereby what you need to do
18990
18:10:08,591 --> 18:10:11,641
But that, too, is just a set of rules\n
18991
18:10:11,641 --> 18:10:13,951
that's somewhere\nstandardized and documented.
18992
18:10:13,951 --> 18:10:16,711
So computers use protocols\nall the time to govern
18993
18:10:16,711 --> 18:10:19,622
how they are sending information\nand receiving information.
18994
18:10:19,622 --> 18:10:24,482
And TCP and IP are two such protocols\n
18995
18:10:24,482 --> 18:10:27,212
What TCP/IP tells someone\nlike Phyllis to do
18996
18:10:27,211 --> 18:10:31,152
if she wants to send an email to Brian,\n
18997
18:10:31,902 --> 18:10:37,082
But on the outside of that virtual\n
18998
18:10:37,082 --> 18:10:41,132
And I'll describe this as destination\n
18999
18:10:41,131 --> 18:10:44,011
just like in our human world,\nyou would write the destination
19000
18:10:45,332 --> 18:10:49,202
And then she's going to put her own\n
19001
18:10:49,201 --> 18:10:51,750
corner, just like you, the\nsender, would put your own source
19002
18:10:51,750 --> 18:10:53,171
address in the human world.
19003
18:10:53,171 --> 18:10:56,432
But, instead of these addresses\n
19004
18:10:56,432 --> 18:11:01,141
Cambridge, Massachusetts 02138, USA,\n
19005
18:11:01,141 --> 18:11:05,641
on the internet have unique addresses\n
19006
18:11:05,641 --> 18:11:08,192
And an IP address is\njust a numeric identifier
19007
18:11:08,192 --> 18:11:11,461
on the internet, that allows\ncomputers, like Phyllis and Brian
19008
18:11:11,461 --> 18:11:14,432
to address these envelopes\nto and from each other.
19009
18:11:14,432 --> 18:11:16,741
And you've probably seen\nthe format at some point.
19010
18:11:16,741 --> 18:11:20,311
Typically, the format of IP\naddresses is something dot something
19011
18:11:20,311 --> 18:11:22,261
dot something dot something.
19012
18:11:22,262 --> 18:11:24,872
Each of those somethings,\nrepresented here with a hash symbol
19013
18:11:24,872 --> 18:11:29,852
is a number from 0 through 255.
19014
18:11:29,851 --> 18:11:32,911
And, based on that little\nhint, if each of these hashes
19015
18:11:32,911 --> 18:11:36,631
represents a number from 0\nto 255, each of those hashes
19016
18:11:36,631 --> 18:11:39,542
is represented with\nhow many bytes or bits?
19017
18:11:39,542 --> 18:11:43,442
Eight bits or one byte, which is to\n
19018
18:11:43,442 --> 18:11:47,072
an IP address must use\n32 bits or 4 bytes
19019
18:11:47,072 --> 18:11:50,641
if we rewind now to some of the\n
19020
18:11:50,641 --> 18:11:52,991
And what that means is,\nat least at a glance
19021
18:11:52,991 --> 18:11:57,153
it looks like we have 4 billion some\n
19022
18:11:57,154 --> 18:11:58,862
Now, unfortunately,\nthere's a huge number
19023
18:11:58,862 --> 18:12:01,772
of humans in the world these days,\n
19024
18:12:01,771 --> 18:12:05,221
have multiple devices, certainly\n
19025
18:12:05,222 --> 18:12:08,974
a laptop, and a phone, and you have\n
19026
18:12:08,974 --> 18:12:10,391
all of which need to be addressed.
19027
18:12:10,391 --> 18:12:12,332
So there's another type\nof IP address that's
19028
18:12:12,332 --> 18:12:14,012
starting to be used more commonly.
19029
18:12:16,411 --> 18:12:19,232
There's also version 6\nwhich, instead of 32 bits
19030
18:12:19,232 --> 18:12:23,552
uses 128 bits, which gives us a\n
19031
18:12:23,551 --> 18:12:27,872
for computers, so we can at least handle\n
19032
18:12:28,812 --> 18:12:31,772
So this is to say, what ultimately\nis going on this envelope
19033
18:12:31,771 --> 18:12:35,941
is the destination address, that is\n
19034
18:12:35,941 --> 18:12:40,021
address, that is Phyllis's IP address,\n
19035
18:12:40,021 --> 18:12:43,051
to point B, and if need\nbe, back, by just flipping
19036
18:12:43,051 --> 18:12:44,641
the source and the destination.
19037
18:12:44,642 --> 18:12:49,652
But on the internet, you presumably know\n
19038
18:12:49,652 --> 18:12:53,191
There's web servers, there's chat\n
19039
18:12:53,191 --> 18:12:56,191
Like there's all of these different\n
19040
18:12:56,191 --> 18:12:58,771
And so, when Brian\nreceives that envelope
19041
18:12:58,771 --> 18:13:05,432
how does he know it's an email, versus\n
19042
18:13:05,432 --> 18:13:07,622
versus something else altogether.
19043
18:13:07,622 --> 18:13:10,771
Well, it turns out that we\ncan look at the other part
19044
18:13:10,771 --> 18:13:14,281
of this acronym, the TCP in TCP/IP.
19045
18:13:14,282 --> 18:13:17,172
And what TCP allows us\nto do, for instance
19046
18:13:17,172 --> 18:13:18,822
is specify a couple of things.
19047
18:13:18,822 --> 18:13:23,792
One, the type of service whose\n
19048
18:13:23,792 --> 18:13:26,881
it does this with a numeric identifier.
19049
18:13:26,881 --> 18:13:31,652
And I'm going to go ahead and write down\n
19050
18:13:31,652 --> 18:13:35,222
And I'm going to write that in the\n
19051
18:13:35,221 --> 18:13:37,381
So technically, now,\nwhat's on this envelope
19052
18:13:37,381 --> 18:13:40,111
is not just the addresses,\nbut also a unique number
19053
18:13:40,111 --> 18:13:45,182
that represents what kind of service\n
19054
18:13:45,182 --> 18:13:48,792
whether it's email, or web traffic,\nor Skype, or something else.
19055
18:13:48,792 --> 18:13:53,342
These numbers are standardized, and here\n
19056
18:13:53,342 --> 18:13:56,192
not even in the context of email,\nbut in the context of the web.
19057
18:13:56,191 --> 18:13:59,131
Port 80 is typically used\nwhenever an envelope contains
19058
18:13:59,131 --> 18:14:03,361
a web page, or a request\ntherefor, or the number 443
19059
18:14:03,361 --> 18:14:07,201
when that request is actually\n
19060
18:14:07,202 --> 18:14:11,192
know, in URLs, known as HTTPS,\n
19061
18:14:11,191 --> 18:14:13,081
More on what the HTTP means later.
19062
18:14:13,081 --> 18:14:16,955
If it's email, the number\nmight be 25 or 465, or 587.
19063
18:14:16,955 --> 18:14:19,872
These are the kinds of things you\n
19064
18:14:19,872 --> 18:14:24,271
But if you've ever had to configure,\nlike, Outlook or even Gmail
19065
18:14:24,271 --> 18:14:26,671
to talk to another account,\nyou might very well
19066
18:14:26,672 --> 18:14:30,902
have seen these numbers, by typing\n
19067
18:14:30,902 --> 18:14:33,892
and then a number, which is only to\n
19068
18:14:33,892 --> 18:14:35,642
But they're typically\nnot things you and I
19069
18:14:35,642 --> 18:14:38,582
have to care about, because\nservers and computers nowadays
19070
18:14:38,581 --> 18:14:40,811
automate much of this process.
19071
18:14:40,812 --> 18:14:45,211
But that's all it takes, ultimately, for\n
19072
18:14:45,211 --> 18:14:47,042
But what if it's a really big message?
19073
18:14:47,042 --> 18:14:50,351
If it's a short email, It might\n
19074
18:14:50,941 --> 18:14:54,241
But suppose that Phyllis wants\nto send Brian a picture of a cat
19075
18:14:54,241 --> 18:14:56,581
like this, or worse, a video of a cat.
19076
18:14:56,581 --> 18:15:00,811
It would be kind of inequitable\nif no one else could do anything
19077
18:15:00,812 --> 18:15:03,062
on the internet, just\nbecause Phyllis wants
19078
18:15:03,062 --> 18:15:06,601
to send Brian a really big picture,\na really big video of a cat.
19079
18:15:06,601 --> 18:15:10,351
It would be nice if we could kind\n
19080
18:15:10,351 --> 18:15:13,529
across these routers, so that\nwe can give a little bit of time
19081
18:15:13,529 --> 18:15:15,572
to Phyllis, a little bit\nof time to someone else
19082
18:15:15,572 --> 18:15:18,779
a little bit of time to someone else,\n
19083
18:15:19,892 --> 18:15:25,472
But in terms of fairness, she\ndoesn't monopolize the bandwidth
19084
18:15:27,842 --> 18:15:31,982
And this, then, allows us to\ndo one other feature of TCP/IP
19085
18:15:31,982 --> 18:15:35,072
which is fragmentation,\nwhere we can temporarily
19086
18:15:35,072 --> 18:15:37,952
and Phyllis's computer would\ndo this automatically, fragment
19087
18:15:37,952 --> 18:15:41,582
the big packet in question,\nor the big file in question
19088
18:15:41,581 --> 18:15:46,561
and then use, not just a single\n
19089
18:15:48,369 --> 18:15:50,161
If we do that, though,\nwe're probably going
19090
18:15:50,161 --> 18:15:54,031
to need one other piece of information,\n
19091
18:15:54,032 --> 18:15:57,332
Like, if you were implementing this,\n
19092
18:15:57,331 --> 18:16:00,301
into four parts, like,\nintuitively, what might you
19093
18:16:00,301 --> 18:16:03,911
want to put virtually on the\noutside of this envelope now?
19094
18:16:05,259 --> 18:16:06,842
SPEAKER 1: The order of them, somehow.
19095
18:16:06,842 --> 18:16:10,472
So probably something like part\none of four, part two of four
19096
18:16:10,471 --> 18:16:12,161
part three of four, and so forth.
19097
18:16:12,161 --> 18:16:14,792
So I'm going to write one more thing\n
19098
18:16:15,301 --> 18:16:17,551
I put some kind of\nsequence number, that's
19099
18:16:17,551 --> 18:16:20,191
just a little bit of a\nclue to Brian, to know
19100
18:16:20,191 --> 18:16:22,171
in what order to\nreassemble these things.
19101
18:16:22,172 --> 18:16:25,202
And even more powerfully\nthan that, this actually
19102
18:16:25,202 --> 18:16:29,252
gives us this simple primitive of\n
19103
18:16:30,452 --> 18:16:35,881
If Brian receives envelopes like these,\n
19104
18:16:35,881 --> 18:16:39,691
field, what other feature\ndoes TCP apparently
19105
18:16:39,691 --> 18:16:42,542
enable Brian and Phyllis to implement?
19106
18:16:44,282 --> 18:16:46,612
But it's not just the\nordering of the packets.
19107
18:16:46,611 --> 18:16:50,701
What else might be useful about\nputting numbers on these things
19108
18:16:55,187 --> 18:16:56,932
AUDIENCE: How about if you like missed.
19109
18:16:56,932 --> 18:16:58,851
SPEAKER 1: If you missed something\nthat was intended to be sent
19110
18:16:59,851 --> 18:17:04,110
So short answer, exactly, yes, TCP,\n
19111
18:17:04,110 --> 18:17:06,652
that we\'re including, can quote\nunquote "guarantee" delivery.
19112
18:17:07,161 --> 18:17:10,341
Because if Brian receives one\nout of four, two out of four
19113
18:17:10,342 --> 18:17:12,562
four out of four, but\nnot three out of four
19114
18:17:12,562 --> 18:17:16,402
he now knows, predictably, that\n
19115
18:17:17,911 --> 18:17:21,801
And so this is why pretty much\nalways, if you receive an email
19116
18:17:21,801 --> 18:17:24,411
you either receive the whole\nthing, or nothing at all.
19117
18:17:24,411 --> 18:17:27,741
Like sentences and words and\nparagraphs should never really
19118
18:17:29,361 --> 18:17:31,478
Or if you download a\nphotograph on the web
19119
18:17:31,479 --> 18:17:33,562
it shouldn't just have a\nblank hole in the middle
19120
18:17:33,562 --> 18:17:36,652
just because that packet of\ninformation happened to be lost.
19121
18:17:36,652 --> 18:17:40,851
TCP, if it is the protocol being used to\n
19122
18:17:40,851 --> 18:17:44,792
ensures that it either all gets there,\n
19123
18:17:44,792 --> 18:17:48,112
So this is an important property,\nbut, just as a teaser there's
19124
18:17:49,312 --> 18:17:52,732
There's something called UDP,\nwhich is an alternative to TCP
19125
18:17:52,732 --> 18:17:54,472
that doesn't guarantee delivery.
19126
18:17:54,471 --> 18:17:58,072
And just as a taste of why you might\n
19127
18:17:58,072 --> 18:18:03,366
maybe you're watching like a streaming\n
19128
18:18:03,366 --> 18:18:05,241
You probably don't\nnecessarily want the thing
19129
18:18:05,241 --> 18:18:08,476
to buffer and buffer and buffer, just\n
19130
18:18:08,476 --> 18:18:10,351
because you're going to\nstart to miss things.
19131
18:18:10,351 --> 18:18:12,471
And then you're going to be the\nonly one in the world watching
19132
18:18:12,471 --> 18:18:15,849
the game that ended 20 minutes ago, when\n
19133
18:18:15,850 --> 18:18:18,142
Similarly for a voice call,\nit would be really annoying
19134
18:18:18,142 --> 18:18:19,942
if our voice is constantly buffered.
19135
18:18:19,941 --> 18:18:22,678
So UDP might be a good\nprotocol for making sure
19136
18:18:22,679 --> 18:18:25,762
that, even if the person on the other\n
19137
18:18:26,991 --> 18:18:29,781
It's not pausing and\nresending and resending
19138
18:18:29,782 --> 18:18:33,961
because that would really slow down\n
19139
18:18:33,961 --> 18:18:36,861
So, in short, IP handles the\naddressing of these packets
19140
18:18:36,861 --> 18:18:40,341
and standardizes numbers that every\n
19141
18:18:40,342 --> 18:18:44,542
and TCP handles the standardization\nof like what services
19142
18:18:44,542 --> 18:18:50,661
can be used, between points A and\n
19143
18:18:50,661 --> 18:18:54,652
but presumably, when Phyllis\nsends a message to Brian
19144
18:18:54,652 --> 18:18:56,782
she doesn't really know\nand probably shouldn't
19145
18:18:56,782 --> 18:18:58,912
care what his IP address is, right?
19146
18:18:58,911 --> 18:19:01,491
These days it's, like, I don't\nknow most of the phone numbers
19147
18:19:02,661 --> 18:19:04,614
I instead look them up in some way.
19148
18:19:04,614 --> 18:19:07,072
And, indeed, when you visit a\nwebsite, what do you type in?
19149
18:19:07,072 --> 18:19:10,048
It's typically not something\ndot something dot something dot
19150
18:19:10,048 --> 18:19:12,298
something, where each of\nthose somethings is a number.
19151
18:19:12,298 --> 18:19:14,182
What do you typically\ntype in to a browser?
19152
18:19:15,622 --> 18:19:20,031
Something like Stanford.edu,\nHarvard.edu, Yale.edu, gmail.com
19153
18:19:20,031 --> 18:19:22,072
or any other such domain name.
19154
18:19:22,072 --> 18:19:24,622
And so, thankfully,\nthere's another system
19155
18:19:24,622 --> 18:19:29,551
on the internet, one more acronym for\n
19156
18:19:29,551 --> 18:19:33,742
And pretty much every network on the\n
19157
18:19:33,741 --> 18:19:37,838
your own home network, somewhere,\nsomehow has a DNS server.
19158
18:19:37,839 --> 18:19:39,922
You probably didn't have\nto configure it yourself.
19159
18:19:39,922 --> 18:19:44,342
Someone else did, your campus, your\n
19160
18:19:44,342 --> 18:19:48,112
But there is some server connected\n
19161
18:19:48,111 --> 18:19:52,701
via wires or wirelessly, that just\n
19162
18:19:52,702 --> 18:19:55,672
a big spreadsheet, if you\nwill, or, if you prefer
19163
18:19:55,672 --> 18:19:59,992
a hash table, that has at least\ntwo columns of keys and values
19164
18:20:00,812 --> 18:20:02,241
Where on the left hand\nside is what we'll
19165
18:20:02,241 --> 18:20:04,262
call domain name,\nsomething like Harvard.edu
19166
18:20:04,262 --> 18:20:08,902
Yale.edu, an IP address on the\nright hand side, that is to say
19167
18:20:08,902 --> 18:20:13,612
a DNS server's purpose in life\nis just to translate domain names
19168
18:20:14,896 --> 18:20:17,271
And vice versa, if you want\nto go in the other direction
19169
18:20:17,271 --> 18:20:22,072
and technically, just to be precise, it\n
19170
18:20:23,432 --> 18:20:25,441
And we'll see what those\nare in just a moment.
19171
18:20:25,441 --> 18:20:27,652
But again, all of this just\nkind of happens magically
19172
18:20:27,652 --> 18:20:29,402
when you turn on your\nphone or your laptop
19173
18:20:29,402 --> 18:20:33,292
today, because all of these things\n
19174
18:20:33,292 --> 18:20:38,752
So how can we actually start to\n
19175
18:20:38,751 --> 18:20:44,911
Well, let's go ahead and poke around,\n
19176
18:20:44,911 --> 18:20:48,531
Let's see what we can actually do\n
19177
18:20:48,532 --> 18:20:52,672
If we now have the ability to\nmove data from point A to point B
19178
18:20:52,672 --> 18:20:55,252
and what can be in that envelope\ncould be, yes, an email
19179
18:20:55,251 --> 18:20:58,011
but today, onward, it's really\ngoing to be web content.
19180
18:20:58,012 --> 18:21:00,202
There's going to be content\nthat you're requesting
19181
18:21:00,202 --> 18:21:01,584
like give me today's home page.
19182
18:21:01,584 --> 18:21:03,292
And there's content\nyou're sending, which
19183
18:21:03,292 --> 18:21:05,512
would be the contents of\nthat actual home page.
19184
18:21:05,512 --> 18:21:10,642
And so, just to go one level deeper,\n
19185
18:21:10,642 --> 18:21:14,602
are getting from point A to\npoint B using TCP/IP, let's
19186
18:21:14,601 --> 18:21:19,221
put something specific inside of them,\n
19187
18:21:19,221 --> 18:21:24,149
but something called HTTP, which\n
19188
18:21:24,149 --> 18:21:25,941
You've seen this for\ndecades now, probably
19189
18:21:25,941 --> 18:21:29,331
in the form of URLs, so much so that you\n
19190
18:21:29,331 --> 18:21:31,521
Your browser just adds\nit for you automatically
19191
18:21:31,521 --> 18:21:35,101
and you just type in Harvard.edu,\nor Yale.edu, or the like.
19192
18:21:35,101 --> 18:21:38,391
But HTTP is just a final\nprotocol that we'll
19193
18:21:38,392 --> 18:21:42,412
talk about here, that just\nstandardizes how web browsers and web
19194
18:21:44,611 --> 18:21:47,601
So this is a distinction now\nbetween the internet and the web.
19195
18:21:47,601 --> 18:21:50,001
The internet is really like\nthe low-level plumbing
19196
18:21:50,001 --> 18:21:53,091
all of the cables, all of a\ntechnology that just moves packets
19197
18:21:53,092 --> 18:21:57,322
from left to right, right to left, top\n
19198
18:21:57,322 --> 18:22:01,941
to point B. You can do anything you\n
19199
18:22:01,941 --> 18:22:06,121
email and web and video and chat\nand gaming, and all of that.
19200
18:22:06,122 --> 18:22:09,652
So HTTP, or the web,\nis just one application
19201
18:22:09,652 --> 18:22:13,695
that is conceptually on top of,\nbuilt on top of the internet.
19202
18:22:13,695 --> 18:22:15,862
Once you take for granted\nthat there is an internet
19203
18:22:15,861 --> 18:22:17,451
you can do really\ninteresting things with it
19204
18:22:17,452 --> 18:22:19,942
just like in our physical world,\nonce you have electricity
19205
18:22:19,941 --> 18:22:22,941
you can just assume you can do really\n
19206
18:22:22,941 --> 18:22:25,101
without even knowing\nor caring how it works.
19207
18:22:25,101 --> 18:22:28,801
But now that you'll be\nprogramming for the web
19208
18:22:28,801 --> 18:22:32,131
it's useful to understand how\nsome of these things indeed work.
19209
18:22:32,131 --> 18:22:36,202
So let's take a peek at the\nformat of the things that
19210
18:22:37,402 --> 18:22:39,922
These days, it's usually\nactually HTTPS that's
19211
18:22:39,922 --> 18:22:42,172
in play, where, again,\nthe S just means secure.
19212
18:22:42,172 --> 18:22:47,092
More on that later, but the HTTP is\n
19213
18:22:47,092 --> 18:22:48,472
go inside of these envelopes.
19214
18:22:48,471 --> 18:22:52,281
And wonderfully, it's just\ntextual information, typically.
19215
18:22:52,282 --> 18:22:56,992
There is a simple text format\nthat humans decided on years ago
19216
18:22:56,991 --> 18:23:00,441
that goes inside of these\nenvelopes, that tells a browser how
19217
18:23:00,441 --> 18:23:04,762
to request information from a server,\n
19218
18:23:04,762 --> 18:23:06,392
to that client with information.
19219
18:23:06,392 --> 18:23:12,592
So here's, for instance, a canonical\n
19220
18:23:12,592 --> 18:23:14,362
What might you see at the end of this?
19221
18:23:14,361 --> 18:23:15,829
You might sometimes see a slash.
19222
18:23:15,830 --> 18:23:18,622
Browsers nowadays kind of simplify\n
19223
18:23:18,622 --> 18:23:21,771
But slash, as we'll see, just\nrepresents like the default
19224
18:23:21,771 --> 18:23:24,652
folder, the root of the\nweb server's hard drive
19225
18:23:24,652 --> 18:23:26,077
like whatever the base is of it.
19226
18:23:26,077 --> 18:23:32,182
It's like C colon backslash on\n
19227
18:23:32,182 --> 18:23:34,432
But a URL can have more than that.
19228
18:23:34,432 --> 18:23:36,982
It can have slash path,\nwhere path is just a word
19229
18:23:36,982 --> 18:23:40,552
or multiple words, that sort of\n
19230
18:23:40,551 --> 18:23:43,072
That path could actually be\na specific file, we'll see
19231
18:23:43,072 --> 18:23:45,262
like something called file.html.
19232
18:23:45,262 --> 18:23:48,682
More on HTML in just a bit, or\nit can even be slash folder
19233
18:23:48,682 --> 18:23:52,911
maybe with another slash, or\nmaybe it can be /folder/file.html.
19234
18:23:52,911 --> 18:23:57,081
Now these days Safari, and even Chrome\n
19235
18:23:57,081 --> 18:24:00,621
are in the habit of trying to hide\n
19236
18:24:02,361 --> 18:24:05,121
Ultimately, though, it'll\nbe useful to understand
19237
18:24:05,122 --> 18:24:08,812
what URLs you're at, because\nit maps directly to the code
19238
18:24:08,812 --> 18:24:10,642
that we're ultimately going to write.
19239
18:24:10,642 --> 18:24:13,282
But this is only to say that\nall this stuff in yellow
19240
18:24:13,282 --> 18:24:18,322
refers to, presumably, a specific\nfile and/or folder on the web
19241
18:24:18,322 --> 18:24:20,214
server, on which you're programming.
19242
18:24:21,172 --> 18:24:26,002
Example.com, this is the domain\n
19243
18:24:26,001 --> 18:24:28,621
Example.com is the\nso-called domain name.
19244
18:24:28,622 --> 18:24:33,532
This whole thing, www.example.com,\n
19245
18:24:33,532 --> 18:24:37,432
And what the WW is referring\nto is specifically the name
19246
18:24:37,432 --> 18:24:40,292
of a specific server in that domain.
19247
18:24:40,292 --> 18:24:44,842
So back in the day, there was\na www.example.com web server.
19248
18:24:44,842 --> 18:24:48,442
There might have been a\nmail.example.com mail server.
19249
18:24:48,441 --> 18:24:51,501
There might have been a\nchat.example.com chat server.
19250
18:24:51,501 --> 18:24:56,331
Nowadays, this hostname, or\nsubdomain, depending on the context
19251
18:24:56,331 --> 18:24:58,581
can actually refer to a whole\nbunch of servers, right?
19252
18:24:58,581 --> 18:25:01,792
When you go to www.facebook.com,\nthat's not one server
19253
18:25:01,792 --> 18:25:03,652
that's thousands of servers nowadays.
19254
18:25:03,652 --> 18:25:05,631
So long story short,\nthere's technology that
19255
18:25:05,631 --> 18:25:08,122
somehow get your data\nto one of those servers
19256
18:25:08,122 --> 18:25:11,491
but this whole thing is what we\n
19257
18:25:11,491 --> 18:25:13,822
This thing here, hostname,\nin the context of an email
19258
18:25:13,822 --> 18:25:16,702
address it might alternatively\nbe called a subdomain.
19259
18:25:16,702 --> 18:25:20,062
This thing here, top\nlevel domain, you probably
19260
18:25:20,062 --> 18:25:23,422
know that dot com means commercial,\n
19261
18:25:23,422 --> 18:25:25,762
Dot org is similar, dot net.
19262
18:25:25,762 --> 18:25:29,182
Some of them are a bit restricted,\n
19263
18:25:29,182 --> 18:25:31,711
dot edu is just for accredited\neducational institutions.
19264
18:25:31,711 --> 18:25:35,211
But there are hundreds, if\nnot more, top level domains
19265
18:25:35,211 --> 18:25:37,702
nowadays, some more popular than others.
19266
18:25:37,702 --> 18:25:41,542
CS50's tools, for instance, use CS50.io.
19267
18:25:41,542 --> 18:25:44,182
IO sort of connotes input-output.
19268
18:25:44,182 --> 18:25:49,131
It actually belongs, though, to\n
19269
18:25:49,131 --> 18:25:55,366
whose country code is .io, and you see\n
19270
18:25:56,241 --> 18:26:00,399
Indeed, it's something.uk,\nsomething.jp, and the like typically
19271
18:26:01,191 --> 18:26:04,042
But some of them have been\nrather co-opted, .tv as well
19272
18:26:04,042 --> 18:26:06,932
because they have these\nmeanings in English as well.
19273
18:26:06,932 --> 18:26:08,991
Lastly, this is what\nwe'll call the protocol.
19274
18:26:08,991 --> 18:26:13,042
That specifies how the server uses\n
19275
18:26:13,042 --> 18:26:16,351
to point B. So what is\ninside of this envelope?
19276
18:26:16,351 --> 18:26:18,622
Let's now start poking\naround a little bit more.
19277
18:26:18,622 --> 18:26:20,211
What is inside of this envelope?
19278
18:26:20,211 --> 18:26:23,782
It's essentially, for our\npurposes today, one of two verbs
19279
18:26:25,792 --> 18:26:28,612
And if any of you have dabbled\n
19280
18:26:28,611 --> 18:26:30,569
you might have seen some\nof these terms before.
19281
18:26:30,570 --> 18:26:34,702
But these two verbs describe\njust how to send information
19282
18:26:36,952 --> 18:26:39,172
Long story short, more\non this next week
19283
18:26:39,172 --> 18:26:43,472
GET means put any user input\nin the URL, POST means hide it
19284
18:26:43,471 --> 18:26:46,641
so that things you're searching for,\n
19285
18:26:46,642 --> 18:26:49,634
usernames and passwords you're\n
19286
18:26:49,634 --> 18:26:51,592
and are therefore visible\nto anyone with access
19287
18:26:51,592 --> 18:26:53,672
to your computer and\nyour search history
19288
18:26:53,672 --> 18:26:57,802
but rather they're somehow provided\n
19289
18:26:57,801 --> 18:27:00,292
But for now, we'll focus\nalmost entirely on GET
19290
18:27:00,292 --> 18:27:03,741
which is perhaps the most common\n
19291
18:27:03,741 --> 18:27:05,251
And what we're going to do is this.
19292
18:27:05,251 --> 18:27:07,461
Let me switch over just\nto a blank screen here.
19293
18:27:07,461 --> 18:27:11,182
And if we assume that little\nold me is this laptop here
19294
18:27:11,182 --> 18:27:16,191
and I'm connected to the cloud, and\n
19295
18:27:16,191 --> 18:27:20,182
want to request the web page\nof, Harvard.edu or Yale.edu
19296
18:27:20,182 --> 18:27:22,982
it's really going to\nbe a two-step process.
19297
18:27:22,982 --> 18:27:27,892
There's going to be a request,\n
19298
18:27:27,892 --> 18:27:29,752
and then, hopefully,\nthe server that hears
19299
18:27:29,751 --> 18:27:34,011
that request is going to reply with\n
19300
18:27:34,012 --> 18:27:37,042
And other terms that are\nrelevant here, is my laptop
19301
18:27:37,042 --> 18:27:40,642
is the so-called client,\nHarvard.edu, Yale.edu, whatever
19302
18:27:40,642 --> 18:27:42,352
it is, is the so-called server.
19303
18:27:42,351 --> 18:27:45,441
And just like in a restaurant, where\n
19304
18:27:45,441 --> 18:27:47,091
the server might bring it to you.
19305
18:27:47,092 --> 18:27:49,372
It's, again, that kind of\nbidirectional relationship.
19306
18:27:49,372 --> 18:27:54,381
One request, one response, for\neach such web page we request.
19307
18:27:54,381 --> 18:27:58,042
All right, so what's inside these\n
19308
18:27:58,042 --> 18:28:01,012
Well, this arrow, this line I\njust drew from left to right
19309
18:28:01,012 --> 18:28:05,301
representing the request, technically\n
19310
18:28:05,301 --> 18:28:08,031
When you visit a web\npage, using your browser
19311
18:28:08,032 --> 18:28:11,452
on your phone, laptop, or desktop,\n
19312
18:28:11,452 --> 18:28:14,794
and the textual message your Mac or PC\n
19313
18:28:14,793 --> 18:28:16,251
looks a little something like this.
19314
18:28:16,251 --> 18:28:20,111
The verb GET, the URL, or rather\nthe path that you want to get
19315
18:28:20,111 --> 18:28:22,941
slash represents the\ndefault page on the website.
19316
18:28:22,941 --> 18:28:27,682
HTTP/1.1 is just some mention of\n
19317
18:28:27,682 --> 18:28:31,221
Now we're up to version 2, and\n
19318
18:28:31,221 --> 18:28:35,811
And the envelope contains some\n
19319
18:28:35,812 --> 18:28:37,502
the fully qualified domain name.
19320
18:28:37,501 --> 18:28:42,331
This is because single servers can\n
19321
18:28:42,331 --> 18:28:46,011
If you're using Squarespace or Wix or\n
19322
18:28:46,012 --> 18:28:49,282
nowadays, you don't get your own\npersonal server, most likely.
19323
18:28:49,282 --> 18:28:52,652
You're on the same server as\n
19324
18:28:52,652 --> 18:28:56,241
But when your customers,\nyour users' browsers
19325
18:28:56,241 --> 18:29:00,471
include a little mention of your\n
19326
18:29:00,471 --> 18:29:02,841
name in the envelope,\nSquarespace and Wix just
19327
18:29:02,842 --> 18:29:06,442
know to send it to your web page or\n
19328
18:29:07,142 --> 18:29:09,032
Dot dot dot, there's\nsome other stuff there.
19329
18:29:09,032 --> 18:29:12,532
But that's really the essence\nof what's in these requests.
19330
18:29:12,532 --> 18:29:16,312
Hopefully, then, when your browser\n
19331
18:29:17,361 --> 18:29:21,951
Well, hopefully, a response\nthat looks like this, HTTP/1.1
19332
18:29:21,952 --> 18:29:25,822
so the same version, some\nstatus code, like a number 200
19333
18:29:25,822 --> 18:29:30,021
and then literally a short phrase like\n
19334
18:29:31,682 --> 18:29:35,062
Then it contains some other\n
19335
18:29:35,872 --> 18:29:37,789
And we'll see that this,\ntoo, is standardized.
19336
18:29:37,789 --> 18:29:41,932
Text/HTML means here comes some\n
19337
18:29:41,932 --> 18:29:48,142
It could instead be image/jpeg\nor Image/png, or video/mp4
19338
18:29:48,142 --> 18:29:51,982
there are these different content\n
19339
18:29:51,982 --> 18:29:54,711
that uniquely identify types\nof files, that come back
19340
18:29:54,711 --> 18:29:57,322
similar in spirit to file\nextensions, but a little more
19341
18:29:59,232 --> 18:30:00,982
Then there's some more\nstuff, dot dot dot.
19342
18:30:00,982 --> 18:30:05,842
But in general, what you see here, are\n
19343
18:30:05,842 --> 18:30:09,652
These keys and values are\notherwise known as HTTP headers.
19344
18:30:09,652 --> 18:30:14,092
And your browser has been sending\n
19345
18:30:14,092 --> 18:30:16,422
And, indeed, we can see\nthis right now ourselves.
19346
18:30:16,422 --> 18:30:20,422
Let me go over, in just a\nsecond, to Chrome on my computer
19347
18:30:20,422 --> 18:30:23,552
though you can do this kind of\n
19348
18:30:23,551 --> 18:30:28,641
I'll go ahead and visit\nHTTP://Harvard.edu, Enter.
19349
18:30:28,642 --> 18:30:31,459
And, voila, I'm at Harvard's\nhome page for today.
19350
18:30:32,542 --> 18:30:34,982
But this is what it\nlooks like right now.
19351
18:30:34,982 --> 18:30:38,002
Well, I typed in the URL, but\nnotice it changed a little bit.
19352
18:30:38,001 --> 18:30:41,011
It actually sent me to\nHTTPS and added the www
19353
18:30:41,012 --> 18:30:42,682
even though I didn't type that.
19354
18:30:42,682 --> 18:30:46,702
But it turns out we can poke around\n
19355
18:30:47,812 --> 18:30:50,542
I'm going to start to use incognito\n
19356
18:30:50,542 --> 18:30:52,500
care that people know\nI'm visiting Harvard.edu
19357
18:30:52,500 --> 18:30:56,471
but because it throws away\nany history that I just did.
19358
18:30:56,471 --> 18:30:58,971
So that every request is going\nto look like a brand new one
19359
18:30:58,971 --> 18:31:01,429
and that's just useful\ndiagnostically, because we're always
19360
18:31:01,429 --> 18:31:02,781
going to see fresh information.
19361
18:31:02,782 --> 18:31:06,082
My browser is not going to remember\n
19362
18:31:06,081 --> 18:31:09,441
But I'm going to go\nup to View, developer
19363
18:31:09,441 --> 18:31:13,042
developer tools, which is something\n
19364
18:31:13,042 --> 18:31:15,652
And there's something\nanalogous for Firefox and Edge
19365
18:31:15,652 --> 18:31:17,631
and Safari and other browsers.
19366
18:31:17,631 --> 18:31:20,301
Developer tools is going to\nopen up these tabs down here.
19367
18:31:20,301 --> 18:31:23,509
I don't really care what's new, so I'm\n
19368
18:31:23,509 --> 18:31:26,182
And I'm going to hover over the\nNetwork tab for just a moment.
19369
18:31:26,182 --> 18:31:31,252
And now I'm going to go and\nsay HTTP://Harvard.edu, so
19370
18:31:32,422 --> 18:31:36,022
I'm going to hit Enter,\nand a whole bunch of stuff
19371
18:31:36,021 --> 18:31:37,342
just flew across the screen.
19372
18:31:38,601 --> 18:31:43,702
And if I zoom in down here, my God,\n
19373
18:31:43,702 --> 18:31:48,082
is downloading, what 17, 18,\n19 megabytes, 20 megabytes
19374
18:31:48,081 --> 18:31:53,182
millions of bytes of information,\nover 111 HTTP requests.
19375
18:31:53,182 --> 18:31:56,062
In other words, a bit of a\nsimplification, but my browser
19376
18:31:56,062 --> 18:31:59,692
unbeknownst to me, sent one\nenvelope initially with the request.
19377
18:31:59,691 --> 18:32:01,611
Then the server said,\nOK, by the way, there's
19378
18:32:01,611 --> 18:32:05,241
110 other things you need, 112\nother things you need to get.
19379
18:32:05,241 --> 18:32:09,441
So my computer went back and forth,\n
19380
18:32:10,191 --> 18:32:13,581
Well, inside of Harvard's web\npage is a whole bunch of images
19381
18:32:13,581 --> 18:32:16,341
and maybe sound files and\nvideos and other stuff
19382
18:32:16,342 --> 18:32:18,741
that all need to be\ndownloaded and to compose
19383
18:32:18,741 --> 18:32:20,161
what is ultimately the web page.
19384
18:32:20,161 --> 18:32:22,881
But I don't care about like\n100 plus of these things.
19385
18:32:22,881 --> 18:32:25,161
Let's focus on the very first one first.
19386
18:32:25,161 --> 18:32:27,711
The very first request\nI sent was up here.
19387
18:32:27,711 --> 18:32:30,771
And I'm going to click on this\nrow, under the Network tab.
19388
18:32:30,771 --> 18:32:33,501
And then I'm going to see a\nbit of diagnostic information.
19389
18:32:33,501 --> 18:32:36,741
To an average person using the\n
19390
18:32:36,741 --> 18:32:39,292
just as you probably didn't\ncare about it until right now.
19391
18:32:41,001 --> 18:32:44,991
But if I scroll down to\nrequest headers, you will see
19392
18:32:44,991 --> 18:32:48,952
if I click View source, literally\n
19393
18:32:48,952 --> 18:32:51,282
my Mac just sent to Harvard.edu.
19394
18:32:51,282 --> 18:32:56,482
Two of the lines are familiar,\nget/http1.1, host:harvard.edu
19395
18:32:56,482 --> 18:32:59,942
and then other stuff that, for now,\n
19396
18:32:59,941 --> 18:33:03,381
But let's look at the response\nthat came back from the server.
19397
18:33:03,381 --> 18:33:08,421
I'm going to scroll up now and\n
19398
18:33:11,211 --> 18:33:14,032
There's no 200, there's no word OK.
19399
18:33:14,032 --> 18:33:18,112
Curiously, harvard.edu\nhas moved permanently.
19400
18:33:18,953 --> 18:33:20,661
Well, there's a whole\nbunch of stuff here
19401
18:33:20,661 --> 18:33:22,119
that's not that interesting for us.
19402
18:33:22,119 --> 18:33:24,781
But this line, location, is interesting.
19403
18:33:24,782 --> 18:33:28,552
This is an HTTP header, a\nstandardized key value pair
19404
18:33:28,551 --> 18:33:32,301
that's part of the HTTP\nprotocol, that is, conventions.
19405
18:33:32,301 --> 18:33:34,881
And if I highlight just\nthis one, it's telling me
19406
18:33:34,881 --> 18:33:38,421
mm-mmm, Harvard is not\nat HTTP://Harvard.edu
19407
18:33:38,422 --> 18:33:44,242
Harvard's website is now, and perhaps\n
19408
18:33:47,631 --> 18:33:50,641
Probably someone at Harvard wants\n
19409
18:33:50,642 --> 18:33:53,092
So they redirected you\nfrom HTTP to HTTPS.
19410
18:33:53,092 --> 18:33:57,392
Maybe the marketing people want you to\n
19411
18:33:57,892 --> 18:34:00,350
Just to standardize things,\nbut there are technical reasons
19412
18:34:00,350 --> 18:34:03,442
to use a hostname, and not\njust the raw domain name.
19413
18:34:03,441 --> 18:34:06,331
And all this other stuff is sort\n
19414
18:34:06,331 --> 18:34:11,361
now, because a browser that\nreceives a 301 response knows
19415
18:34:11,361 --> 18:34:16,461
by design, by the definition of HTTP,\n
19416
18:34:16,461 --> 18:34:20,065
And that's why, in my browser, all of\n
19417
18:34:20,065 --> 18:34:22,732
because I didn't really know or\ncare about all of those headers.
19418
18:34:22,732 --> 18:34:26,482
But that's why and how I\nended up at this URL here.
19419
18:34:26,482 --> 18:34:29,842
My browser was told to go\nelsewhere via that new location.
19420
18:34:29,842 --> 18:34:32,392
And the browser just\nfollowed those breadcrumbs
19421
18:34:32,392 --> 18:34:35,122
if you will, at which point it\n
19422
18:34:35,122 --> 18:34:39,562
and files, and so forth, that\ncompose this particular page.
19423
18:34:40,638 --> 18:34:42,471
And let me actually go\ninto VS Code, if only
19424
18:34:42,471 --> 18:34:45,596
because it's a little more pleasant to\n
19425
18:34:45,596 --> 18:34:49,201
without actually using\na full-fledged browser.
19426
18:34:49,202 --> 18:34:51,381
So now let's just use\nan equivalent program.
19427
18:34:51,381 --> 18:34:54,351
It's called Curl, for\nconnecting to a URL, that's
19428
18:34:54,351 --> 18:34:57,262
going to allow me to play with\n
19429
18:34:57,262 --> 18:34:59,872
without bothering to download\nall the images and text
19430
18:34:59,872 --> 18:35:01,312
and so forth from the website.
19431
18:35:01,312 --> 18:35:03,902
It's going to allow me to\ndo something like this.
19432
18:35:03,902 --> 18:35:10,941
Let me go ahead and run, for instance,\n
19433
18:35:10,941 --> 18:35:14,572
line arguments that says\nsimulate a GET request textually
19434
18:35:15,831 --> 18:35:20,661
And let's go to\nHTTP://Harvard.edu Enter.
19435
18:35:20,661 --> 18:35:23,834
Now, by way of how Curl, works,\nI'm just seeing the headers.
19436
18:35:23,834 --> 18:35:25,792
It didn't bother downloading\nthe whole website.
19437
18:35:25,792 --> 18:35:28,792
And you see exactly the same\nthing, 301 moved permanently.
19438
18:35:28,792 --> 18:35:31,562
Location is, indeed, this one here.
19439
18:35:31,562 --> 18:35:33,202
So that's kind of interesting.
19440
18:35:33,202 --> 18:35:34,881
But let's follow it manually now.
19441
18:35:34,881 --> 18:35:37,682
Let's now do what it's telling me to do.
19442
18:35:37,682 --> 18:35:42,381
Let's go to the location, with\nHTTPS and the www and hit Enter.
19443
18:35:42,381 --> 18:35:46,402
And now, what's a good\nsign with this output?
19444
18:35:48,831 --> 18:35:51,741
SPEAKER 1: 200 OK, that\nmeans I'm seeing, presumably
19445
18:35:51,741 --> 18:35:55,221
if I were using a real browser,\n
19446
18:35:55,221 --> 18:35:58,911
Looks like Harvard's version of HTTP\n
19447
18:35:58,911 --> 18:36:01,051
It's using HTTP version\n2, which is fine.
19448
18:36:01,051 --> 18:36:04,611
But 200 is indeed indicative\nof things being OK.
19449
18:36:04,611 --> 18:36:07,822
Well, what if I try\nvisiting some bogus URL
19450
18:36:07,822 --> 18:36:14,331
like Harvard.edu, when this file does\n
19451
18:36:14,331 --> 18:36:17,331
probably doesn't exist, and hit Enter.
19452
18:36:17,331 --> 18:36:20,991
What do you see now, that's perhaps\nfamiliar, in the real world?
19453
18:36:24,620 --> 18:36:26,600
All of us have seen\nthis probably endlessly
19454
18:36:26,600 --> 18:36:29,301
from time to time, when you\nscrew up by mis-typing a URL
19455
18:36:29,301 --> 18:36:31,221
or someone deletes the\nweb page in question.
19456
18:36:31,221 --> 18:36:34,221
But all that is is a\nstatus code that a browser
19457
18:36:34,221 --> 18:36:37,521
is being sent from the server,\nthat's a little clue as to what
19458
18:36:37,521 --> 18:36:40,383
the actual problem is,\nunderneath the hood.
19459
18:36:40,384 --> 18:36:42,092
So instead of getting\nback, for instance
19460
18:36:42,092 --> 18:36:45,202
something like OK, or moved permanently,\n
19461
18:36:47,902 --> 18:36:51,711
Well, it turns out there's\nother types of status codes
19462
18:36:51,711 --> 18:36:55,211
that you'll start to see over time,\n
19463
18:36:57,881 --> 18:37:01,421
302, 304, 307 are all similar in spirit.
19464
18:37:01,422 --> 18:37:04,782
They're related to redirecting the\n
19465
18:37:04,782 --> 18:37:08,802
401, 403, unauthorized or forbidden.
19466
18:37:08,801 --> 18:37:11,021
If you ever mess up\nyour password, or you
19467
18:37:11,021 --> 18:37:13,134
try visiting a URL you're\nnot supposed to look at
19468
18:37:13,134 --> 18:37:15,551
you might see one of these\ncodes, indicating that you just
19469
18:37:15,551 --> 18:37:17,441
don't have authorization for those.
19470
18:37:17,441 --> 18:37:21,671
404 not found, 418, I'm a\nteapot, was an April Fool's joke
19471
18:37:21,672 --> 18:37:24,072
by the tech community years ago.
19472
18:37:25,331 --> 18:37:27,281
And, unfortunately,\nall of you are probably
19473
18:37:27,282 --> 18:37:30,972
on a path now to creating\nHTTP 500 errors, once
19474
18:37:30,971 --> 18:37:33,114
next week, we start writing\ncode, because all of us
19475
18:37:34,032 --> 18:37:37,632
We're going to have typos, logical\n
19476
18:37:37,631 --> 18:37:42,042
just like segfaults were in the world of\n
19477
18:37:42,042 --> 18:37:45,254
503 service unavailable, means\nmaybe the server is overloaded
19478
18:37:46,211 --> 18:37:47,721
And there's other codes there.
19479
18:37:47,721 --> 18:37:51,011
But those are perhaps some\nof the most common ones.
19480
18:37:51,012 --> 18:37:54,642
Has anyone, we can get away with\n
19481
18:37:54,642 --> 18:37:58,972
has anyone ever visited\nSafetySchool.org?
19482
18:38:01,542 --> 18:38:08,381
HTTP://SafetySchool.org,\ndare we do this, Enter.
19483
18:38:17,262 --> 18:38:20,051
--so this has been like a\njoke for like 10 or 20 years.
19484
18:38:20,051 --> 18:38:22,811
Someone out there has been\npaying for the domain name
19485
18:38:22,812 --> 18:38:26,202
safetyschool.org, just for\nthis two second demonstration.
19486
18:38:26,202 --> 18:38:28,542
But we can now infer, how did this work?
19487
18:38:28,542 --> 18:38:31,452
The person who bought that domain\nname and somehow configured
19488
18:38:31,452 --> 18:38:35,502
DNS to point to their web server,\n
19489
18:38:35,501 --> 18:38:37,871
what is their web server\npresumably spitting out
19490
18:38:37,872 --> 18:38:41,021
whenever a browser requests the page?
19491
18:38:48,221 --> 18:38:49,911
Let me increase my terminal window.
19492
18:38:49,911 --> 18:38:58,271
Let me do Curl-I-xget\nHTTP://safetyschool.org Enter
19493
18:38:58,271 --> 18:39:00,012
and that's all this website does.
19494
18:39:00,012 --> 18:39:02,322
There's not even an\nactual website there.
19495
18:39:02,322 --> 18:39:04,842
No HTML, no CSS languages\nwe're about to see.
19496
18:39:04,842 --> 18:39:09,702
It literally just exists on the\n
19497
18:39:09,702 --> 18:39:13,362
In fairness, there are others.
19498
18:39:13,361 --> 18:39:15,611
Let me actually do another one here.
19499
18:39:15,611 --> 18:39:19,511
Instead of safetyschool.org,\nturns out someone
19500
18:39:19,512 --> 18:39:25,572
some years ago, bought\nHarvardSucks.org Enter.
19501
18:39:25,572 --> 18:39:30,072
And when we do this, you'll see that,\n
19502
18:39:36,762 --> 18:39:38,785
This demo actually\nworked for so many years.
19503
18:39:38,785 --> 18:39:41,202
But someone has stopped paying\nfor the Squarespace account
19504
18:39:46,471 --> 18:39:52,111
OK, so, fortunately, we\ndid save the YouTube video
19505
18:39:53,672 --> 18:39:56,255
And so, just to put this\ninto context, since it's
19506
18:39:56,255 --> 18:39:58,422
been quite a few years,\nHarvard and Yale, of course
19507
18:39:58,422 --> 18:40:00,272
have this long-standing rivalry.
19508
18:40:00,271 --> 18:40:02,891
There is this tradition\nof pranking each other.
19509
18:40:02,892 --> 18:40:06,822
And, honestly, hands down, one of the\n
19510
18:40:09,062 --> 18:40:10,741
It's about a three-minute retrospective.
19511
18:40:10,741 --> 18:40:13,074
It's one of the earliest\nvideos, I dare say, on YouTube
19512
18:40:13,074 --> 18:40:15,512
so the quality is\nrepresentative of that.
19513
18:40:15,512 --> 18:40:18,402
But let me go ahead and\nfull screen my page here.
19514
18:40:18,402 --> 18:40:22,892
And what used to live at\nHarvardSucks.org is this video here.
19515
18:40:22,892 --> 18:40:25,738
If we could dim the lights\nfor about three minutes.
19516
18:40:53,975 --> 18:40:55,892
- Actually we're going\nall the way to the top.
19517
18:41:01,142 --> 18:41:04,322
- We're here to trip up Harvard.
19518
18:41:06,750 --> 18:41:08,402
- Pass from the top one, pass it down.
19519
18:41:09,714 --> 18:41:12,822
- It's nice to say the ERA sucks.
19520
18:41:19,732 --> 18:41:22,402
It's going to have to happen.
19521
18:41:22,402 --> 18:41:25,032
- It's actually going to happen.
19522
18:41:25,032 --> 18:41:26,452
I can't [BEEP] believe this.
19523
18:41:26,452 --> 18:41:28,512
- What do you think of Yale?
19524
18:41:33,334 --> 18:41:34,542
- Because they don't have it.
19525
18:41:37,013 --> 18:41:40,341
- Probably that's going\nto be legible, very small.
19526
18:41:45,801 --> 18:41:47,512
- Says, are we in boats now?
19527
18:41:48,262 --> 18:41:49,387
- How many extra are there?
19528
18:41:54,445 --> 18:41:55,903
- You guys are from Harvard, right?
19529
18:41:58,611 --> 18:42:00,039
- Just make sure everyone has one.
19530
18:42:35,631 --> 18:42:38,891
- What do you think of Yale, sir?
19531
18:42:38,892 --> 18:42:41,376
- Going to be, do one more time.
19532
18:43:08,812 --> 18:43:13,222
SPEAKER 1: All right, so thanks to\n
19533
18:43:13,221 --> 18:43:17,129
Let's go ahead here and consider, in\n
19534
18:43:17,130 --> 18:43:18,922
down inside of the\nenvelope, because we now
19535
18:43:18,922 --> 18:43:24,202
have the ability to get data from,\n
19536
18:43:25,342 --> 18:43:30,082
Let's consider for just a moment\n
19537
18:43:30,081 --> 18:43:33,652
that we now have this ability to\n
19538
18:43:33,652 --> 18:43:37,490
And we have the ability to\nspecify in those envelopes what
19539
18:43:37,490 --> 18:43:38,782
it is we want from the website.
19540
18:43:38,782 --> 18:43:40,192
We want to get the home page.
19541
18:43:40,191 --> 18:43:41,932
We want to get back the HTML.
19542
18:43:43,247 --> 18:43:46,372
In fact, we don't yet have the language\n
19543
18:43:46,372 --> 18:43:48,494
are written, namely HTML and CSS.
19544
18:43:48,494 --> 18:43:50,702
But let's go ahead and take\na five minute break here.
19545
18:43:50,702 --> 18:43:54,182
And when we come back, we'll\nlearn those two languages.
19546
18:43:56,072 --> 18:43:58,372
So we've got three\nlanguages to look at today.
19547
18:43:58,372 --> 18:44:00,832
But two of them are not\nactually programming languages.
19548
18:44:00,831 --> 18:44:05,152
What makes something a programming\n
19549
18:44:05,152 --> 18:44:08,472
is that there are these constructs via\n
19550
18:44:08,471 --> 18:44:10,971
You might have variables, you\nmight have looping constructs.
19551
18:44:10,971 --> 18:44:13,341
You have the ability,\nultimately, to express logic.
19552
18:44:13,342 --> 18:44:16,912
HTML and CSS aren't so much about\n
19553
18:44:16,911 --> 18:44:18,441
and the aesthetics of a page.
19554
18:44:18,441 --> 18:44:21,262
And so we're going to create\nthe skeleton of a web page using
19555
18:44:21,262 --> 18:44:23,190
this pair of languages, HTML and CSS.
19556
18:44:23,190 --> 18:44:24,982
And then toward the\nend of the today, we'll
19557
18:44:24,982 --> 18:44:26,872
introduce an actual\nprogramming language
19558
18:44:26,872 --> 18:44:30,232
that actually is pretty similar\nin spirit and syntactically
19559
18:44:30,232 --> 18:44:34,101
to both C and Python, but that's going\n
19560
18:44:34,101 --> 18:44:38,251
just static, things that you look at,\n
19561
18:44:38,251 --> 18:44:42,771
And then next week again, in week 9,\n
19562
18:44:42,771 --> 18:44:46,251
tie all of this together, so that you\n
19563
18:44:46,251 --> 18:44:49,221
talking to a back-end\nserver, and creating
19564
18:44:49,221 --> 18:44:53,151
the experience that you and I now take\n
19565
18:44:53,812 --> 18:44:55,187
Well, let's go ahead and do this.
19566
18:44:55,187 --> 18:44:58,402
Let's quickly whip up something\nin this language called HTML.
19567
18:44:59,601 --> 18:45:04,101
I'm going to go ahead and create a\n
19568
18:45:04,101 --> 18:45:07,491
The convention is typically to\nend your file names in dot html.
19569
18:45:07,491 --> 18:45:09,783
And I'm going to go ahead\nand bang this out real quick.
19570
18:45:09,783 --> 18:45:12,842
But then we'll more slowly step\n
19571
18:45:12,842 --> 18:45:17,482
So I'm going to say doctype\nhtml open bracket html
19572
18:45:17,482 --> 18:45:22,072
and then notice I'm going to do open\n
19573
18:45:22,072 --> 18:45:25,672
And I'm leveraging a feature of VS\n
19574
18:45:25,672 --> 18:45:27,601
generally, to do a bit of autocomplete.
19575
18:45:27,601 --> 18:45:30,831
So you'll see that there's this symmetry\n
19576
18:45:30,831 --> 18:45:32,631
but I'm not typing all of these things.
19577
18:45:32,631 --> 18:45:37,611
VS Code is automatically generating the\n
19578
18:45:37,611 --> 18:45:41,511
Let me go ahead and\nsay, Open the head tag.
19579
18:45:42,831 --> 18:45:44,932
I'll say something cute\nlike, Hello, title.
19580
18:45:44,932 --> 18:45:47,661
And then down here, I'm going to\n
19581
18:45:47,661 --> 18:45:49,461
and say something like Hello, body.
19582
18:45:49,461 --> 18:45:53,241
And let me specify at the very top,\n
19583
18:45:54,592 --> 18:46:00,562
So at this moment, I have a file in my\n
19584
18:46:00,562 --> 18:46:03,484
VS Code as we're using it,\nof course, is cloud-based.
19585
18:46:03,483 --> 18:46:05,691
We're using it in a browser,\neven though you can also
19586
18:46:05,691 --> 18:46:07,792
download it and run it on a Mac and PC.
19587
18:46:07,792 --> 18:46:10,491
So we are in this weird\nsituation where I'm
19588
18:46:10,491 --> 18:46:12,771
using the cloud to\ncreate a web page, and I
19589
18:46:12,771 --> 18:46:17,781
want that web page to also live in\n
19590
18:46:17,782 --> 18:46:21,082
But the thing about VS\nCode, or really any website
19591
18:46:21,081 --> 18:46:24,741
that you might use in a browser, by\n
19592
18:46:24,741 --> 18:46:29,361
TCP port number 80 or\nTCP port number 443
19593
18:46:29,361 --> 18:46:32,061
which is HTTP and HTTPS respectively.
19594
18:46:32,062 --> 18:46:34,912
But here I am, sort of\na programmer myself
19595
18:46:34,911 --> 18:46:39,241
trying to create my own\nwebsite on an existing website.
19596
18:46:39,241 --> 18:46:40,732
So it's a bit of a weird situation.
19597
18:46:40,732 --> 18:46:43,222
But that's OK, because\nwhat's nice about TCP
19598
18:46:43,221 --> 18:46:47,572
is that you and I can just pick port\n
19599
18:46:49,312 --> 18:46:51,802
That is, we can control\nthe environment entirely
19600
18:46:51,801 --> 18:46:57,711
by just running our own web server\n
19601
18:46:58,732 --> 18:47:01,445
This is a command that we\npreinstalled in VS Code here.
19602
18:47:01,445 --> 18:47:03,112
And you'll notice a pop-up just came up.
19603
18:47:03,111 --> 18:47:05,841
Your application running\non port 8080 is available.
19604
18:47:05,842 --> 18:47:09,172
That's a commonly used TCP port\nnumber, when 80 is already used
19605
18:47:09,172 --> 18:47:11,932
and 443 is already used,\nyou can run your own server
19606
18:47:11,932 --> 18:47:14,521
on your own port, 8080 in this case.
19607
18:47:14,521 --> 18:47:18,592
I've opened that tab in advance, and\n
19608
18:47:18,592 --> 18:47:22,252
here I see a so-called directory\n
19609
18:47:22,251 --> 18:47:24,322
So I don't see any of my other files.
19610
18:47:24,322 --> 18:47:27,351
I don't see anything\nbelonging to VS Code itself.
19611
18:47:27,351 --> 18:47:30,562
I only see the file that I've created\n
19612
18:47:31,851 --> 18:47:36,232
And so if I click on this file\nnow, I should see Hello, body.
19613
18:47:37,312 --> 18:47:39,262
But that's because the\ntitle of a web page
19614
18:47:39,262 --> 18:47:41,699
nowadays is typically\nembedded in the tab.
19615
18:47:41,698 --> 18:47:44,031
And if I'm full screen in my\nbrowser, there are no tabs.
19616
18:47:44,032 --> 18:47:45,652
So let me minimize the window a bit.
19617
18:47:45,652 --> 18:47:50,032
And now you can see just in this\n
19618
18:47:50,032 --> 18:47:52,382
here, that Hello, body, is\nin the top left hand corner.
19619
18:47:52,381 --> 18:47:54,801
And if I zoom in, there's Hello, title.
19620
18:47:56,271 --> 18:48:01,491
I have gone ahead and created\nmy own web page in HTML
19621
18:48:01,491 --> 18:48:04,851
in a file called Hello.html.
19622
18:48:04,851 --> 18:48:09,251
And then I have opened up\na web server of my own
19623
18:48:09,251 --> 18:48:11,591
configured it to listen\non TCP port 8080
19624
18:48:11,592 --> 18:48:14,982
which just says to the internet, hey,\n
19625
18:48:14,982 --> 18:48:18,612
not on the standard port number,\n80 or 443, listen on 8080.
19626
18:48:18,611 --> 18:48:22,481
And this means I can develop a website\n
19627
18:48:22,482 --> 18:48:24,851
here, which is\nincreasingly common today.
19628
18:48:24,851 --> 18:48:28,691
All right, so now let's consider\n
19629
18:48:28,691 --> 18:48:32,891
HTML is characterized really by just\n
19630
18:48:33,732 --> 18:48:36,972
Most of what I just typed were tags,\n
19631
18:48:37,732 --> 18:48:41,872
Here's the same source code that I\n
19632
18:48:41,872 --> 18:48:43,122
Let's consider what this is.
19633
18:48:43,122 --> 18:48:46,631
The very first line of\ncode here, doctype html
19634
18:48:47,892 --> 18:48:51,522
It's the only one that starts with\n
19635
18:48:52,782 --> 18:48:55,392
There's no more exclamation\npoints thereafter, for now.
19636
18:48:55,392 --> 18:48:58,842
This is the document type declaration,\n
19637
18:48:58,842 --> 18:49:00,432
it's just got to be there nowadays.
19638
18:49:00,432 --> 18:49:02,974
It's like a little breadcrumb\nat the beginning of a file that
19639
18:49:02,974 --> 18:49:08,021
says to the browser, you are about to\n
19640
18:49:08,021 --> 18:49:11,000
That line of code has changed\nover time, over the years.
19641
18:49:11,000 --> 18:49:13,542
The most recent version of it\nis nice and succinct like this
19642
18:49:13,542 --> 18:49:16,572
and it's just a clue to the\nbrowser as to what version of HTML
19643
18:49:16,572 --> 18:49:18,881
is being used by you, the programmer.
19644
18:49:18,881 --> 18:49:20,781
All right, what comes after that?
19645
18:49:20,782 --> 18:49:23,592
Well, after that, and I've\nhighlighted two things in yellow
19646
18:49:23,592 --> 18:49:26,442
this is what we're going to\nstart calling an open tag
19647
18:49:26,441 --> 18:49:30,911
or a start tag, open bracket HTML\nthen something, close bracket
19648
18:49:30,911 --> 18:49:32,991
is the so-called start or open tag.
19649
18:49:32,991 --> 18:49:35,741
Then the corresponding close\nor end tag is down here.
19650
18:49:37,092 --> 18:49:40,252
You use the same tag number, you\nuse the same angled brackets.
19651
18:49:40,251 --> 18:49:43,121
But you do add a slash, and\nyou don't repeat yourself
19652
18:49:43,122 --> 18:49:46,032
with any of the things\ncalled attributes
19653
18:49:46,032 --> 18:49:47,772
because, what is this thing here?
19654
18:49:47,771 --> 18:49:50,891
Lang equals quote unquote\n"en," means the language
19655
18:49:50,892 --> 18:49:53,632
of my page is written\nin the English language.
19656
18:49:53,631 --> 18:49:56,771
The humans have standardized\ntwo and three letter codes
19657
18:49:56,771 --> 18:49:59,451
for every human language, right now.
19658
18:49:59,452 --> 18:50:03,131
And so this is just a clue to the\n
19659
18:50:03,131 --> 18:50:06,402
and accessibility purposes,\nwhat language the web page
19660
18:50:07,331 --> 18:50:10,661
Not the tags, but the words, like\nHello, title and Hello, body
19661
18:50:10,661 --> 18:50:13,182
which while minimalist,\nare indeed in English.
19662
18:50:13,182 --> 18:50:16,142
So when you close a tag, you close\nthe name of it with the slash
19663
18:50:17,111 --> 18:50:19,319
You don't repeat the attribute.
19664
18:50:19,320 --> 18:50:21,862
That would just be annoying to\nhave to type everything again.
19665
18:50:21,861 --> 18:50:23,028
But notice the pattern here.
19666
18:50:24,262 --> 18:50:27,551
But this is another example of\nkey value pairs in computing.
19667
18:50:27,551 --> 18:50:31,631
The key is Lang, the\nvalue is E-N for English.
19668
18:50:31,631 --> 18:50:34,182
The attribute is called\nLang, the value is
19669
18:50:34,182 --> 18:50:38,142
called, it is E-N. So again,\nit's just key value pairs
19670
18:50:38,142 --> 18:50:39,552
in just yet another context.
19671
18:50:39,551 --> 18:50:41,682
Probably the browser's using a\nhash table underneath the hood
19672
18:50:41,682 --> 18:50:44,807
to keep track of this stuff, like a\n
19673
18:50:44,807 --> 18:50:48,292
Again, humans keep using the same\n
19674
18:50:49,732 --> 18:50:52,241
The nesting is important\nvisually, not to the computer
19675
18:50:52,241 --> 18:50:54,491
but to us, the humans,\nbecause it implies
19676
18:50:54,491 --> 18:50:55,971
that there's some hierarchy here.
19677
18:50:55,971 --> 18:50:59,781
And, indeed, what is inside\nof the HTML tag here?
19678
18:50:59,782 --> 18:51:02,892
Well, we have what\nwe'll call the head tag.
19679
18:51:02,892 --> 18:51:06,252
The head tag says, hey, browser,\n
19680
18:51:06,251 --> 18:51:08,519
And then the body tag\nsays, hey, browser
19681
18:51:08,519 --> 18:51:09,851
here comes the body of the page.
19682
18:51:09,851 --> 18:51:13,902
The body is like 99% of the user's\n
19683
18:51:13,902 --> 18:51:17,532
The head is really just the address\n
19684
18:51:17,532 --> 18:51:20,292
like the title that we saw a moment ago.
19685
18:51:20,292 --> 18:51:24,732
Just to introduce the vernacular,\n
19686
18:51:24,732 --> 18:51:29,202
as an element, has two children,\n
19687
18:51:29,202 --> 18:51:31,961
which is to say that head\nand body are now siblings.
19688
18:51:31,961 --> 18:51:35,111
So you can use the same kind of\n
19689
18:51:35,111 --> 18:51:37,421
when talking about trees, weeks ago.
19690
18:51:37,422 --> 18:51:41,802
If we look at the head tag, how\n
19691
18:51:41,801 --> 18:51:44,591
I'm seeing one, and,\nindeed, at least if we
19692
18:51:44,592 --> 18:51:48,851
ignore all the white space, the\n
19693
18:51:48,851 --> 18:51:51,072
there's just one child, a title element.
19694
18:51:51,072 --> 18:51:55,032
And an element is the terminology that\n
19695
18:51:56,422 --> 18:51:58,242
So this is the title element.
19696
18:51:58,241 --> 18:52:01,812
And the title element has one\nchild, which is just pure text
19697
18:52:01,812 --> 18:52:03,792
otherwise known as a text node.
19698
18:52:03,792 --> 18:52:06,911
Recall, node, from our discussions\nof data structures weeks ago.
19699
18:52:06,911 --> 18:52:10,961
If we jump then to the body, which\n
19700
18:52:10,961 --> 18:52:14,922
it too has one child, which is just\n
19701
18:52:14,922 --> 18:52:17,292
that says, quote unquote "Hello, body.
19702
18:52:17,292 --> 18:52:21,042
What's nice about this indentation,\n
19703
18:52:21,042 --> 18:52:25,312
is not going to care, is that it\nimplies this kind of structure.
19704
18:52:25,312 --> 18:52:28,902
And this is where we connect,\nlike weeks 5 and now weeks 8, here
19705
18:52:28,902 --> 18:52:33,282
is the tree structure we began to\n
19706
18:52:33,282 --> 18:52:35,585
It's not a binary tree,\neven though this one happens
19707
18:52:35,585 --> 18:52:37,002
to have no more than two children.
19708
18:52:37,001 --> 18:52:40,961
It's an arbitrary tree that can\n
19709
18:52:40,961 --> 18:52:43,842
But if we have a special node\nhere that refers to the document
19710
18:52:43,842 --> 18:52:47,082
the root node, so to speak, is\nHTML, drawn with a rectangle
19711
18:52:47,081 --> 18:52:48,671
here, just for discussion's sake.
19712
18:52:48,672 --> 18:52:51,522
It has two children, head\nand body, also rectangles.
19713
18:52:51,521 --> 18:52:54,851
Head has a title child,\nand then it and body
19714
18:52:54,851 --> 18:52:57,861
have text nodes, which I've\ndrawn with ovals instead.
19715
18:52:57,861 --> 18:53:01,301
Which is only to say that when your\n
19716
18:53:01,301 --> 18:53:03,762
downloads a web page,\nopens up that envelope
19717
18:53:03,762 --> 18:53:06,771
and sees the contents that\nhave come back from the server
19718
18:53:06,771 --> 18:53:10,221
it essentially reads the\ncode that someone wrote
19719
18:53:10,221 --> 18:53:12,631
the HTML code, top to\nbottom, left to right
19720
18:53:12,631 --> 18:53:16,402
and creates in the browser's\nmemory, in your Mac or your PC
19721
18:53:16,402 --> 18:53:20,241
or your phone's memory or RAM,\nthis kind of data structure.
19722
18:53:20,241 --> 18:53:22,131
That's what's going on\nunderneath the hood.
19723
18:53:22,131 --> 18:53:24,592
And that's why aesthetically,\nit's just nice, as a human
19724
18:53:24,592 --> 18:53:27,802
to indent things stylistically,\nbecause it's very clear then
19725
18:53:27,801 --> 18:53:32,701
to you, and to other programmers,\n
19726
18:53:32,702 --> 18:53:36,112
So that's it for like\nthe fundamentals of HTML.
19727
18:53:36,111 --> 18:53:38,451
We'll see a bunch of tags\nand a bunch of examples now.
19728
18:53:38,452 --> 18:53:40,682
But HTML is just tags and attributes.
19729
18:53:40,682 --> 18:53:43,432
And it's the kind of thing that\n
19730
18:53:43,432 --> 18:53:45,142
Eventually, many of them get ingrained.
19731
18:53:45,142 --> 18:53:47,939
I constantly check the reference\nguides or stack overflow
19732
18:53:47,938 --> 18:53:50,271
if I'm trying to figure out,\nhow do I lay something out.
19733
18:53:50,271 --> 18:53:52,063
It's really just these\nbuilding blocks that
19734
18:53:52,063 --> 18:53:55,131
allow you to assemble the\nstructure of a web page.
19735
18:53:55,131 --> 18:53:58,551
This one is being super simple,\n
19736
18:53:58,551 --> 18:54:01,101
Any questions on this\nframework, before we
19737
18:54:01,101 --> 18:54:05,489
start to add more tags, more\nvocabulary, if you will?
19738
18:54:06,322 --> 18:54:09,301
AUDIENCE: What would happen\nif we put the title tag?
19739
18:54:09,301 --> 18:54:13,091
SPEAKER 1: If we put the hello tag\n
19740
18:54:13,751 --> 18:54:20,251
So let me actually go to this,\nand say open bracket title
19741
18:54:20,251 --> 18:54:23,461
whoops, sometimes you don't want\n
19742
18:54:24,601 --> 18:54:27,031
I've gone ahead and changed the file.
19743
18:54:27,032 --> 18:54:30,362
Let me go and open up, give me a\n
19744
18:54:30,361 --> 18:54:34,411
and go back to the URL that has my page.
19745
18:54:41,161 --> 18:54:45,891
And let me go ahead now\nand click on Hello.html.
19746
18:54:45,892 --> 18:54:49,352
And in this case, it looks like\nwe don't actually see anything.
19747
18:54:49,351 --> 18:54:50,631
So the browser is hiding it.
19748
18:54:50,631 --> 18:54:54,051
Technically speaking, browsers\ntend to be pretty generous.
19749
18:54:54,051 --> 18:54:56,331
And half the time, when\nyou make mistakes in HTML
19750
18:54:56,331 --> 18:54:58,101
it will display, it might display--
19751
18:54:58,101 --> 18:54:59,437
not display as you intend it.
19752
18:54:59,437 --> 18:55:03,022
It might not display the same on\n
19753
18:55:03,021 --> 18:55:05,211
There is a tool, though,\nthat we'll see, that
19754
18:55:05,211 --> 18:55:07,042
can help answer this question for you.
19755
18:55:07,042 --> 18:55:11,032
For instance, if I go\nto Validator.w3.org
19756
18:55:11,032 --> 18:55:13,072
W3 is the World Wide\nWeb Consortium, a group
19757
18:55:13,072 --> 18:55:15,232
of people that standardize\nthis kind of stuff
19758
18:55:15,232 --> 18:55:17,631
I can click on Validate\nby direct input, and just
19759
18:55:17,631 --> 18:55:21,411
copy paste my sample HTML into\nthis box, and click Check.
19760
18:55:21,411 --> 18:55:24,051
And I should see,\nhopefully, that indeed, it's
19761
18:55:24,051 --> 18:55:25,671
an error, what you proposed that I do.
19762
18:55:25,672 --> 18:55:27,744
The browser just did its\nbest to do something
19763
18:55:27,744 --> 18:55:30,952
which was to show me nothing at least,\n
19764
18:55:30,952 --> 18:55:34,491
But if I revert that change, and\nlet me undo what we just did
19765
18:55:34,491 --> 18:55:39,411
let me copy my original code back\n
19766
18:55:39,411 --> 18:55:41,831
now you can see, conversely,\nmy code is now correct.
19767
18:55:41,831 --> 18:55:43,581
And there's automated\ntools to check that.
19768
18:55:43,581 --> 18:55:45,873
But we'll encourage you, for\nproblem sets and projects
19769
18:55:45,873 --> 18:55:48,831
to use that particular manual tool.
19770
18:55:48,831 --> 18:55:51,621
All right, so let's go ahead\nand enhance this a little bit
19771
18:55:51,622 --> 18:55:53,572
by introducing a whole\nbunch of tags, just
19772
18:55:53,572 --> 18:55:55,952
to give you a sense of some\nof the building blocks here.
19773
18:55:55,952 --> 18:56:01,292
So I'm going to go ahead and create\n
19774
18:56:01,292 --> 18:56:04,292
And I'm just going to do a bunch of\n
19775
18:56:04,292 --> 18:56:07,432
so I'm not constantly typing all\n
19776
18:56:07,432 --> 18:56:09,411
because I want everything\nto be the same here
19777
18:56:09,411 --> 18:56:12,441
except I'm going to change my title\n
19778
18:56:12,441 --> 18:56:16,028
And inside of the body, I need a\n
19779
18:56:16,028 --> 18:56:18,111
And I don't really want\nto come up with some text.
19780
18:56:18,111 --> 18:56:21,682
So let me go to some random website\n
19781
18:56:21,682 --> 18:56:25,051
which if you're involved in like\n
19782
18:56:25,051 --> 18:56:28,581
this is placeholder text, kind of looks\n
19783
18:56:28,581 --> 18:56:32,121
Here, though, I have a handy way of\n
19784
18:56:32,122 --> 18:56:33,801
in something that looks like Latin.
19785
18:56:33,801 --> 18:56:36,012
And I've put those,\nnotice, inside of the body.
19786
18:56:37,012 --> 18:56:40,112
Look how long the\nmade-up words here are.
19787
18:56:40,111 --> 18:56:46,251
So let me go now into\nmy browser tab here.
19788
18:56:46,251 --> 18:56:50,301
Let me reload this page, and you'll\n
19789
18:56:50,301 --> 18:56:53,001
Paragraphs.html, which is\nmy new one, and Hello.html.
19790
18:56:53,001 --> 18:56:58,351
Let me click on Paragraphs.html,\n
19791
18:57:00,172 --> 18:57:03,172
SPEAKER 1: Yeah, it's obviously one\n
19792
18:57:03,172 --> 18:57:07,762
So that's interesting, but it's just a\n
19793
18:57:07,762 --> 18:57:09,142
It will only do what you say.
19794
18:57:09,142 --> 18:57:12,412
And each of these tags tells the\n
19795
18:57:12,411 --> 18:57:15,801
and then maybe stop doing something,\n
19796
18:57:15,801 --> 18:57:17,676
Hey, browser, here comes\nthe head of my page.
19797
18:57:17,676 --> 18:57:20,211
Hey, browser, here comes the\ntitle of my page, Hello, title.
19798
18:57:20,211 --> 18:57:22,042
Hey, browser, that's it for the title.
19799
18:57:22,042 --> 18:57:24,661
That's it for the head,\nhere comes the body tag.
19800
18:57:24,661 --> 18:57:27,411
So it's kind of having this\nconversation between the browser
19801
18:57:27,411 --> 18:57:30,471
between the HTML and the browser,\ndoing literally what it says.
19802
18:57:30,471 --> 18:57:32,721
So if you want a\nparagraph, you're probably
19803
18:57:32,721 --> 18:57:35,481
going to want to use\nthe P tag for paragraph.
19804
18:57:35,482 --> 18:57:38,642
And I'm going to go ahead\nand add this to my code.
19805
18:57:38,642 --> 18:57:41,392
I'm going to keep things neat,\n
19806
18:57:43,072 --> 18:57:47,032
Let me create another paragraph\ntag here, and close it
19807
18:57:47,032 --> 18:57:49,522
right after that one,\nindenting again, and I'm
19808
18:57:49,521 --> 18:57:51,232
keeping everything nice and orderly.
19809
18:57:53,402 --> 18:57:58,892
Let me indent that, and then let me\n
19810
18:57:58,892 --> 18:58:02,672
So again, a little tedious, but now I\n
19811
18:58:02,672 --> 18:58:04,101
hey, browser, start a paragraph.
19812
18:58:04,101 --> 18:58:05,721
Hey, browser, stop that paragraph.
19813
18:58:07,682 --> 18:58:09,961
Let me go back to the\nbrowser window here.
19814
18:58:09,961 --> 18:58:13,491
Let me hit Command R or\nControl R to reload the page.
19815
18:58:13,491 --> 18:58:16,732
And voila, now I have three\ncleaner paragraphs, all right?
19816
18:58:16,732 --> 18:58:18,562
So there's a P tag for paragraphs.
19817
18:58:18,562 --> 18:58:20,572
So now we have that\nparticular building block.
19818
18:58:20,572 --> 18:58:23,839
What if I want to add, for instance,\nsome headings to this page?
19819
18:58:23,839 --> 18:58:25,672
Well, that's something\nthat's possible, too.
19820
18:58:25,672 --> 18:58:29,152
Let me go ahead and create a\nnew file called Headings.html.
19821
18:58:29,152 --> 18:58:31,911
Let me copy and paste\nthat same code as before.
19822
18:58:31,911 --> 18:58:36,021
But now, let's preface each\nparagraph with maybe H1.
19823
18:58:36,021 --> 18:58:38,482
And I'm going to just\nwrite the word one.
19824
18:58:38,482 --> 18:58:41,452
And here I'm going to say H2, two.
19825
18:58:41,452 --> 18:58:44,572
And down here I might say H3, three.
19826
18:58:44,572 --> 18:58:49,184
So this is another tag,\nanother three tags, H1, H2, H3.
19827
18:58:49,184 --> 18:58:51,351
As you might have inferred\nby the file name I chose
19828
18:58:51,351 --> 18:58:55,342
this just gives you headings, like in\n
19829
18:58:55,342 --> 18:58:57,412
or subsections, or in\nan academic paper, you
19830
18:58:57,411 --> 18:59:00,182
have different hierarchies to\nthe text that you're writing.
19831
18:59:00,182 --> 18:59:04,461
So now that I've added an H1 tag,\n
19832
18:59:04,461 --> 18:59:07,491
two, H3 tag and the word three,\nlet's go back to the browser
19833
18:59:07,491 --> 18:59:12,232
reload the page again,\nand voila, once the page
19834
18:59:12,232 --> 18:59:17,152
reloads, I'll do it with the\nmanual button, reload the page.
19835
18:59:20,331 --> 18:59:21,623
AUDIENCE: Not in headings file.
19836
18:59:21,623 --> 18:59:23,581
SPEAKER 1: Right, I'm\nnot in the headings file.
19837
18:59:27,601 --> 18:59:29,301
OK, now we see some evidence of this.
19838
18:59:29,301 --> 18:59:30,921
Again, it's nonsensical content.
19839
18:59:30,922 --> 18:59:33,982
But you can kind of see that\nH1 is apparently big and bold
19840
18:59:33,982 --> 18:59:36,922
H2 is slightly less big, but still bold.
19841
18:59:36,922 --> 18:59:38,752
H3 is the same but a little smaller.
19842
18:59:38,751 --> 18:59:40,366
And it goes all the way down to H6.
19843
18:59:40,366 --> 18:59:42,741
After that, you should probably\nreorganize your thoughts.
19844
18:59:42,741 --> 18:59:44,631
But there are six\ndifferent hierarchies here
19845
18:59:44,631 --> 18:59:48,831
as you might use for chapters, sections,\n
19846
18:59:48,831 --> 18:59:53,061
So those are headings, as an\nHTML tag, in our vocabulary.
19847
18:59:53,062 --> 18:59:59,902
What's a common thing, too, well, let\n
19848
18:59:59,902 --> 19:00:04,851
and get some boilerplate here,\ncreate a file called List.html.
19849
19:00:04,851 --> 19:00:07,641
Let's create a simple\nlist inside of my body
19850
19:00:07,642 --> 19:00:10,522
and I'll give this a title of List.
19851
19:00:10,521 --> 19:00:13,531
And let me fix the title of this\none to be Headings, as well.
19852
19:00:13,532 --> 19:00:19,172
So in List.html, suppose I want to have\n
19853
19:00:19,172 --> 19:00:21,172
they're like a computer\nscientist's go-to words
19854
19:00:21,172 --> 19:00:23,302
just like a mathematician might say xyz.
19855
19:00:23,301 --> 19:00:26,122
Foo, bar, baths is in List.html.
19856
19:00:26,122 --> 19:00:29,991
Let me go back to my\nbrowser, hit the Back button.
19857
19:00:29,991 --> 19:00:33,921
There's List.html, and, hopefully,\n
19858
19:00:33,922 --> 19:00:37,892
on each line like a nice little\nlist, but, of course, I do not.
19859
19:00:38,961 --> 19:00:41,001
Chrome thinks it might be Arabic.
19860
19:00:41,001 --> 19:00:44,748
But that's curious, too,\nbecause the Lang attribute
19861
19:00:45,831 --> 19:00:48,023
So Google is trying to override it.
19862
19:00:48,024 --> 19:00:49,732
All right, what's the\nobvious explanation
19863
19:00:49,732 --> 19:00:52,762
why we're seeing foo, bar, and\nbaths on the same line, and not
19864
19:00:53,932 --> 19:00:55,342
AUDIENCE: We didn't tell it.
19865
19:00:55,342 --> 19:00:57,009
SPEAKER 1: We didn't tell it to do that.
19866
19:00:57,009 --> 19:00:59,272
So we need paragraph tags,\nor maybe something else.
19867
19:00:59,271 --> 19:01:00,921
Turns out there is something else.
19868
19:01:00,922 --> 19:01:04,972
There is a UL tag, for an\nunordered list in HTML
19869
19:01:04,971 --> 19:01:08,596
inside of which you can\nhave LI tags, for list item
19870
19:01:08,596 --> 19:01:10,221
inside of which you can put your words.
19871
19:01:10,221 --> 19:01:13,861
So there's my foo, there's\nmy bar, there's my baths.
19872
19:01:13,861 --> 19:01:16,671
And, again, notice that VS Code\nis finishing my thought for me.
19873
19:01:16,672 --> 19:01:21,412
But notice the hierarchy, open\nUL, open LI, close LI, open LI
19874
19:01:21,411 --> 19:01:24,771
close LI, open LI, close LI, close UL.
19875
19:01:24,771 --> 19:01:27,141
So it's sort of done\nin reverse order here.
19876
19:01:27,142 --> 19:01:33,172
Let me go back to my browser, reload\n
19877
19:01:33,172 --> 19:01:36,382
a default bulleted list, that\nstill seems to be in Arabic.
19878
19:01:36,381 --> 19:01:38,152
What if I want this list to be numbered?
19879
19:01:38,152 --> 19:01:39,801
Well, you can probably guess.
19880
19:01:39,801 --> 19:01:43,372
If you don't want an unordered list, but\n
19881
19:01:44,661 --> 19:01:46,641
SPEAKER 1: OL, sure, so let's try that.
19882
19:01:46,642 --> 19:01:49,312
Not always that easy as just\nguessing, but in this case
19883
19:01:49,312 --> 19:01:51,062
OL is going to do the trick.
19884
19:01:51,062 --> 19:01:52,582
Let me go back to my other browser.
19885
19:01:52,581 --> 19:01:55,664
Let me reload the page, and now it's\n
19886
19:01:55,664 --> 19:01:58,111
It's a tiny thing, but\nthis is actually useful
19887
19:01:58,111 --> 19:02:00,451
if you have a very long\nlist of data, and maybe you
19888
19:02:00,452 --> 19:02:02,612
might add some things in the\nmiddle, the beginning, or the end.
19889
19:02:02,611 --> 19:02:04,944
It would just be annoying to\nhave to go and renumber it.
19890
19:02:04,945 --> 19:02:07,772
The computer is doing it\nfor us by, instead, just
19891
19:02:07,771 --> 19:02:10,152
numbering from top to bottom here.
19892
19:02:10,152 --> 19:02:12,211
All right, what about\nanother type of layout
19893
19:02:12,211 --> 19:02:14,959
not just paragraphs, not just\n
19894
19:02:14,959 --> 19:02:17,042
You've got some research\ndata you want to present
19895
19:02:17,042 --> 19:02:20,334
some financial data you want to present,\n
19896
19:02:20,334 --> 19:02:23,222
How might we go about laying\nout data, a la a table?
19897
19:02:23,221 --> 19:02:26,221
Well, let me create a\nfile called Table.html
19898
19:02:26,221 --> 19:02:28,921
and I'll just copy paste\nwhere we started earlier.
19899
19:02:28,922 --> 19:02:31,292
Let me start to close\nsome of these other files.
19900
19:02:31,292 --> 19:02:34,688
And in Table.html, this is\ngoing to be a bit more HTML
19901
19:02:34,688 --> 19:02:36,271
but I'm going to go ahead and do this.
19902
19:02:36,271 --> 19:02:40,592
Table and close table, tables\ncan have table headings.
19903
19:02:40,592 --> 19:02:45,881
So T head is the name of that tag, and\n
19904
19:02:45,881 --> 19:02:47,161
So I'm going to add that tag.
19905
19:02:47,161 --> 19:02:49,619
And this is a common technique,\nsort of start your thought
19906
19:02:49,619 --> 19:02:52,721
finish your thought, and then go\n
19907
19:02:52,721 --> 19:02:54,301
What do I want to put in this table?
19908
19:02:54,301 --> 19:02:58,391
How about a bunch of names and numbers.
19909
19:02:58,392 --> 19:03:02,352
So, for instance, like left\ncolumn name, right column number.
19910
19:03:02,351 --> 19:03:05,641
So let's create a table row,\nwith what's called the TR tag.
19911
19:03:05,642 --> 19:03:10,382
Let's create a table heading with\n
19912
19:03:10,381 --> 19:03:14,262
Let's create another table\nheading called number here.
19913
19:03:14,262 --> 19:03:17,612
And all of that, to be\nclear, is in one table row.
19914
19:03:17,611 --> 19:03:21,331
Meanwhile, in the table body,\nlet me create another table row
19915
19:03:21,331 --> 19:03:23,101
but this time, it's not a heading.
19916
19:03:23,101 --> 19:03:24,581
Now I'm in the guts of my table.
19917
19:03:24,581 --> 19:03:28,171
Let's do table data, which is synonymous\n
19918
19:03:28,172 --> 19:03:30,872
in like an Excel spreadsheet\nor Google spreadsheet.
19919
19:03:30,872 --> 19:03:33,722
In this TD, I'm going to\nsay like Carter's name
19920
19:03:33,721 --> 19:03:39,481
and then lets grab Carter's number\n
19921
19:03:39,482 --> 19:03:43,211
Then let's put me into the mix, and\n
19922
19:03:44,342 --> 19:03:47,461
But we'll see that there's a lot\nof shared structure with HTML.
19923
19:03:47,461 --> 19:03:53,771
Let me go ahead and do mine,\n
19924
19:03:53,771 --> 19:03:56,191
So we're getting to be\na lot of indentation.
19925
19:03:56,191 --> 19:03:59,711
I'm using four spaces by default.\n
19926
19:03:59,711 --> 19:04:02,191
So long as you're consistent,\nthat's considered good style.
19927
19:04:02,191 --> 19:04:04,981
But let me go back to my\nbrowser here, and hit back.
19928
19:04:04,982 --> 19:04:07,232
That then brings me to my\ndirectory listing again.
19929
19:04:07,232 --> 19:04:10,562
Here's Table.html, and this\nis not that interesting yet.
19930
19:04:10,562 --> 19:04:13,532
But you can see that there's\ntwo columns, name and number.
19931
19:04:13,532 --> 19:04:18,302
Because it's a table heading, TH,\n
19932
19:04:18,301 --> 19:04:22,111
In there, in the table, are two\n
19933
19:04:22,111 --> 19:04:25,261
It's a little, oh, I forgot my\nnumber one, sorry about that.
19934
19:04:25,262 --> 19:04:28,197
One and one, it's not the\nprettiest table, right?
19935
19:04:28,197 --> 19:04:30,072
I feel like I kind of\nwant to separate things
19936
19:04:30,072 --> 19:04:32,154
a little more, maybe put\nsome borders or the like.
19937
19:04:32,154 --> 19:04:36,281
But with HTML alone, I'm really\nfocusing on the structure alone.
19938
19:04:36,282 --> 19:04:37,902
So we'll make this prettier soon.
19939
19:04:37,902 --> 19:04:41,331
But for now, this is how you\nmight lay out tabular data.
19940
19:04:41,331 --> 19:04:44,081
All right, let me pause here just\n
19941
19:04:44,081 --> 19:04:46,411
But, again, the goal right now\nis just to kind of throw at you
19942
19:04:46,411 --> 19:04:49,801
some basic building blocks, that, again,\n
19943
19:04:49,801 --> 19:04:53,671
But we're going to start\nstylizing these things soon, too.
19944
19:04:55,691 --> 19:04:57,387
SPEAKER 1: How do you indent paragraphs?
19945
19:04:58,262 --> 19:04:59,822
For that, we'll probably\ngoing to want something
19946
19:04:59,822 --> 19:05:01,308
called CSS, Cascading Style Sheets.
19947
19:05:01,308 --> 19:05:03,391
So let me come back to\nthat, in just a little bit.
19948
19:05:03,392 --> 19:05:06,782
For the stylization of these things,\n
19949
19:05:06,782 --> 19:05:10,172
we're going to need a\ndifferent language altogether.
19950
19:05:10,172 --> 19:05:13,172
All right, well, let's\nnow create what the web
19951
19:05:13,172 --> 19:05:17,702
is full of, which is like\nphotographs and images and the like.
19952
19:05:17,702 --> 19:05:22,832
Let me go ahead and create a new file\n
19953
19:05:22,831 --> 19:05:25,781
and change the title\nhere to be, say, Image.
19954
19:05:25,782 --> 19:05:29,189
And then, in the body of this page,\n
19955
19:05:29,188 --> 19:05:31,771
The interesting thing about an\nimage is that it's actually not
19956
19:05:31,771 --> 19:05:35,251
going to have a start tag and an end\n
19957
19:05:35,251 --> 19:05:38,281
Like, how can you start an image\nand then eventually finish it?
19958
19:05:38,282 --> 19:05:39,902
It's either there or it isn't.
19959
19:05:39,902 --> 19:05:42,542
So some tags do not have end tags.
19960
19:05:42,542 --> 19:05:49,081
So let me do image, IMG,\nsource equals Harvard.jpeg.
19961
19:05:49,081 --> 19:05:51,902
And let me go ahead, and,\nin my terminal window
19962
19:05:51,902 --> 19:05:54,301
I actually came with a photo of Harvard.
19963
19:05:54,301 --> 19:05:57,751
Let me grab this for just a second.
19964
19:05:57,751 --> 19:06:01,171
Let me grab Harvard.jpeg and\nput it into my directory
19965
19:06:01,172 --> 19:06:03,822
pretend that I downloaded\nthat in advance.
19966
19:06:03,822 --> 19:06:06,422
And so I'm referring\nto now a file called
19967
19:06:06,422 --> 19:06:12,032
Harvard.jpeg, that apparently is in\n
19968
19:06:12,032 --> 19:06:15,211
If this image were on the\ninternet, like Harvard server
19969
19:06:15,211 --> 19:06:21,402
I could also say like\nHTTPS://www.Harvard.edu/FolderName
19970
19:06:21,402 --> 19:06:25,562
whatever it is, /Harvard.jpeg, but\n
19971
19:06:25,562 --> 19:06:28,772
to your own, the Scode environment,\nlike I did before class
19972
19:06:28,771 --> 19:06:32,072
by dragging and dropping this\nwhole file, this photo of Harvard
19973
19:06:32,072 --> 19:06:35,402
you can just refer to it\nrelatively, so to speak.
19974
19:06:35,402 --> 19:06:38,521
This would be the same thing\nas saying ./Harvard.jpeg
19975
19:06:38,521 --> 19:06:41,941
go to the current directory and\n
19976
19:06:41,941 --> 19:06:43,861
But that's unnecessary to type.
19977
19:06:43,861 --> 19:06:47,461
For accessibility purposes, though,\n
19978
19:06:47,461 --> 19:06:51,122
it's ideal if we also give this\nan alternative text, something
19979
19:06:51,122 --> 19:06:55,021
like Harvard University,\nin the so-called Alt tag
19980
19:06:55,021 --> 19:06:57,331
and this is so that\nscreen readers will recite
19981
19:06:57,331 --> 19:06:59,581
what it is the photo is,\nfor folks who can't see it.
19982
19:06:59,581 --> 19:07:02,072
And if you're just on a slow\nconnection, sometimes you'll
19983
19:07:02,072 --> 19:07:04,051
see the text of what\nyou're about to see
19984
19:07:04,051 --> 19:07:07,601
before the image itself downloads,\n
19985
19:07:07,601 --> 19:07:12,241
So let's now go back to my open browser\n
19986
19:07:12,241 --> 19:07:16,531
I now have Harvard.jpeg, which I\n
19987
19:07:16,532 --> 19:07:20,252
Let me click on Image.html,\nand here we have
19988
19:07:20,251 --> 19:07:25,171
a really big picture of Memorial\n
19989
19:07:25,172 --> 19:07:29,762
Suffice it to say I should probably fix\n
19990
19:07:29,762 --> 19:07:33,271
But to do that, we're going to probably\n
19991
19:07:33,271 --> 19:07:36,842
There are some historical\nattributes that you can still
19992
19:07:36,842 --> 19:07:39,283
use to control width and\nheight, and so forth.
19993
19:07:39,283 --> 19:07:41,491
But we're going to do it\nthe better way, so to speak
19994
19:07:41,491 --> 19:07:44,042
with a language designed for just that.
19995
19:07:45,482 --> 19:07:50,402
I also came prepared with,\nlet me grab another file here
19996
19:07:50,402 --> 19:07:55,782
let me grab a file called\nHalloween.mp4, which is an MPEG file.
19997
19:07:55,782 --> 19:08:02,822
And let me go ahead and change this\n
19998
19:08:02,822 --> 19:08:04,592
I'll change my title to be Video.
19999
19:08:04,592 --> 19:08:07,982
And let's go ahead and now\nintroduce another tag, a video tag
20000
19:08:07,982 --> 19:08:13,172
open bracket video, and then let me go\n
20001
19:08:13,172 --> 19:08:17,461
And then inside of the video tag,\n
20002
19:08:17,461 --> 19:08:22,501
is going to be specifically\n
20003
19:08:22,501 --> 19:08:27,511
I know, is Video/mp4, because I looked\n
20004
19:08:27,512 --> 19:08:29,702
And the video tag actually\nhas a few attributes.
20005
19:08:29,702 --> 19:08:31,711
I can have this thing autoplay.
20006
19:08:33,392 --> 19:08:36,602
I can mute it, so that there's no\n
20007
19:08:36,601 --> 19:08:41,381
Most browsers, to prevent ads, don't\n
20008
19:08:41,381 --> 19:08:44,581
So if you mute your video, it\nwill autoplay, but presumably not
20009
19:08:45,601 --> 19:08:49,932
And let me set the width of this thing\n
20010
19:08:49,932 --> 19:08:51,581
But I can make it any size I want.
20011
19:08:51,581 --> 19:08:55,292
So I know this just from having\n
20012
19:08:57,122 --> 19:08:59,822
Sometimes attributes don't have values.
20013
19:09:01,051 --> 19:09:04,387
They're just single words,\nautoplay, loop, muted
20014
19:09:04,387 --> 19:09:06,512
and that kind of makes\nsense for any attribute that
20015
19:09:08,342 --> 19:09:11,372
Like, it doesn't make sense\nto say muted equals something.
20016
19:09:11,372 --> 19:09:12,722
Like it's either muted or not.
20017
19:09:12,721 --> 19:09:14,322
The attribute is there or not.
20018
19:09:14,322 --> 19:09:16,361
Similarly, for these others, as well.
20019
19:09:16,361 --> 19:09:19,981
So let me go back to my other browser\n
20020
19:09:19,982 --> 19:09:24,122
There is both my mp4\nand also Video.html
20021
19:09:24,122 --> 19:09:26,202
which is the web page that embeds it.
20022
19:09:26,202 --> 19:09:29,792
And this is actually a video that was\n
20023
19:09:30,941 --> 19:09:35,221
So we included it in this demo here.
20024
19:09:35,221 --> 19:09:41,792
This is the video that was on\n
20025
19:09:41,792 --> 19:09:44,009
But you can see here that\nan image alone probably
20026
19:09:44,009 --> 19:09:45,301
would not have the same effect.
20027
19:09:45,301 --> 19:09:48,494
This is actually a movie, a small\nvideo file that's now looping.
20028
19:09:48,494 --> 19:09:51,661
Now there's some artifacts here, like\n
20029
19:09:51,661 --> 19:09:53,494
I feel like it'd be\nnice to fill the screen.
20030
19:09:53,494 --> 19:09:58,031
But again, we'll come back to a language\n
20031
19:09:58,032 --> 19:10:00,272
Well, it's not just\nvideos like this, that you
20032
19:10:00,271 --> 19:10:01,982
might want to put into a web page.
20033
19:10:01,982 --> 19:10:06,283
Let me create another\nfile called iFrame.html.
20034
19:10:06,283 --> 19:10:09,241
If you've ever poked around with, if\n
20035
19:10:09,241 --> 19:10:12,565
or if you had your own blog or\n
20036
19:10:12,565 --> 19:10:14,732
you might have been in the\nhabit of embedding videos
20037
19:10:14,732 --> 19:10:17,672
in websites, using like\nembedded YouTube players.
20038
19:10:17,672 --> 19:10:21,362
Well, this is possible, too, using\n
20039
19:10:22,411 --> 19:10:25,531
And an iFrame is just a tag\nthat is literally iFrame.
20040
19:10:25,532 --> 19:10:28,960
It has source equals,\nand then a URL, and if it
20041
19:10:28,960 --> 19:10:32,252
happens to be a YouTube video, there's\n
20042
19:10:32,251 --> 19:10:33,811
per YouTube's documentation.
20043
19:10:33,812 --> 19:10:41,652
So you might do www.youtube.com, embed,\n
20044
19:10:41,652 --> 19:10:47,191
So this is essentially what we do, if\n
20045
19:10:47,191 --> 19:10:50,402
videos, in the course's website, or\n
20046
19:10:50,402 --> 19:10:53,911
If I want to allow full screen,\nI can add this attribute, too
20047
19:10:53,911 --> 19:10:56,661
that I know exists, by just\nhaving checked the documentation.
20048
19:10:56,661 --> 19:11:00,720
And if I now go back to my browser\n
20049
19:11:02,039 --> 19:11:03,872
It's not going to fill\nthe screen, because I
20050
19:11:03,872 --> 19:11:05,760
haven't customized the aesthetics yet.
20051
19:11:05,760 --> 19:11:10,111
But it does seem to embed a tiny little\n
20052
19:11:10,721 --> 19:11:13,801
So we could change the width, change\n
20053
19:11:14,471 --> 19:11:19,171
But an iFrame is a way of embedding\n
20054
19:11:19,171 --> 19:11:21,961
page, if they allow\nit, so as to create all
20055
19:11:21,961 --> 19:11:25,652
the more of an interactive experience\n
20056
19:11:25,652 --> 19:11:28,652
All right, well, the web is, of\n
20057
19:11:28,652 --> 19:11:32,072
Let's go ahead and create\na file called Link.html.
20058
19:11:32,072 --> 19:11:35,581
And if we want to create a web page that\n
20059
19:11:35,581 --> 19:11:40,100
else, let's go ahead and do this,\n
20060
19:11:44,551 --> 19:11:48,182
Now, in like Facebook, Instagram, a lot\n
20061
19:11:48,182 --> 19:11:50,851
in a domain name, or a\nfully qualified domain name
20062
19:11:50,851 --> 19:11:52,771
it automatically becomes a link.
20063
19:11:52,771 --> 19:11:56,161
That's because those websites have\n
20064
19:11:56,161 --> 19:12:00,550
detects something that looks like a\n
20065
19:12:00,551 --> 19:12:02,452
HTML itself does not do that for you.
20066
19:12:02,452 --> 19:12:06,822
And so if I go back to my web\npage here, click on Link.html
20067
19:12:06,822 --> 19:12:09,851
if you type visit\nHarvard.edu period, that's
20068
19:12:09,851 --> 19:12:11,592
all you're literally going to see.
20069
19:12:11,592 --> 19:12:15,192
But instinctively, even if you've\n
20070
19:12:15,191 --> 19:12:19,326
we probably do here\nto solve this problem?
20071
19:12:19,327 --> 19:12:20,952
What could we do to solve this problem.
20072
19:12:20,952 --> 19:12:22,244
What do I probably want to add.
20073
19:12:24,461 --> 19:12:27,702
SPEAKER 1: Yeah, so I want to surround\n
20074
19:12:27,702 --> 19:12:30,285
And you wouldn't necessarily\nknow this until someone told you
20075
19:12:30,285 --> 19:12:33,342
or you looked it up, but the tag for\n
20076
19:12:33,342 --> 19:12:35,832
called the A tag for anchor.
20077
19:12:35,831 --> 19:12:39,251
It has an attribute called\nHREF for hyper-reference
20078
19:12:39,251 --> 19:12:42,791
which is like a link in\nthe virtual world to a URL.
20079
19:12:42,792 --> 19:12:45,881
So let me type in Harvard's\nfull and proper URL here.
20080
19:12:45,881 --> 19:12:48,101
Then I'm going to close the tag.
20081
19:12:48,101 --> 19:12:53,411
And then I can still say Harvard.edu,\n
20082
19:12:53,411 --> 19:12:58,961
But the place they're going to go\n
20083
19:13:00,971 --> 19:13:03,281
Now if I go back here\nand reload the page
20084
19:13:03,282 --> 19:13:05,322
now it automatically gets underlined.
20085
19:13:05,322 --> 19:13:07,092
It happens to be purple by default. Why?
20086
19:13:07,092 --> 19:13:09,292
Because we visited\nHarvard.edu a few minutes ago.
20087
19:13:09,292 --> 19:13:12,642
So my browser, by default, is indicating\n
20088
19:13:12,642 --> 19:13:14,592
But now I have a link\nthat I can click on
20089
19:13:14,592 --> 19:13:18,881
and if I hover over it but don't click,\n
20090
19:13:18,881 --> 19:13:22,842
there's a little clue as to where\n
20091
19:13:24,012 --> 19:13:26,082
And without going too\nfar down a rabbit hole
20092
19:13:26,081 --> 19:13:29,471
but to tie together our discussion\nof cybersecurity recently
20093
19:13:29,471 --> 19:13:32,861
what if I were to do\nsomething like this.
20094
19:13:32,861 --> 19:13:37,331
Right now you have the beginnings\nof a phishing attack of sorts
20095
19:13:37,331 --> 19:13:42,281
P-H-I-S-H-I-N-G, whereby you can\n
20096
19:13:42,282 --> 19:13:46,542
even an email using HTML, that tells\n
20097
19:13:46,542 --> 19:13:49,612
but they're really going to\ngo someplace else altogether.
20098
19:13:49,611 --> 19:13:52,121
And that is the essence of\nphishing attacks these days.
20099
19:13:52,122 --> 19:13:55,452
If you've ever gotten a bogus\nemail pretending to be from PayPal
20100
19:13:55,452 --> 19:13:57,912
or your bank or some\nother website, odds are
20101
19:13:57,911 --> 19:14:00,432
they've just written HTML\nthat says whatever they want
20102
19:14:00,432 --> 19:14:04,042
but the underlying tags might\ndo something very different.
20103
19:14:04,042 --> 19:14:06,792
And so having the instinct to look\n
20104
19:14:06,792 --> 19:14:10,211
or be a little suspicious when you're\n
20105
19:14:10,211 --> 19:14:13,601
it's this easy to socially\nengineer people, that is
20106
19:14:13,601 --> 19:14:18,342
deceive them, by just saying one\nthing and linking to another.
20107
19:14:18,342 --> 19:14:22,572
Well, what if I want to link my page\n
20108
19:14:22,572 --> 19:14:25,422
Well, if I want to link\nto that photo of Harvard
20109
19:14:25,422 --> 19:14:28,722
I can just do HREF = equals quote\n
20110
19:14:28,721 --> 19:14:32,121
in my same account, that\nis itself a web page.
20111
19:14:32,122 --> 19:14:35,682
So this is how you can create\nrelative links, multi-page web
20112
19:14:35,682 --> 19:14:38,271
pages, multi-page websites, yourself.
20113
19:14:38,271 --> 19:14:41,652
So if I now reload this\npage, hover over Harvard.edu
20114
19:14:41,652 --> 19:14:45,191
you'll see in the bottom left\nhand corner a very long URL.
20115
19:14:45,191 --> 19:14:48,101
But that's because I'm in code\nspaces right now, VS Code
20116
19:14:48,101 --> 19:14:51,762
and it's appending automatically\nto the end of my current URL
20117
19:14:54,822 --> 19:14:57,251
When I click on this, I\ngo immediately to that
20118
19:14:57,251 --> 19:15:01,151
file we created earlier, with a\ncrazy, big version of the image.
20119
19:15:01,152 --> 19:15:03,822
But that's just a way\nthat one page on a website
20120
19:15:03,822 --> 19:15:07,721
can link to another page on a website.
20121
19:15:07,721 --> 19:15:11,711
Let's do one other thing here,\nmaking things more responsive
20122
19:15:11,711 --> 19:15:14,622
because, in fact, that wasn't a\nparticularly responsive website.
20123
19:15:14,622 --> 19:15:17,702
Responsive means responding to the\n
20124
19:15:17,702 --> 19:15:20,202
is so important when someone\nmight be on a screen like this
20125
19:15:20,202 --> 19:15:22,032
or on a screen like this these days.
20126
19:15:22,032 --> 19:15:26,892
There are special tags we can use to\n
20127
19:15:28,271 --> 19:15:32,381
So let me create a file\ncalled Responsive.html.
20128
19:15:32,381 --> 19:15:36,131
I'm going to copy/paste some starting\n
20129
19:15:36,131 --> 19:15:40,751
And let me go ahead and just grab, let\n
20130
19:15:40,751 --> 19:15:46,341
from before, just so that we have a\n
20131
19:15:46,342 --> 19:15:50,172
And let me go ahead and\ngrab this text here.
20132
19:15:50,172 --> 19:15:53,632
And I'm just going to paste\nthis into the body of this page.
20133
19:15:54,411 --> 19:15:57,762
So I just have a big paragraph,\n
20134
19:15:57,762 --> 19:15:59,532
Let me go back to my browser.
20135
19:15:59,532 --> 19:16:02,622
Let me open up this file,\ncalled Responsive.html
20136
19:16:02,622 --> 19:16:05,442
to make the point that\nit is not yet responsive.
20137
19:16:05,441 --> 19:16:07,781
Let me go ahead and\nclick on Responsive.html.
20138
19:16:08,961 --> 19:16:12,381
But here's another trick you can do,\n
20139
19:16:12,881 --> 19:16:14,921
You can pretend to be another device.
20140
19:16:14,922 --> 19:16:19,332
Let me go to View, developer,\ndeveloper tools again.
20141
19:16:19,331 --> 19:16:21,779
Last time we used this to\nuse the Network tab, which
20142
19:16:21,779 --> 19:16:24,822
was kind of interesting, because we\n
20143
19:16:25,822 --> 19:16:29,111
But notice, we can also click on\nthis icon, in Chrome, at least
20144
19:16:29,111 --> 19:16:31,151
that looks like a mobile phone.
20145
19:16:31,152 --> 19:16:36,199
I can turn my laptop into what looks\n
20146
19:16:36,198 --> 19:16:39,281
I'm going to click the dot dot dot\n
20147
19:16:39,282 --> 19:16:41,502
Instead of on the bottom,\nwhere it might be by default
20148
19:16:41,501 --> 19:16:43,281
I'm going to move it\nto the right hand side.
20149
19:16:43,282 --> 19:16:45,312
So that now on the left,\nyou see what looks more
20150
19:16:45,312 --> 19:16:47,082
like the shape of a vertical phone.
20151
19:16:47,081 --> 19:16:49,841
And, in fact, if I go\nto my dimensions here
20152
19:16:49,842 --> 19:16:53,741
I'll choose something like\niPhone X, so a few years back.
20153
19:16:53,741 --> 19:16:57,661
Here's what that same website might\n
20154
19:16:57,661 --> 19:17:01,322
that looks pretty damn\nsmall, to be able to read it.
20155
19:17:01,322 --> 19:17:03,842
And that's because the\nwebsite has not automatically
20156
19:17:03,842 --> 19:17:07,771
responded to the fairly narrow\ndimensions of the iPhone
20157
19:17:07,771 --> 19:17:09,971
in question, or Android\ndevice, or whatnot.
20158
19:17:09,971 --> 19:17:11,411
So let me go ahead and do this.
20159
19:17:11,411 --> 19:17:13,391
Let me go back into my code.
20160
19:17:13,392 --> 19:17:16,382
And let me go into the head of\nthe page, and for the first time
20161
19:17:18,149 --> 19:17:20,191
This word is now all over\nthe internet, but there
20162
19:17:20,191 --> 19:17:23,131
is a metatag that is\ncalled, that allows you
20163
19:17:23,131 --> 19:17:27,009
to specify the name of some kind\nof configuration detail here
20164
19:17:28,051 --> 19:17:31,081
Viewport is the technical term\nfor the rectangular region
20165
19:17:31,081 --> 19:17:32,741
that the human sees in a browser.
20166
19:17:32,741 --> 19:17:35,491
It's essentially the body of the\n
20167
19:17:37,051 --> 19:17:39,751
And you can specify the\ncontent of the viewport
20168
19:17:39,751 --> 19:17:41,761
should have an initial scale of 1.
20169
19:17:41,762 --> 19:17:43,652
So it shouldn't be zoomed in or out.
20170
19:17:43,652 --> 19:17:46,292
And the width that the\nbrowser should assume
20171
19:17:46,292 --> 19:17:48,812
should be equal to the device's width.
20172
19:17:48,812 --> 19:17:51,722
These are sort of magical statements\nthat you just have to know
20173
19:17:51,721 --> 19:17:55,771
or copy/paste or transcribe, that\njust express, to the browser
20174
19:17:55,771 --> 19:17:59,611
assume that the width of the page is the\n
20175
19:17:59,611 --> 19:18:03,451
Don't assume the luxury of a\nbig laptop or desktop computer.
20176
19:18:03,452 --> 19:18:08,432
Now, making only that change, let\n
20177
19:18:08,432 --> 19:18:10,682
here, using Chrome's developer tools.
20178
19:18:13,232 --> 19:18:18,722
And now, it's not very effective on this\n
20179
19:18:23,501 --> 19:18:28,351
So if I zoom in to 100%, this would be\n
20180
19:18:28,351 --> 19:18:30,372
readable than it would\nhave been a moment ago
20181
19:18:30,372 --> 19:18:33,039
even though I realized that demo\nwas not necessarily persuasive.
20182
19:18:33,039 --> 19:18:34,832
But it's as simple as\ntelling the browser
20183
19:18:34,831 --> 19:18:38,801
to resize the thing to\nthe width of the page.
20184
19:18:38,801 --> 19:18:41,971
All right, let me pause here to see\n
20185
19:18:41,971 --> 19:18:43,591
feels like enough HTML tags.
20186
19:18:43,592 --> 19:18:45,312
We'll add just a couple of more in.
20187
19:18:45,312 --> 19:18:47,402
But for the most part,\nlike HTML tags are
20188
19:18:47,402 --> 19:18:51,241
things you Google and figure out over\n
20189
19:18:51,241 --> 19:18:54,182
The basic building blocks\nare tags, attributes.
20190
19:18:54,182 --> 19:18:55,652
Some attributes have values.
20191
19:18:56,461 --> 19:19:00,175
And that's sort of the\nstructure of HTML in essence.
20192
19:19:00,175 --> 19:19:01,592
Questions on any of these, though.
20193
19:19:02,330 --> 19:19:03,961
AUDIENCE: Do attributes have an order?
20194
19:19:03,961 --> 19:19:05,611
SPEAKER 1: Do attributes have an order?
20195
19:19:05,611 --> 19:19:08,701
No, attributes can be in any\norder, from left to right.
20196
19:19:08,702 --> 19:19:11,702
I tend to be a little nit-picky,\nand so I alphabetize them
20197
19:19:11,702 --> 19:19:14,917
if only because then I can easily\nspot if something's missing
20198
19:19:14,917 --> 19:19:16,292
if it's not there alphabetically.
20199
19:19:16,292 --> 19:19:21,952
Most people on the internet\ndon't seem to do that.
20200
19:19:24,801 --> 19:19:26,811
I mentioned that HTML\nis starting to replace
20201
19:19:26,812 --> 19:19:28,851
other languages for user interfaces.
20202
19:19:28,851 --> 19:19:30,292
And it's not just HTML alone.
20203
19:19:30,292 --> 19:19:34,101
It's HTML with CSS, with JavaScript,\n
20204
19:19:35,211 --> 19:19:37,911
That rather has been the\ntrend for portability
20205
19:19:37,911 --> 19:19:40,581
and the ability for companies,\nfor individual programmers
20206
19:19:40,581 --> 19:19:42,951
to write one version\nof an app and have it
20207
19:19:42,952 --> 19:19:47,092
work on Android devices and iPhones\n
20208
19:19:48,351 --> 19:19:51,411
It is very time-consuming to\nlearn a language like Java
20209
19:19:51,411 --> 19:19:54,322
and write an Android app, learn\nanother language called Swift
20210
19:19:54,322 --> 19:19:56,991
and make an iOS app, not to\nmention make them look and behave
20211
19:19:56,991 --> 19:19:59,123
the same, not to\nmention fix a bug in one
20212
19:19:59,123 --> 19:20:00,831
and then remember to\nfix it in the other.
20213
19:20:00,831 --> 19:20:05,061
I mean, this is just very painful\nand time-consuming and costly.
20214
19:20:05,062 --> 19:20:09,351
So this standardization on\nHTML, CSS, and JavaScript
20215
19:20:09,351 --> 19:20:13,432
even for mobile apps and web apps,\n
20216
19:20:13,432 --> 19:20:16,012
because it solves problems like that.
20217
19:20:16,012 --> 19:20:19,851
All right, so let's go ahead and now do\n
20218
19:20:19,851 --> 19:20:22,161
All of these pages thus\nfar are really just tastes
20219
19:20:22,161 --> 19:20:25,042
of static content, content\nthat does not change.
20220
19:20:25,042 --> 19:20:27,991
Well, let's go ahead and do this.
20221
19:20:27,991 --> 19:20:31,012
Let me introduce one other\nformat of URLs, which looks
20222
19:20:31,012 --> 19:20:33,051
a little something like it did before.
20223
19:20:33,051 --> 19:20:36,451
So slash path, but it could\nactually be something like this
20224
19:20:36,452 --> 19:20:40,012
slash path question\nmark, key equals value.
20225
19:20:40,012 --> 19:20:42,021
You might not have noticed,\nor cared to notice
20226
19:20:42,021 --> 19:20:44,641
the URLs in your URL bar every day.
20227
19:20:44,642 --> 19:20:46,282
But these things are everywhere.
20228
19:20:46,282 --> 19:20:49,222
Often when you type into a\nsearch engine like Google
20229
19:20:49,221 --> 19:20:52,491
a search query, whatever you\njust typed ends up in the URL.
20230
19:20:52,491 --> 19:20:55,099
When you click on a link that\ncontains some information
20231
19:20:55,099 --> 19:20:57,682
there might be a question mark,\nand then some keys and values.
20232
19:20:57,682 --> 19:21:00,471
There might be an ampersand\nand more keys and values.
20233
19:21:00,471 --> 19:21:02,901
Here, again, is that very\ncommon programming paradigm
20234
19:21:02,902 --> 19:21:05,062
of just associating keys with values.
20235
19:21:07,021 --> 19:21:11,391
Let me actually go to\ngoogle.com, in a browser
20236
19:21:11,392 --> 19:21:16,042
here, and let me search for something\n
20237
19:21:16,042 --> 19:21:20,482
Enter, notice now that my\nURL changed from google.com
20238
19:21:20,482 --> 19:21:23,991
to google.com slash\nsearch question mark
20239
19:21:23,991 --> 19:21:27,081
Q equals cats, ampersand\nand then a bunch of stuff
20240
19:21:27,081 --> 19:21:28,581
that I don't understand or know.
20241
19:21:28,581 --> 19:21:33,501
So let's just delete it for now, and\n
20242
19:21:34,559 --> 19:21:37,101
If I zoom out here, years ago\nyou would get pictures of cats.
20243
19:21:37,101 --> 19:21:41,122
Now you get videos of the movie.
20244
19:21:41,122 --> 19:21:44,332
And then that top query\nthere, is Cats a bad movie.
20245
19:21:44,331 --> 19:21:46,822
But we can also, of\ncourse, click on Images.
20246
19:21:46,822 --> 19:21:49,521
And there are the\nadorable cat, creepy cats.
20247
19:21:49,521 --> 19:21:52,342
All right, this didn't used to\nhappen when we searched for cats.
20248
19:21:52,342 --> 19:21:57,922
But anyhow, the point is that the URL\n
20249
19:21:57,922 --> 19:22:00,562
And this is such a simple,\nbut such a powerful thing.
20250
19:22:00,562 --> 19:22:04,822
This is how humans\nprovide input to servers.
20251
19:22:04,822 --> 19:22:07,741
They don't manually create the\nURLs, like I sort of just did.
20252
19:22:07,741 --> 19:22:10,792
But when you fill out a form\non the web and you hit Enter
20253
19:22:10,792 --> 19:22:13,672
typically the URL suddenly\nchanges to include
20254
19:22:13,672 --> 19:22:16,552
whatever you typed in,\nin the URL, assuming
20255
19:22:16,551 --> 19:22:18,891
the form is using the verb GET.
20256
19:22:20,331 --> 19:22:22,123
If you're typing in a\nusername, a password
20257
19:22:22,123 --> 19:22:25,331
a credit card information, because you\n
20258
19:22:25,331 --> 19:22:27,951
at your laptop to see literally\neverything you typed in
20259
19:22:29,331 --> 19:22:32,061
So there's another verb, POST,\nthat can hide all of that.
20260
19:22:32,062 --> 19:22:33,772
And it's just sent a little differently.
20261
19:22:33,771 --> 19:22:36,501
But things like this are\ntypically sent via GET
20262
19:22:36,501 --> 19:22:39,921
and what that means underneath the\n
20263
19:22:39,922 --> 19:22:43,432
making a request like this, Get/search?
20264
19:22:43,432 --> 19:22:47,752
Q equals, whatever you typed in, the\n
20265
19:22:47,751 --> 19:22:52,822
And hopefully what comes back is a page\n
20266
19:22:52,822 --> 19:22:57,961
And what's interesting here now is, if\n
20267
19:22:57,961 --> 19:23:04,792
and let me go ahead and create a\n
20268
19:23:04,792 --> 19:23:09,682
In Search.html, I'm going to start\n
20269
19:23:11,032 --> 19:23:14,612
And in the body of this page, I'm\ngoing to introduce a form tag.
20270
19:23:14,611 --> 19:23:18,531
And in this form tag, I'm going\nto have a couple of inputs.
20271
19:23:18,532 --> 19:23:25,252
And the types of inputs are going to\n
20272
19:23:28,448 --> 19:23:30,531
And this isn't that\ninteresting yet, but let's see
20273
19:23:30,532 --> 19:23:32,422
what is happening in the page itself.
20274
19:23:32,422 --> 19:23:34,762
Let me go back to my directory listing.
20275
19:23:34,762 --> 19:23:36,922
Let me click on Search.html.
20276
19:23:36,922 --> 19:23:39,442
I seem to have the beginning\nof my own search engine.
20277
19:23:40,551 --> 19:23:43,201
It's just a text box\nand a submit button.
20278
19:23:43,202 --> 19:23:45,002
But let's finish my thoughts here.
20279
19:23:45,001 --> 19:23:49,911
So let's specifically give\nthis text box a name of Q
20280
19:23:49,911 --> 19:23:53,751
which, if you roll back to the late '90s\n
20281
19:23:53,751 --> 19:23:58,041
created Google.com, Q represented query,\n
20282
19:23:58,042 --> 19:24:01,822
So the name of this\ntext box shall be text
20283
19:24:01,822 --> 19:24:05,842
shall be Q. The form is\ngoing to use what method?
20284
19:24:05,842 --> 19:24:07,702
Technically it uses GET\nby default, but I'll
20285
19:24:07,702 --> 19:24:10,282
be explicit and say method\nequals quote unquote "get.
20286
19:24:10,282 --> 19:24:14,632
Stupidly, it's lowercase in HTML, even\n
20287
19:24:16,851 --> 19:24:21,491
The action of this form, specifically,\n
20288
19:24:21,491 --> 19:24:24,351
But we don't really have time\ntoday to implement Google itself.
20289
19:24:24,351 --> 19:24:28,941
So we're just going to send the\n
20290
19:24:28,941 --> 19:24:31,152
So I'm creating a form,\nthe action of which
20291
19:24:31,152 --> 19:24:35,711
is to send the data to Google's slash\n
20292
19:24:35,711 --> 19:24:41,021
It's going to send an input called Q,\n
20293
19:24:41,021 --> 19:24:44,081
Let me go back to the\nbrowser, reload the page.
20294
19:24:44,081 --> 19:24:48,911
Nothing seems to have changed yet,\n
20295
19:24:50,351 --> 19:24:53,952
Right now I'm in Search.html.
20296
19:24:53,952 --> 19:24:57,282
If I zoom out and search for\ncats now and click Submit
20297
19:24:57,282 --> 19:24:59,262
I'm whisked away to google.com.
20298
19:24:59,262 --> 19:25:02,952
But notice that the URL is\nparameterized, with those key value
20299
19:25:04,572 --> 19:25:06,432
And I get back a whole\nbunch of cat results.
20300
19:25:06,432 --> 19:25:08,771
And I can very easily now\nmake this a little prettier.
20301
19:25:08,771 --> 19:25:11,801
Right now, it's not ideal that like\n
20302
19:25:13,116 --> 19:25:15,491
And it's a little obnoxious\nthat autocomplete is enabled.
20303
19:25:15,491 --> 19:25:17,292
If I don't want to\nsearch for cats anymore
20304
19:25:17,292 --> 19:25:21,282
well, according to HTML's documentation,\n
20305
19:25:21,282 --> 19:25:25,362
Autocomplete equals off, to turn\noff autocomplete, auto focus
20306
19:25:25,361 --> 19:25:28,521
to automatically put the\ncursor inside of that text box.
20307
19:25:28,521 --> 19:25:32,292
If I want some explanatory text, I can\n
20308
19:25:33,521 --> 19:25:35,652
And now if I go back to\nthis page and reload
20309
19:25:35,652 --> 19:25:37,461
now it's a little more user-friendly.
20310
19:25:37,461 --> 19:25:39,641
You see query in kind of gray text.
20311
19:25:39,642 --> 19:25:41,472
The cursor is already\nthere and blinking.
20312
19:25:41,471 --> 19:25:43,001
I don't have to even move my cursor.
20313
19:25:43,001 --> 19:25:46,121
I can search for dogs now, and you\n
20314
19:25:46,122 --> 19:25:48,911
Hit enter to submit, and\nnow I'm searching for
20315
19:25:48,911 --> 19:25:51,831
there we go, adorable dogs, instead.
20316
19:25:52,762 --> 19:25:56,851
I've implemented the front end of\n
20317
19:25:56,851 --> 19:25:58,601
To implement the back\nend, we're obviously
20318
19:25:58,601 --> 19:26:01,812
going to need like a really big\n
20319
19:26:01,812 --> 19:26:05,484
We're going to need some code that like\n
20320
19:26:06,191 --> 19:26:08,091
We're going to need Python\nfor something like that.
20321
19:26:08,092 --> 19:26:09,912
And in fact, that's the direction\nwe're steering next week
20322
19:26:09,911 --> 19:26:11,262
when we implement that back end.
20323
19:26:11,262 --> 19:26:14,592
But today it's all about this front end.
20324
19:26:14,592 --> 19:26:20,682
Or any question, then, about forms,\n
20325
19:26:20,682 --> 19:26:24,680
transition to making things look\na little prettier, with CSS?
20326
19:26:24,679 --> 19:26:27,221
And then we'll end by making\nthings a little more functional
20327
19:26:31,941 --> 19:26:35,121
All right, so let's start to answer\n
20328
19:26:35,122 --> 19:26:40,232
came up, by making these pages a\n
20329
19:26:40,232 --> 19:26:45,872
Let's go ahead now and introduce to\n
20330
19:26:45,872 --> 19:26:48,562
Let me go ahead and create\na file called Home.html
20331
19:26:48,562 --> 19:26:51,292
as though I'm making a home\npage for the very first time.
20332
19:26:51,292 --> 19:26:53,961
And in this page, I'm going\nto give a title of Home.
20333
19:26:53,961 --> 19:26:55,891
And I'm just going to\nhave like three things.
20334
19:26:55,892 --> 19:27:00,292
First I'm going to have maybe\na paragraph of text up here
20335
19:27:00,292 --> 19:27:03,322
at the top, that says something\nwelcoming for my home page
20336
19:27:03,322 --> 19:27:06,741
like my name, John Harvard, for\n
20337
19:27:06,741 --> 19:27:10,072
Then in the middle of the page,\n
20338
19:27:10,072 --> 19:27:12,982
welcome to my home\npage exclamation point!
20339
19:27:12,982 --> 19:27:16,042
And at the bottom of the page, I'm\n
20340
19:27:16,042 --> 19:27:19,672
says something like copyright,\nthe copyright symbol, John
20341
19:27:19,672 --> 19:27:21,172
Harvard, or something like that.
20342
19:27:21,172 --> 19:27:25,432
All right, so it's like a web page\n
20343
19:27:26,721 --> 19:27:27,981
This isn't that interesting.
20344
19:27:27,982 --> 19:27:32,422
If I open this page called\nHome.html, let me go ahead
20345
19:27:32,422 --> 19:27:35,932
and create three quick paragraphs,\n
20346
19:27:35,932 --> 19:27:39,292
Inside the middle, I'm going to say\n
20347
19:27:40,232 --> 19:27:42,472
And at the bottom,\nwhoops, at the bottom
20348
19:27:42,471 --> 19:27:45,651
a little footer that says\nsomething like copyright
20349
19:27:45,652 --> 19:27:50,331
a little simple copyright\nsymbol, and John Harvard's name.
20350
19:27:50,331 --> 19:27:52,364
All right, now let me reload the page.
20351
19:27:53,032 --> 19:27:57,292
It's a very simple, very underwhelming\n
20352
19:27:57,292 --> 19:28:00,188
Let's start to now stylize\nthis in an interesting way
20353
19:28:00,188 --> 19:28:02,271
so that it's a little more\naesthetically pleasing.
20354
19:28:02,271 --> 19:28:04,251
First, these aren't really paragraphs.
20355
19:28:04,251 --> 19:28:08,104
They're sort of like areas of the page,\n
20356
19:28:08,104 --> 19:28:09,771
There's like the main part of my screen.
20357
19:28:09,771 --> 19:28:11,521
And then there's the\nfooter of my screen.
20358
19:28:11,521 --> 19:28:13,813
So paragraphs isn't quite\nright, if these aren't really
20359
19:28:15,152 --> 19:28:17,722
I might more properly call\nthem divs or divisions
20360
19:28:17,721 --> 19:28:21,831
of the page, which is a very commonly\n
20361
19:28:21,831 --> 19:28:24,121
this generic rectangular region to it.
20362
19:28:24,122 --> 19:28:28,192
It does not do anything aesthetically,\n
20363
19:28:28,191 --> 19:28:32,601
It just creates an invisible\nrectangular region, inside of which
20364
19:28:32,601 --> 19:28:34,342
you can start to style the text.
20365
19:28:34,342 --> 19:28:36,262
Or I can take this one step further.
20366
19:28:36,262 --> 19:28:40,461
There's some other tags in HTML,\n
20367
19:28:40,461 --> 19:28:42,869
have names that describe the\ntypes of your page, which
20368
19:28:42,869 --> 19:28:45,411
is all the more compelling these\ndays for accessibility, too
20369
19:28:45,411 --> 19:28:49,881
for screen readers, for search engines,\n
20370
19:28:49,881 --> 19:28:52,584
engine can realize that footer\nis probably a little fluffy.
20371
19:28:52,584 --> 19:28:54,292
The header might be\na little interesting.
20372
19:28:54,292 --> 19:28:57,202
The main part of the page\nis probably the juicy part
20373
19:28:57,202 --> 19:29:01,762
that I want users to be able to search\n
20374
19:29:01,762 --> 19:29:04,672
So let's start to stylize\nthis page somehow.
20375
19:29:04,672 --> 19:29:08,482
Let's introduce a style\nattribute in HTML
20376
19:29:08,482 --> 19:29:13,101
inside of which is going to be\ntext like this, font size colon
20377
19:29:13,101 --> 19:29:16,851
large, text align colon center.
20378
19:29:16,851 --> 19:29:20,182
On Main, I'm going to add a\nstyle attribute and say font size
20379
19:29:24,021 --> 19:29:27,741
And then on the footer, I'm going\nto say style equals font size
20380
19:29:34,232 --> 19:29:36,592
Well, in blue is the\nlanguage we promised
20381
19:29:36,592 --> 19:29:39,112
called CSS, for Cascading Style Sheets.
20382
19:29:39,111 --> 19:29:42,231
We're not really seeing the\nCascading Style Sheet of it yet.
20383
19:29:42,232 --> 19:29:46,192
But in blue here, notice is\nanother very common paradigm.
20384
19:29:46,191 --> 19:29:48,861
It's different syntax\nnow, but how would you
20385
19:29:48,861 --> 19:29:52,461
describe what you're\nlooking at here in blue?
20386
19:29:52,461 --> 19:29:56,902
This is another example of what\nkind of programming convention?
20387
19:29:57,789 --> 19:30:00,081
SPEAKER 1: Yeah, it's just\nmore key value pairs, right?
20388
19:30:00,081 --> 19:30:03,031
It'd be nice if the world standardized\n
20389
19:30:03,032 --> 19:30:07,260
because we've now seen equal signs\n
20390
19:30:07,801 --> 19:30:10,009
But it's just different\nlanguages, different choices.
20391
19:30:10,009 --> 19:30:13,121
The key here is font-size,\nthe value is large.
20392
19:30:13,122 --> 19:30:16,922
The other key is text-align,\nthe colon, the value is center.
20393
19:30:16,922 --> 19:30:20,642
The semicolon just separates\none key value pair from another.
20394
19:30:20,642 --> 19:30:24,482
Just like in the URL, the ampersand\ndid, in the context of HTTP.
20395
19:30:24,482 --> 19:30:27,392
The designers of CSS\nused semicolons instead.
20396
19:30:27,392 --> 19:30:30,149
Strictly speaking, this\nsemicolon isn't necessary.
20397
19:30:30,149 --> 19:30:32,732
I tend to include it just for\nsymmetry, but it doesn't matter
20398
19:30:32,732 --> 19:30:34,322
because there's nothing after that.
20399
19:30:34,322 --> 19:30:36,301
This is a bit of a weird example.
20400
19:30:36,301 --> 19:30:41,341
This is the co-mingling of\nCSS inside of JavaScript.
20401
19:30:41,342 --> 19:30:46,682
So as of now, you can use the CSS\n
20402
19:30:46,682 --> 19:30:49,441
in the value of a style attribute.
20403
19:30:49,441 --> 19:30:52,542
We did something a little\nsimilarly last two weeks
20404
19:30:52,542 --> 19:30:57,032
a week plus ago, when we included\nsome SQL inside of Python.
20405
19:30:57,032 --> 19:30:59,764
So again, languages can kind\nof cross barriers together.
20406
19:30:59,764 --> 19:31:01,682
But we're going to clean\nthis up, because this
20407
19:31:01,682 --> 19:31:04,599
is going to get messy quickly,\n
20408
19:31:04,599 --> 19:31:07,211
of Harvard's or Yale's, or the like.
20409
19:31:07,211 --> 19:31:09,372
So let's see what this looks like.
20410
19:31:09,372 --> 19:31:13,142
Let me go back to my browser\nwindow here, reload the page.
20411
19:31:13,142 --> 19:31:14,922
And it's not that different.
20412
19:31:14,922 --> 19:31:19,412
But it's indeed centered, and it's\n
20413
19:31:19,411 --> 19:31:20,732
And let me make one refinement.
20414
19:31:20,732 --> 19:31:22,771
The copyright symbol\nactually can be expressed
20415
19:31:22,771 --> 19:31:25,021
but there's no key on\nmy US keyboard here.
20416
19:31:25,021 --> 19:31:30,902
I can actually magically say\nampersand hash 169 semicolon
20417
19:31:30,902 --> 19:31:33,167
using what's called an HTML entity.
20418
19:31:33,167 --> 19:31:36,991
It turns out there are numeric\n
20419
19:31:36,991 --> 19:31:40,351
allow you to specify symbols that\n
20420
19:31:40,351 --> 19:31:42,361
but that don't exist on most keyboards.
20421
19:31:42,361 --> 19:31:46,051
If I reload the page now, now\nit's a proper copyright symbol.
20422
19:31:46,051 --> 19:31:50,221
So minor aesthetic, but it\nintroduces us to these HTML entities.
20423
19:31:50,221 --> 19:31:53,671
So even if you've never\nseen CSS before, you
20424
19:31:53,672 --> 19:31:56,522
can probably find something\nkind of dumb about what
20425
19:31:56,521 --> 19:31:58,201
I did here, like poor design.
20426
19:31:58,202 --> 19:32:02,582
It is correct, if my goal was small,\n
20427
19:32:02,581 --> 19:32:07,081
looks like a bad design,\nperhaps, even if you've never
20428
19:32:09,872 --> 19:32:12,122
SPEAKER 1: Yeah, I've used\nthe same style three times
20429
19:32:12,122 --> 19:32:14,942
like copy/paste, or typing the\nexact same thing again and again.
20430
19:32:14,941 --> 19:32:17,221
It has rarely been a good thing.
20431
19:32:17,221 --> 19:32:21,691
Well, here's where we can take\nadvantage of the design of CSS
20432
19:32:21,691 --> 19:32:24,121
because it supports what\nwe might call inheritance
20433
19:32:24,122 --> 19:32:29,461
whereby children inherit the properties,\n
20434
19:32:30,572 --> 19:32:32,471
And what that means is, I can do this.
20435
19:32:32,471 --> 19:32:34,231
Let me get rid of this text align.
20436
19:32:34,232 --> 19:32:36,032
Let me get rid of this text align.
20437
19:32:37,232 --> 19:32:40,142
I could get rid of the semicolon,\n
20438
19:32:40,142 --> 19:32:46,502
And let me add all of that style\n
20439
19:32:46,501 --> 19:32:51,511
so that it sort of cascades down to the\n
20440
19:32:52,331 --> 19:32:54,451
And let me close my quotes there, too.
20441
19:32:54,452 --> 19:32:58,381
Now, if I go back to my browser\nand hit reload, nothing changes.
20442
19:32:58,381 --> 19:33:00,241
But it's a little\nbetter designed, right?
20443
19:33:00,241 --> 19:33:03,601
Because if I want to change the text\n
20444
19:33:03,601 --> 19:33:06,491
I can now reload the page, and\nvoila, now it's over there.
20445
19:33:06,491 --> 19:33:08,952
I change it in one place, not\nin three different places.
20446
19:33:08,952 --> 19:33:11,822
So that would seem to be\nmarginally better design.
20447
19:33:11,822 --> 19:33:14,441
And could we do this\nany more differently?
20448
19:33:14,441 --> 19:33:20,251
Well, it's not that elegant that\n
20449
19:33:20,251 --> 19:33:22,621
This generally tends to\nbe bad practice, where
20450
19:33:22,622 --> 19:33:26,012
you co-mingle your HTML and your\n
20451
19:33:26,012 --> 19:33:28,622
might be really good at laying\nout the structure of web pages
20452
19:33:28,622 --> 19:33:31,411
and the content and the data, and you\n
20453
19:33:31,411 --> 19:33:32,822
or just not care about the aesthetics.
20454
19:33:32,822 --> 19:33:34,572
You might work with a\ndesigner, an artist
20455
19:33:34,572 --> 19:33:37,741
who's much better at all of\nthese fine tunings aesthetically.
20456
19:33:37,741 --> 19:33:41,921
Wouldn't it be nice if you could work\n
20457
19:33:41,922 --> 19:33:43,982
And you don't have to\nsomehow like literally
20458
19:33:43,982 --> 19:33:46,112
edit the same lines\nof code as each other.
20459
19:33:46,111 --> 19:33:50,131
Well, just like we can move\nstuff into header files in C
20460
19:33:50,131 --> 19:33:53,895
or packages in Python, we\ncan do the same in CSS.
20461
19:33:53,895 --> 19:33:55,812
So I'm actually going\nto go ahead and do this.
20462
19:33:55,812 --> 19:33:58,382
Let me get rid of all of\nthese style attributes
20463
19:33:58,381 --> 19:34:03,842
and let me now start to practice a\n
20464
19:34:05,221 --> 19:34:10,051
Let me instead move it into the\n
20465
19:34:11,611 --> 19:34:14,011
This is one of the rare\nexamples where there
20466
19:34:14,012 --> 19:34:16,862
are attributes that have the\nsame names of tags as vice versa.
20467
19:34:16,861 --> 19:34:19,441
It's not very common,\nbut this one does exist.
20468
19:34:19,441 --> 19:34:22,891
Here's a slightly different syntax for\n
20469
19:34:22,892 --> 19:34:26,912
If I want to apply CSS properties,\nthat is, key value pairs
20470
19:34:26,911 --> 19:34:31,201
to the header of the page, I say\n
20471
19:34:31,202 --> 19:34:38,101
and inside of those I say\nfont-size large, text-align center.
20472
19:34:38,101 --> 19:34:42,131
Then, if I want to apply some properties\n
20473
19:34:42,131 --> 19:34:47,072
I again do font-size, say, medium,\n
20474
19:34:47,072 --> 19:34:49,622
Then, lastly, on the\nfooter of the page, I
20475
19:34:49,622 --> 19:34:55,021
can assign some properties like\n
20476
19:34:57,532 --> 19:35:00,442
And I don't have to do\nanything more in my HTML.
20477
19:35:00,441 --> 19:35:03,871
It all just represents\nthe structure of my page.
20478
19:35:03,872 --> 19:35:06,313
But, because of this style\ntag in the head of the page
20479
19:35:06,313 --> 19:35:08,271
the browser knows in\nadvance that the moment it
20480
19:35:08,271 --> 19:35:10,851
encounters a header tag, a\nmain tag, or a footer tag
20481
19:35:10,851 --> 19:35:14,062
it should apply those\nproperties, those styles.
20482
19:35:14,062 --> 19:35:17,032
If I reload the page, other\nthan it being recentered now
20483
19:35:18,111 --> 19:35:21,171
All we're doing is sort of\n
20484
19:35:21,172 --> 19:35:24,322
But now everything's\nin the top of the file.
20485
19:35:24,322 --> 19:35:26,241
But there's still a bad design here.
20486
19:35:26,241 --> 19:35:30,202
What could I now do\nthat would be smarter?
20487
19:35:33,952 --> 19:35:35,932
SPEAKER 1: OK, create a\nnew file with just the CSS.
20488
19:35:36,411 --> 19:35:37,828
Let's go there in just one second.
20489
19:35:37,828 --> 19:35:40,161
But even as we're here,\nthere's still a redundancy
20490
19:35:40,161 --> 19:35:41,902
we can probably chip away at.
20491
19:35:41,902 --> 19:35:45,232
Yeah, get rid of the text-align center\n
20492
19:35:45,232 --> 19:35:47,601
doesn't seem necessary,\nand perhaps someone
20493
19:35:47,601 --> 19:35:53,991
else, if I get rid of text-align center,\n
20494
19:35:53,991 --> 19:35:56,991
in order to bring it back, but\n
20495
19:35:56,991 --> 19:35:59,822
And the page, if I scroll\ndown, looks like this, in HTML.
20496
19:36:01,282 --> 19:36:02,792
SPEAKER 1: Yeah, so the body tag.
20497
19:36:02,792 --> 19:36:04,711
So let me go ahead and say body.
20498
19:36:04,711 --> 19:36:07,282
And then in here, put text-align center.
20499
19:36:07,282 --> 19:36:10,282
And that, now, if I reload the\npage, has no visual effect
20500
19:36:10,282 --> 19:36:12,142
but it's just better\ndesign, because now I
20501
19:36:12,142 --> 19:36:14,242
factored out that kind of commonality.
20502
19:36:14,241 --> 19:36:16,521
And so, just to make clear\nwhat we've been doing here
20503
19:36:16,521 --> 19:36:19,491
these are all, again, CSS\nproperties, these key value pairs.
20504
19:36:19,491 --> 19:36:22,461
And there's different types\nof ways of using them.
20505
19:36:22,461 --> 19:36:24,797
And there's this whole taxonomy.
20506
19:36:24,797 --> 19:36:27,922
What we've been doing thus far are what\n
20507
19:36:27,922 --> 19:36:30,172
where the type is the name of a tag.
20508
19:36:30,172 --> 19:36:33,172
And so it turns out there's\nother ways, though, to do this.
20509
19:36:33,172 --> 19:36:35,211
And let's head in this direction.
20510
19:36:35,211 --> 19:36:38,782
Let's go ahead and maybe write\nour CSS slightly differently
20511
19:36:38,782 --> 19:36:40,282
because you know what would be nice.
20512
19:36:40,282 --> 19:36:44,512
I bet, after today, once I start\n
20513
19:36:44,512 --> 19:36:46,222
or John Harvard's home\npage, I might want
20514
19:36:46,221 --> 19:36:48,741
to have centered text on other pages.
20515
19:36:48,741 --> 19:36:52,012
And I might want to have large\n
20516
19:36:52,012 --> 19:36:55,012
It'd be nice if I could reuse\nthese properties again and again
20517
19:36:55,012 --> 19:36:57,292
and kind of create my\nown library, maybe even
20518
19:36:57,292 --> 19:36:59,812
ultimately putting it\nin a separate file.
20519
19:37:00,562 --> 19:37:04,042
Instead of explicitly applying\ntext-align center to the body
20520
19:37:04,042 --> 19:37:07,101
let me create a new\nnoun, or an adjective
20521
19:37:07,101 --> 19:37:09,441
rather, for myself, called centered.
20522
19:37:09,441 --> 19:37:12,021
It has to start with a\ndot, because what I'm doing
20523
19:37:12,021 --> 19:37:14,781
is inventing my own class, so to speak.
20524
19:37:14,782 --> 19:37:17,662
This has nothing to do with\nclasses in Java or Python.
20525
19:37:17,661 --> 19:37:20,271
Class here is this aesthetic feature.
20526
19:37:20,271 --> 19:37:22,971
And, actually, let me rename\nthese, to be dot large
20527
19:37:26,001 --> 19:37:29,931
What this is doing for me\nis it's inventing new words
20528
19:37:29,932 --> 19:37:33,232
well-named words, that I\ncan now use in this file
20529
19:37:33,232 --> 19:37:36,211
or potentially in other web\npages I make, as follows.
20530
19:37:36,211 --> 19:37:39,322
I can now say, if I want\nto center the whole body
20531
19:37:39,322 --> 19:37:41,631
I can say class equals centered.
20532
19:37:41,631 --> 19:37:45,171
On the header tag, I can\nsay class equals large.
20533
19:37:45,172 --> 19:37:48,222
On the main tag I can\nsay class equals medium.
20534
19:37:48,221 --> 19:37:50,631
On the footer tag, I can\nsay class equals small.
20535
19:37:50,631 --> 19:37:53,391
But let me take this one step further.
20536
19:37:53,392 --> 19:37:56,212
As you suggested, why\ndon't I go ahead now
20537
19:37:56,211 --> 19:37:59,452
and let me actually get rid\nof-- let me grab all of the CSS
20538
19:38:01,312 --> 19:38:08,362
Let me get rid of the style tag here,\n
20539
19:38:08,361 --> 19:38:13,432
and let me just save all of that same\n
20540
19:38:13,432 --> 19:38:15,622
nothing else, no HTML whatsoever.
20541
19:38:15,622 --> 19:38:18,801
But let me go back to my\nHome.html page, and this
20542
19:38:18,801 --> 19:38:21,951
is one of the most annoyingly named\n
20543
19:38:21,952 --> 19:38:30,052
mean what it does, Link HREF\nHome.css rel equals stylesheet.
20544
19:38:30,051 --> 19:38:33,292
So ideally we would have used the\n
20545
19:38:33,292 --> 19:38:36,112
but this is link in the\nsort of conceptual sense.
20546
19:38:36,111 --> 19:38:39,981
We're linking this file to this other\n
20547
19:38:39,982 --> 19:38:43,612
using this hyper-reference,\nHome.css, the relationship
20548
19:38:43,611 --> 19:38:46,281
of that file to this one\nis that of stylesheet.
20549
19:38:46,282 --> 19:38:48,472
A stylesheet is a file\ncontaining a whole bunch
20550
19:38:48,471 --> 19:38:52,581
of stylizations, a whole bunch\nof properties, as we just did.
20551
19:38:52,581 --> 19:38:54,621
So here, too, it's\nunderwhelming the effect.
20552
19:38:54,622 --> 19:38:57,002
If I reload the page, nothing changed.
20553
19:38:57,001 --> 19:39:01,881
But now, I not only have\na better design here
20554
19:39:01,881 --> 19:39:06,921
because I can now use those same classes\n
20555
19:39:06,922 --> 19:39:11,211
my third page, my fourth page, my bio,\n
20556
19:39:11,211 --> 19:39:15,021
making on my website here, I\ncan reuse those styles by just
20557
19:39:15,021 --> 19:39:19,251
including one line of code, instead of\n
20558
19:39:19,251 --> 19:39:21,841
stuff into file after file after file.
20559
19:39:21,842 --> 19:39:24,322
And heck, if the rest\nof the world is really
20560
19:39:24,322 --> 19:39:28,197
impressed by my centered class, and\n
20561
19:39:28,197 --> 19:39:31,072
I could bundle this up, let other\n
20562
19:39:31,072 --> 19:39:35,482
and I have my own library, my own CSS\n
20563
19:39:35,482 --> 19:39:37,881
Why should you ever invent\na centered class again
20564
19:39:37,881 --> 19:39:41,331
if I already did it for you,\nstupid and small as this one is.
20565
19:39:41,331 --> 19:39:43,191
But it would be nice\nnow to package this up
20566
19:39:43,191 --> 19:39:47,911
in a way that's usable\nby other people as well.
20567
19:39:47,911 --> 19:39:51,351
So this is perhaps the best\ndesign, when it comes to CSS.
20568
19:39:51,351 --> 19:39:56,601
Use classes where you can, use\n
20569
19:39:56,601 --> 19:40:00,971
but don't use the style attribute\n
20570
19:40:00,971 --> 19:40:06,531
starts to get messy quickly,\nespecially for large files.
20571
19:40:06,532 --> 19:40:08,617
All right, any questions, then, on this.
20572
19:40:11,361 --> 19:40:13,731
No, all right, so\nthat's class selectors.
20573
19:40:13,732 --> 19:40:16,131
When you specify dot\nsomething, that means
20574
19:40:16,131 --> 19:40:20,211
you're selecting all of the tags in the\n
20575
19:40:20,211 --> 19:40:21,661
and applying those properties.
20576
19:40:21,661 --> 19:40:23,828
So there's a couple of\nothers here, just to give you
20577
19:40:23,828 --> 19:40:25,432
a taste now of what's possible.
20578
19:40:25,432 --> 19:40:29,072
There's so much more that you can\n
20579
19:40:29,072 --> 19:40:33,501
Let me go ahead and open up a few\n
20580
19:40:33,501 --> 19:40:35,391
Let me go ahead and open up VS Code.
20581
19:40:35,392 --> 19:40:43,502
And let me go ahead and copy\nmy source eight directory.
20582
19:40:43,501 --> 19:40:47,041
Give me one second to grab the source\n
20583
19:40:47,042 --> 19:40:51,872
so that I can now go into\nmy browser, go into some
20584
19:40:51,872 --> 19:40:53,822
of the pre-made examples\nin source eight
20585
19:40:53,822 --> 19:40:57,461
and let me open up paragraphs one here.
20586
19:40:57,461 --> 19:41:00,691
So here's something,\nit's a little subtle.
20587
19:41:00,691 --> 19:41:04,292
But does anyone notice\nhow this is stylized?
20588
19:41:04,292 --> 19:41:07,142
This is just some generic\nlorem ipsum text again.
20589
19:41:07,142 --> 19:41:12,972
But what's noteworthy\nstylistically, a book might do this.
20590
19:41:14,432 --> 19:41:15,932
SPEAKER 1: Yeah, the first\nparagraph's a little bigger.
20591
19:41:16,322 --> 19:41:18,691
Who knows, it's just a stylistic\n
20592
19:41:18,691 --> 19:41:19,981
The first paragraph is bigger.
20593
19:41:21,072 --> 19:41:23,562
Well, we can actually explore\nthis in a couple of ways.
20594
19:41:23,562 --> 19:41:26,101
One, I can obviously go into\nVS Code and show you the code.
20595
19:41:26,101 --> 19:41:29,281
But, now, that we're using Chrome and\n
20596
19:41:30,572 --> 19:41:33,661
View developer, developer\ntools, and now notice
20597
19:41:33,661 --> 19:41:37,201
let me turn off the mobile feature,\n
20598
19:41:37,202 --> 19:41:39,752
to the bottom, just so\nthat it's fully wide.
20599
19:41:39,751 --> 19:41:41,641
We looked at the Network tab before.
20600
19:41:41,642 --> 19:41:44,072
We looked at the mobile button before.
20601
19:41:44,072 --> 19:41:45,971
Now let me click on Elements.
20602
19:41:45,971 --> 19:41:49,651
What's nice about the Elements tab\n
20603
19:41:49,652 --> 19:41:54,752
version of the web page's HTML,\n
20604
19:41:54,751 --> 19:41:58,831
for you, so that you can now henceforth\n
20605
19:41:58,831 --> 19:42:02,461
code, the HTML source code, of\nany web page on the internet.
20606
19:42:02,461 --> 19:42:05,221
Notice that my own web page\nhere, it's not that interesting.
20607
19:42:05,221 --> 19:42:08,072
There's a bunch of paragraph\ntags of lorem ipsum text.
20608
19:42:09,482 --> 19:42:12,452
The very first one, I gave an ID to.
20609
19:42:12,452 --> 19:42:14,851
This is something that you,\nas a web designer, can do.
20610
19:42:14,851 --> 19:42:18,452
You can give an ID attribute\nto any tag in a page
20611
19:42:18,452 --> 19:42:20,222
to give it a unique identifier.
20612
19:42:20,221 --> 19:42:22,951
The onus is on you, not to\nreuse the word, anywhere else.
20613
19:42:22,952 --> 19:42:24,932
If you reuse it, you've screwed up.
20614
19:42:26,402 --> 19:42:30,362
But I chose an ID of\nfirst, just so that I
20615
19:42:30,361 --> 19:42:34,271
have some way of referring to the\n
20616
19:42:34,271 --> 19:42:37,572
If I look in the head of the\npage, and the style tag here
20617
19:42:37,572 --> 19:42:40,292
notice that I have hash first.
20618
19:42:40,292 --> 19:42:43,411
So just as I use dot for\nclasses, the world of CSS
20619
19:42:43,411 --> 19:42:46,741
uses a hash symbol to\nrepresent IDs, unique IDs.
20620
19:42:46,741 --> 19:42:51,721
And what this is telling the browser,\n
20621
19:42:51,721 --> 19:42:57,182
F-I-R-S-T, without the hash,\napply font-size larger to it.
20622
19:42:57,182 --> 19:43:00,542
And that's why the first paragraph,\n
20623
19:43:01,952 --> 19:43:04,502
If I actually go into\nVS Code now, and let
20624
19:43:04,501 --> 19:43:06,121
me go into my source eight directory.
20625
19:43:06,122 --> 19:43:09,002
Let me open up Paragraphs1.html.
20626
19:43:10,622 --> 19:43:14,521
If I want to change the color of that\n
20627
19:43:14,521 --> 19:43:16,351
I can do color colon: green.
20628
19:43:16,351 --> 19:43:19,861
Let me close the developer\ntools, reload the page.
20629
19:43:19,861 --> 19:43:22,572
And now that page is green as well.
20630
19:43:22,572 --> 19:43:24,152
You don't have to just use words.
20631
19:43:27,271 --> 19:43:31,081
What was the hex code for green in RGB?
20632
19:43:31,081 --> 19:43:34,691
Like no red, lots of green, no blue.
20633
19:43:34,691 --> 19:43:38,881
So you could do 00 FF 00, using\na hash, which, coincidentally
20634
19:43:38,881 --> 19:43:41,131
is the same symbol, but it\nhas nothing to do with IDs.
20635
19:43:41,131 --> 19:43:44,792
This is just how Photoshop and\nweb pages represent colors.
20636
19:43:44,792 --> 19:43:46,050
Let's go back here and reload.
20637
19:43:46,050 --> 19:43:48,842
It's the same, although it's a\n
20638
19:43:50,251 --> 19:43:56,131
If I want to change it to red, that\n
20639
19:43:56,131 --> 19:43:58,081
and here I can go and reload.
20640
19:43:58,081 --> 19:43:59,932
Now it's first paragraph red.
20641
19:43:59,932 --> 19:44:01,682
This actually gets\npretty tedious quickly.
20642
19:44:01,682 --> 19:44:04,622
Like, if you're a web designer trying\n
20643
19:44:04,622 --> 19:44:06,961
it actually might be fun\nto tinker with the website
20644
19:44:06,961 --> 19:44:09,422
before you open up your editor\nand you start making changes
20645
19:44:11,771 --> 19:44:14,611
So notice what you can\ndo with developer tools
20646
19:44:14,611 --> 19:44:16,591
too, in Chrome and other browsers.
20647
19:44:16,592 --> 19:44:19,982
When I highlight over this\nparagraph, under the Elements tab
20648
19:44:19,982 --> 19:44:22,502
notice that, one, it\ngets highlighted in blue.
20649
19:44:22,501 --> 19:44:24,714
If I move my cursor, it\ndoesn't get highlighted.
20650
19:44:24,714 --> 19:44:26,131
If I move it, it gets highlighted.
20651
19:44:26,131 --> 19:44:29,641
So it's showing me what\nthat tag represents.
20652
19:44:29,642 --> 19:44:32,342
But notice over here on\nthe right, you can also
20653
19:44:32,342 --> 19:44:35,938
see all of the stylizations\nof that particular element.
20654
19:44:37,021 --> 19:44:40,561
The italicized ones here at the\n
20655
19:44:40,562 --> 19:44:44,772
That means this is what Google makes\n
20656
19:44:44,771 --> 19:44:48,432
But in non-italicized\nhere, you see hash first
20657
19:44:48,432 --> 19:44:50,441
which is my code, that I just changed.
20658
19:44:50,441 --> 19:44:55,981
And if I want to start tinkering with\n
20659
19:44:57,211 --> 19:45:02,101
But notice, if I go back to VS Code, I\n
20660
19:45:02,101 --> 19:45:04,141
This is now purely client side.
20661
19:45:05,221 --> 19:45:08,281
When I drew that picture\nearlier of the browser going
20662
19:45:08,282 --> 19:45:11,252
making a request to the cloud, the\n
20663
19:45:11,251 --> 19:45:14,491
coming back, the browser,\nyour Mac, your PC, your phone
20664
19:45:14,491 --> 19:45:18,641
has a copy of all the HTML and\nCSS, so you can change it here
20665
19:45:20,664 --> 19:45:22,831
And, for instance, you can\ndo this with any website.
20666
19:45:22,831 --> 19:45:29,011
Let's go, say, on a field trip\nhere, to how about Stanford.edu.
20667
19:45:29,012 --> 19:45:31,771
So here's Stanford's\nwebsite as of today.
20668
19:45:31,771 --> 19:45:34,141
Let's go ahead here\nand let's see, there's
20669
19:45:34,142 --> 19:45:36,642
their admissions page,\ncampus life, and so forth.
20670
19:45:36,642 --> 19:45:41,282
Let me go ahead and view developer\ntools on Stanford's page
20671
19:45:41,282 --> 19:45:45,242
developer tools, elements,\nyou can see all of their HTML.
20672
19:45:45,241 --> 19:45:48,221
And notice it's collapsed,\nso here is their header.
20673
19:45:48,221 --> 19:45:51,121
Here's their main part, and\nI'm using my keyboard shortcuts
20674
19:45:51,122 --> 19:45:54,203
to just open and close the tags,\nto dive in deeper and deeper.
20675
19:45:54,203 --> 19:45:56,161
Suppose you want to kind\nof mess with Stanford
20676
19:45:56,161 --> 19:45:58,861
you can actually like right\nclick on any element of a page
20677
19:45:58,861 --> 19:46:03,121
or control click, Inspect, and that's\n
20678
19:46:03,122 --> 19:46:07,062
to the tag in the Elements\ntab that shows you that link.
20679
19:46:07,062 --> 19:46:11,882
And notice, if I hover over this\n
20680
19:46:11,881 --> 19:46:13,921
as an unordered list from left to right.
20681
19:46:13,922 --> 19:46:16,255
But it doesn't have to be a\nbulleted list top to bottom.
20682
19:46:16,255 --> 19:46:20,202
They've used CSS to change it to be\n
20683
19:46:20,202 --> 19:46:22,862
research, health care,\ncampus admission, about.
20684
19:46:22,861 --> 19:46:25,691
Well, so much for\nadmission, that's gone.
20685
19:46:25,691 --> 19:46:29,911
So now, if I close developer tools,\n
20686
19:46:29,911 --> 19:46:32,971
But, of course, what have I really done.
20687
19:46:32,971 --> 19:46:35,404
I've just like mutated\nmy own local copy.
20688
19:46:35,404 --> 19:46:37,322
So this is not hacking,\neven though this might
20689
19:46:37,322 --> 19:46:38,947
be how they do it in TV and the movies.
20690
19:46:38,947 --> 19:46:40,691
It's still there if I reload the page.
20691
19:46:40,691 --> 19:46:44,601
But it's a wonderfully powerful way\n
20692
19:46:44,601 --> 19:46:46,351
different things\nstylistically, figure out
20693
19:46:46,351 --> 19:46:48,902
how you want to design\nsomething, and two, just learn
20694
19:46:50,381 --> 19:46:53,339
So, for instance, if I right click\n
20695
19:46:53,339 --> 19:46:56,831
go to inspect, and let\nme go to the LI tag.
20696
19:46:56,831 --> 19:47:00,211
Let me keep going up, up,\nup, up, up to the UL tag.
20697
19:47:00,211 --> 19:47:02,322
There's going to be a lot going on here.
20698
19:47:02,322 --> 19:47:06,271
But notice, they have applied\nall of these CSS properties
20699
19:47:08,702 --> 19:47:11,912
But notice, here, this is\nhow, it's something like this.
20700
19:47:11,911 --> 19:47:15,872
And we'd have to read more to learn\n
20701
19:47:15,872 --> 19:47:18,122
this is how they probably\ngot rid of the bullets.
20702
19:47:18,122 --> 19:47:19,682
And what you can do is just tinker.
20703
19:47:19,682 --> 19:47:21,389
Like, all right, well,\nwhat does this do?
20704
19:47:22,771 --> 19:47:26,101
All right, didn't really change\n
20705
19:47:26,922 --> 19:47:30,572
So now the margin is changed, the\npadding around it has changed.
20706
19:47:31,801 --> 19:47:34,081
We can just start turning\nthings on and off, just
20707
19:47:34,081 --> 19:47:35,792
to get a sense of how\nthe web page works.
20708
19:47:35,792 --> 19:47:37,831
I'm not really learning\nanything here so far.
20709
19:47:37,831 --> 19:47:43,801
Let me go to the LI here for, let's\n
20710
19:47:47,432 --> 19:47:50,229
So when there's a\ndisplay property in CSS
20711
19:47:50,229 --> 19:47:53,312
that's apparently effectively changing\n
20712
19:47:53,312 --> 19:47:56,491
if I turn that off, now Stanford's\nlinks all look like this.
20713
19:47:56,491 --> 19:47:57,721
And there are those bullets.
20714
19:47:57,721 --> 19:48:00,841
So again, just default styles,\nthat they've somehow overridden
20715
19:48:00,842 --> 19:48:03,782
and a good web designer\njust knows ultimately
20716
19:48:03,782 --> 19:48:06,500
how to do these kinds of things.
20717
19:48:06,500 --> 19:48:08,792
All right, how about a couple\nof final building blocks
20718
19:48:08,792 --> 19:48:09,991
before we'll take one more break.
20719
19:48:09,991 --> 19:48:12,616
And then we'll dive in with\nJavaScript to manipulate this stuff
20720
19:48:13,771 --> 19:48:17,432
Let me go ahead and open up,\nhow about Paragraphs2 here.
20721
19:48:17,432 --> 19:48:21,631
Let me close this tab, let me go\n
20722
19:48:21,631 --> 19:48:25,051
And this one looks\nthe same, except, when
20723
19:48:25,051 --> 19:48:27,661
I go ahead and inspect\nthis first paragraph
20724
19:48:27,661 --> 19:48:29,822
notice that I was able\nto get rid of the ID
20725
19:48:29,822 --> 19:48:32,202
somehow, which is just\nto say, there's many
20726
19:48:32,202 --> 19:48:34,562
many ways to solve\nproblems in HTML and CSS
20727
19:48:34,562 --> 19:48:36,422
just like there is in C and Python.
20728
19:48:36,422 --> 19:48:39,422
Let me look in the head and\nthe style of the page now.
20729
19:48:39,422 --> 19:48:45,812
This is what we might call\nanother type of selector
20730
19:48:45,812 --> 19:48:49,382
that allows us to specify\nthe paragraph tag
20731
19:48:49,381 --> 19:48:52,111
that itself happens to\nbe the first child only.
20732
19:48:52,111 --> 19:48:56,001
So you can apply CSS to a very\n
20733
19:48:56,001 --> 19:48:58,686
There's also syntax for last\nchild, if just the first one
20734
19:48:58,687 --> 19:49:00,312
is supposed to look a little different.
20735
19:49:00,312 --> 19:49:02,432
So, here, I've just\ngotten out of the business
20736
19:49:02,432 --> 19:49:05,282
of creating my own unique\nidentifier and, instead, I'm
20737
19:49:05,282 --> 19:49:08,142
using this type of selector as well.
20738
19:49:09,282 --> 19:49:14,072
Let me go into another example\nhere, called Link1.html
20739
19:49:14,072 --> 19:49:17,411
and here we have a very simple\n
20740
19:49:17,411 --> 19:49:19,411
But notice it's purple\nby default, because we've
20741
19:49:21,271 --> 19:49:24,331
Let's see if we can't maybe\nstylize Harvard's links
20742
19:49:25,812 --> 19:49:30,702
Let me go into Link version\n2, now, which looks like this.
20743
19:49:30,702 --> 19:49:33,012
And now Harvard is very red.
20744
19:49:33,971 --> 19:49:36,451
Well, let me right click\non it, click Inspect
20745
19:49:36,452 --> 19:49:37,952
and I can start to poke around.
20746
19:49:37,952 --> 19:49:40,381
It looks like my HTML is\nnot at all noteworthy.
20747
19:49:40,381 --> 19:49:43,952
It's just very simple HTML,\nanchor tag with an HREF.
20748
19:49:46,629 --> 19:49:48,461
And we can look at it\nin two different ways.
20749
19:49:48,461 --> 19:49:51,422
We can literally look at\nthe style, contents here
20750
19:49:51,422 --> 19:49:55,472
or we can look at Chrome's\npretty version of it, over here.
20751
19:49:55,471 --> 19:49:59,521
It looks like my style\nsheet, in the style tag
20752
19:49:59,521 --> 19:50:04,001
has changed the color to be red, and the\n
20753
19:50:04,001 --> 19:50:06,121
but it's another CSS property, to none.
20754
19:50:06,122 --> 19:50:08,911
Notice, if I turn that\noff, links on the internet
20755
19:50:08,911 --> 19:50:10,831
are underlined by\ndefault, which tends to be
20756
19:50:10,831 --> 19:50:13,591
good for familiarity, for\nvisibility, for accessibility.
20757
19:50:13,592 --> 19:50:18,152
But, if it's very obvious what\nis text and what is a link
20758
19:50:18,152 --> 19:50:21,062
maybe you change text\ndecoration to none.
20759
19:50:21,062 --> 19:50:25,142
But maybe, watch this, maybe the\nlink comes, the line comes back
20760
19:50:26,611 --> 19:50:29,161
Well, let's look at how\nI did this in style.
20761
19:50:29,161 --> 19:50:34,111
Notice that I have stylization, and I\n
20762
19:50:34,111 --> 19:50:36,572
here, as tends to be convention in CSS.
20763
19:50:36,572 --> 19:50:38,581
Color is red, text decoration is none.
20764
19:50:38,581 --> 19:50:42,961
But, whenever an anchor\ntag is hovered over
20765
19:50:42,961 --> 19:50:47,881
you can change the text decoration\n
20766
19:50:47,881 --> 19:50:51,251
So, again, just little ways of playing\n
20767
19:50:51,251 --> 19:50:52,831
once you understand\nthat, really, there's
20768
19:50:52,831 --> 19:50:54,091
just different types of selectors.
20769
19:50:54,092 --> 19:50:56,131
And you might have to remind\n
20770
19:50:57,361 --> 19:51:02,551
But it's just another way of scoping\n
20771
19:51:02,551 --> 19:51:06,451
Let's look at version 3 of this\n
20772
19:51:06,452 --> 19:51:11,702
If I go to Link3.html, maybe I\nwant to have Harvard links red
20773
19:51:14,471 --> 19:51:17,011
Well, let's right click,\nand click Inspect.
20774
19:51:17,012 --> 19:51:21,872
And here we might have two links,\nwith a couple of techniques
20775
19:51:21,872 --> 19:51:24,631
just to, again, emphasize, you can\n
20776
19:51:24,631 --> 19:51:30,061
I gave my Harvard link an ID of\n
20777
19:51:30,062 --> 19:51:34,802
In my CSS, if we go to the head\nof the page, I then did this.
20778
19:51:34,801 --> 19:51:37,441
The tag with the Harvard ID, a.k.a.
20779
19:51:37,441 --> 19:51:41,671
#Harvard, should be red,\n#Yale should be blue
20780
19:51:41,672 --> 19:51:45,572
and then any anchor tag should\nhave no text decoration
20781
19:51:45,572 --> 19:51:48,721
unless you hover over it, at which\n
20782
19:51:48,721 --> 19:51:52,231
And so, if I hover over Harvard,\nit's red underlined, Yale
20783
19:51:53,262 --> 19:51:56,262
If I want to get rid of the IDs, I\n
20784
19:51:57,542 --> 19:52:00,601
Same effect, but notice,\nI got rid of the IDs now.
20785
19:52:00,601 --> 19:52:02,281
How else can I express myself?
20786
19:52:02,282 --> 19:52:03,992
Well, let's look at the CSS here.
20787
19:52:03,991 --> 19:52:06,512
The anchor tag has no text\ndecoration by default
20788
19:52:06,512 --> 19:52:08,051
unless you're hovering over it.
20789
19:52:09,461 --> 19:52:11,881
This is what we would\ncall, on our list here
20790
19:52:11,881 --> 19:52:16,801
an attribute selector, where you\nselect tags using CSS notation
20791
19:52:18,372 --> 19:52:21,692
So this is saying, go ahead\nand find any anchor tag
20792
19:52:21,691 --> 19:52:26,341
who's HREF value happens to\nequal this URL, and make it red.
20793
19:52:26,342 --> 19:52:28,327
Do the same for Yale, and make it blue.
20794
19:52:28,327 --> 19:52:31,452
Now, this might not be ideal, because\n
20795
19:52:31,452 --> 19:52:33,601
these equal signs don't\nwork, because if it's
20796
19:52:33,601 --> 19:52:37,202
a different Harvard or different Yale\n
20797
19:52:37,202 --> 19:52:40,412
So let me look at version\n5 here, of Link.html.
20798
19:52:40,411 --> 19:52:43,951
Look at this style, and I\ndid this a little smarter.
20799
19:52:44,941 --> 19:52:46,951
And, again, just the kind\nof thing you look up.
20800
19:52:46,952 --> 19:52:54,542
Star equals means, change any anchor\n
20801
19:52:54,542 --> 19:52:59,551
Harvard.edu to red, and do the same\n
20802
19:52:59,551 --> 19:53:01,141
So star here connotes wildcard.
20803
19:53:01,142 --> 19:53:04,652
So search for Harvard.edu or\nYale.edu anywhere in the HREF
20804
19:53:04,652 --> 19:53:07,652
and if it's there, colorize the link.
20805
19:53:07,652 --> 19:53:11,732
And, again, we could do this all\n
20806
19:53:11,732 --> 19:53:15,012
to actually achieve the same kind\n
20807
19:53:15,012 --> 19:53:17,115
And as projects just\nget larger and larger
20808
19:53:17,115 --> 19:53:19,032
you just have more and\nmore decisions to make.
20809
19:53:19,032 --> 19:53:21,752
And so you have certain\nconventions you start to adopt.
20810
19:53:21,751 --> 19:53:25,171
And, indeed, if I may,\nyou have the introduction
20811
19:53:25,172 --> 19:53:28,142
of what are called\nframeworks, ultimately.
20812
19:53:28,142 --> 19:53:30,512
If you're a full-time\nweb developer, or you're
20813
19:53:30,512 --> 19:53:33,889
working for a company doing the same,\n
20814
19:53:34,682 --> 19:53:38,372
For instance, the company might say,\n
20815
19:53:38,372 --> 19:53:41,222
Or always use attribute\nselectors, or don't use this.
20816
19:53:41,221 --> 19:53:44,221
And it wouldn't be necessarily\nas draconian as that.
20817
19:53:44,221 --> 19:53:46,471
But they might have a\nstyle guide of sorts.
20818
19:53:46,471 --> 19:53:49,441
But, what many people, and\nmany companies, do nowadays
20819
19:53:49,441 --> 19:53:53,402
is they do not come up with all\nof their own CSS properties.
20820
19:53:53,402 --> 19:53:57,241
They start with something off the shelf,\n
20821
19:53:57,241 --> 19:54:00,781
source framework, that just gives\n
20822
19:54:00,782 --> 19:54:04,172
for free, just by using\na third party library.
20823
19:54:04,172 --> 19:54:05,912
And one of the most\npopular ones nowadays
20824
19:54:05,911 --> 19:54:07,861
is something called\nBootstrap, that CS50 uses
20825
19:54:07,861 --> 19:54:11,161
on all of its websites,\nsuper-popular in industry as well.
20826
19:54:11,161 --> 19:54:17,161
It's at getbootstrap.com, and this\n
20827
19:54:17,161 --> 19:54:20,611
a website that documents\nthe library that they offer.
20828
19:54:20,611 --> 19:54:24,751
And there's so much documentation here,\n
20829
19:54:26,521 --> 19:54:30,122
It just gives you, out of the\nbox, the CSS with which you
20830
19:54:31,322 --> 19:54:33,392
If you've ever noticed\non CS50's website
20831
19:54:33,392 --> 19:54:36,002
little colorful warnings at\nthe top of the page, or call
20832
19:54:36,001 --> 19:54:37,861
outs, to draw your attention to things.
20833
19:54:38,851 --> 19:54:41,491
It's probably a paragraph\ntag or a div tag
20834
19:54:41,491 --> 19:54:43,111
and maybe we changed the font color.
20835
19:54:43,111 --> 19:54:44,731
We changed the background color.
20836
19:54:44,732 --> 19:54:47,282
Or it's a lot of stuff we could\nabsolutely do from scratch
20837
19:54:47,282 --> 19:54:49,747
but, you know what,\nwhy would we reinvent
20838
19:54:49,747 --> 19:54:51,372
the wheel if we can just use Bootstrap.
20839
19:54:51,372 --> 19:54:53,252
So, for instance, let\nme just scroll down.
20840
19:54:53,251 --> 19:54:57,851
If you've ever seen on CS50's website\n
20841
19:54:57,851 --> 19:55:00,542
let me just zoom in on this.
20842
19:55:00,542 --> 19:55:03,271
We are just using HTML like this.
20843
19:55:03,271 --> 19:55:06,241
We're using a div tag, which,\nagain, is an invisible division
20844
19:55:06,241 --> 19:55:07,812
a rectangular region of the page.
20845
19:55:07,812 --> 19:55:12,812
But we're using classes called alert\n
20846
19:55:12,812 --> 19:55:17,461
Those are classes that the\nfolks at Bootstrap invented.
20847
19:55:17,461 --> 19:55:19,982
They associated certain\ntext colors and background
20848
19:55:19,982 --> 19:55:23,042
colors and padding and margin\nand like other aesthetics with
20849
19:55:23,042 --> 19:55:25,411
so all we have to do\nis use those classes.
20850
19:55:25,411 --> 19:55:28,441
Role equals alert, just makes clear\n
20851
19:55:28,441 --> 19:55:30,512
is an alert, that should\nprobably be recited
20852
19:55:30,512 --> 19:55:33,601
and whatever's in between\nthe open tag and close tag
20853
19:55:33,601 --> 19:55:35,312
is what the human would see.
20854
19:55:35,312 --> 19:55:37,262
How do you use something like Bootstrap?
20855
19:55:37,262 --> 19:55:39,032
Well, you just read the documentation.
20856
19:55:39,032 --> 19:55:44,272
Under Getting Started, there is a\n
20857
19:55:45,232 --> 19:55:49,312
So in Table.html, we had code like this.
20858
19:55:49,312 --> 19:55:52,182
Let me actually read Bootstrap's\ndocumentation really fast.
20859
19:55:55,542 --> 19:55:57,551
I'm going to put this\ninto the head of my page.
20860
19:55:57,551 --> 19:55:59,981
And it's quite long, but\nnotice, it's a link tag
20861
19:55:59,982 --> 19:56:03,851
which I used earlier for my\nown CSS file, the HREF of which
20862
19:56:03,851 --> 19:56:06,491
is this CDN link, content\ndelivery network, that's
20863
19:56:06,491 --> 19:56:09,732
referring to a specific version of\n
20864
19:56:09,732 --> 19:56:13,752
And the file that I'm including\nis called Bootstrap.min.css.
20865
19:56:13,751 --> 19:56:17,291
This is an actual file I\ncan visit with my browser.
20866
19:56:17,292 --> 19:56:20,622
If I open this in a separate\ntab, this is the CSS
20867
19:56:20,622 --> 19:56:23,532
that Bootstrap has made\nfreely available to us.
20868
19:56:25,221 --> 19:56:27,221
That's because it's been\nminimized, just to not
20869
19:56:27,221 --> 19:56:29,771
waste space by adding lots\nof white space and comments.
20870
19:56:29,771 --> 19:56:33,221
But this contains a whole lot,\nhundreds, of CSS properties
20871
19:56:33,221 --> 19:56:36,851
that we can reuse, thanks to\nclasses that they invented.
20872
19:56:36,851 --> 19:56:40,285
If I want to use some JavaScript\n
20873
19:56:40,285 --> 19:56:41,952
But we'll come back to that before long.
20874
19:56:41,952 --> 19:56:45,522
Let me now just make a couple\nof tweaks to this table.
20875
19:56:45,521 --> 19:56:48,551
If I go into my browser\nfrom before, this
20876
19:56:48,551 --> 19:56:51,461
is what it looked like previously,\nwhere name and number were
20877
19:56:51,461 --> 19:56:54,086
bold, but centered, and then\nCarter and David were on the left
20878
19:56:54,086 --> 19:56:55,504
and the numbers were to the right.
20879
19:56:56,202 --> 19:56:59,662
It's not that pretty, but it'd be nice\n
20880
19:56:59,661 --> 19:57:03,101
So if we add Bootstrap into it,\nnotice one thing happens first
20881
19:57:04,872 --> 19:57:08,262
No longer are Chrome's\ndefault styles used.
20882
19:57:08,262 --> 19:57:10,482
Now Bootstrap's default\nstyles are used, which
20883
19:57:10,482 --> 19:57:14,112
is a way of enforcing similarity\nacross Chrome, Edge, Firefox
20884
19:57:15,461 --> 19:57:18,042
Notice it went from a\nserif font to a sans serif
20885
19:57:18,042 --> 19:57:19,792
font, and something cleaner like this.
20886
19:57:19,792 --> 19:57:24,221
It still looks pretty ugly, but let\n
20887
19:57:24,221 --> 19:57:30,311
Let me go under their\ncontent tab, for tables.
20888
19:57:30,312 --> 19:57:32,532
And if I just kind of\nstart skimming this
20889
19:57:32,532 --> 19:57:34,392
these are some good\nlooking tables, right?
20890
19:57:34,392 --> 19:57:38,442
Like, there's some underlining\nhere, some bolder font.
20891
19:57:39,672 --> 19:57:41,790
If I keep going, ooh,\nthat's getting pretty, too
20892
19:57:41,789 --> 19:57:44,831
if I want to have a colorful table,\n
20893
19:57:44,831 --> 19:57:47,981
out myself if I want\nsome dark mode here
20894
19:57:47,982 --> 19:57:51,472
if I want to have alternating\nhighlights, and so forth.
20895
19:57:51,471 --> 19:57:54,491
There's so many different stylizations\n
20896
19:57:54,491 --> 19:57:58,211
But I care about making a phone book,\n
20897
19:57:58,211 --> 19:58:02,982
So if I read the documentation closely,\n
20898
19:58:02,982 --> 19:58:06,881
is add Bootstrap's table\nclass to my table tag
20899
19:58:06,881 --> 19:58:11,592
and watch with a simple reload, what\n
20900
19:58:13,331 --> 19:58:16,391
Might not be what you want, but, my\n
20901
19:58:16,392 --> 19:58:18,262
I just really prettied things up.
20902
19:58:18,262 --> 19:58:21,112
And so here, then, is the value of\n
20903
19:58:21,111 --> 19:58:27,311
It allows you to actually\ncreate much prettier, much more
20904
19:58:27,312 --> 19:58:32,232
user-friendly websites than you might\n
20905
19:58:34,211 --> 19:58:37,842
In fact, let's iterate one\nmore time on one other example
20906
19:58:37,842 --> 19:58:39,862
before we introduce a bit of that code.
20907
19:58:39,861 --> 19:58:44,691
Let me go ahead and open\nup Search.html from before
20908
19:58:44,691 --> 19:58:49,331
which, recall, looks like this,\nand Search.html on my browser
20909
19:58:49,331 --> 19:58:52,511
was this very simple Google search.
20910
19:58:52,512 --> 19:58:56,801
And suppose I want to reinvent\nGoogle.com's UI a bit more.
20911
19:58:56,801 --> 19:58:59,711
Here's a screenshot of\nGoogle.com on a typical day.
20912
19:58:59,711 --> 19:59:03,581
It's got an about link, a store\n
20913
19:59:05,360 --> 19:59:07,152
It's not appearing well\non the screen here
20914
19:59:07,152 --> 19:59:09,851
but there's a big text box in\nthe middle, and then two buttons
20915
19:59:09,851 --> 19:59:12,131
Google search, and I'm feeling lucky.
20916
19:59:12,131 --> 19:59:15,851
Well, could I maybe go about\nimplementing this UI myself
20917
19:59:15,851 --> 19:59:19,872
using some HTML, some CSS,\nand maybe Bootstrap's help
20918
19:59:19,872 --> 19:59:23,110
just so I don't have to figure out\n
20919
19:59:23,110 --> 19:59:24,402
Well, here's my starting point.
20920
19:59:24,402 --> 19:59:29,631
In Search.html, let's go and add\n
20921
19:59:29,631 --> 19:59:34,342
so that we have access to all of\n
20922
19:59:34,342 --> 19:59:37,732
And let me go ahead and\nfigure out how to do this.
20923
19:59:37,732 --> 19:59:43,392
Well, just like Stanford's site had\n
20924
19:59:43,392 --> 19:59:46,362
but they changed it from being a\n
20925
19:59:46,361 --> 19:59:48,322
I bet I can do something\nlike this myself.
20926
19:59:48,322 --> 19:59:50,562
So let me go into the body\nof my page and, first
20927
19:59:50,562 --> 19:59:54,851
based on Bootstrap's documentation,\nlet me add a div called
20928
19:59:54,851 --> 19:59:57,641
a div with a class of container fluid.
20929
19:59:57,642 --> 19:59:59,832
Container fluid is\njust a class that comes
20930
19:59:59,831 --> 20:00:03,491
with Bootstrap that says, make\nyour web page fluid, that is
20931
20:00:04,961 --> 20:00:07,032
So that way it's going to resize nicely.
20932
20:00:07,032 --> 20:00:09,392
I'm going to go ahead and\nfix my indentation here.
20933
20:00:09,392 --> 20:00:11,142
If you haven't discovered\nthis yet, if you
20934
20:00:11,142 --> 20:00:13,032
highlight multiple lines\nin VS Code, you can
20935
20:00:13,032 --> 20:00:15,082
hit Tab and indent them all at once.
20936
20:00:15,081 --> 20:00:17,451
So now, I have all of\nthat inside of this div.
20937
20:00:17,452 --> 20:00:23,052
Now, just like in Stanford's site, let's\n
20938
20:00:23,051 --> 20:00:32,561
an LI, called with a class of NAV item,\n
20939
20:00:32,562 --> 20:00:42,702
let me go ahead and say, A\nHREF=https://about.google
20940
20:00:42,702 --> 20:00:44,972
which is the real URL\nof Google's about page.
20941
20:00:44,971 --> 20:00:46,771
And I'll put the about text in there.
20942
20:00:46,771 --> 20:00:50,911
Then I'm going to close my LI tag\n
20943
20:00:50,911 --> 20:00:52,381
because I'm using Bootstrap.
20944
20:00:52,381 --> 20:00:54,391
Bootstrap's documentation,\nif I read it closely
20945
20:00:54,392 --> 20:00:59,642
says to add a class to your links,\n
20946
20:00:59,642 --> 20:01:04,712
to make it dark, like black or dark\n
20947
20:01:04,711 --> 20:01:08,551
All right, so I think I\nhave now an about link
20948
20:01:08,551 --> 20:01:10,661
in a navigation part of my screen.
20949
20:01:10,661 --> 20:01:14,161
Let me go ahead and\nsave this and reload.
20950
20:01:14,161 --> 20:01:16,351
All right, so not exactly what I wanted.
20951
20:01:16,351 --> 20:01:19,741
It's a bulleted list, still, so\nI need to override this somehow.
20952
20:01:19,741 --> 20:01:22,831
Let me read Bootstrap's\ndocumentation a little more clearly.
20953
20:01:22,831 --> 20:01:25,021
And let me pretend to do\nthat, for time's sake.
20954
20:01:25,021 --> 20:01:28,682
If I go under content, oops,\nif I go under components
20955
20:01:28,682 --> 20:01:32,801
and I go to Navs and\nTabs, long story short
20956
20:01:32,801 --> 20:01:35,701
if you want to create a pretty menu\n
20957
20:01:35,702 --> 20:01:37,802
from the left to the\nright, just like Stanford
20958
20:01:37,801 --> 20:01:40,001
I essentially need HTML like this.
20959
20:01:40,001 --> 20:01:42,271
And this is subtle, but\nI left off this class.
20960
20:01:42,271 --> 20:01:45,902
I should have added a\nclass called NAV on my UL.
20961
20:01:46,831 --> 20:01:49,322
Let me go in here and\nsay add class equals
20962
20:01:49,322 --> 20:01:53,702
NAV, and then again, this class\nNAV item, Bootstrap told me to
20963
20:01:53,702 --> 20:01:56,461
NAV link text dark,\nBootstrap told me to.
20964
20:01:56,461 --> 20:02:02,521
Let me go back to my page here,\n
20965
20:02:02,521 --> 20:02:05,161
But at least the About link is\nin the top left hand corner
20966
20:02:05,161 --> 20:02:07,891
just like it should be\nin the real google.com.
20967
20:02:07,892 --> 20:02:10,142
Now let me whip up a couple\nof more links real fast.
20968
20:02:10,142 --> 20:02:13,472
Let me go and do a little\ncopy/paste, though I bet next week
20969
20:02:13,471 --> 20:02:15,811
we can avoid this kind of copy/paste.
20970
20:02:15,812 --> 20:02:20,252
Let me change this link\nto be Store.google.com.
20971
20:02:22,471 --> 20:02:26,941
Let me go ahead and create\nanother one here for Gmail.
20972
20:02:26,941 --> 20:02:31,682
So this one's going to go\nto, officially, how about
20973
20:02:31,682 --> 20:02:35,642
technically it's www.google.com/gmail.
20974
20:02:37,172 --> 20:02:39,332
And let me grab one more of these.
20975
20:02:39,331 --> 20:02:42,691
And for Google Images, and I'm\ngoing to paste this, whoops
20976
20:02:44,042 --> 20:02:45,631
I'm going to put this here, too.
20977
20:02:45,631 --> 20:02:53,251
This is going to be images, and\nthat URL is IMG.hp, is the URL.
20978
20:02:53,251 --> 20:02:56,281
All right, let me go ahead\nand reload the browser page.
20979
20:02:56,282 --> 20:02:57,842
Now it's coming along, right?
20980
20:02:57,842 --> 20:02:59,491
About, store, Gmail, images.
20981
20:02:59,491 --> 20:03:01,057
It's not quite what I want.
20982
20:03:01,057 --> 20:03:03,182
So I'd have to read the\ndocumentation to figure out
20983
20:03:03,182 --> 20:03:07,395
how to maybe nudge one of these\n
20984
20:03:07,395 --> 20:03:09,062
And there's a couple of ways to do this.
20985
20:03:09,062 --> 20:03:13,351
But one way is if I want Gmail to move\n
20986
20:03:13,351 --> 20:03:23,161
else, I can say that add some margin to\n
20987
20:03:23,161 --> 20:03:27,721
This is in Bootstrap's documentation, a\n
20988
20:03:27,721 --> 20:03:29,941
just automatically\nshove everything apart.
20989
20:03:29,941 --> 20:03:34,441
And now, if I reload the page\n
20990
20:03:35,461 --> 20:03:37,336
All right, so now we're\nkind of moving along.
20991
20:03:37,336 --> 20:03:40,422
Let me go ahead and add the\nbig blue button to sign in.
20992
20:03:40,422 --> 20:03:45,422
So here with sign in, let me go\n
20993
20:03:45,422 --> 20:03:49,202
so let's go ahead and do one\nmore LI, class equals NAV item.
20994
20:03:49,202 --> 20:03:52,442
And then, inside of this LI\ntag, what am I going to do?
20995
20:03:52,441 --> 20:03:58,111
Turns out there is a class that can turn\n
20996
20:03:58,111 --> 20:04:01,411
for button, and then button\nprimary, makes it blue
20997
20:04:01,411 --> 20:04:04,554
the HREF for this one is going\nto be https://accounts.goo
20998
20:04:04,554 --> 20:04:08,491
gle.com/service/login, which is\n
20999
20:04:09,631 --> 20:04:11,941
The role of this link is that of button.
21000
20:04:11,941 --> 20:04:15,241
And then sign in, is going\nto be the text on it.
21001
20:04:15,241 --> 20:04:18,744
If I now reload the page, now\nwe're getting even closer
21002
20:04:18,744 --> 20:04:20,161
although it looks a little stupid.
21003
20:04:20,161 --> 20:04:22,921
Notice that sign in is way\nin the top right hand corner
21004
20:04:22,922 --> 20:04:26,492
whereas the real google.com has\n
21005
20:04:26,491 --> 20:04:28,021
OK, that's an easy fix, too.
21006
20:04:28,021 --> 20:04:30,061
Let me go back into my HTML here.
21007
20:04:31,849 --> 20:04:33,182
This, too, is a Bootstrap thing.
21008
20:04:33,182 --> 20:04:35,702
They have a class called m-something.
21009
20:04:35,702 --> 20:04:38,461
The something is a\nnumber from like 1 to 5
21010
20:04:38,461 --> 20:04:41,982
I believe, that adds just\nsome amount of white space.
21011
20:04:41,982 --> 20:04:45,672
So if I reload now, OK,\nit's just a little prettier.
21012
20:04:47,191 --> 20:04:50,671
Just to demonstrate how I can\ntake this home, let me go ahead
21013
20:04:50,672 --> 20:04:55,232
and open up my premade\nversion of this, whereby
21014
20:04:55,232 --> 20:04:58,862
I added to this some final flourishes.
21015
20:04:58,861 --> 20:05:02,281
If I go to Search2.html, I\ndecided to replace their logo
21016
20:05:02,282 --> 20:05:05,582
with just this out of a\ncat, and notice that I
21017
20:05:05,581 --> 20:05:07,664
re-implemented essentially google.com.
21018
20:05:07,664 --> 20:05:10,081
Here's a text box, here's two\nbuttons, even though they're
21019
20:05:10,081 --> 20:05:11,641
a little washed out on the screen.
21020
20:05:11,642 --> 20:05:14,882
I even figured out how to get dots\n
21021
20:05:14,881 --> 20:05:18,421
And if we view source, you can see\n
21022
20:05:18,422 --> 20:05:25,322
If I go to view developer tools, and I\n
21023
20:05:25,322 --> 20:05:31,081
and I go into this div, you'll see\n
21024
20:05:31,081 --> 20:05:35,531
And I added some classes there to make\n
21025
20:05:35,532 --> 20:05:38,652
If I go into the form tag, this\nis the same form tag as before.
21026
20:05:38,652 --> 20:05:40,952
But, notice, I used\nbutton tags this time
21027
20:05:40,952 --> 20:05:43,672
with button and button light classes.
21028
20:05:43,672 --> 20:05:45,961
And then I stylized\nthem in a certain way.
21029
20:05:45,961 --> 20:05:49,221
And so in the end result, if I want\n
21030
20:05:49,221 --> 20:05:51,981
and click Google search,\nvoila, I've implemented
21031
20:05:51,982 --> 20:05:54,982
something that's pretty\ndarn close to Google.com
21032
20:05:54,982 --> 20:05:57,952
without even touching raw CSS myself.
21033
20:05:57,952 --> 20:06:00,502
And now here's the value,\nthen, of a framework.
21034
20:06:00,501 --> 20:06:02,811
You can just start to use\noff the shelf functionality
21035
20:06:02,812 --> 20:06:04,622
that someone else created for you.
21036
20:06:04,622 --> 20:06:07,101
But if you want to make\nrefinements, you don't really
21037
20:06:07,101 --> 20:06:09,831
like the shade of blue that\nBootstrap chose, or the gray button
21038
20:06:09,831 --> 20:06:11,748
or you want to curve\nthings a bit more, that's
21039
20:06:11,748 --> 20:06:14,122
where you can create\nyour own CSS file, and do
21040
20:06:14,122 --> 20:06:16,372
the last mile, sort\nof fine tuning things.
21041
20:06:16,372 --> 20:06:17,961
And that tends to be best practice.
21042
20:06:17,961 --> 20:06:21,051
Stand on the shoulders of others as\n
21043
20:06:21,051 --> 20:06:23,551
And then if you really don't\nlike what the library is doing
21044
20:06:23,551 --> 20:06:26,301
then use your own skills and\nunderstanding of HTML and CSS
21045
20:06:26,301 --> 20:06:29,551
to refine things a bit further.
21046
20:06:29,551 --> 20:06:33,262
But still, after all of that, all of\n
21047
20:06:33,262 --> 20:06:35,752
are still static, other\nthan the Google one
21048
20:06:35,751 --> 20:06:37,671
which searches on the real Google.com.
21049
20:06:37,672 --> 20:06:39,532
Let's take a final 5\nminute break and we'll
21050
20:06:39,532 --> 20:06:43,282
give you a sense of what we can next\n
21051
20:06:44,822 --> 20:06:48,381
All right, so I think\nit's fair to say, we're
21052
20:06:48,381 --> 20:06:50,819
about to see our very last language.
21053
20:06:50,820 --> 20:06:52,612
Next week and final\nprojects are ultimately
21054
20:06:52,611 --> 20:06:55,101
going to be about\nsynthesizing so many of these.
21055
20:06:55,101 --> 20:06:58,042
Thankfully, this language called\nJavaScript is quite similar
21056
20:06:58,042 --> 20:07:00,229
syntactically to both C and Python.
21057
20:07:00,229 --> 20:07:03,021
And, indeed, if you can imagine\n
21058
20:07:03,021 --> 20:07:05,391
you can probably do it in\nsome form in JavaScript.
21059
20:07:05,392 --> 20:07:07,742
The most fundamental\ndifference today, though
21060
20:07:07,741 --> 20:07:11,122
is that when you have written C\ncode and Python code thus far
21061
20:07:11,122 --> 20:07:12,379
you've done it on the server.
21062
20:07:12,379 --> 20:07:14,461
You've done it in the\nterminal window environment.
21063
20:07:14,461 --> 20:07:17,812
And when you run the code, it's\n
21064
20:07:17,812 --> 20:07:20,122
The difference now today\nwith JavaScript is
21065
20:07:20,122 --> 20:07:23,362
even though you're going to write\nit in the cloud using VS Code
21066
20:07:23,361 --> 20:07:27,981
recall that, when a browser gets\nthe page containing this code
21067
20:07:27,982 --> 20:07:32,012
it's going to get a copy of the HTML,\n
21068
20:07:32,012 --> 20:07:37,012
So JavaScript, that we see today, is\n
21069
20:07:37,012 --> 20:07:40,732
on users' own Macs, PCs, and\nphones, not in the server.
21070
20:07:40,732 --> 20:07:44,062
JavaScript can be used on the server,\n
21071
20:07:44,062 --> 20:07:47,752
It's an alternative to Python or\n
21072
20:07:47,751 --> 20:07:51,361
We are using it today client\nside, which is a key difference.
21073
20:07:51,361 --> 20:07:53,932
So in Scratch, let's\ndo this one last time.
21074
20:07:53,932 --> 20:07:56,991
If you wanted to create a variable\n
21075
20:07:56,991 --> 20:07:59,331
In JavaScript, it's\ngoing to look like this.
21076
20:07:59,331 --> 20:08:02,241
You don't specify the type,\nbut you do use the keyword let
21077
20:08:02,241 --> 20:08:07,581
and there's a few others as well, that\n
21078
20:08:07,581 --> 20:08:11,631
If you want to increment that\nvariable by one, you in JavaScript
21079
20:08:11,631 --> 20:08:14,932
could say something like,\ncounter equals counter plus 1
21080
20:08:14,932 --> 20:08:17,512
or you can do it more\nsuccinctly, with plus equals
21081
20:08:17,512 --> 20:08:20,092
or the plus plus is back in JavaScript.
21082
20:08:20,092 --> 20:08:23,122
You can now say counter\nplus plus semicolon again.
21083
20:08:23,122 --> 20:08:26,122
In Scratch, if you wanted to\ndo a conditional like this
21084
20:08:26,122 --> 20:08:30,741
asking if x less than y, it looks\n
21085
20:08:32,361 --> 20:08:35,691
The curly braces here are back, if you\n
21086
20:08:35,691 --> 20:08:39,951
But, syntactically, it's pretty much\n
21087
20:08:39,952 --> 20:08:42,502
and even for it's else if else.
21088
20:08:42,501 --> 20:08:45,621
Unlike Python, it's two\nwords again, else if.
21089
20:08:45,622 --> 20:08:49,347
So quite, quite like C,\nnothing new beyond that.
21090
20:08:49,346 --> 20:08:52,221
If you want to do something forever\n
21091
20:08:52,221 --> 20:08:55,731
In JavaScript, you can do it a few\n
21092
20:08:57,172 --> 20:09:00,652
In JavaScript, Booleans are\nlowercase again, just like in C.
21093
20:09:02,073 --> 20:09:04,281
If you want to do something\na finite number of times
21094
20:09:04,282 --> 20:09:08,042
like repeat three times,\nlooks almost like C as well.
21095
20:09:08,042 --> 20:09:12,452
The only difference, really, is using\n
21096
20:09:12,452 --> 20:09:15,142
And, again, you'll use let to\ncreate a string, or an INT
21097
20:09:15,142 --> 20:09:17,362
or any other type of\nvariable in JavaScript.
21098
20:09:17,361 --> 20:09:21,021
The browser will figure out\nwhat type you mean from context.
21099
20:09:21,021 --> 20:09:24,232
In C we would have said INT instead.
21100
20:09:24,232 --> 20:09:27,952
Ultimately, this language, and that's\n
21101
20:09:27,952 --> 20:09:30,502
There's bunches of other\nfeatures, but syntactically it's
21102
20:09:30,501 --> 20:09:32,841
going to be that accessible,\nrelatively speaking.
21103
20:09:32,842 --> 20:09:35,572
The power of JavaScript\nrunning in the user's browser
21104
20:09:35,572 --> 20:09:39,232
is going to be that you can\nchange this thing in memory.
21105
20:09:39,232 --> 20:09:43,432
Think about most any website, that's\n
21106
20:09:43,432 --> 20:09:45,771
It's typically very\ninteractive and dynamic.
21107
20:09:45,771 --> 20:09:49,581
If you're sitting in front of Gmail on\n
21108
20:09:49,581 --> 20:09:52,141
open, and someone sends you\nan email, all of a sudden
21109
20:09:52,142 --> 20:09:55,402
another row appears in your\ninbox, another row, another row.
21110
20:09:56,721 --> 20:09:58,711
Honestly, it could be an HTML table.
21111
20:09:58,711 --> 20:10:00,741
Maybe it's a bunch of\ndivs top to bottom.
21112
20:10:00,741 --> 20:10:04,072
The point, though, is, you don't\n
21113
20:10:04,072 --> 20:10:06,532
R to reload the page to see more email.
21114
20:10:06,532 --> 20:10:10,072
It automatically appears\nevery few seconds or minutes.
21115
20:10:11,751 --> 20:10:14,811
When you visit Gmail.com,\nyou are downloading not just
21116
20:10:14,812 --> 20:10:18,322
HTML and CSS with your\ninitial inbox, presumably.
21117
20:10:18,322 --> 20:10:20,392
You're downloading some\nJavaScript code, that
21118
20:10:20,392 --> 20:10:24,592
is designed to keep talking every\n
21119
20:10:24,592 --> 20:10:28,282
to Gmail servers, and they,\nthen, are using their code
21120
20:10:28,282 --> 20:10:31,492
to add another element, another\nelement, another element
21121
20:10:31,491 --> 20:10:36,771
to the existing DOM, document object\n
21122
20:10:36,771 --> 20:10:40,851
in memory that represents HTML,\n
21123
20:10:43,232 --> 20:10:46,101
If you click and drag and drag\nand drag, your browser did not
21124
20:10:46,101 --> 20:10:49,351
download the entire world to\nyour Mac or PC by default.
21125
20:10:49,351 --> 20:10:52,491
It only downloaded what's in your\n
21126
20:10:52,491 --> 20:10:55,491
But when you click and drag, it's\n
21127
20:10:55,491 --> 20:10:58,702
some more images, some more\nimages, as you keep dragging, using
21128
20:10:58,702 --> 20:11:01,312
JavaScript, again, behind the scenes.
21129
20:11:01,312 --> 20:11:04,672
So let's actually use JavaScript\n
21130
20:11:05,979 --> 20:11:08,271
We can put the JavaScript\ncode in the head of the page
21131
20:11:08,271 --> 20:11:12,182
in the body of the page, or even\n
21132
20:11:13,131 --> 20:11:17,601
Here is a new version of\nHello.html, that, during the break
21133
20:11:17,601 --> 20:11:20,781
I just added a form to, because it'd\n
21134
20:11:20,782 --> 20:11:24,472
say Hello, title, Hello, body, it said,\n
21135
20:11:25,672 --> 20:11:28,672
I've got a form that I borrowed\nfrom some of our earlier code
21136
20:11:28,672 --> 20:11:34,882
and that form has an input whose ID is\n
21137
20:11:34,881 --> 20:11:36,572
But there's no code in this yet.
21138
20:11:36,572 --> 20:11:39,601
So let's add a little bit of\nJavaScript code as follows.
21139
20:11:39,601 --> 20:11:43,101
Suppose that, when this form is\n
21140
20:11:44,282 --> 20:11:47,302
Well, let's do it the\nsomewhat messy way first.
21141
20:11:47,301 --> 20:11:51,351
I can add an attribute called\non submit to the form element
21142
20:11:51,351 --> 20:11:56,452
and I can say on submit, call the\n
21143
20:11:56,452 --> 20:11:58,552
Unfortunately, this\nfunction doesn't yet exist.
21144
20:11:59,769 --> 20:12:01,101
But there's another detail here.
21145
20:12:01,101 --> 20:12:04,851
When the user clicks submit, normally\n
21146
20:12:04,851 --> 20:12:06,381
I don't want to do that today.
21147
20:12:06,381 --> 20:12:10,372
I want to just submit the form to\n
21148
20:12:10,372 --> 20:12:13,142
and just print to the screen,\nHello, David, or so forth.
21149
20:12:13,142 --> 20:12:15,862
So I'm also going to go\nahead and say, return false.
21150
20:12:15,861 --> 20:12:20,301
And this is a JavaScript way of telling\n
21151
20:12:20,301 --> 20:12:21,831
to submit the form, return false.
21152
20:12:21,831 --> 20:12:24,171
Like, no, don't let them\nactually submit the form.
21153
20:12:24,172 --> 20:12:26,362
But do call this function called greet.
21154
20:12:26,361 --> 20:12:30,951
In the head of my page, I'm going to add\n
21155
20:12:30,952 --> 20:12:33,682
implicitly JavaScript,\nand has no relationship
21156
20:12:33,682 --> 20:12:37,402
for those of you who took APCS with\n
21157
20:12:37,402 --> 20:12:41,542
but no relation, I'm going to\nname a function called Greet.
21158
20:12:41,542 --> 20:12:44,631
Apparently in JavaScript, the\nway you create a function is you
21159
20:12:44,631 --> 20:12:47,361
literally say the word\nfunction instead of Def.
21160
20:12:47,361 --> 20:12:49,701
You don't specify a return type.
21161
20:12:49,702 --> 20:12:54,381
And in this function, I could do\n
21162
20:12:54,381 --> 20:12:57,801
unquote, how about, Hello, there.
21163
20:12:57,801 --> 20:13:01,501
Initially I'm going to keep it simple,\n
21164
20:13:01,501 --> 20:13:03,031
which is not a good user interface.
21165
20:13:03,032 --> 20:13:04,407
There are better ways to do this.
21166
20:13:04,407 --> 20:13:06,382
But we're doing something simple first.
21167
20:13:06,381 --> 20:13:09,232
Let me now go ahead and\nload this page again.
21168
20:13:09,232 --> 20:13:12,562
It still looks as simple as before,\nwith just a simple text box.
21169
20:13:12,562 --> 20:13:13,942
I'll zoom in to make it bigger.
21170
20:13:13,941 --> 20:13:15,471
I'm going to type my name,\nbut I think it's going
21171
20:13:15,471 --> 20:13:17,241
to be ignored when I click Submit.
21172
20:13:18,831 --> 20:13:21,111
And this is, again, this\nis an ugly user interface.
21173
20:13:21,111 --> 20:13:24,381
It literally says the whole\ncode space URL of the web page
21174
20:13:25,581 --> 20:13:29,456
It's really just meant for simple\n
21175
20:13:29,456 --> 20:13:31,581
All right, let's have it\nsay Hello, David, somehow.
21176
20:13:33,032 --> 20:13:37,222
Well, if this element on the\npage was given by me a unique ID
21177
20:13:37,221 --> 20:13:41,331
it'd be nice if, just like in CSS, I\n
21178
20:13:45,572 --> 20:13:49,461
Let me store, in a variable\ncalled name, the result
21179
20:13:49,461 --> 20:13:54,381
of calling a special function\ncalled document.queryselector.
21180
20:13:54,381 --> 20:13:57,682
This query selector function\nis JavaScript's version
21181
20:13:57,682 --> 20:14:00,142
of what we were doing\nin CSS, to select nodes
21182
20:14:00,142 --> 20:14:02,962
using hashes or dots or other syntax.
21183
20:14:04,441 --> 20:14:09,391
So if I want to select the\nelement whose unique ID is name
21184
20:14:09,392 --> 20:14:12,712
I can literally just pass, in\n
21185
20:14:14,211 --> 20:14:17,032
That gives me the actual\nnode from the tree.
21186
20:14:17,032 --> 20:14:21,202
It gives me one of these rectangles\n
21187
20:14:21,202 --> 20:14:24,351
If I actually want to get at\nthe specific value therein
21188
20:14:24,351 --> 20:14:27,301
I need to go one step\nfurther and say .value.
21189
20:14:27,301 --> 20:14:29,301
So, similar in spirit\nto Python, where we
21190
20:14:29,301 --> 20:14:32,211
saw a lot of dot notation, where\nyou can go inside an object
21191
20:14:32,211 --> 20:14:34,191
inside of an object,\nthat's what's going on.
21192
20:14:34,191 --> 20:14:38,091
Long story short, in JavaScript,\n
21193
20:14:38,092 --> 20:14:41,392
called document, that lets you just do\n
21194
20:14:42,021 --> 20:14:44,361
One of those functions\nis called query selector.
21195
20:14:44,361 --> 20:14:47,781
That function returns to you\nwhatever it is you're selecting.
21196
20:14:47,782 --> 20:14:51,032
And dot value means go\ninside of that rectangle
21197
20:14:51,032 --> 20:14:54,272
and grab the actual text\nthat the human typed in.
21198
20:14:54,271 --> 20:14:58,251
So if I want to now say,\nHello, to that person
21199
20:14:58,251 --> 20:15:01,011
the syntax is a little\ndifferent from C and Python.
21200
20:15:01,012 --> 20:15:03,741
I can use concatenation, which\nactually does exist in Python
21201
20:15:04,941 --> 20:15:10,371
I can go ahead and say hello,\nquote unquote "Hello," plus name.
21202
20:15:10,372 --> 20:15:13,131
All right, now, if I go\nback to the browser window
21203
20:15:13,131 --> 20:15:16,911
reload the page, to get the latest\n
21204
20:15:16,911 --> 20:15:20,031
and click Submit, now\nI see, Hello, David.
21205
20:15:20,032 --> 20:15:23,422
Not the best website, but\nit does demonstrate how
21206
20:15:23,422 --> 20:15:25,832
I can start to interact with the page.
21207
20:15:25,831 --> 20:15:29,211
But let me stipulate that\nthis co-mingling of languages
21208
20:15:30,661 --> 20:15:33,171
It's fine to use\nclasses, but using style
21209
20:15:33,172 --> 20:15:35,345
equals quote unquote and\na whole bunch of CSS
21210
20:15:35,345 --> 20:15:38,512
that was not going to scale well, once\n
21211
20:15:38,512 --> 20:15:40,672
Same here, once you\nhave more and more code
21212
20:15:40,672 --> 20:15:44,461
you don't want to just put your code\n
21213
20:15:45,751 --> 20:15:48,301
Let's get rid of that\non summit attribute
21214
20:15:48,301 --> 20:15:50,131
and literally never use it again.
21215
20:15:50,131 --> 20:15:52,411
That was for demonstration's sake only.
21216
20:15:53,801 --> 20:15:57,661
Let me move the script tag,\nactually, just below the form
21217
20:15:57,661 --> 20:16:01,591
but still inside the body, so\nthat the script tag exists only
21218
20:16:01,592 --> 20:16:04,482
after the form tag exists, logically.
21219
20:16:04,482 --> 20:16:08,411
Just like in Python, your code is\n
21220
20:16:10,751 --> 20:16:13,591
Let me define this function\ncalled Greet, and then
21221
20:16:13,592 --> 20:16:18,902
let me do this, document.queryselector,\n
21222
20:16:18,902 --> 20:16:20,232
It doesn't have a unique ID.
21223
20:16:21,062 --> 20:16:24,302
I can just reference it by name, form,\n
21224
20:16:24,301 --> 20:16:28,411
And let me call this special\nfunction, add event listener.
21225
20:16:28,411 --> 20:16:31,471
This is a function that\nlistens for events.
21226
20:16:31,471 --> 20:16:34,201
Now this is actually a term\nof art within programming.
21227
20:16:34,202 --> 20:16:37,172
Many different languages\nare governed by events.
21228
20:16:37,172 --> 20:16:40,652
And pretty much any user interface is\n
21229
20:16:40,652 --> 20:16:45,264
On phones, you have touches, and you\n
21230
20:16:45,264 --> 20:16:47,432
and you have pinch, and all\nof these other gestures.
21231
20:16:47,432 --> 20:16:49,711
On your Mac or PC you\nhave click, you have drag
21232
20:16:49,711 --> 20:16:52,982
you have key down, key up, as\n
21233
20:16:53,861 --> 20:16:57,121
This is a non-exhaustive\nlist of all of the events
21234
20:16:57,122 --> 20:16:59,919
that you can listen for in the\ncontext of web programming.
21235
20:16:59,918 --> 20:17:02,251
And this might be a throwback\nto Scratch, where, recall
21236
20:17:02,251 --> 20:17:04,411
Scratch let you broadcast events.
21237
20:17:04,411 --> 20:17:07,741
And we had the two puppets sort of\n
21238
20:17:07,741 --> 20:17:11,611
In the world of web programming, game\n
21239
20:17:11,611 --> 20:17:14,131
these days, they're\njust governed by events.
21240
20:17:14,131 --> 20:17:16,961
And you write code that listens\nfor these events happening.
21241
20:17:16,961 --> 20:17:18,301
So what do I want to listen for?
21242
20:17:18,301 --> 20:17:21,461
Well, I want to add an event\nlistener for the Submit event.
21243
20:17:21,461 --> 20:17:26,771
And when that happens, I want to\n
21244
20:17:26,771 --> 20:17:28,801
So this is kind of interesting.
21245
20:17:28,801 --> 20:17:32,432
Thank you, I have my Greet\nfunction as before, no changes.
21246
20:17:32,432 --> 20:17:34,952
But I'm adding one\nline of code down here.
21247
20:17:34,952 --> 20:17:38,072
I'm telling the browser to\nuse document.queryselector
21248
20:17:39,331 --> 20:17:42,551
Then I'm adding an event listener,\n
21249
20:17:42,551 --> 20:17:44,671
And when that happens, I call Greet.
21250
20:17:44,672 --> 20:17:47,461
Notice I am not using\nparentheses after Greet.
21251
20:17:47,461 --> 20:17:49,381
I don't want to call Greet right away.
21252
20:17:49,381 --> 20:17:55,891
I want to tell the browser to call\n
21253
20:17:55,892 --> 20:18:03,992
Now let me go ahead and deliberately,\n
21254
20:18:03,991 --> 20:18:08,012
here, let me type in my name,\nDavid, submit, and there we go.
21255
20:18:09,361 --> 20:18:13,231
All right, but let's now make\nthis slightly better designed.
21256
20:18:13,232 --> 20:18:16,572
Right now, I'm defining a\nfunction Greet, which is fine.
21257
20:18:16,572 --> 20:18:18,551
But I'm only using it in one place.
21258
20:18:18,551 --> 20:18:21,122
And you might recall, we\nstumbled on this in Python
21259
20:18:21,122 --> 20:18:24,211
where I was like, why are we creating\n
21260
20:18:24,211 --> 20:18:26,461
when we're only using\nit like one line later?
21261
20:18:26,461 --> 20:18:29,842
And we introduced what type of\nfunction in Python the other day?
21262
20:18:30,601 --> 20:18:33,221
SPEAKER 1: Yeah, so lambda\nfunctions, anonymous functions.
21263
20:18:33,221 --> 20:18:35,611
You can actually do this\nin JavaScript as well.
21264
20:18:35,611 --> 20:18:40,001
If I want to define a function all\n
21265
20:18:40,001 --> 20:18:43,661
Let me cut this onto my\nclipboard, paste it over here.
21266
20:18:43,661 --> 20:18:45,451
Let me fix all of the alignment.
21267
20:18:46,982 --> 20:18:50,672
And I can actually, now, do this.
21268
20:18:50,672 --> 20:18:52,351
The syntax is a little weird.
21269
20:18:52,351 --> 20:18:56,021
But using now just these four\nlines of code, I can do this.
21270
20:18:56,021 --> 20:18:59,531
I can tell the browser to add an\n
21271
20:18:59,532 --> 20:19:03,331
And then when it hears that, call\n
21272
20:19:03,331 --> 20:19:06,151
And unlike Python, this function\ncan have multiple lines
21273
20:19:06,150 --> 20:19:07,650
which is actually a nice thing.
21274
20:19:08,714 --> 20:19:11,131
There's a lot of indentation\nin curly braces going on now.
21275
20:19:11,131 --> 20:19:15,452
But you can think of this as just\n
21276
20:19:15,452 --> 20:19:18,149
when the form is submitted.
21277
20:19:18,149 --> 20:19:20,732
But if I want to block the form\nfrom actually being submitted
21278
20:19:20,732 --> 20:19:21,722
I've got to do one other thing.
21279
20:19:21,721 --> 20:19:24,929
And you would only know this from being\n
21280
20:19:24,929 --> 20:19:28,291
I need to do this\nfunction, prevent default
21281
20:19:28,292 --> 20:19:32,122
passing in this E argument, which\nis a variable that represents
21282
20:19:32,122 --> 20:19:33,872
the event, more on\nthat another time, that
21283
20:19:33,872 --> 20:19:36,482
just allows us to prevent\nwhatever the default
21284
20:19:36,482 --> 20:19:39,512
handling of that particular event is.
21285
20:19:39,512 --> 20:19:42,782
So long story short, this is\nrepresentative of the type of code
21286
20:19:42,782 --> 20:19:45,961
you might write in JavaScript,\nwhereby you can actually interact
21287
20:19:45,961 --> 20:19:48,422
with your code, the user's actual form.
21288
20:19:48,422 --> 20:19:50,192
And we can do interesting things, too.
21289
20:19:50,191 --> 20:19:53,531
Built into browsers nowadays\nis functionality like this.
21290
20:19:53,532 --> 20:19:57,251
So here's a very simple example, that\n
21291
20:19:58,381 --> 20:20:00,751
Well, it turns out using\nJavaScript, you can control
21292
20:20:00,751 --> 20:20:02,881
the CSS of a page programmatically.
21293
20:20:02,881 --> 20:20:05,281
I can change the background\nof the body of the page
21294
20:20:05,282 --> 20:20:10,382
to red, to green, to blue, just by\n
21295
20:20:10,381 --> 20:20:12,482
and then changing CSS properties.
21296
20:20:12,482 --> 20:20:15,271
Just to give you a taste of this,\nif I view the page's source
21297
20:20:15,271 --> 20:20:19,711
similar code here, I can\nselect the red button by an ID
21298
20:20:19,711 --> 20:20:22,441
that I apparently defined\non it, right up here.
21299
20:20:22,441 --> 20:20:25,871
I can add an event listener, this\n
21300
20:20:25,872 --> 20:20:28,199
And when it's clicked, I\nexecute this one line of code.
21301
20:20:28,198 --> 20:20:30,240
And this one line of code\nwe haven't seen before
21302
20:20:30,240 --> 20:20:33,871
but you can go into the body of\nthe page, its style property
21303
20:20:33,872 --> 20:20:36,331
and you can change its\nbackground color to red.
21304
20:20:36,331 --> 20:20:38,851
This is one example of\ntwo different groups
21305
20:20:38,851 --> 20:20:40,562
not talking to one another in advance.
21306
20:20:40,562 --> 20:20:43,911
In CSS, properties that have two\nwords are usually hyphenated
21307
20:20:45,592 --> 20:20:48,501
Unfortunately, in JavaScript, if\n
21308
20:20:48,501 --> 20:20:52,142
that's subtraction, which is\nlogically nonsensical here.
21309
20:20:52,142 --> 20:20:56,932
So in CSS, you can convert\nbackground-color to, in JavaScript
21310
20:20:56,932 --> 20:20:59,601
background Color, where\nyou capitalize the C
21311
20:20:59,601 --> 20:21:01,851
and you get rid of the minus sign.
21312
20:21:03,182 --> 20:21:05,900
Well, back in the day, there\nused to be a blink tag.
21313
20:21:05,900 --> 20:21:07,971
And it's one of the\nfew historical examples
21314
20:21:07,971 --> 20:21:12,682
of a tag that was removed from HTML,\n
21315
20:21:12,682 --> 20:21:14,122
this is what the web looked like.
21316
20:21:14,122 --> 20:21:16,176
There was a lot of this kind of stuff.
21317
20:21:16,176 --> 20:21:18,051
There was even a marquee\nthat would move text
21318
20:21:18,051 --> 20:21:19,509
from left to right over the screen.
21319
20:21:19,509 --> 20:21:21,781
And the web was a very ugly place.
21320
20:21:21,782 --> 20:21:25,071
I will admit, my very first web page\n
21321
20:21:25,070 --> 20:21:26,542
But how can we bring it back?
21322
20:21:26,542 --> 20:21:29,691
Well, this is a version of the\n
21323
20:21:30,202 --> 20:21:35,722
I wrote some code in this example, that\n
21324
20:21:35,721 --> 20:21:39,471
the CSS of the page to be\nvisible, invisible, visible
21325
20:21:39,471 --> 20:21:42,891
invisible, because built into\nJavaScript is support for a clock.
21326
20:21:42,892 --> 20:21:45,452
So you can just do something\non some sort of schedule.
21327
20:21:45,452 --> 20:21:47,782
Let me go ahead and open up\nthis example, autocomplete.
21328
20:21:49,251 --> 20:21:53,751
In Autocomplete.html, I whipped up as\n
21329
20:21:53,751 --> 20:21:56,482
but I also grabbed the\ndictionary from problem
21330
20:21:56,482 --> 20:22:00,711
set 5 speller, so that if I want\n
21331
20:22:00,711 --> 20:22:04,491
this searches that 140,000\nwords, using JavaScript
21332
20:22:04,491 --> 20:22:07,380
to create what we know in the\nworld of the web as autocomplete.
21333
20:22:07,380 --> 20:22:09,172
When you start searching\nfor something, you
21334
20:22:09,172 --> 20:22:11,632
should start to see words\nthat start with that phrase.
21335
20:22:11,631 --> 20:22:14,164
And sure enough, if I search\nfor something like banana
21336
20:22:14,164 --> 20:22:17,331
here's the three variants of bananas\n
21337
20:22:18,381 --> 20:22:21,811
Just JavaScript, when\nit finds matching words
21338
20:22:21,812 --> 20:22:24,862
it's just updating the DOM, the\ntree in the computer's memory
21339
20:22:24,861 --> 20:22:27,921
to show more and more text, or less.
21340
20:22:27,922 --> 20:22:34,162
And for one final example, this is how\n
21341
20:22:35,872 --> 20:22:40,042
You have built into browsers today some\n
21342
20:22:40,042 --> 20:22:44,152
interfaces, whereby you can ask for\n
21343
20:22:44,152 --> 20:22:47,601
For instance, here, I wrote a\n
21344
20:22:47,601 --> 20:22:49,448
apparently asking to know my location.
21345
20:22:49,448 --> 20:22:51,531
All right, let me go ahead\nand allow it this time
21346
20:22:51,532 --> 20:22:54,202
if that's something you're\ncomfortable with on your own device.
21347
20:22:54,202 --> 20:22:57,202
It's taking a moment, because sometimes\n
21348
20:22:58,081 --> 20:23:02,601
But, hopefully, in just a moment, there\n
21349
20:23:02,601 --> 20:23:06,232
and as a final flourish today, for what\n
21350
20:23:06,232 --> 20:23:08,601
for your structure, CSS\nfor your style, and now
21351
20:23:08,601 --> 20:23:13,372
JavaScript for your logic, which we'll\n
21352
20:23:13,372 --> 20:23:15,982
and search Google for\nthose GPS coordinates.
21353
20:23:15,982 --> 20:23:21,472
Zoom in here on Google Maps,\nand if we zoom in, in, in
21354
20:23:22,461 --> 20:23:26,122
We're not on that street, but\nthere, oh, there it is, actually.
21355
20:23:26,122 --> 20:23:27,832
There is the marker it had put for us.
21356
20:23:27,831 --> 20:23:30,118
We're indeed here in Memorial Hall.
21357
20:23:30,119 --> 20:23:32,452
So all that with JavaScript,\nbut the basic understanding
21358
20:23:32,452 --> 20:23:34,252
of the DOM and the\ndocument object model
21359
20:23:34,251 --> 20:23:36,481
we'll pick up where\nwe left off next week.
21360
20:24:57,172 --> 20:25:01,222
So this is CS50 and this is\nweek nine, and this is it
21361
20:25:01,221 --> 20:25:03,471
in terms of programming fundamentals.
21362
20:25:03,471 --> 20:25:06,250
Today, we come rather full circle\nwith so many of the languages
21363
20:25:06,250 --> 20:25:08,542
that we've been looking at\nover the past several weeks.
21364
20:25:08,542 --> 20:25:11,512
And with HTML and CSS\nand JavaScript last week
21365
20:25:11,512 --> 20:25:14,782
we're going to add back into\nthe mix, Python and SQL.
21366
20:25:14,782 --> 20:25:18,182
And with that, do we have the\nability to program for the web.
21367
20:25:18,182 --> 20:25:20,601
And even though this isn't\nthe only user interface out
21368
20:25:20,601 --> 20:25:23,691
there, increasingly-- or people\n
21369
20:25:23,691 --> 20:25:26,721
and a browser to access applications\nthat people have written
21370
20:25:26,721 --> 20:25:31,311
but it's also, increasingly, the way\n
21371
20:25:31,312 --> 20:25:33,532
There are languages\ncalled Swift for iOS
21372
20:25:33,532 --> 20:25:35,692
there are languages\ncalled Java for Android
21373
20:25:35,691 --> 20:25:38,512
but coding applications\nin both of those language
21374
20:25:38,512 --> 20:25:42,292
means knowing twice as many language,\n
21375
20:25:42,952 --> 20:25:45,502
So we're increasingly seeing,\nfor better or for worse
21376
20:25:45,501 --> 20:25:47,601
that the world is starting\nto really standardize
21377
20:25:47,601 --> 20:25:51,501
at least for the next some number of\n
21378
20:25:51,501 --> 20:25:55,491
coupled with other languages like\n
21379
20:25:55,491 --> 20:25:57,929
And so today, we'll tie\nall of those together
21380
20:25:57,929 --> 20:26:00,471
and give you the last of the\ntools in your toolkit with which
21381
20:26:00,471 --> 20:26:03,051
to tackle final projects to\ngo off into the real world
21382
20:26:03,051 --> 20:26:05,961
ultimately, and somehow solve\nproblems with programming.
21383
20:26:05,961 --> 20:26:11,072
But we need an additional tool today,\n
21384
20:26:11,072 --> 20:26:14,239
This is just a program that\ncomes on certain computers
21385
20:26:14,239 --> 20:26:16,072
that you can install\nfor free, happens to be
21386
20:26:16,072 --> 20:26:18,741
written in a language called\nJavaScript, but it's a program
21387
20:26:18,741 --> 20:26:22,101
that we've been using to\nrun a web server in VSCO.
21388
20:26:22,101 --> 20:26:24,932
But you can run it on your own\nMac or PC or anywhere else.
21389
20:26:24,932 --> 20:26:28,702
But all this particular\nHTTP server does is
21390
20:26:28,702 --> 20:26:32,512
serve up static content\nlike HTML files, CSS files
21391
20:26:32,512 --> 20:26:36,982
JavaScript files, maybe images, maybe\n
21392
20:26:36,982 --> 20:26:41,362
It has no ability to really interact\n
21393
20:26:41,361 --> 20:26:46,561
You can create a web form and serve\n
21394
20:26:46,562 --> 20:26:50,392
but if the human types in input into\n
21395
20:26:50,392 --> 20:26:53,842
submit it elsewhere to something like\n
21396
20:26:53,842 --> 20:26:56,961
it's not actually going to go anywhere\n
21397
20:26:56,961 --> 20:26:59,221
process the requests that are coming in.
21398
20:26:59,221 --> 20:27:02,151
So today, we're going to introduce\nanother type of server that
21399
20:27:02,152 --> 20:27:06,112
comes with Python that allows\nus to not only serve web pages
21400
20:27:06,111 --> 20:27:07,731
but also process user input.
21401
20:27:07,732 --> 20:27:11,482
And recall that all that input is\n
21402
20:27:11,482 --> 20:27:14,062
or more deeply inside of\nthose virtual envelopes.
21403
20:27:14,062 --> 20:27:17,902
So here's the canonical URL we talked\n
21404
20:27:20,042 --> 20:27:23,782
And I've highlighted the slash to\n
21405
20:27:23,782 --> 20:27:26,302
like the default folder\nwhere, presumably, there's
21406
20:27:26,301 --> 20:27:29,911
a file called index.html\nor something else in there.
21407
20:27:29,911 --> 20:27:32,122
Otherwise, you might have\na more explicit mention
21408
20:27:32,122 --> 20:27:34,551
of the actual file named file.html.
21409
20:27:34,551 --> 20:27:37,634
You can have folders, as you probably\n
21410
20:27:38,134 --> 20:27:40,221
You can have files in\nfolders like this, and these
21411
20:27:40,221 --> 20:27:44,091
are all examples of what a programmer\n
21412
20:27:44,092 --> 20:27:45,922
So it might not just\nbe a single word, it
21413
20:27:45,922 --> 20:27:49,832
might have multiple slashes and multiple\n
21414
20:27:49,831 --> 20:27:51,831
But this is just more\ngenerally known as a path.
21415
20:27:51,831 --> 20:27:54,470
But there's another term of our,\nthat's essentially equivalent
21416
20:27:54,471 --> 20:27:55,652
that we'll introduce today.
21417
20:27:55,652 --> 20:27:59,062
This is also synonymously\ncalled a route, which is maybe
21418
20:27:59,062 --> 20:28:03,022
a better generic description of what\n
21419
20:28:03,021 --> 20:28:05,512
they don't have to\nmap to, that is, refer
21420
20:28:05,512 --> 20:28:08,572
to a specific folder\nor a specific file, you
21421
20:28:08,572 --> 20:28:11,095
can come up with your\nown routes in a website.
21422
20:28:11,095 --> 20:28:13,762
And just make sure that when the\nuser visits that, you give them
21423
20:28:14,661 --> 20:28:17,369
If they visit something else, you\n
21424
20:28:17,369 --> 20:28:20,241
It doesn't have to map to a very\n
21425
20:28:20,241 --> 20:28:23,421
And if you want to get input from\n
21426
20:28:23,422 --> 20:28:28,222
like q=cats, you can add a question\n
21427
20:28:28,221 --> 20:28:32,061
The key, or the HTTP parameter name\n
21428
20:28:32,062 --> 20:28:35,025
and then equal sum value that,\npresumably, the human typed in.
21429
20:28:35,024 --> 20:28:37,191
If you have more of these,\nyou can put an ampersand
21430
20:28:37,191 --> 20:28:41,902
and then more key equals value pairs\n
21431
20:28:41,902 --> 20:28:46,532
The catch, though, is that using the\n
21432
20:28:46,532 --> 20:28:51,682
we don't really have the ability to\n
21433
20:28:53,601 --> 20:28:56,661
You could have appended question\n
21434
20:28:56,661 --> 20:29:00,561
to any of URLs in your home\npage for problem set eight
21435
20:29:00,562 --> 20:29:04,222
but it doesn't actually do\nanything useful, necessarily
21436
20:29:04,221 --> 20:29:06,111
unless you use some fancy JavaScript.
21437
20:29:06,111 --> 20:29:09,241
The server is not going to bother\neven looking in that for you.
21438
20:29:09,241 --> 20:29:12,141
But today, we're going to\nintroduce using a bit of Python.
21439
20:29:12,142 --> 20:29:15,982
And in fact, we're going to use a web\n
21440
20:29:15,982 --> 20:29:19,312
of using HTTP server alone,\nto automatically, for you
21441
20:29:19,312 --> 20:29:22,192
look for any key value pairs\nafter the question mark
21442
20:29:22,191 --> 20:29:25,881
and then hand them to you in\nthe form of a Python dictionary.
21443
20:29:25,881 --> 20:29:29,781
Recall that a dictionary in Python, a\n
21444
20:29:29,782 --> 20:29:33,917
That seems like a perfect fit\nfor these kinds of parameters.
21445
20:29:33,917 --> 20:29:36,292
And you're not going to have\nto write that code yourself.
21446
20:29:36,292 --> 20:29:39,902
It's going to be handed to you by\n
21447
20:29:39,902 --> 20:29:41,782
So this will be the\nsecond of two frameworks
21448
20:29:41,782 --> 20:29:43,324
really, that we look at in the class.
21449
20:29:43,323 --> 20:29:46,011
And a framework is essentially\na bunch of libraries
21450
20:29:46,012 --> 20:29:48,502
that someone else wrote\nand a set of conventions
21451
20:29:48,501 --> 20:29:49,951
therefore, for doing things.
21452
20:29:49,952 --> 20:29:52,762
So those of you who really\nstarted dabbling with Bootstrap
21453
20:29:52,762 --> 20:29:56,441
this past week to make your home\n
21454
20:29:58,211 --> 20:30:01,872
Well, you're using libraries, code that\n
21455
20:30:01,872 --> 20:30:04,961
maybe some of the JavaScript that\n
21456
20:30:04,961 --> 20:30:08,741
But it's also a framework in the\n
21457
20:30:08,741 --> 20:30:11,262
You have to use Bootstraps\nclasses, and you
21458
20:30:11,262 --> 20:30:15,582
have to lay out your divs or\nyour spans or your table tags
21459
20:30:15,581 --> 20:30:17,424
in a sort of Bootstrap-friendly way.
21460
20:30:17,425 --> 20:30:19,842
And it's not too onerous, but\nyou're following conventions
21461
20:30:19,842 --> 20:30:21,682
that a bunch of humans standardized on.
21462
20:30:21,682 --> 20:30:25,422
So similarly, in the world of\nPython, is there another framework
21463
20:30:25,422 --> 20:30:26,802
we're going to start using today.
21464
20:30:26,801 --> 20:30:30,191
And whereas Bootstrap is\nused for CSS and JavaScript
21465
20:30:30,191 --> 20:30:32,231
Flask is going to be used for Python.
21466
20:30:32,232 --> 20:30:34,932
And it just solves a lot\nof common problems for us.
21467
20:30:34,932 --> 20:30:38,142
It's going to make it easier\nfor us to analyze the URLs
21468
20:30:38,142 --> 20:30:40,782
and get key value pairs,\nit's going to make it easier
21469
20:30:40,782 --> 20:30:43,872
for us to find files or\nimages that the human wants
21470
20:30:43,872 --> 20:30:45,312
to see when visiting our website.
21471
20:30:45,312 --> 20:30:48,204
It's even going to make it easier\nto send emails automatically
21472
20:30:48,203 --> 20:30:49,661
like when someone fills out a form.
21473
20:30:49,661 --> 20:30:52,971
You can dynamically, using code,\nsend them an email as well.
21474
20:30:52,971 --> 20:30:55,271
So Flask, and with it\nsome related libraries
21475
20:30:55,271 --> 20:30:58,512
it's just going to make stuff\nlike that easier for us.
21476
20:30:58,512 --> 20:31:03,101
And to do this, all we have to do\n
21477
20:31:03,101 --> 20:31:05,211
requirements of this framework.
21478
20:31:05,211 --> 20:31:08,411
We're going to have to create a\n
21479
20:31:08,411 --> 20:31:11,411
this is where our web app or\napplication is going to live.
21480
20:31:11,411 --> 20:31:15,881
If we have any libraries that we want to\n
21481
20:31:15,881 --> 20:31:19,512
is to have a very simple text\nfile called requirements.txt
21482
20:31:19,512 --> 20:31:21,851
where you list the names\nof those libraries
21483
20:31:21,851 --> 20:31:26,652
top to bottom, in that text file,\n
21484
20:31:26,652 --> 20:31:30,411
or the import statements that we\n
21485
20:31:30,411 --> 20:31:33,461
We're going to have a static\nfolder or static directory, which
21486
20:31:33,461 --> 20:31:36,611
means any files you create that\nare not ever going to change
21487
20:31:36,611 --> 20:31:39,203
like images, CSS files,\nJavaScript files
21488
20:31:39,203 --> 20:31:40,661
they're going to go in this folder.
21489
20:31:40,661 --> 20:31:43,512
And then lastly, any\nHTML that you write
21490
20:31:43,512 --> 20:31:45,911
web pages you want the\nhuman to see, are going
21491
20:31:45,911 --> 20:31:47,751
to go in a folder called templates.
21492
20:31:47,751 --> 20:31:50,711
So this is, again, evidence of\nwhat we mean by a framework.
21493
20:31:50,711 --> 20:31:52,512
Do you have to make a web app like this?
21494
20:31:52,512 --> 20:31:54,702
No, but if you're using\nthis particular framework
21495
20:31:54,702 --> 20:31:58,752
this is what people decided\nwould be the human conventions.
21496
20:31:58,751 --> 20:32:02,961
If you've heard of other frameworks like\n
21497
20:32:02,961 --> 20:32:06,131
there are just different conventions\n
21498
20:32:06,131 --> 20:32:10,152
Flask is a very nice\nmicroframework in that that's it.
21499
20:32:10,152 --> 20:32:13,631
All you have to do is adhere to\n
21500
20:32:13,631 --> 20:32:16,601
to get some code up and running.
21501
20:32:16,601 --> 20:32:19,381
All right, so let's go\nahead and make a web app.
21502
20:32:19,381 --> 20:32:21,131
Let me go ahead and\nswitch over to VS Code
21503
20:32:21,131 --> 20:32:23,173
here, and let me practice\nwhat I'm preaching here
21504
20:32:25,482 --> 20:32:29,172
And let's go ahead and create\nan application that very simply
21505
20:32:29,172 --> 20:32:31,461
maybe, says hello to the user.
21506
20:32:31,461 --> 20:32:35,922
So something that, initially, is not all\n
21507
20:32:35,922 --> 20:32:37,882
But we'll build on that\nas we've always done.
21508
20:32:37,881 --> 20:32:41,861
So in app.py, what I'm going to do\n
21509
20:32:41,861 --> 20:32:43,182
I had on the screen earlier.
21510
20:32:43,182 --> 20:32:49,601
From Flask, import Flask, with a capital\n
21511
20:32:49,601 --> 20:32:52,152
And I'm also going to\npreemptively import
21512
20:32:52,152 --> 20:32:56,202
a couple of functions,\nrender template, and request.
21513
20:32:56,202 --> 20:32:58,122
More on those in just a bit.
21514
20:32:58,122 --> 20:33:00,762
And then below that, I'm going\nto say, go ahead and do this.
21515
20:33:00,762 --> 20:33:03,491
Give me a web-- a\nvariable called app that's
21516
20:33:03,491 --> 20:33:07,691
going to be the result of calling\n
21517
20:33:07,691 --> 20:33:10,161
this weird incantation here, name.
21518
20:33:10,161 --> 20:33:13,572
So we've seen this a few weeks back\n
21519
20:33:13,572 --> 20:33:16,301
and we had that if main thing\nat the bottom of the screen.
21520
20:33:16,301 --> 20:33:21,851
For now, just know that __name__\n
21521
20:33:21,851 --> 20:33:26,322
And so this line here, simple as\n
21522
20:33:26,322 --> 20:33:30,072
turn this file into a Flask application.
21523
20:33:30,072 --> 20:33:33,792
Flask is a function that just figures\n
21524
20:33:33,792 --> 20:33:37,392
The last thing I'm going to do for this\n
21525
20:33:37,392 --> 20:33:41,052
I'm going to say that I'm\ngoing to have a function called
21526
20:33:41,051 --> 20:33:42,971
index that takes no arguments.
21527
20:33:42,971 --> 20:33:44,891
And whenever this\nfunction is called, I want
21528
20:33:44,892 --> 20:33:50,322
to return the results of rendering\na template called index.html.
21529
20:33:51,277 --> 20:33:53,652
So let's assume there's a file\nsomewhere, haven't created
21530
20:33:55,482 --> 20:33:57,702
But render template\nmeans render this file
21531
20:33:57,702 --> 20:34:01,162
that is printed to the\nuser's screen, so to speak.
21532
20:34:01,161 --> 20:34:03,971
The last thing I'm going to\ndo is I have to tell Flask
21533
20:34:03,971 --> 20:34:06,261
when to call this index function.
21534
20:34:06,262 --> 20:34:11,927
And so I'm going to tell it to define\n
21535
20:34:13,029 --> 20:34:15,072
So let's take a look at\nwhat I just created here.
21536
20:34:15,072 --> 20:34:18,042
This is slightly new syntax, and\nit's really the only weirdness
21537
20:34:18,042 --> 20:34:19,812
that we'll have today in Python.
21538
20:34:19,812 --> 20:34:22,512
This is what's known in Python\nis what's called a decorator.
21539
20:34:22,512 --> 20:34:24,521
A decorator is a\nspecial type of function
21540
20:34:24,521 --> 20:34:27,072
that modifies, essentially,\nanother function.
21541
20:34:27,072 --> 20:34:30,111
For our purposes, just know\nthat on line six this says
21542
20:34:30,111 --> 20:34:33,461
hey Python, define a route\nfor slash, the default
21543
20:34:33,461 --> 20:34:35,262
page on my website application.
21544
20:34:35,262 --> 20:34:37,661
The next two lines, seven\nand eight, say, hey Python
21545
20:34:37,661 --> 20:34:40,391
define a function called\nindex, takes no arguments.
21546
20:34:40,392 --> 20:34:45,282
And the only thing you should ever do is\n
21547
20:34:48,411 --> 20:34:51,461
So really, the next question,\n
21548
20:34:55,251 --> 20:34:56,991
Well, let me go ahead and do that next.
21549
20:34:56,991 --> 20:35:00,262
Let me create a directory called\ntemplates, practicing, again
21550
20:35:01,521 --> 20:35:04,281
So I'm going to create a new\nempty directory called templates
21551
20:35:04,282 --> 20:35:08,002
I'm going to go and\nCD into that directory
21552
20:35:08,001 --> 20:35:11,661
and then do code of index.html.
21553
20:35:11,661 --> 20:35:13,281
So here is going to be my index page.
21554
20:35:13,282 --> 20:35:16,227
And I'm going to do a very\nsimple web page, doc type HTML.
21555
20:35:16,226 --> 20:35:18,351
I'm just going to borrow\nsome stuff from last week.
21556
20:35:18,351 --> 20:35:20,601
HTML language equals English.
21557
20:35:21,771 --> 20:35:25,491
I'll then do a head tag, I'll do a meta\n
21558
20:35:25,491 --> 20:35:28,012
This makes my site recall responsive.
21559
20:35:28,012 --> 20:35:30,741
That is, it just grows and shrink\nto fit the size of the device.
21560
20:35:30,741 --> 20:35:34,221
The initial scale for which is going\n
21561
20:35:34,221 --> 20:35:36,261
is going to be device width.
21562
20:35:36,262 --> 20:35:38,301
So I'm typing this out,\nI have it printed here.
21563
20:35:38,301 --> 20:35:40,251
This is stuff I typically copy paste.
21564
20:35:40,251 --> 20:35:42,951
But then lastly, I'm going to\nadd in my title, which will just
21565
20:35:42,952 --> 20:35:44,512
be hello for the name of this app.
21566
20:35:46,402 --> 20:35:50,271
The body of this tag will be--
21567
20:35:51,232 --> 20:35:54,902
The body of this page, rather,\nwill just be hello comma world.
21568
20:35:54,902 --> 20:35:58,441
So very uninteresting and really a\n
21569
20:35:58,441 --> 20:36:01,792
But let's go now and experiment\nwith these two files.
21570
20:36:01,792 --> 20:36:03,652
I'm not going to bother\nwith a static folder
21571
20:36:03,652 --> 20:36:06,652
right now, because I don't have any\n
21572
20:36:06,652 --> 20:36:08,752
No images, no CSS, nothing like that.
21573
20:36:08,751 --> 20:36:11,761
And honestly, requirements.txt\nis going to be pretty simple.
21574
20:36:11,762 --> 20:36:15,562
I'm going to go requirements.txt and\n
21575
20:36:15,562 --> 20:36:18,952
access to the Flask library itself.
21576
20:36:18,952 --> 20:36:21,952
All right, but that's the only\n
21577
20:36:21,952 --> 20:36:27,232
All right, so now I have two files,\n
21578
20:36:27,232 --> 20:36:30,711
But index.html thank you is\ninside of my templates directory
21579
20:36:30,711 --> 20:36:33,211
so how do I actually start\na web server last week
21580
20:36:33,211 --> 20:36:34,491
I would have said HTTP server.
21581
20:36:34,491 --> 20:36:36,831
But HTTP server is not a Python thing.
21582
20:36:36,831 --> 20:36:41,152
It has no idea about Flask or\nPython or anything I just wrote.
21583
20:36:41,152 --> 20:36:43,982
HTTP server will just\nspit out static files.
21584
20:36:43,982 --> 20:36:47,286
So if I ran HTTP server, and\nthen I clicked on app.py
21585
20:36:47,286 --> 20:36:49,251
I would literally see my Python code.
21586
20:36:49,251 --> 20:36:53,421
It would not get executed because HTTP\n
21587
20:36:53,422 --> 20:36:57,502
But today, I'm going to run a\n
21588
20:36:57,501 --> 20:37:01,251
So this framework Flask that I\nactually preinstalled in advance
21589
20:37:01,251 --> 20:37:04,941
so it wasn't strictly necessary that I\n
21590
20:37:04,941 --> 20:37:08,902
yet, comes with a program called Flask,\n
21591
20:37:08,902 --> 20:37:12,472
the word run, and when I do that, you'll\n
21592
20:37:12,471 --> 20:37:14,661
week whereby you'll see the name--
21593
20:37:14,661 --> 20:37:17,629
your URL for your\nunique preview of that.
21594
20:37:17,629 --> 20:37:20,211
You might see a pop up saying\nthat your application is running
21595
20:37:20,211 --> 20:37:22,161
on TCP port, something or other.
21596
20:37:22,161 --> 20:37:24,591
By default, last week,\nwe used port 8080.
21597
20:37:24,592 --> 20:37:27,622
Flask, just because, prefers port 5,000.
21598
20:37:28,702 --> 20:37:31,432
I'm going to go ahead\nand open up this URL now.
21599
20:37:31,432 --> 20:37:33,652
And once it authenticates\nand redirects me
21600
20:37:33,652 --> 20:37:37,222
just to make sure I'm allowed to access\n
21601
20:37:37,221 --> 20:37:40,221
Voila, there's the extent\nof this application.
21602
20:37:40,221 --> 20:37:43,911
If I view source by right-clicking\nor control clicking
21603
20:37:43,911 --> 20:37:45,891
there's my HTML that's been spit out.
21604
20:37:45,892 --> 20:37:48,832
So really, I've just reinvented\nthe wheel from last week
21605
20:37:48,831 --> 20:37:51,271
because there's no dynamism\nnow, nothing at all.
21606
20:37:52,968 --> 20:37:54,801
Let me close the source\nand let me zoom out.
21607
20:37:56,422 --> 20:37:59,601
Let me zoom in now, and I have\na very unique cryptic URL.
21608
20:37:59,601 --> 20:38:01,581
But the point is that\nit ends with nothing.
21609
20:38:01,581 --> 20:38:03,801
Or implicitly, it ends with slash.
21610
20:38:03,801 --> 20:38:05,691
This is just Chrome\nbeing a little helpful.
21611
20:38:05,691 --> 20:38:08,762
It doesn't bother showing you a slash,\n
21612
20:38:08,762 --> 20:38:14,872
But let me do something explicit like\n
21613
20:38:14,872 --> 20:38:17,241
So there's a key value\npair that I've manually
21614
20:38:17,241 --> 20:38:20,661
typed into my URL bar and hit Enter.
21615
20:38:20,661 --> 20:38:22,161
Nothing happens, nothing changes.
21616
20:38:23,661 --> 20:38:27,982
But the opportunity today is to\n
21617
20:38:27,982 --> 20:38:31,172
from that URL and start\ndisplaying it to the user.
21618
20:38:31,172 --> 20:38:35,372
So let me go back over here to\nmy terminal window and code.
21619
20:38:35,372 --> 20:38:37,822
Let me move that down\nto the bottom there.
21620
20:38:37,822 --> 20:38:41,042
And what if I want to\nsay, huh, hello, name.
21621
20:38:41,042 --> 20:38:42,771
I ideally want to say something like--
21622
20:38:42,771 --> 20:38:44,391
I don't want to hard code\nDavid because then it's never
21623
20:38:44,392 --> 20:38:46,282
going to say hello to anyone else.
21624
20:38:46,282 --> 20:38:51,772
I want to put like a variable name\n
21625
20:38:51,771 --> 20:38:55,311
But it's not an HTML tag, so I\nneed some kind of placeholder.
21626
20:38:57,441 --> 20:39:02,572
If I go back to my Python code, I can\n
21627
20:39:02,572 --> 20:39:06,562
And I can ask Flask to go\ninto the current request
21628
20:39:06,562 --> 20:39:10,432
into its arguments, that is\nin the URL, as they're called
21629
20:39:10,432 --> 20:39:14,721
and get whatever the value of\nthe parameter called name is.
21630
20:39:14,721 --> 20:39:16,792
That puts that into a variable for me.
21631
20:39:16,792 --> 20:39:19,702
And then, in render template--\nthis is one of those functions
21632
20:39:19,702 --> 20:39:21,562
that can take more than one argument.
21633
20:39:21,562 --> 20:39:23,542
If it takes another\nargument, you can pass
21634
20:39:23,542 --> 20:39:25,502
in the name of any variable you want.
21635
20:39:25,501 --> 20:39:29,971
So if I want to pass in my name, I\n
21636
20:39:29,971 --> 20:39:34,731
So this is the name of a variable\n
21637
20:39:34,732 --> 20:39:39,682
This is the actual variable that\nI want to get the value from.
21638
20:39:39,682 --> 20:39:45,682
And now lastly, in my index.html,\n
21639
20:39:45,682 --> 20:39:49,792
is to do two curly braces and\nthen put the name of the variable
21640
20:39:51,354 --> 20:39:53,622
So here's what we mean by a template.
21641
20:39:53,622 --> 20:39:56,711
A template is like a blueprint\nin the real world, where
21642
20:39:56,711 --> 20:39:58,842
it's plans to make something.
21643
20:39:58,842 --> 20:40:02,711
This is the plan to make a web page\n
21644
20:40:02,711 --> 20:40:06,221
but there's this placeholder with\ntwo curly braces here and here
21645
20:40:06,221 --> 20:40:10,641
that says go ahead and plug in the\n
21646
20:40:10,642 --> 20:40:13,212
So in this sense, it's similar\nin spirit to our f strings
21647
20:40:13,211 --> 20:40:14,741
or format strings in Python.
21648
20:40:14,741 --> 20:40:17,658
The syntax is a little different\njust because reasonable people
21649
20:40:17,658 --> 20:40:19,991
disagree, different people,\ndifferent frameworks come up
21650
20:40:19,991 --> 20:40:21,116
with different conventions.
21651
20:40:21,116 --> 20:40:23,982
The convention in Flask,\nin their templates
21652
20:40:23,982 --> 20:40:26,442
is to use two curly braces here.
21653
20:40:26,441 --> 20:40:28,721
The hope is that you, the\nprogrammer, will never
21654
20:40:28,721 --> 20:40:32,241
want to display two curly\nbraces in your actual web page.
21655
20:40:32,241 --> 20:40:34,432
But even if you do,\nthere's a workaround.
21656
20:40:35,572 --> 20:40:39,012
So now let me go ahead and go\nback to my browser tab here.
21657
20:40:39,012 --> 20:40:41,922
Previously, even though\nI added name equals David
21658
20:40:41,922 --> 20:40:44,952
to the end of the URL with a question\n
21659
20:40:44,952 --> 20:40:47,682
But now, hopefully, if\nI made these changes
21660
20:40:47,682 --> 20:40:51,350
let me go ahead and open\nup my terminal window.
21661
20:40:51,350 --> 20:40:55,432
Let me restart Flask so it\nloads my changes by default.
21662
20:40:55,432 --> 20:40:59,380
Let me go back to my hello tab and\n
21663
20:41:01,001 --> 20:41:02,471
And there we go, hello, David.
21664
20:41:02,471 --> 20:41:05,501
I can play around now and I can change\n
21665
20:41:07,842 --> 20:41:10,552
And now we have something more dynamic.
21666
20:41:10,551 --> 20:41:14,741
So the new pieces here are, in\nPython, we have some code here
21667
20:41:14,741 --> 20:41:18,551
that allows us to access,\nprogrammatically, everything
21668
20:41:18,551 --> 20:41:20,711
that's after the\nquestion mark in the URL.
21669
20:41:20,711 --> 20:41:26,622
And the only thing we have to do that\n
21670
20:41:26,622 --> 20:41:28,452
You and I don't have\nto bother figuring out
21671
20:41:28,452 --> 20:41:30,869
where is the question mark,\nwhere is the equal sign, where
21672
20:41:30,869 --> 20:41:32,442
are the ampersands, potentially.
21673
20:41:32,441 --> 20:41:36,221
The framework, Flask,\ndoes all of that for us.
21674
20:41:36,221 --> 20:41:43,251
OK, any questions then on\nthese principles thus far?
21675
20:41:44,557 --> 20:41:50,142
AUDIENCE: Why do you say the\nquestion mark in the URL?
21676
20:41:50,142 --> 20:41:52,572
DAVID: Why do you need a\nquestion mark in the URL?
21677
20:41:52,572 --> 20:41:59,152
The short answer is just because that\n
21678
20:41:59,152 --> 20:42:02,752
If you're making a GET request\nfrom a browser to a server
21679
20:42:02,751 --> 20:42:07,451
the convention, standardized by the\n
21680
20:42:07,452 --> 20:42:11,152
after the so-called route or\npath, then a question mark.
21681
20:42:11,152 --> 20:42:13,601
And it delineates what's\npart of the root or the path
21682
20:42:13,601 --> 20:42:17,751
and what's part of the\nhuman input to the right.
21683
20:42:19,226 --> 20:42:22,872
AUDIENCE: Can you go over again why\n
21684
20:42:23,372 --> 20:42:25,502
This is this annoying\nthing about Python.
21685
20:42:25,501 --> 20:42:29,551
When you pass in parameters,\ntwo functions that have names
21686
20:42:29,551 --> 20:42:32,471
you typically say something\nequals something else.
21687
20:42:32,471 --> 20:42:35,042
So let me make a slight tweak here.
21688
20:42:35,042 --> 20:42:39,372
How about I say name of person here.
21689
20:42:39,372 --> 20:42:44,222
This allows me to invent my\nown variable for my template
21690
20:42:44,221 --> 20:42:46,211
and assign it the value of name.
21691
20:42:46,211 --> 20:42:52,676
I now, though, have to go into my\n
21692
20:42:57,072 --> 20:43:00,452
And so this is just stupid because\nit's unnecessarily verbose.
21693
20:43:00,452 --> 20:43:04,442
So what typically people do is they\n
21694
20:43:04,441 --> 20:43:08,361
itself, even though it looks admittedly\n
21695
20:43:08,361 --> 20:43:10,111
The thing to the left\nof the equal sign is
21696
20:43:10,111 --> 20:43:14,221
the name of the variable you plan to use\n
21697
20:43:14,221 --> 20:43:16,231
is the actual value you're assigning it.
21698
20:43:16,232 --> 20:43:18,092
And this is because its general purpose.
21699
20:43:18,092 --> 20:43:21,332
I could override this and I could\nsay something like name always
21700
20:43:21,331 --> 20:43:23,881
equals Emma, no matter\nwhat that variable is.
21701
20:43:23,881 --> 20:43:26,191
And now if I go back to\nmy browser and reload
21702
20:43:26,191 --> 20:43:30,121
no matter what's in the URL,\nDavid or Carter, It's always--
21703
20:43:34,142 --> 20:43:36,602
Oh, I didn't change my template back.
21704
20:43:37,142 --> 20:43:40,412
Let me change that back to be\nname, so that it's name there
21705
20:43:41,342 --> 20:43:43,892
But I've hardcoded\nEmma's name, so now we're
21706
20:43:43,892 --> 20:43:47,492
only ever going to see Emma no\nmatter whose name is in the URL.
21707
20:43:48,521 --> 20:43:51,781
All right, so this is\nbad user interface.
21708
20:43:51,782 --> 20:43:54,692
If, in order to get a greeting\nfor the day, you, the user
21709
20:43:54,691 --> 20:43:57,182
have to manually change the\nURL, which none of us ever do.
21710
20:43:57,182 --> 20:43:58,982
This is not how web pages work.
21711
20:43:58,982 --> 20:44:02,972
What is the more normal mechanism\n
21712
20:44:02,971 --> 20:44:06,661
and putting it in that\nURL automatically?
21713
20:44:06,661 --> 20:44:09,542
How did we do that last week?
21714
20:44:11,262 --> 20:44:15,952
AUDIENCE: We have the search\nbar and we [INAUDIBLE] you have
21715
20:44:15,952 --> 20:44:22,101
to make something in there [INAUDIBLE].
21716
20:44:22,101 --> 20:44:25,732
DAVID: OK, so we did make something in\n
21717
20:44:25,732 --> 20:44:29,601
And specifically, what was the tag\n
21718
20:44:30,831 --> 20:44:32,761
DAVID: Sorry, a little louder?
21719
20:44:36,441 --> 20:44:38,792
DAVID: So the input tag,\ninside of the form tag.
21720
20:44:38,792 --> 20:44:41,536
So in short, forms, or of\ncourse, how the web works
21721
20:44:41,536 --> 20:44:43,411
and how we typically\nget input from the user
21722
20:44:43,411 --> 20:44:46,739
whether it's a button or a text box\n
21723
20:44:46,740 --> 20:44:48,782
So let's go ahead and add\nthat into the mix here.
21724
20:44:48,782 --> 20:44:52,702
So let's enhance this hello app\n
21725
20:44:54,001 --> 20:44:58,101
Let me get rid of this\nname stuff and let me just
21726
20:44:58,101 --> 20:45:03,891
have a very simple index.html file\n
21727
20:45:03,892 --> 20:45:06,242
the user for some input as follows.
21728
20:45:06,241 --> 20:45:10,851
I'm going to go back into my\n
21729
20:45:10,851 --> 20:45:13,851
the user's name, this is the page I'm\n
21730
20:45:14,642 --> 20:45:16,972
So I'm going to create a form tag.
21731
20:45:16,971 --> 20:45:20,631
The method I'm going to use for now\n
21732
20:45:20,631 --> 20:45:23,092
Then, inside of that form, I'm\ngoing to have an input tag.
21733
20:45:23,092 --> 20:45:25,634
And I'm going to turn off\nautocomplete like we did last week.
21734
20:45:25,634 --> 20:45:29,542
I'm going to turn on auto focus, so it\n
21735
20:45:29,542 --> 20:45:33,263
I'm going to give the name\nof this input the name, name.
21736
20:45:33,263 --> 20:45:35,971
Not to be too confusing, but I'm\n
21737
20:45:35,971 --> 20:45:39,531
So it makes sense that the name of the\n
21738
20:45:39,532 --> 20:45:42,232
The placeholder I want the\nhuman to see in light gray text
21739
20:45:42,232 --> 20:45:45,262
will be Name with a capital N,\n
21740
20:45:45,262 --> 20:45:47,512
And then type of this text fiel--
21741
20:45:47,512 --> 20:45:49,372
type of this input is going to be text.
21742
20:45:49,372 --> 20:45:52,312
Then I'm just going to give myself,\n
21743
20:45:52,312 --> 20:45:54,229
And I don't care what\nit says, it's just going
21744
20:45:54,229 --> 20:45:56,582
to say the default submit terminology.
21745
20:45:56,581 --> 20:46:01,072
Let me go ahead, now, and open\nup my terminal window again.
21746
20:46:01,072 --> 20:46:06,232
Let me go to that same URL\nso that I can see-- whoops.
21747
20:46:11,751 --> 20:46:13,301
So that was just cached from earlier.
21748
20:46:13,301 --> 20:46:16,601
Let me go back to that same\nURL, my GitHub preview.dev URL
21749
20:46:18,021 --> 20:46:19,932
And now, I can type in anything I want.
21750
20:46:19,932 --> 20:46:23,601
The catch, though, is when I click\n
21751
20:46:24,642 --> 20:46:28,002
It does have a default value,\nbut let me go into my index.html
21752
20:46:28,001 --> 20:46:30,761
and let me add, just like we\ndid last week for it, Google.
21753
20:46:30,762 --> 20:46:36,188
Whereas previously, I said something\n
21754
20:46:36,188 --> 20:46:38,021
we're not going to rely\non some third party.
21755
20:46:38,021 --> 20:46:40,482
I'm going to implement\nthe so-called backend
21756
20:46:40,482 --> 20:46:44,752
and I'm going to have the user\n
21757
20:46:44,751 --> 20:46:47,381
not just slash, how about /greet.
21758
20:46:47,381 --> 20:46:48,941
I can make it up, whatever I want.
21759
20:46:48,941 --> 20:46:53,411
Greet feels like a nice operative word,\n
21760
20:46:53,411 --> 20:46:56,411
sent when they click\nSubmit on this form.
21761
20:46:56,411 --> 20:46:59,891
All right, so let's go ahead now\nand go back to my browser tab.
21762
20:46:59,892 --> 20:47:02,592
Let me go ahead, actually,\nand let me reload Flask
21763
20:47:02,592 --> 20:47:05,082
here so that it reloads\nall of my changes.
21764
20:47:05,081 --> 20:47:09,161
Let me reload this tab so that I get\n
21765
20:47:10,271 --> 20:47:13,391
If I view page source, we\nindeed see that my browser
21766
20:47:13,392 --> 20:47:15,252
has downloaded the latest HTML.
21767
20:47:15,251 --> 20:47:16,916
So it definitely has changed.
21768
20:47:16,917 --> 20:47:18,292
Let's go ahead and type in David.
21769
20:47:18,292 --> 20:47:22,062
And when I click Submit\nhere, what's going to happen?
21770
20:47:25,262 --> 20:47:28,381
What's going to happen visually,\nfunctionally, however you
21771
20:47:28,381 --> 20:47:32,301
want to interpret when I click Submit.
21772
20:47:32,801 --> 20:47:34,346
AUDIENCE: [INAUDIBLE] an empty page.
21773
20:47:34,346 --> 20:47:36,471
DAVID: OK, the user's going\nto go to an empty page.
21774
20:47:36,471 --> 20:47:37,763
Pretty good instinct, because--
21775
20:47:37,763 --> 20:47:40,641
no where else, if I mentioned\n/greet, it doesn't seem to exist.
21776
20:47:40,642 --> 20:47:44,872
How's the URL going to\nchange, just to be clear?
21777
20:47:44,872 --> 20:47:47,122
What's going to appear,\nsuddenly, in the URL?
21778
20:47:53,563 --> 20:47:56,438
Specifically in the URL, something's\n
21779
20:47:56,994 --> 20:47:58,202
AUDIENCE: The key value pair?
21780
20:47:58,202 --> 20:47:59,577
DAVID: The key value pair, right.
21781
20:48:00,661 --> 20:48:02,941
That's why our Google\ntrick last week worked.
21782
20:48:02,941 --> 20:48:05,604
I sort of recreated a\nform on my own website.
21783
20:48:05,604 --> 20:48:08,521
And even though I didn't get around\n
21784
20:48:08,521 --> 20:48:12,182
I can still send the information\n
21785
20:48:13,351 --> 20:48:16,232
to your question earlier, that\nwhenever you submit a form
21786
20:48:16,232 --> 20:48:19,502
it automatically ends up after\na question mark in the URL
21787
20:48:20,432 --> 20:48:23,221
So this both of you are\nright, this is going to break.
21788
20:48:23,221 --> 20:48:26,761
And all three of you are right,\nin effect, 404 not found.
21789
20:48:26,762 --> 20:48:28,211
You can see it in the tab here.
21790
20:48:28,211 --> 20:48:29,711
That's the error that has come back.
21791
20:48:29,711 --> 20:48:33,482
But what's interesting, and most\nimportant, the URL did change.
21792
20:48:33,482 --> 20:48:37,652
And it went to /greet?name=david.
21793
20:48:37,652 --> 20:48:40,021
So I just, now, need to add\nsome logic that actually
21794
20:48:40,021 --> 20:48:41,762
looks for that so-called route.
21795
20:48:41,762 --> 20:48:44,282
So let me go back to my app.py.
21796
20:48:44,282 --> 20:48:49,442
Let me define another route for,\nquote unquote, "slash greet.
21797
20:48:49,441 --> 20:48:52,961
And then, inside of-- under this,\n
21798
20:48:52,961 --> 20:48:56,042
I'll call it greet, but I\ncould call it anything I want.
21799
20:48:56,042 --> 20:48:58,711
No arguments, for now,\nfor this, and then
21800
20:48:58,711 --> 20:49:02,191
let me go ahead and\ndo this in my app.py.
21801
20:49:02,191 --> 20:49:05,051
This time around, I do want\nto get the human's name.
21802
20:49:05,051 --> 20:49:09,061
So let me say requeste.args\nget quote unquote "name"
21803
20:49:09,062 --> 20:49:11,101
and let me store that in\na variable called name.
21804
20:49:11,101 --> 20:49:14,792
Then let me return a\ntemplate, and you know
21805
20:49:14,792 --> 20:49:17,372
what, I'm going to give myself\na new template, greet.html.
21806
20:49:17,372 --> 20:49:19,622
Because this has a different\npurpose, it's not a form.
21807
20:49:19,622 --> 20:49:22,562
I want to say hello to the\nuser in this HTML file
21808
20:49:22,562 --> 20:49:27,722
and I want to pass, into it, the\n
21809
20:49:27,721 --> 20:49:34,081
All right, so now if I go up and\n
21810
20:49:36,241 --> 20:49:39,691
If I go ahead and hit reload or resubmit\n
21811
20:49:46,241 --> 20:49:48,652
Let me try, so let's try this.
21812
20:49:48,652 --> 20:49:50,232
Let's go ahead and reload the page.
21813
20:49:50,232 --> 20:49:51,741
Previously, it was not found.
21814
20:49:51,741 --> 20:49:55,661
Now it's worse, and this is\nthe 500 error, internal server
21815
20:49:55,661 --> 20:49:59,891
error that I promised next week we will\n
21816
20:49:59,892 --> 20:50:01,782
But here we have an\ninternal server error.
21817
20:50:01,782 --> 20:50:05,512
Because it's an internal error, this\n
21818
20:50:05,512 --> 20:50:09,042
So the route was actually found\n
21819
20:50:09,042 --> 20:50:14,502
But if we go into VS Code here and\n
21820
20:50:17,622 --> 20:50:21,042
this is actually a bit misleading.
21821
20:50:32,202 --> 20:50:34,872
OK, here we have this\nerror here, and this
21822
20:50:34,872 --> 20:50:37,241
is where your terminal window\nis going to be helpful.
21823
20:50:37,241 --> 20:50:40,152
In your terminal window,\nby default, is typically
21824
20:50:40,152 --> 20:50:43,271
going to go helpful\nstuff like a log, L-O-G
21825
20:50:43,271 --> 20:50:46,608
of what it is the server\nis seeing from the browser.
21826
20:50:46,608 --> 20:50:48,941
For instance, here's what the\nserver just saw in purple.
21827
20:50:48,941 --> 20:50:54,072
Get /greet?name=david\nusing HTTP version 1.0.
21828
20:50:54,072 --> 20:50:57,551
Here, though, is the status code\nthat the server returned, 500.
21829
20:50:58,601 --> 20:51:02,021
Well, here's where we get these\n
21830
20:51:02,021 --> 20:51:03,851
that help50 might\nultimately help you with
21831
20:51:03,851 --> 20:51:07,164
or here, we might just\nhave a clue at the bottom.
21832
20:51:07,164 --> 20:51:09,581
And this is actually pretty\nclear, even though we've never
21833
20:51:11,892 --> 20:51:14,472
I just didn't create greet.html, right?
21834
20:51:15,498 --> 20:51:17,831
All right, so that must be\nthe last piece of the puzzle.
21835
20:51:17,831 --> 20:51:21,221
And again, representative of how you\n
21836
20:51:21,221 --> 20:51:24,311
let me go into my terminal window.
21837
20:51:24,312 --> 20:51:28,812
After hitting Control C, which\ncancels or interrupts a process
21838
20:51:28,812 --> 20:51:30,792
let me go into my templates directory.
21839
20:51:30,792 --> 20:51:33,672
If I type ls, I only have index.html.
21840
20:51:33,672 --> 20:51:36,222
So let's code up greet.html.
21841
20:51:36,221 --> 20:51:40,121
And in this file let's\nquickly do doc type.
21842
20:51:40,122 --> 20:51:44,922
Doc type HTML, open bracket\nHTML, language equals English.
21843
20:51:44,922 --> 20:51:48,851
Inside of this, I'll have the head tag,\n
21844
20:51:48,851 --> 20:51:53,501
The name is viewport,\nthe content of which is--
21845
20:51:55,782 --> 20:52:02,982
The content of which is initial scale\n
21846
20:52:02,982 --> 20:52:05,892
Quote unquote, title\nis still going to be
21847
20:52:05,892 --> 20:52:08,292
I'll call this greet\nbecause this is my template.
21848
20:52:08,292 --> 20:52:13,642
And then here, in the body, I'm\ngoing to have hello comma name.
21849
20:52:13,642 --> 20:52:17,862
So I could have kept around the old\n
21850
20:52:17,861 --> 20:52:19,331
essentially, my second template.
21851
20:52:19,331 --> 20:52:22,961
So index.html now is almost the\nsame, but the title is different
21852
20:52:24,342 --> 20:52:27,552
greet.html is almost the same,\nbut it does not have a form.
21853
20:52:27,551 --> 20:52:29,781
It just has the hello comma name.
21854
20:52:29,782 --> 20:52:34,272
So let me now go ahead and\nrerun in the correct directory.
21855
20:52:34,271 --> 20:52:38,842
You have to run Flask wherever app.py\n
21856
20:52:38,842 --> 20:52:41,802
So let me do Flask run to\nget back to where I was.
21857
20:52:41,801 --> 20:52:43,661
Let me go into my other tab.
21858
20:52:43,661 --> 20:52:46,841
Cross my fingers this time\nthat, when I go back to slash
21859
20:52:46,842 --> 20:52:51,792
and I get index.html's form, now\n
21860
20:52:54,491 --> 20:52:58,001
And now we have a full-fledged web\n
21861
20:52:58,001 --> 20:53:03,191
slash and /greet, the latter of\n
21862
20:53:03,191 --> 20:53:05,231
using a template, spits it out.
21863
20:53:05,232 --> 20:53:08,752
But something could go wrong,\nand let's see what happens here.
21864
20:53:08,751 --> 20:53:11,381
Suppose I don't type anything in.
21865
20:53:11,381 --> 20:53:13,631
Let me go here and just click Submit.
21866
20:53:13,631 --> 20:53:16,423
Now, I mean, it looks stupid.
21867
20:53:16,423 --> 20:53:18,381
So there's bunches of\nways we could solve this.
21868
20:53:18,381 --> 20:53:21,822
I could require that the user\nhave input on the previous page
21869
20:53:21,822 --> 20:53:23,955
I could have some kind\nof error check for this.
21870
20:53:23,955 --> 20:53:26,622
But there's another mechanism I\ncan use that I'll just show you.
21871
20:53:26,622 --> 20:53:30,911
It turns out this GET function,\nin the context of HTTP
21872
20:53:30,911 --> 20:53:33,161
and also in general with\nPython dictionaries
21873
20:53:33,161 --> 20:53:35,572
you can actually supply a default value.
21874
20:53:35,572 --> 20:53:39,941
So if there is no name parameter\n
21875
20:53:39,941 --> 20:53:42,591
you can actually give it\na default value like this.
21876
20:53:42,592 --> 20:53:44,771
So I'll say world, for instance.
21877
20:53:46,211 --> 20:53:48,342
Let me type in nothing\nagain and click Submit.
21878
20:53:48,342 --> 20:53:52,032
And hopefully this time,\nI'll do-- oops, sorry.
21879
20:53:52,032 --> 20:53:54,792
Let me restart Flask\nto reload the template.
21880
20:53:54,792 --> 20:53:57,851
Let me go ahead and type nothing\nthis time, clicking Submit.
21881
20:54:05,622 --> 20:54:09,461
Suppose that the reason this--
21882
20:54:11,021 --> 20:54:15,191
Suppose I just get rid of name\n
21883
20:54:15,191 --> 20:54:17,831
Now I see hello, world,\nand this is a subtlety
21884
20:54:17,831 --> 20:54:19,961
that I didn't intend to get into here.
21885
20:54:19,961 --> 20:54:23,381
When you have question\nmark name equals nothing
21886
20:54:23,381 --> 20:54:25,421
you're passing in\nwhat's called-- whoops.
21887
20:54:25,422 --> 20:54:29,112
When you have greet question\nmark name equals something
21888
20:54:29,111 --> 20:54:31,841
you actually are giving a value to name.
21889
20:54:31,842 --> 20:54:34,482
It is quote unquote\nwith nothing in between.
21890
20:54:34,482 --> 20:54:36,952
That is different from\nhaving no value at all.
21891
20:54:36,952 --> 20:54:40,965
So allow me to just propose\nthat the error here, we
21892
20:54:40,964 --> 20:54:42,881
would want to require\nthis in a different way.
21893
20:54:42,881 --> 20:54:45,372
And probably the most\nrobust way to do this
21894
20:54:45,372 --> 20:54:51,021
would be to go in here, in my HTML, and\n
21895
20:54:51,021 --> 20:54:55,751
Now, if I go back to my form\nafter restarting Flask here
21896
20:54:55,751 --> 20:54:59,711
and I go ahead and click reload\non my form and type in nothing
21897
20:54:59,711 --> 20:55:03,342
and click Submit, now the\nbrowser is going to yell at me.
21898
20:55:03,342 --> 20:55:05,502
But just as a teaser\nfor something we'll be
21899
20:55:05,501 --> 20:55:08,231
doing in the next problem set\nin terms of error checking
21900
20:55:08,232 --> 20:55:14,771
you should never, ever, ever rely on\n
21901
20:55:14,771 --> 20:55:19,301
Because we know, from last week, that\n
21902
20:55:19,301 --> 20:55:21,682
and let me poke around the HTML here.
21903
20:55:21,682 --> 20:55:24,042
Let me go into the body, the form.
21904
20:55:24,042 --> 20:55:26,661
OK, you say required,\nI say not required.
21905
20:55:26,661 --> 20:55:29,721
You can just delete what's\nin the dom, in the browser
21906
20:55:29,721 --> 20:55:32,682
and now I can go ahead\nand submit this form.
21907
20:55:32,682 --> 20:55:34,271
And it appears to be broken.
21908
20:55:34,271 --> 20:55:37,461
Not a big deal with a silly little\n
21909
20:55:37,461 --> 20:55:40,512
But if you're trying to\nrequire that humans actually
21910
20:55:40,512 --> 20:55:43,902
provide input that is necessary for\n
21911
20:55:43,902 --> 20:55:49,452
you don't want to trust that the HTML\n
21912
20:55:49,452 --> 20:55:52,332
All right, any questions,\nthen, on this particular app
21913
20:55:52,331 --> 20:55:56,211
before we add another feature here?
21914
20:55:59,947 --> 20:56:01,911
AUDIENCE: Do you guys [INAUDIBLE].
21915
20:56:04,255 --> 20:56:05,422
DAVID: Sorry, little louder.
21916
20:56:14,270 --> 20:56:15,812
DAVID: Would it be a problem if what?
21917
20:56:15,812 --> 20:56:17,285
AUDIENCE: You have to [INAUDIBLE].
21918
20:56:22,122 --> 20:56:24,521
What you should really do is something\n
21919
20:56:24,521 --> 20:56:26,092
where I'm going to start\nerror checking things.
21920
20:56:26,092 --> 20:56:27,792
So let me wave my hands at\nthat and propose that we'll
21921
20:56:27,792 --> 20:56:29,259
solve this better in just a bit.
21922
20:56:29,259 --> 20:56:31,301
But it's not bad to do\nwhat I just did here, it's
21923
20:56:31,301 --> 20:56:34,241
only going to handle one of the\n
21924
20:56:35,751 --> 20:56:38,051
All right, so even though\nthis is new to most of us
21925
20:56:38,051 --> 20:56:43,391
here, consider index.html, my first\n
21926
20:56:45,551 --> 20:56:48,783
What might be arguably badly designed?
21927
20:56:48,783 --> 20:56:50,741
Even though this might\nbe the first time you've
21928
20:56:50,741 --> 20:56:54,461
ever touched web programming like this.
21929
20:56:54,461 --> 20:57:01,902
What's bad or dumb about this\n
21930
20:57:01,902 --> 20:57:05,831
And there's a reason, too, that I bored\n
21931
20:57:06,642 --> 20:57:11,415
AUDIENCE: [INAUDIBLE] you said,\n
21932
20:57:11,414 --> 20:57:13,081
DAVID: Yeah, there's so much repetition.
21933
20:57:13,081 --> 20:57:16,322
I mean, it was deliberately tedious\n
21934
20:57:16,322 --> 20:57:19,351
The doc type, the HTML tag,\nthe head tag, the title tag.
21935
20:57:19,351 --> 20:57:21,991
And little things did change\nalong the way, like the title
21936
20:57:21,991 --> 20:57:24,031
and certainly, the content of the body.
21937
20:57:24,032 --> 20:57:27,302
But so much of this, I\nmean, almost all of the page
21938
20:57:27,301 --> 20:57:30,599
is a copy of itself in multiple files.
21939
20:57:30,599 --> 20:57:33,932
And God forbid we have a third template,\n
21940
20:57:35,072 --> 20:57:37,441
This is going to get very\ntedious very quickly.
21941
20:57:37,441 --> 20:57:39,970
And suppose you want to\nchange something in one place
21942
20:57:39,970 --> 20:57:43,262
you're going to have to change it now in\n
21943
20:57:43,961 --> 20:57:46,741
So just like in\nprogramming more generally
21944
20:57:46,741 --> 20:57:49,230
we have this ability to\nfactor out commonalities.
21945
20:57:49,230 --> 20:57:51,480
So do you in the context\nof web programming
21946
20:57:51,480 --> 20:57:54,271
and specifically\ntemplating, have the ability
21947
20:57:54,271 --> 20:57:56,379
to factor out all of\nthose commonalities.
21948
20:57:56,379 --> 20:57:58,171
The syntax is going to\nbe a little curious
21949
20:57:58,171 --> 20:58:01,092
but it functionally is\npretty straightforward.
21950
20:58:02,292 --> 20:58:06,301
Let me go ahead and copy\nthe contents of index.html.
21951
20:58:06,301 --> 20:58:09,182
Let me go into my templates\ndirectory and code a file that
21952
20:58:09,182 --> 20:58:12,131
by default, is called layout.html.
21953
20:58:12,131 --> 20:58:15,751
And let me go ahead, and per\nyour answer, copy all of those
21954
20:58:15,751 --> 20:58:18,282
commonalities into\nthis file now instead.
21955
20:58:18,282 --> 20:58:21,151
So here I have a file\ncalled layout.html.
21956
20:58:21,150 --> 20:58:26,311
I don't want to give every page the same\n
21957
20:58:26,312 --> 20:58:27,782
I'm going to call everything hello.
21958
20:58:27,782 --> 20:58:30,662
But in the body of the page,\nwhat I'm going to do here is just
21959
20:58:30,661 --> 20:58:35,262
have a placeholder for actual\ncontents that do change.
21960
20:58:35,262 --> 20:58:38,131
So in this layout, I'm\ngoing to go ahead in here
21961
20:58:38,131 --> 20:58:42,822
and just put in the body of my\npage, how about this syntax?
21962
20:58:44,372 --> 20:58:48,782
Block body, and then percent\nsign close curly brace.
21963
20:58:48,782 --> 20:58:50,972
And then I'm going to do end block.
21964
20:58:50,971 --> 20:58:55,891
So a curious syntax here, but\nthis is more template syntax.
21965
20:58:55,892 --> 20:58:59,222
The other template syntax we saw\n
21966
20:58:59,221 --> 20:59:01,051
That's for just plugging in values.
21967
20:59:01,051 --> 20:59:05,311
There's this other syntax with Flask\n
21968
20:59:05,312 --> 20:59:10,112
brace, a percent sign, and then some\n
21969
20:59:10,892 --> 20:59:13,382
And this one's a little weird\nbecause there's literally
21970
20:59:13,381 --> 20:59:16,721
nothing between the close curly\nand the open curly brace here.
21971
20:59:16,721 --> 20:59:19,201
But let's see what this can do for us.
21972
20:59:19,202 --> 20:59:25,682
Let me now go into my index.html, which\n
21973
20:59:25,682 --> 20:59:28,952
from, and let me focus on\nwhat is minimally different.
21974
20:59:28,952 --> 20:59:33,881
The only thing that's really different\n
21975
20:59:33,881 --> 20:59:37,682
So let me go ahead and just cut\nthat form out to my clipboard.
21976
20:59:37,682 --> 20:59:40,441
Let me change the first\nline of index.html
21977
20:59:40,441 --> 20:59:46,741
to say this file is going\nto extend layout.html
21978
20:59:46,741 --> 20:59:48,932
and notice I'm using\nthe curly braces again.
21979
20:59:48,932 --> 20:59:52,381
And this file is going to\nhave its own body block
21980
20:59:52,381 --> 20:59:57,301
inside of which is just\nthe HTML that I actually
21981
20:59:57,301 --> 20:59:59,792
want to make specific to this page.
21982
20:59:59,792 --> 21:00:02,051
And I'll keep my indentation\nnice and neat here.
21983
21:00:02,051 --> 21:00:03,542
And let's consider what I've done.
21984
21:00:03,542 --> 21:00:05,971
This is starting to look\nweird fast, and this is now
21985
21:00:05,971 --> 21:00:09,871
a mix of HTML with templating code.
21986
21:00:09,872 --> 21:00:15,961
Index.html, first line now says, hey,\n
21987
21:00:17,072 --> 21:00:20,402
This next line, three through\n10, says, hey, Flask, here
21988
21:00:20,402 --> 21:00:23,732
is what I consider my body block to be.
21989
21:00:23,732 --> 21:00:27,362
Plug this into the layout placeholder.
21990
21:00:27,361 --> 21:00:33,611
Therefore, so if I now go back\nto layout.html, and layout.html
21991
21:00:33,611 --> 21:00:35,671
it's almost all HTML by contrast.
21992
21:00:35,672 --> 21:00:38,757
But there is this placeholder, and\n
21993
21:00:39,631 --> 21:00:41,402
If I want to put a\ndefault value, I could
21994
21:00:41,402 --> 21:00:45,002
put a default value there just in case\n
21995
21:00:45,001 --> 21:00:47,171
But in general, that's\nnot going to be relevant.
21996
21:00:47,172 --> 21:00:50,882
So this is just a placeholder,\n
21997
21:00:50,881 --> 21:00:54,491
plug in the page-specific\ncontent right here.
21998
21:00:54,491 --> 21:00:58,262
So if I go now into greet.html,\nthis one's even easier.
21999
21:00:58,262 --> 21:01:01,262
I'm going to cut this content\nand get rid of everything else.
22000
21:01:01,262 --> 21:01:06,842
Greet.html 2 is going to extend\n
22001
21:01:06,842 --> 21:01:13,512
and then I'm going to have my body block\n
22002
21:01:13,512 --> 21:01:15,961
And then I'm going to go\nahead and end that block here.
22003
21:01:15,961 --> 21:01:18,812
These are not HTML tags,\nthis is not HTML syntax.
22004
21:01:18,812 --> 21:01:22,741
Technically, the syntax we keep\nseeing with the curly braces
22005
21:01:22,741 --> 21:01:28,982
and these now curly braces with percent\n
22006
21:01:28,982 --> 21:01:33,812
J-I-N-J-A, which is a language,\nthat some humans invented
22007
21:01:33,812 --> 21:01:35,732
for this purpose of templating.
22008
21:01:35,732 --> 21:01:37,952
And the people who\ninvented Flask decided
22009
21:01:37,952 --> 21:01:40,052
we're not going to come\nup with our own syntax
22010
21:01:40,051 --> 21:01:44,066
we're going to use these other\n
22011
21:01:44,066 --> 21:01:46,441
So again, there starts to be\nat this point in the course
22012
21:01:46,441 --> 21:01:50,371
and really in computing, a lot of\n
22013
21:01:50,952 --> 21:01:54,842
So Flask is using this syntax, but\n
22014
21:01:57,032 --> 21:02:01,832
All right, so now\nindex.html is half HTML
22015
21:02:01,831 --> 21:02:04,261
half templating code, Jinja syntax.
22016
21:02:04,262 --> 21:02:08,012
Greet.html is almost all\nJinja syntax, no tags even
22017
21:02:08,012 --> 21:02:11,582
but because they both\nextend layout.html
22018
21:02:11,581 --> 21:02:14,911
now I think I've improved\nthe design of this thing.
22019
21:02:14,911 --> 21:02:18,941
If I go back to app.py, none\nof this really needs to change.
22020
21:02:18,941 --> 21:02:22,111
I don't change my templates\nto mention layout.html
22021
21:02:22,111 --> 21:02:25,621
that's already implicit in the fact\n
22022
21:02:25,622 --> 21:02:28,741
So now if I go ahead and\nopen my terminal window
22023
21:02:28,741 --> 21:02:32,822
go back to the same folder\nas app.py and do Flask run
22024
21:02:32,822 --> 21:02:35,732
all right, my application\nis running on port 5000.
22025
21:02:35,732 --> 21:02:39,392
Let me now go back to the /route\nin my browser and hit Enter
22026
21:02:40,682 --> 21:02:44,252
And just as a little check, let\nme view the source of the page
22027
21:02:45,932 --> 21:02:48,361
And there's all of the code.
22028
21:02:48,361 --> 21:02:51,421
No mention of Jinja, no curly\nbraces, no percent signs.
22029
21:02:52,096 --> 21:02:54,721
It's not quite pretty printed in\nthe same way, but that's fine.
22030
21:02:54,721 --> 21:02:57,263
Because now, we're starting to\ndynamically generate websites.
22031
21:02:57,263 --> 21:03:00,512
And by that, I mean this isn't\n
22032
21:03:01,111 --> 21:03:03,871
If it's indented in the\nsource code version
22033
21:03:03,872 --> 21:03:05,911
doesn't matter what the\nbrowser really sees.
22034
21:03:05,911 --> 21:03:08,402
Let me now go ahead and type\nin my name, click Submit.
22035
21:03:08,402 --> 21:03:10,051
I should see, yep, hello, David.
22036
21:03:10,051 --> 21:03:12,152
Let me go ahead and view\nthe source of this page.
22037
21:03:12,152 --> 21:03:16,211
And we'll see almost the same\n
22038
21:03:16,211 --> 21:03:19,741
So this is, now, web programming\nin the literal sense.
22039
21:03:19,741 --> 21:03:23,042
I did not hard code a page that says\n
22040
21:03:24,032 --> 21:03:27,752
I hardcoded a page that has a\ntemplate with a placeholder
22041
21:03:27,751 --> 21:03:31,471
and now I'm using actual\nlogic, some code in app.py
22042
21:03:31,471 --> 21:03:37,751
to actually tell the server\nwhat to send to the browser.
22043
21:03:37,751 --> 21:03:42,901
All right, any questions,\nthen, on where we're at here?
22044
21:03:42,902 --> 21:03:45,271
This is now a web application.
22045
21:03:45,271 --> 21:03:49,051
Simple though it is, it's\nno longer just a web site.
22046
21:03:49,922 --> 21:03:54,521
AUDIENCE: Is what we did just better\n
22047
21:03:54,521 --> 21:03:57,232
DAVID: It better for\ndesign or for memory?
22048
21:03:57,732 --> 21:03:59,922
It's definitely better\nfor design because, truly
22049
21:03:59,922 --> 21:04:02,112
if we had a third page,\nfourth page, I would really
22050
21:04:02,111 --> 21:04:03,911
start just resorting to copy paste.
22051
21:04:03,911 --> 21:04:07,031
And as you saw with home page,\noften, in the head of your page
22052
21:04:07,032 --> 21:04:10,632
you might want to include some CSS\n
22053
21:04:10,631 --> 21:04:13,122
You might want to have\nother information up there.
22054
21:04:13,122 --> 21:04:16,754
If you had to upgrade the version of\n
22055
21:04:16,754 --> 21:04:18,461
so you want to change\none of those lines
22056
21:04:18,461 --> 21:04:21,851
you would literally have to go into\n
22057
21:04:26,501 --> 21:04:30,731
Theoretically, the server, because\n
22058
21:04:30,732 --> 21:04:33,394
it can theoretically do some\noptimizations underneath the hood.
22059
21:04:33,394 --> 21:04:36,101
Flask is probably doing that, but\n
22060
21:04:36,101 --> 21:04:38,021
We're using it in\ndevelopment mode, which
22061
21:04:38,021 --> 21:04:41,592
means it's typically\nreloading things each time.
22062
21:04:41,592 --> 21:04:44,802
Other questions on this application?
22063
21:04:48,842 --> 21:04:54,461
All right, so let me ask a question,\n
22064
21:04:54,461 --> 21:04:56,641
What about the implications for privacy?
22065
21:04:56,642 --> 21:05:02,192
Why is this maybe not the best design\n
22066
21:05:05,066 --> 21:05:06,491
AUDIENCE: For some reason,\nyou wanted your name.
22067
21:05:06,491 --> 21:05:08,682
So these private people\ncould just look at the URL.
22068
21:05:09,182 --> 21:05:11,042
I mean, if you have a\nnosy sibling or roommate
22069
21:05:11,042 --> 21:05:13,051
and they have access to\nyour laptop and they just
22070
21:05:13,051 --> 21:05:15,301
go trolling through your\nautocomplete or your history
22071
21:05:15,301 --> 21:05:18,331
like, literally what you typed into\n
22072
21:05:18,331 --> 21:05:21,331
Not a big deal if it's your name, but\n
22073
21:05:21,331 --> 21:05:23,551
card or anything else\nthat's mildly sensitive
22074
21:05:23,551 --> 21:05:26,941
you probably don't want it\nending up in the URL at all
22075
21:05:26,941 --> 21:05:29,402
even if you're in\nincognito mode or whatnot.
22076
21:05:29,402 --> 21:05:34,152
You just don't want to expose yourself\n
22077
21:05:34,152 --> 21:05:35,941
So perhaps, we can do better than that.
22078
21:05:35,941 --> 21:05:38,311
And fortunately, this one\nis actually an easy change.
22079
21:05:38,312 --> 21:05:43,502
Let me go into my\nindex.html where my form is.
22080
21:05:43,501 --> 21:05:48,271
And in my form, I can just change\nthe method from GET to POST.
22081
21:05:48,271 --> 21:05:50,911
It's still going to send key\nvalue pairs to the server
22082
21:05:50,911 --> 21:05:52,951
but it's not going to\nput them in the URL.
22083
21:05:52,952 --> 21:05:56,256
The upside of which is that we\ncan assuage this privacy concern
22084
21:05:56,256 --> 21:05:58,381
but I'm going to have to\nmake one other change too.
22085
21:05:58,381 --> 21:06:02,851
Because now, if I go ahead and run\n
22086
21:06:02,851 --> 21:06:07,331
and I now reload the form to make\n
22087
21:06:07,331 --> 21:06:10,171
You should be in the habit\nof going to View, Developer
22088
21:06:10,172 --> 21:06:12,783
View Source, or Developer\nTools just to make sure
22089
21:06:12,783 --> 21:06:15,241
that what you're seeing in your\nbrowser is what you intend.
22090
21:06:15,241 --> 21:06:18,062
And yes, I do see what I wanted.
22091
21:06:19,652 --> 21:06:22,381
Let me go ahead and type\nin David and click Submit.
22092
21:06:22,381 --> 21:06:24,872
Now I get a different error.
22093
21:06:24,872 --> 21:06:28,902
This one is HTTP 405,\nmethod not allowed.
22094
21:06:30,001 --> 21:06:34,081
Well, in my Flask application, I've\n
22095
21:06:34,081 --> 21:06:37,601
One of which is for slash,\nthen that worked fine.
22096
21:06:37,601 --> 21:06:40,751
One of which is for /greet,\nand that used to work fine.
22097
21:06:40,751 --> 21:06:46,301
But apparently, what Flask is doing\n
22098
21:06:46,301 --> 21:06:51,512
So if I want to change this route to\n
22099
21:06:51,512 --> 21:06:56,682
quote unquote "POST" inside\nof this parameter here.
22100
21:06:56,682 --> 21:07:01,592
So that now, I can actually\nsupport POST, not just GET.
22101
21:07:01,592 --> 21:07:08,642
And if I now restart Flask, so Flask\n
22102
21:07:08,642 --> 21:07:11,282
Let me go back one screen\nto the form, reload
22103
21:07:11,282 --> 21:07:13,202
the page just to make\nsure I have the latest
22104
21:07:13,202 --> 21:07:14,785
even though nothing there has changed.
22105
21:07:14,785 --> 21:07:18,392
Type David and click Submit now,\nnow I should see hello, world.
22106
21:07:18,392 --> 21:07:25,142
Notice that I'm at the greet route,\n
22107
21:07:28,021 --> 21:07:30,331
All right, so that's an\ninteresting takeaway.
22108
21:07:30,331 --> 21:07:35,141
It's a simple change, but whereas GET\n
22109
21:07:35,142 --> 21:07:37,382
But it still works so long\nas you tweak the backend
22110
21:07:37,381 --> 21:07:41,971
to look as a POST request, which\n
22111
21:07:41,971 --> 21:07:44,521
It's not going to be as simple\nas looking at the URL itself.
22112
21:07:44,521 --> 21:07:46,322
Why shouldn't we just always use POST?
22113
21:07:49,491 --> 21:07:53,751
Why not use POST everywhere?
22114
21:07:55,342 --> 21:07:58,822
Right, because it's obnoxious to\n
22115
21:07:58,822 --> 21:08:02,084
if you're leaving these little\n
22116
21:08:02,084 --> 21:08:04,042
can poke around and see\nwhat you've been doing.
22117
21:08:09,138 --> 21:08:11,523
AUDIENCE: You're supposed\nto duplicate [INAUDIBLE]..
22118
21:08:15,532 --> 21:08:18,792
I mean, if you get rid of GET\n
22119
21:08:18,792 --> 21:08:22,194
your history, your autocomplete,\ngets pretty less useful.
22120
21:08:22,194 --> 21:08:25,152
Because none of the information is\n
22121
21:08:25,152 --> 21:08:26,569
go through the menu and hit Enter.
22122
21:08:26,569 --> 21:08:28,182
You'd have to re-fill out the form.
22123
21:08:28,182 --> 21:08:30,389
And there's this other\nsymptom that you can see here.
22124
21:08:30,389 --> 21:08:33,014
Let me zoom out and let\nme just reload this page.
22125
21:08:33,014 --> 21:08:34,932
Notice that you'll get\nthis warning, and it'll
22126
21:08:34,932 --> 21:08:39,551
look different in Safari and Firefox\n
22127
21:08:40,342 --> 21:08:43,932
So your browser might remember what\n
22128
21:08:43,932 --> 21:08:45,711
but just while you're on the page.
22129
21:08:45,711 --> 21:08:50,562
And this is in contrast to GET,\nwhere the state is information.
22130
21:08:50,562 --> 21:08:53,957
Like, key value pairs is\nembedded in the URL itself.
22131
21:08:53,956 --> 21:08:56,081
And if you looked at an\nemail I sent earlier today
22132
21:08:56,081 --> 21:08:59,277
I deliberately linked\nto https://www.google.c
22133
21:08:59,277 --> 21:09:00,881
om/search?q=what+time+is+it.
22134
21:09:06,971 --> 21:09:11,471
This is, by definition, a GET\nrequest when you click on it.
22135
21:09:11,471 --> 21:09:14,861
Because it's going to grab the\ninformation, the key value pair
22136
21:09:14,861 --> 21:09:18,171
from the URL, send it to Google\n
22137
21:09:18,172 --> 21:09:20,592
And the reason I sent this\nvia email earlier was I
22138
21:09:20,592 --> 21:09:23,812
wanted people to very quickly be able\n
22139
21:09:23,812 --> 21:09:27,732
And so I can sort automate the process\n
22140
21:09:27,732 --> 21:09:29,952
but that you induce when\nyou click that link.
22141
21:09:29,952 --> 21:09:35,381
If Google did not support GET, they\n
22142
21:09:35,381 --> 21:09:38,171
is send you all to this\nURL which, unfortunately
22143
21:09:39,762 --> 21:09:42,972
I would have had to add to my\n
22144
21:09:44,812 --> 21:09:46,522
So it's just bad for usability.
22145
21:09:46,521 --> 21:09:49,721
So there, too, we might have design\n
22146
21:09:49,721 --> 21:09:53,261
but also the design when it comes\nto the user experience, or UX
22147
21:09:53,262 --> 21:09:54,941
as a computer scientist would call it.
22148
21:09:54,941 --> 21:09:58,211
Just in terms of what you want\nto optimize for, ultimately.
22149
21:09:58,211 --> 21:10:00,081
So GET and POST both have their roles.
22150
21:10:00,081 --> 21:10:02,581
It depends on what kind of\nfunctionality you want to provide
22151
21:10:02,581 --> 21:10:06,911
and what kind of sensitivity\nthere might be around it.
22152
21:10:06,911 --> 21:10:10,122
All right, any questions, then, on\n
22153
21:10:10,122 --> 21:10:13,722
Super simple, just gets someone's\nname and prints it back out.
22154
21:10:13,721 --> 21:10:16,031
But we now have all\nthe plumbing with which
22155
21:10:16,032 --> 21:10:19,032
to create really most anything we want.
22156
21:10:21,820 --> 21:10:24,112
All right, let's go ahead\nand take a five minute break.
22157
21:10:24,111 --> 21:10:27,951
And when we come back, we'll add to\n
22158
21:10:30,202 --> 21:10:32,182
And recall that the last\nthing we just changed
22159
21:10:32,182 --> 21:10:34,941
was the route to use\nPOST instead of GET.
22160
21:10:34,941 --> 21:10:37,851
So gone is my name and\nany value in the URL.
22161
21:10:37,851 --> 21:10:44,031
But there was a subtle bug or change\n
22162
21:10:44,032 --> 21:10:47,062
I did type David into the\nform and I did click Submit
22163
21:10:47,062 --> 21:10:50,452
and yet here it is\nsaying hello comma world.
22164
21:10:50,452 --> 21:10:53,491
So that seems to be\nbroken all of a sudden
22165
21:10:53,491 --> 21:10:56,572
even though we added support for POST.
22166
21:10:56,572 --> 21:10:58,792
But something must be wrong.
22167
21:10:58,792 --> 21:11:01,342
Logically, it must be the case here.
22168
21:11:01,342 --> 21:11:05,461
Intuitively, that if I'm seeing\n
22169
21:11:07,161 --> 21:11:10,101
It must be that it's\nnot seeing a key called
22170
21:11:10,101 --> 21:11:14,482
name in request.args, which is this.
22171
21:11:14,482 --> 21:11:16,851
Gives you access to\neverything after the URL.
22172
21:11:16,851 --> 21:11:19,471
That's because there's this\nother thing we should know about
22173
21:11:19,471 --> 21:11:21,741
which is not just\nrequest.args but request.form.
22174
21:11:21,741 --> 21:11:26,211
These are horribly named, but\nrequest.args is for GET requests
22175
21:11:26,211 --> 21:11:28,804
request.form is for POST requests.
22176
21:11:28,804 --> 21:11:31,012
Otherwise, they're pretty\nmuch functionally the same.
22177
21:11:31,012 --> 21:11:33,592
But the onus is on you,\nthe user or the programmer
22178
21:11:33,592 --> 21:11:35,832
to make sure you're using the right one.
22179
21:11:35,831 --> 21:11:38,671
So I think if we want\nto get rid of the world
22180
21:11:38,672 --> 21:11:41,002
and actually see what\nI, the human, typed in
22181
21:11:41,001 --> 21:11:45,201
I think I can just change\nrequest.args to request.form.
22182
21:11:45,202 --> 21:11:47,782
Still dot get, still\nquote unquote "name
22183
21:11:47,782 --> 21:11:52,252
and now, if I go ahead and rerun\nFlask in my terminal window
22184
21:11:52,251 --> 21:11:54,776
go back to my browser, go\nback to-- and actually
22185
21:11:54,777 --> 21:11:56,152
I won't even go back to the form.
22186
21:11:56,152 --> 21:11:59,392
I will literally just reload,\nCommand R or Control R
22187
21:11:59,392 --> 21:12:02,572
and what this warning is\nsaying is it's going to submit
22188
21:12:02,572 --> 21:12:05,001
the same information to the website.
22189
21:12:05,001 --> 21:12:08,901
When I click Continue, now I\nshould see hello comma David.
22190
21:12:08,902 --> 21:12:11,182
So again, you, too, are\ngoing to encounter, probably
22191
21:12:11,182 --> 21:12:12,801
all these little subtleties.
22192
21:12:12,801 --> 21:12:15,801
But if you focus on, really, the\nfirst principles of last week
22193
21:12:15,801 --> 21:12:18,682
like what it HTTP, how\ndoes it get request work
22194
21:12:18,682 --> 21:12:20,786
how does a POST request\nwork now, you should
22195
21:12:20,786 --> 21:12:22,911
have a lot of the mental\nbuilding blocks with which
22196
21:12:22,911 --> 21:12:25,101
to solve problems like these.
22197
21:12:25,101 --> 21:12:28,342
And let me give you one other mental\n
22198
21:12:28,342 --> 21:12:32,662
This framework called Flask is just an\n
22199
21:12:32,661 --> 21:12:36,262
that all implement the same\nparadigm, the same way of thinking
22200
21:12:36,262 --> 21:12:38,482
and the same way of\nprogramming applications.
22201
21:12:38,482 --> 21:12:41,932
And that's known as MVC,\nmodel view controller.
22202
21:12:41,932 --> 21:12:46,432
And here's a very simple diagram\n
22203
21:12:46,432 --> 21:12:48,301
and I have been implementing thus far.
22204
21:12:48,301 --> 21:12:51,292
And actually, this is more than\n
22205
21:12:51,292 --> 21:12:55,123
In app.py is what a programmer\n
22206
21:12:55,123 --> 21:12:57,081
That's the code you're\nwriting, this are called
22207
21:12:57,081 --> 21:13:01,131
business logic that makes all of the\n
22208
21:13:01,131 --> 21:13:03,471
what values to show, and so forth.
22209
21:13:03,471 --> 21:13:09,861
In layout.html, index.html, greet.html\n
22210
21:13:09,861 --> 21:13:12,651
that is the visualizations\nthat the human actually
22211
21:13:14,271 --> 21:13:18,561
Those things are dumb, they pretty\n
22212
21:13:18,562 --> 21:13:20,991
All of the hard work is done in app.py.
22213
21:13:20,991 --> 21:13:26,001
So controller, AKA app.py, is where\n
22214
21:13:26,001 --> 21:13:31,491
And in your view is where your HTML and\n
22215
21:13:31,491 --> 21:13:35,512
the curly braces, the curly braces\n
22216
21:13:35,512 --> 21:13:39,411
We haven't added an M\nto MVC yet model, that's
22217
21:13:39,411 --> 21:13:42,322
going to refer to things\nlike CSV files or databases.
22218
21:13:42,322 --> 21:13:46,116
The model, where do you keep\nactual data, typically long term.
22219
21:13:46,116 --> 21:13:47,991
So we'll come back to\nthat, but this picture
22220
21:13:47,991 --> 21:13:52,312
where you have one of these-- each of\n
22221
21:13:52,312 --> 21:13:55,222
another is representative of\nhow a lot of frameworks work.
22222
21:13:55,221 --> 21:13:58,792
What we're teaching today, this week,\n
22223
21:13:58,792 --> 21:14:01,581
It's not really specific to Flask,\n
22224
21:14:01,581 --> 21:14:03,471
It really is a very\ncommon paradigm that you
22225
21:14:03,471 --> 21:14:08,211
could implement in Java, C sharp, or\n
22226
21:14:08,211 --> 21:14:12,595
All right, so let's now\npivot back to VS Code here.
22227
21:14:12,595 --> 21:14:14,512
Let me stop running\nFlask, and let me go ahead
22228
21:14:14,512 --> 21:14:19,922
and create a new folder altogether\n
22229
21:14:19,922 --> 21:14:24,592
And let me go ahead and create\na folder called FroshIMS
22230
21:14:24,592 --> 21:14:27,862
representing freshman intramural\n
22231
21:14:29,422 --> 21:14:32,601
And now I'm going to code an app.py.
22232
21:14:32,601 --> 21:14:36,351
And in anticipation, I'm going to\n
22233
21:14:36,351 --> 21:14:38,251
This one in the FroshIMS folder.
22234
21:14:38,251 --> 21:14:42,351
And then in my templates directory,\n
22235
21:14:42,351 --> 21:14:44,482
and I'm just going to\nget myself started here.
22236
21:14:46,221 --> 21:14:48,741
I'm just copying my layout\nfrom earlier because most
22237
21:14:48,741 --> 21:14:53,031
of my interesting work, this time, is\n
22238
21:14:53,032 --> 21:14:54,512
So what is it we're creating?
22239
21:14:54,512 --> 21:14:58,282
So literally, the very first\nthing I wrote as a web application
22240
21:14:58,282 --> 21:15:02,042
20 years ago, was a site that\nliterally looked like this.
22241
21:15:02,042 --> 21:15:04,131
So I was like a sophomore\nor junior at the time.
22242
21:15:04,131 --> 21:15:06,982
I'd taken CS50 and a\nfollow-on class only.
22243
21:15:06,982 --> 21:15:08,889
I had no idea how to do web programming.
22244
21:15:08,888 --> 21:15:11,721
Neither of those two courses taught\n
22245
21:15:11,721 --> 21:15:14,391
So I taught myself, at the\ntime, a language called Perl.
22246
21:15:14,392 --> 21:15:18,000
And I learned a little something about\n
22247
21:15:18,000 --> 21:15:20,542
can't even say googled enough,\nbecause Google didn't come out
22248
21:15:22,142 --> 21:15:26,432
Read enough online to figure out how to\n
22249
21:15:26,432 --> 21:15:29,282
on campus, first years,\ncould actually register
22250
21:15:29,282 --> 21:15:32,162
via a website for intramural sports.
22251
21:15:32,161 --> 21:15:35,161
Back in my day, you would\nliterally fill out a piece of paper
22252
21:15:35,161 --> 21:15:38,491
and then walk it across the yard to\n
22253
21:15:38,491 --> 21:15:40,861
slide it under the dorm\nof the Proctor or RA
22254
21:15:40,861 --> 21:15:43,381
and thus you were\nregistered for sports so.
22255
21:15:46,532 --> 21:15:48,512
There was an internet,\njust wasn't really being
22256
21:15:48,512 --> 21:15:50,801
used much on campus or more generally.
22257
21:15:50,801 --> 21:15:54,301
So background images\nthat repeat infinitely
22258
21:15:54,301 --> 21:15:56,611
was in vogue, apparently, at the time.
22259
21:15:56,611 --> 21:15:58,951
All of this was like images\nthat I had to hand make
22260
21:15:58,952 --> 21:16:03,881
because we did not have the features\n
22261
21:16:03,881 --> 21:16:07,801
So it was really just HTML, and it was\n
22262
21:16:09,512 --> 21:16:12,301
And it was really just\nthe same building blocks
22263
21:16:12,301 --> 21:16:15,131
that we hear already today now have.
22264
21:16:15,131 --> 21:16:17,702
So we'll get rid of all of\nthe imagery and focus more
22265
21:16:17,702 --> 21:16:19,952
on the functionality and the\naesthetics, but let's see
22266
21:16:19,952 --> 21:16:23,252
if we can whip up a web\napplication via which someone could
22267
21:16:23,251 --> 21:16:26,281
register for one such intramural sport.
22268
21:16:26,282 --> 21:16:29,972
So in app.py, me go ahead and\nimport some familiar things now.
22269
21:16:29,971 --> 21:16:33,481
From Flask, let's import\ncapital Flask, which
22270
21:16:33,482 --> 21:16:36,482
is that function we need to kick\n
22271
21:16:36,482 --> 21:16:39,661
Render templates, so we have the\n
22272
21:16:39,661 --> 21:16:42,122
those templates, and request\nso that we have the ability
22273
21:16:42,122 --> 21:16:44,850
to get at input from the human.
22274
21:16:44,850 --> 21:16:46,892
Let me go ahead and create\nthe application itself
22275
21:16:46,892 --> 21:16:49,302
using this magical incantation here.
22276
21:16:49,301 --> 21:16:56,141
And then let's go ahead and define a\n
22277
21:16:56,142 --> 21:16:58,022
I'm going to define a\nfunction called index.
22278
21:16:58,021 --> 21:17:00,661
But just to be clear, this\nfunction could be anything.
22279
21:17:00,661 --> 21:17:03,221
Foo, bar, baz, anything else.
22280
21:17:03,221 --> 21:17:05,281
But I tend to name\nthem in a manner that's
22281
21:17:05,282 --> 21:17:07,022
consistent with what\nthe route is called.
22282
21:17:07,021 --> 21:17:09,061
But you could call it\nanything you want, it's
22283
21:17:09,062 --> 21:17:12,601
just the function that will get\n
22284
21:17:12,601 --> 21:17:14,851
Now, let me go ahead here\nand just get things started.
22285
21:17:14,851 --> 21:17:18,294
Return, render template of index.html.
22286
21:17:18,294 --> 21:17:19,711
Just keep it simple, nothing more.
22287
21:17:19,711 --> 21:17:22,622
So there's nothing really\nFroshIM specific about this here
22288
21:17:22,622 --> 21:17:25,172
I just want to make sure I'm\ndoing everything correctly.
22289
21:17:25,172 --> 21:17:27,152
Meanwhile, I've got my layout.
22290
21:17:27,152 --> 21:17:31,562
OK, let me go ahead, and in my\ntemplates directory, code a file
22291
21:17:33,751 --> 21:17:40,051
And let's just do extends\nlayout.html at the top
22292
21:17:40,051 --> 21:17:42,152
just so that we get\nbenefit from that template.
22293
21:17:42,152 --> 21:17:44,012
And down here, I'm just\ngoing to say to do.
22294
21:17:44,012 --> 21:17:47,012
Just so that I have something\ngoing on visually to make sure
22295
21:17:48,452 --> 21:17:51,782
In my FroshIMS directory,\nlet me do Flask run.
22296
21:17:51,782 --> 21:17:54,961
Let me now go back to my previous URL,\n
22297
21:17:54,961 --> 21:17:59,611
But now, I'm serving\nup the FroshIM site.
22298
21:18:01,351 --> 21:18:04,622
That's because I\nscrewed up accidentally.
22299
21:18:04,622 --> 21:18:07,851
What did I do wrong in index.html?
22300
21:18:13,361 --> 21:18:16,229
This file extends layout.html, but--
22301
21:18:16,229 --> 21:18:17,771
AUDIENCE: You left out the block tag?
22302
21:18:18,271 --> 21:18:22,661
I forgot to tell Flask what\nto plug into that layout.
22303
21:18:22,661 --> 21:18:26,801
So I just need to say block body, and\n
22304
21:18:26,801 --> 21:18:28,841
or whatever I want to\neventually get around to.
22305
21:18:31,751 --> 21:18:34,091
OK, so now it looks ugly, more cryptic.
22306
21:18:34,092 --> 21:18:36,972
But this is, again, the\nessence of doing templating.
22307
21:18:36,971 --> 21:18:41,101
Let me now restart Flask up\nhere, let me go back to the page.
22308
21:18:41,733 --> 21:18:43,691
Crossing my fingers this\ntime, and there we go.
22309
21:18:44,081 --> 21:18:46,241
So it's not the application\nI want, but at least I
22310
21:18:46,241 --> 21:18:48,630
know I have some of the\nplumbing there by default.
22311
21:18:48,630 --> 21:18:50,922
All right, so if I want the\nuser to be able to register
22312
21:18:50,922 --> 21:18:53,112
for one of these sports,\nlet's enhance, now
22313
21:18:53,111 --> 21:18:56,051
index.html to actually\nhave a form that's
22314
21:18:56,051 --> 21:18:59,811
maybe got a dropdown menu for all of\n
22315
21:18:59,812 --> 21:19:01,991
So let me go into this template here.
22316
21:19:01,991 --> 21:19:05,171
And instead of to do, let's\ngo ahead and give myself
22317
21:19:05,172 --> 21:19:08,922
how about an H1 tag that just says\n
22318
21:19:09,732 --> 21:19:13,032
How about a form tag\nthat's going to use POST
22319
21:19:13,032 --> 21:19:16,211
just because it's not really necessary\n
22320
21:19:17,292 --> 21:19:20,021
The action for that, how\nabout we plan to create
22321
21:19:20,021 --> 21:19:24,971
a register route so that we're sending\n
22322
21:19:24,971 --> 21:19:26,531
So we'll have to come back to that.
22323
21:19:26,532 --> 21:19:32,412
In here, let me go ahead and create,\n
22324
21:19:35,782 --> 21:19:37,824
How about a name equals\nname, because I'm
22325
21:19:37,823 --> 21:19:40,781
going to ask the student for their\n
22326
21:19:41,629 --> 21:19:43,211
And the type of this box will be text.
22327
21:19:43,211 --> 21:19:45,372
So this is pretty much\nidentical to before.
22328
21:19:45,372 --> 21:19:48,822
But if you've not seen this\nyet, let's create a select menu
22329
21:19:48,822 --> 21:19:51,042
a so-called dropdown menu in HTML.
22330
21:19:51,042 --> 21:19:54,551
And maybe the first option\nI want to be in there
22331
21:19:54,551 --> 21:19:58,331
is going to be, oh, how\nabout the current three
22332
21:19:58,331 --> 21:20:05,322
sports for the fall, which are\nbasketball, and another option
22333
21:20:05,322 --> 21:20:07,902
is going to be soccer,\nand a third option is
22334
21:20:07,902 --> 21:20:13,461
going to be ultimate frisbee for\n
22335
21:20:13,461 --> 21:20:14,922
So I've got those three options.
22336
21:20:16,062 --> 21:20:20,412
I haven't implemented my route yet,\n
22337
21:20:20,411 --> 21:20:23,682
to go back now and check\nif my form has reloaded.
22338
21:20:23,682 --> 21:20:26,419
So let me go ahead and\nstop and start Flask.
22339
21:20:26,419 --> 21:20:28,961
You'll see there's ways to\nautomate the process of restarting
22340
21:20:28,961 --> 21:20:31,122
the server that we'll do for\nyou for problem set nine
22341
21:20:31,122 --> 21:20:32,832
so you don't have to\nkeep stopping Flask.
22342
21:20:32,831 --> 21:20:36,822
Let me reload my index route\nand OK, it's not that pretty.
22343
21:20:39,762 --> 21:20:41,832
But it now has at least\nsome functionality
22344
21:20:41,831 --> 21:20:44,921
where I can type in my name\nand then type in the sport.
22345
21:20:44,922 --> 21:20:47,502
Now, I might be biasing\npeople toward basketball.
22346
21:20:47,501 --> 21:20:51,971
Like UX wise, user experience\nwise, it's obnoxious to precheck
22347
21:20:51,971 --> 21:20:53,572
basketball but not the others.
22348
21:20:53,572 --> 21:20:55,572
So there's some little\ntweaks we can make there.
22349
21:20:55,572 --> 21:20:57,822
Let me go back into index.html.
22350
21:20:57,822 --> 21:21:04,110
Let me create an empty option up here\n
22351
21:21:04,110 --> 21:21:05,652
going to have the name of any sports.
22352
21:21:05,652 --> 21:21:08,110
But it's just going to have a\nword I want the human to see
22353
21:21:08,110 --> 21:21:12,622
so I'm actually going to disable this\n
22354
21:21:12,622 --> 21:21:15,012
But I'm going to say sport up here.
22355
21:21:15,012 --> 21:21:18,491
And there's different ways to do this,\n
22356
21:21:22,422 --> 21:21:24,762
Creating a placeholder\nsports so that the user
22357
21:21:24,762 --> 21:21:26,862
sees something in the dropdown.
22358
21:21:26,861 --> 21:21:29,932
Let me go ahead and restart\nFlask, reload the page
22359
21:21:29,932 --> 21:21:31,932
and now it's just going\nto be marginally better.
22360
21:21:31,932 --> 21:21:34,122
Now you see sport that's\nchecked by default
22361
21:21:34,122 --> 21:21:37,092
but you have to check one of\nthese other ones ultimately.
22362
21:21:37,092 --> 21:21:38,482
All right, so that's pretty good.
22363
21:21:38,482 --> 21:21:40,692
So let me now type in David.
22364
21:21:40,691 --> 21:21:43,121
I'll register for ultimate frisbee.
22365
21:21:43,122 --> 21:21:46,512
OK, I definitely forgot something.
22366
21:21:48,702 --> 21:21:52,782
All right, so input type equals submit.
22367
21:21:52,782 --> 21:21:54,102
All right, let's put that in.
22368
21:21:57,544 --> 21:21:58,961
Submit could be a little prettier.
22369
21:21:58,961 --> 21:22:03,491
Recall that we can change some of\n
22370
21:22:03,491 --> 21:22:05,652
The value of this button\nshould be register, maybe
22371
21:22:05,652 --> 21:22:07,241
just to make things a little prettier.
22372
21:22:07,241 --> 21:22:10,456
Let me now reload the page and register.
22373
21:22:10,456 --> 21:22:13,331
All right, so now we really have\n
22374
21:22:13,331 --> 21:22:17,741
that I created some years ago to let\n
22375
21:22:17,741 --> 21:22:21,641
So let's go, now, and create maybe\n
22376
21:22:22,691 --> 21:22:25,686
And in here, if we want to\nallow the user to register
22377
21:22:25,687 --> 21:22:28,812
let's do a little bit of error checking\n
22378
21:22:28,812 --> 21:22:30,612
What could the user do wrong?
22379
21:22:30,611 --> 21:22:32,801
Because assume that they will.
22380
21:22:32,801 --> 21:22:34,631
One, they might not type their name.
22381
21:22:34,631 --> 21:22:36,505
Two, they might not choose a sport.
22382
21:22:36,505 --> 21:22:38,172
So they might just submit an empty form.
22383
21:22:38,172 --> 21:22:40,302
So that's two things we\ncould check for, just
22384
21:22:40,301 --> 21:22:43,721
so that we're not scoring bogus\n
22385
21:22:43,721 --> 21:22:47,351
So let's create another\nroute called greet, /greet.
22386
21:22:47,351 --> 21:22:50,501
And then in this route, let's\ncreate a function called greet
22387
21:22:50,501 --> 21:22:52,661
but can be called anything we want.
22388
21:22:52,661 --> 21:22:56,054
And then let's go ahead, and in\n
22389
21:22:56,054 --> 21:22:57,221
and validate the submission.
22390
21:22:57,221 --> 21:22:59,351
So a little comment to myself here.
22391
21:22:59,351 --> 21:23:08,031
How about if there is not a\nrequest.form GET name value
22392
21:23:08,032 --> 21:23:10,211
so that is if that\nfunction returns nothing
22393
21:23:10,211 --> 21:23:14,062
like quote unquote, or the\nspecial word none in Python.
22394
21:23:14,062 --> 21:23:25,042
Or request.form.get"sport" not\nin quote unquote, what were they?
22395
21:23:25,042 --> 21:23:32,122
Basketball, the other one was soccer,\n
22396
21:23:32,122 --> 21:23:35,961
Getting a little long, but notice\n
22397
21:23:35,961 --> 21:23:39,051
If the user did not\ngive us a name, that is
22398
21:23:39,051 --> 21:23:41,671
if this function returns\nthe equivalent of false
22399
21:23:41,672 --> 21:23:45,592
which is, quote unquote, or literally\n
22400
21:23:45,592 --> 21:23:52,252
Or if the sport the user provided is\n
22401
21:23:52,251 --> 21:23:56,061
or ultimate frisbee, which I've defined\n
22402
21:23:56,062 --> 21:23:57,652
and just yell at the user in some way.
22403
21:23:57,652 --> 21:24:02,961
Let's return render\ntemplate of failure.html.
22404
21:24:02,961 --> 21:24:06,211
And that's just going to be some\n
22405
21:24:06,211 --> 21:24:08,751
Otherwise, if they get\nthis far, let's go ahead
22406
21:24:08,751 --> 21:24:12,111
and confirm registration\nby just returning-- whoops
22407
21:24:12,111 --> 21:24:18,182
returning render template quote\nunquote "success" dot HTML.
22408
21:24:18,182 --> 21:24:20,461
All right, so a couple\nquick things to do.
22409
21:24:20,461 --> 21:24:25,111
Let me first go in and in\nmy templates directory
22410
21:24:25,111 --> 21:24:28,291
let's create this failure.html file.
22411
21:24:28,292 --> 21:24:31,191
And this is just meant to\nbe a message to the user
22412
21:24:31,191 --> 21:24:34,262
that they fail to provide\nthe information correctly.
22413
21:24:34,262 --> 21:24:37,042
So let me go ahead and in failure.html.
22414
21:24:40,012 --> 21:24:44,902
So let me extend layout.html and in\n
22415
21:24:44,902 --> 21:24:47,902
I'll just yell at them like that so\n
22416
21:24:47,902 --> 21:24:52,521
And then let me create one other\nfile called success.html, that
22417
21:24:52,521 --> 21:24:54,904
similarly is mostly just Jinja syntax.
22418
21:24:54,904 --> 21:24:57,322
And I'm just going to say for\nnow, even though they're not
22419
21:24:57,322 --> 21:24:59,902
technically registered in any\ndatabase, you are registered.
22420
21:24:59,902 --> 21:25:02,032
That's what we mean by success.
22421
21:25:02,032 --> 21:25:05,211
All right, so let me go ahead,\nand back in my FroshIMS
22422
21:25:07,131 --> 21:25:09,322
Let me go back to the form and reload.
22423
21:25:10,911 --> 21:25:13,881
All right, so now let me\nnot cooperate and just
22424
21:25:13,881 --> 21:25:17,061
immediately click Register impatiently.
22425
21:25:20,721 --> 21:25:24,769
Register-- oh, I'm\nconfusing our two examples.
22426
21:25:24,770 --> 21:25:26,062
All right, I spotted the error.
22427
21:25:32,241 --> 21:25:35,812
There's where I am, what did\nI actually invent over here?
22428
21:25:44,562 --> 21:25:45,812
AUDIENCE: Register, not greet.
22429
21:25:47,611 --> 21:25:50,731
I had last example on my mind,\nso the route should be register.
22430
21:25:50,732 --> 21:25:54,122
Ironically, the function could be greet,\n
22431
21:25:54,122 --> 21:25:57,792
But to keep ourselves sane, let's\n
22432
21:25:57,792 --> 21:26:00,218
Let me go ahead now and\nstart Flask as intended.
22433
21:26:00,218 --> 21:26:02,551
Let me reload the form just\nto make sure all is working.
22434
21:26:02,551 --> 21:26:06,902
Now, let me not cooperate and be\na bad user, clicking register--
22435
21:26:07,982 --> 21:26:10,652
OK, other unintended mistake.
22436
21:26:10,652 --> 21:26:12,881
But this one we've seen before.
22437
21:26:12,881 --> 21:26:15,721
Notice that by default,\nroute only support GET.
22438
21:26:15,721 --> 21:26:18,721
So if I want to\nspecifically support POST
22439
21:26:18,721 --> 21:26:25,471
I have to pass in, by a methods\n
22440
21:26:25,471 --> 21:26:28,351
methods that could be\nGET comma POST, but if I
22441
21:26:28,351 --> 21:26:32,221
don't have no need for a GET in\n
22442
21:26:32,221 --> 21:26:34,741
All right, now let's\ndo this one last time.
22443
21:26:34,741 --> 21:26:37,441
Reload the form to make sure\neverything's OK, click Register
22444
21:26:37,441 --> 21:26:39,194
and you are not registered.
22445
21:26:40,111 --> 21:26:42,528
All right, let me go ahead and\nat least give them my name.
22446
21:26:44,312 --> 21:26:49,662
Fine, I'm going to go ahead and be\n
22447
21:26:53,402 --> 21:26:58,542
What should I-- what\ndid I mean to do here?
22448
21:26:58,542 --> 21:27:00,572
All right, so let's figure this out.
22449
21:27:00,572 --> 21:27:04,711
How to debug something like this,\n
22450
21:27:07,081 --> 21:27:10,351
How can we go about\ntroubleshooting this?
22451
21:27:10,351 --> 21:27:13,202
Turn this into the teachable moment.
22452
21:27:13,202 --> 21:27:15,782
All right, well first,\nsome safety checks.
22453
21:27:17,372 --> 21:27:20,521
Let me go ahead and view page\nsource, a good rule of thumb.
22454
21:27:20,521 --> 21:27:23,081
Look at the HTML that you\nactually sent to the user.
22455
21:27:23,081 --> 21:27:27,251
So here, I have an\ninput with a name name.
22456
21:27:27,251 --> 21:27:29,711
So that's what I\nintended, that looks OK.
22457
21:27:29,711 --> 21:27:32,551
Ah, I see it already, even\nthough you, if you've never
22458
21:27:32,551 --> 21:27:35,792
used a select menu, you might\nnot know what, apparently
22459
21:27:35,792 --> 21:27:41,911
is missing from here that I\ndid have for my text input.
22460
21:27:41,911 --> 21:27:45,466
Just intuitively, logically.
22461
21:27:45,467 --> 21:27:47,342
What's going through my\nhead, embarrassingly
22462
21:27:47,342 --> 21:27:51,692
is, all right, if my form thinks\n
22463
21:27:51,691 --> 21:27:55,471
how did I create a situation in which\n
22464
21:27:55,471 --> 21:27:57,481
Well, name, I don't think\nit's going to be blank
22465
21:27:57,482 --> 21:28:01,262
because I explicitly gave\nthis text field a name name
22466
21:28:01,262 --> 21:28:02,972
and that did work last time.
22467
21:28:02,971 --> 21:28:06,421
I've now given a second input\nin the form of the select menu.
22468
21:28:06,422 --> 21:28:13,112
But what seems to be missing here\nthat I'm assuming exists here?
22469
21:28:13,111 --> 21:28:16,381
It's just a dumb mistake I made.
22470
21:28:16,381 --> 21:28:19,448
What might be missing here?
22471
21:28:19,448 --> 21:28:23,552
If request.form gives you all of\nthe inputs that the user might
22472
21:28:23,551 --> 21:28:26,551
have typed in, let me\ngo into my actual code
22473
21:28:26,551 --> 21:28:30,451
here in my form and name equals sport.
22474
21:28:30,452 --> 21:28:32,472
I just didn't give a name to that input.
22475
21:28:32,471 --> 21:28:34,651
So it exists, and the\nbrowser doesn't care.
22476
21:28:34,652 --> 21:28:36,485
It's still going to\ndisplay the form to you
22477
21:28:36,485 --> 21:28:40,211
it just hasn't given it a unique name\n
22478
21:28:40,211 --> 21:28:42,551
So now, if I'm not going\nto put my foot in my mouth
22479
21:28:42,551 --> 21:28:44,741
I think that's what I did wrong.
22480
21:28:44,741 --> 21:28:46,711
And again, my process\nfor figuring that out
22481
21:28:46,711 --> 21:28:49,043
was looking at my code,\nthinking through logically
22482
21:28:49,043 --> 21:28:50,251
is this right, is this right?
22483
21:28:50,251 --> 21:28:52,421
No, I was missing the name there.
22484
21:28:52,422 --> 21:28:55,382
So let's run Flask,\nlet's reload the form
22485
21:28:55,381 --> 21:28:59,941
just to make sure it's all defaults\n
22486
21:28:59,941 --> 21:29:04,441
in ultimate frisbee, crossing\nmy fingers extra hard this time.
22487
21:29:06,932 --> 21:29:08,664
I did not intend to\nscrew up in that way
22488
21:29:08,664 --> 21:29:10,831
but that's exactly the right\nkind of thought process
22489
21:29:10,831 --> 21:29:12,152
to diagnose issues like this.
22490
21:29:12,152 --> 21:29:16,021
Go back to the basics, go back to what\n
22491
21:29:16,021 --> 21:29:18,432
and just rule things in and out.
22492
21:29:18,432 --> 21:29:21,182
There's only a finite number of\n
22493
21:29:22,054 --> 21:29:23,410
AUDIENCE: Are you [INAUDIBLE].
22494
21:29:25,581 --> 21:29:27,081
DAVID: Excuse-- say a little louder?
22495
21:29:27,081 --> 21:29:30,981
AUDIENCE: I don't understand why\nname equals sport [INAUDIBLE]..
22496
21:29:30,982 --> 21:29:33,631
DAVID: Why did name equal\nsport address the problem?
22497
21:29:33,631 --> 21:29:35,872
Well, let's first go back to the HTML.
22498
21:29:35,872 --> 21:29:43,172
Previously, it was just the reality that\n
22499
21:29:44,601 --> 21:29:47,932
But names, or more\ngenerally, key value pairs
22500
21:29:47,932 --> 21:29:51,211
is how information is sent\nfrom a form to the server.
22501
21:29:51,211 --> 21:29:56,422
So if there's no name, there's no key to\n
22502
21:29:56,422 --> 21:30:00,082
It would be like nothing equals ultimate\n
22503
21:30:00,081 --> 21:30:02,432
The browser is just\nnot going to send it.
22504
21:30:02,432 --> 21:30:08,331
However, in app.py, I was naively\n
22505
21:30:08,331 --> 21:30:11,061
there would be a name called\nquote unquote "sport.
22506
21:30:11,062 --> 21:30:13,702
It could have been anything,\nbut I was assuming it was sport.
22507
21:30:13,702 --> 21:30:15,532
But I never told the form that.
22508
21:30:15,532 --> 21:30:19,022
And if I really wanted to dig in,\n
22509
21:30:19,021 --> 21:30:22,012
Let me go back to the\nway it was a moment ago.
22510
21:30:22,012 --> 21:30:25,801
Let me get rid of the name\nof the sport dropdown menu.
22511
21:30:25,801 --> 21:30:30,981
Let me rerun Flask down here\nand reload the form itself
22512
21:30:30,982 --> 21:30:33,411
after it finishes being served.
22513
21:30:34,581 --> 21:30:39,277
View Developer Tools, and then let me\n
22514
21:30:39,277 --> 21:30:41,152
we played around with\na little bit last week.
22515
21:30:41,152 --> 21:30:44,334
And we also played around with Curl,\n
22516
21:30:44,334 --> 21:30:46,042
Here's another-- here's\nwhat I would have
22517
21:30:46,042 --> 21:30:49,411
done if I still wasn't seeing the error\n
22518
21:30:49,411 --> 21:30:53,512
I would have typed in my name as before,\n
22519
21:30:53,512 --> 21:30:55,252
I would have clicked register.
22520
21:30:55,251 --> 21:30:59,241
And now, I would have\nlooked at the HTTP request.
22521
21:30:59,241 --> 21:31:01,342
And I would click on Register here.
22522
21:31:01,342 --> 21:31:04,942
And just like we did last week, I\n
22523
21:31:04,941 --> 21:31:07,671
And there's a whole lot of stuff\nthat we can typically ignore.
22524
21:31:07,672 --> 21:31:10,792
But here, let me zoom\nin, way at the bottom
22525
21:31:10,792 --> 21:31:13,131
what Chrome's developer\ntools are doing for me
22526
21:31:13,131 --> 21:31:16,141
it's showing me all of the\nform data that was submitted.
22527
21:31:16,142 --> 21:31:18,712
So this really would have\nbeen my telltale clue.
22528
21:31:18,711 --> 21:31:21,982
I'm just not sending the sport,\neven if the human typed it in.
22529
21:31:21,982 --> 21:31:23,991
And logically, because\nI've done this before
22530
21:31:23,991 --> 21:31:26,932
that must mean I didn't\ngive the thing a name.
22531
21:31:28,101 --> 21:31:31,372
Like good programmers, web developers\n
22532
21:31:32,422 --> 21:31:34,160
They're not writing bug-free code.
22533
21:31:34,160 --> 21:31:35,452
That's not the point to get to.
22534
21:31:35,452 --> 21:31:38,542
The point to get to is\nbeing a good diagnostician
22535
21:31:38,542 --> 21:31:40,551
I would say, in these cases.
22536
21:31:46,812 --> 21:31:52,152
AUDIENCE: What if you want to\n
22537
21:31:52,152 --> 21:31:54,444
DAVID: I'm sorry, a little bit louder?
22538
21:31:54,444 --> 21:31:57,012
AUDIENCE: If you want to\nedit in CSS or anything
22539
21:31:57,012 --> 21:32:01,422
in HTML, once you have to fix\nthe template, how do you that?
22540
21:32:01,422 --> 21:32:05,532
DAVID: So how would you edit\nCSS if you have these templates?
22541
21:32:05,532 --> 21:32:07,842
That process we'll\nactually see before long.
22542
21:32:07,842 --> 21:32:09,467
It's almost going to be the exact same.
22543
21:32:09,467 --> 21:32:12,550
Just to give you a teaser for this,\n
22544
21:32:12,550 --> 21:32:15,402
but we'll give you some distribution\n
22545
21:32:15,402 --> 21:32:17,862
You can absolutely still\ndo something like this.
22546
21:32:17,861 --> 21:32:22,660
Link href equals quote\nunquote "styles" dot
22547
21:32:22,660 --> 21:32:27,370
CSS rel equals style sheet, that's one\n
22548
21:32:27,370 --> 21:32:31,422
The only difference today, using Flask,\n
22549
21:32:31,422 --> 21:32:34,041
by convention, should go\nin your static folder.
22550
21:32:34,040 --> 21:32:36,072
So the change you would\nmake in your layout
22551
21:32:36,072 --> 21:32:40,001
would be to say that styles dot\nCSS is in your static folder.
22552
21:32:40,001 --> 21:32:43,990
And then, if I go into\nmy FroshIMS directory
22553
21:32:43,990 --> 21:32:46,330
I can create a static folder.
22554
21:32:46,331 --> 21:32:48,822
I can CD into it,\nnothing's there by default.
22555
21:32:48,822 --> 21:32:51,762
But if I now code a\nfile called styles.css
22556
21:32:51,762 --> 21:32:54,822
I could now do something like this body.
22557
21:32:54,822 --> 21:33:05,982
And in here, I could say background\n
22558
21:33:05,982 --> 21:33:10,211
Let me go ahead now and restart\nFlask in the FroshIMS directory.
22559
21:33:10,211 --> 21:33:12,822
Cross my fingers because\nI'm doing this on the fly.
22560
21:33:12,822 --> 21:33:16,161
Go back to my form and reload.
22561
21:33:16,160 --> 21:33:19,751
Voila, now we've tied together\nlast week's stuff as well.
22562
21:33:19,751 --> 21:33:22,853
If I answered the right question?
22563
21:33:22,854 --> 21:33:27,202
AUDIENCE: [INAUDIBLE] change\none page and not the other.
22564
21:33:27,202 --> 21:33:30,202
DAVID: If you want to change one page\n
22565
21:33:32,122 --> 21:33:36,952
In that case, you might want to have\n
22566
21:33:38,152 --> 21:33:42,381
You could use different classes in one\n
22567
21:33:42,381 --> 21:33:43,801
There's different ways to do that.
22568
21:33:43,801 --> 21:33:47,751
You could even have a\nplaceholder in your layout
22569
21:33:47,751 --> 21:33:52,221
that allows you to plug in\nthe URL of a specific style
22570
21:33:52,221 --> 21:33:53,661
sheet in your individual files.
22571
21:33:53,661 --> 21:33:56,432
But that starts to get\nmore complicated quickly.
22572
21:33:56,432 --> 21:33:58,172
So in short, you can absolutely do it.
22573
21:33:58,172 --> 21:34:01,942
But typically, I would\nsay most websites try not
22574
21:34:01,941 --> 21:34:03,652
to use different style Sheets per page.
22575
21:34:03,652 --> 21:34:06,322
They reuse the styles\nas much as they can.
22576
21:34:06,322 --> 21:34:08,572
All right, let me go ahead\nand revert this real quick.
22577
21:34:08,572 --> 21:34:11,512
And let's start to add a little\nbit more functionality here.
22578
21:34:11,512 --> 21:34:14,392
I'm going to go ahead and just\n
22579
21:34:14,392 --> 21:34:16,162
to not complicate things just yet.
22580
21:34:16,161 --> 21:34:19,251
And let's go ahead and just play\n
22581
21:34:19,851 --> 21:34:23,264
In my form here, the dropdown\nmenu is perfectly fine.
22582
21:34:24,182 --> 21:34:27,411
But suppose that I wanted to\nchange it to checkboxes instead.
22583
21:34:27,411 --> 21:34:31,432
Maybe I want students to be able to\n
22584
21:34:31,432 --> 21:34:34,822
Well, it might make sense to\nclean this up in a couple of ways.
22585
21:34:35,572 --> 21:34:40,762
Before we even get into the checkboxes,\n
22586
21:34:40,762 --> 21:34:45,211
Notice that I've hardcoded basketball,\n
22587
21:34:45,211 --> 21:34:49,592
And if you recall, in app.py, I also\n
22588
21:34:49,592 --> 21:34:52,932
And any time you see copy paste\nor the equivalent thereof
22589
21:34:52,932 --> 21:34:54,601
feels like we could do better.
22590
21:34:54,601 --> 21:34:56,402
So what if I instead do this.
22591
21:34:56,402 --> 21:35:00,711
What if I instead give myself\na global variable of Sports
22592
21:35:00,711 --> 21:35:03,111
I'll capitalize the word\njust to connote that it's
22593
21:35:03,111 --> 21:35:07,161
meant to be constant even though\n
22594
21:35:07,161 --> 21:35:09,741
The first sport will be basketball.
22595
21:35:11,721 --> 21:35:15,921
The third will be ultimate frisbee.
22596
21:35:15,922 --> 21:35:19,942
Now I have one convenient\nplace to store all of my sports
22597
21:35:19,941 --> 21:35:22,461
if it changes next semester\nor next year or whatnot.
22598
21:35:22,461 --> 21:35:24,471
But notice what I could do to.
22599
21:35:24,471 --> 21:35:26,101
I could now do something like this.
22600
21:35:26,101 --> 21:35:30,111
Let me pass into my\nindex template a variable
22601
21:35:30,111 --> 21:35:34,461
called sports that's equal to\nthat global variable sports.
22602
21:35:34,461 --> 21:35:37,792
Let me go into my index now,\nand this is really, now
22603
21:35:37,792 --> 21:35:41,782
going to hint at the power of\n
22604
21:35:41,782 --> 21:35:45,452
Let me go ahead and get rid of all\n
22605
21:35:45,452 --> 21:35:50,332
and let me show you some slightly\n
22606
21:35:52,851 --> 21:35:54,872
We've not seen this end for syntax.
22607
21:35:54,872 --> 21:35:57,692
There's like end block syntax,\nbut it's as simple as that.
22608
21:35:57,691 --> 21:36:00,951
So you have a start and an end to your\n
22609
21:36:02,392 --> 21:36:08,482
Option curly brace\nsport close curly brace.
22610
21:36:09,831 --> 21:36:12,981
Let me go back into my\nterminal window, do Flask run.
22611
21:36:12,982 --> 21:36:16,161
And if I didn't mess up\nhere, let me go back to this.
22612
21:36:16,161 --> 21:36:18,801
The red's going to go away\nbecause I deleted my CSS.
22613
21:36:18,801 --> 21:36:21,831
And now I still have a sport\ndropdown and all of those sports
22614
21:36:22,771 --> 21:36:24,351
I can make one more improvement now.
22615
21:36:24,351 --> 21:36:27,301
I don't need to mention these\nsame sports manually in app.py.
22616
21:36:27,301 --> 21:36:31,671
I can now just say if the\nuser's inputed sport is not
22617
21:36:31,672 --> 21:36:35,070
in my global variable, sports,\nand ask the same question.
22618
21:36:35,070 --> 21:36:36,862
And this is really\nhandy because if there's
22619
21:36:36,861 --> 21:36:41,001
another sport, for instance, that\ngets added, like say football
22620
21:36:41,001 --> 21:36:43,921
all I have to do is\nchange my global variable.
22621
21:36:43,922 --> 21:36:47,542
And if I reload the form now\nand look in the dropdown, boom
22622
21:36:47,542 --> 21:36:49,732
now I have support for a fourth sport.
22623
21:36:49,732 --> 21:36:51,422
And I can keep adding and adding there.
22624
21:36:51,422 --> 21:36:55,012
So here's where templating starts\nto get really powerful in that
22625
21:36:55,012 --> 21:37:00,472
now, in this template, I'm using\nJinja's for loop syntax, which
22626
21:37:00,471 --> 21:37:02,961
is almost identical to\nPython here, except you
22627
21:37:02,961 --> 21:37:06,262
need the curly brace and the percent\n
22628
21:37:07,131 --> 21:37:08,811
But it's the same idea as in Python.
22629
21:37:08,812 --> 21:37:13,072
Iterating over something with a for loop\n
22630
21:37:13,072 --> 21:37:14,991
And this is like every\nwebsite out there.
22631
21:37:15,861 --> 21:37:19,851
When you visit your inbox and you\n
22632
21:37:19,851 --> 21:37:22,491
Google has not hardcoded\nyour emails manually.
22633
21:37:22,491 --> 21:37:24,322
They have grabbed them from a database.
22634
21:37:24,322 --> 21:37:26,072
They have some kind\nof for loop like this
22635
21:37:26,072 --> 21:37:32,057
and are just outputting table row after\n
22636
21:37:32,057 --> 21:37:34,432
All right, so now, let's go\nahead and change this, maybe
22637
21:37:34,432 --> 21:37:40,232
to, oh, how about little\ncheckboxes or radio buttons.
22638
21:37:40,232 --> 21:37:41,572
So let me go ahead and do this.
22639
21:37:41,572 --> 21:37:46,432
Instead of a select menu, I'm going to\n
22640
21:37:46,432 --> 21:37:52,351
For each of these sports let me go\n
22641
21:37:52,351 --> 21:37:55,281
but let me go ahead and\noutput an input tag
22642
21:37:55,282 --> 21:37:59,302
the name for which is quote\nunquote "sport," the type of which
22643
21:37:59,301 --> 21:38:05,001
is checkbox, the value of which is\n
22644
21:38:05,001 --> 21:38:09,403
quote unquote, and then afterward\n
22645
21:38:10,111 --> 21:38:11,911
So you see a word next to the checkbox.
22646
21:38:11,911 --> 21:38:14,161
And we'll look at the result\nof this in just a moment.
22647
21:38:14,161 --> 21:38:17,721
So it's actually a little simpler\n
22648
21:38:17,721 --> 21:38:20,961
because now watch what\nhappens if I reload my form.
22649
21:38:20,961 --> 21:38:24,322
Different user interface,\nand it's not as pretty
22650
21:38:24,322 --> 21:38:27,601
but it's going to allow users to sign\n
22651
21:38:28,351 --> 21:38:31,682
Now I can click on basketball\nand football and soccer
22652
21:38:31,682 --> 21:38:34,021
or some other combination thereof.
22653
21:38:34,021 --> 21:38:37,801
If I view the page's source, this\n
22654
21:38:37,801 --> 21:38:41,981
I didn't have to type out four\n
22655
21:38:41,982 --> 21:38:45,302
And these things all have\nthe same name, but that's OK.
22656
21:38:45,301 --> 21:38:48,932
It turns out with Flask, if it sees\n
22657
21:38:48,932 --> 21:38:52,922
it's going to hand them back to you as\n
22658
21:38:52,922 --> 21:38:56,192
All right, but suppose we don't want\n
22659
21:38:57,572 --> 21:39:01,202
Let me go ahead and change this\ncheckbox to radio button, which
22660
21:39:01,202 --> 21:39:03,162
a radio button is mutually exclusive.
22661
21:39:03,161 --> 21:39:04,891
So you can only sign up for one.
22662
21:39:04,892 --> 21:39:09,542
So now, once I reload\nthe page, there we go.
22663
21:39:12,001 --> 21:39:16,771
And because I've given each of these\n
22664
21:39:16,771 --> 21:39:20,171
sport," that\'s what makes\nthem mutually exclusive.
22665
21:39:20,172 --> 21:39:23,492
The browser knows all four of\nthese things are types of sports
22666
21:39:23,491 --> 21:39:26,732
therefore I'm only going to let\nyou select one of these things.
22667
21:39:26,732 --> 21:39:29,072
And that's simply because\nthey all have the same name.
22668
21:39:29,072 --> 21:39:32,491
Again, if I view page source, notice\n
22669
21:39:32,491 --> 21:39:36,542
name equals sport, name equals\n
22670
21:39:36,542 --> 21:39:39,703
that each one is going to have.
22671
21:39:39,703 --> 21:39:45,251
All right, any questions,\nthen, on this approach?
22672
21:39:45,751 --> 21:39:47,751
Well, let me go ahead and\nopen a version of this
22673
21:39:47,751 --> 21:39:51,451
that I made in advance that's going\n
22674
21:39:51,452 --> 21:39:53,342
So thus far, we're\nnot quite at the point
22675
21:39:53,342 --> 21:39:56,672
of where this website was, which\n
22676
21:39:56,672 --> 21:39:59,192
like in a database, everyone\nwho had registered for sports.
22677
21:39:59,191 --> 21:40:01,891
Now, we're literally telling\nstudents you are registered
22678
21:40:01,892 --> 21:40:03,812
or you are not registered,\nbut we're literally
22679
21:40:03,812 --> 21:40:05,892
doing nothing with this information.
22680
21:40:05,892 --> 21:40:08,532
So how might we go\nabout implementing this?
22681
21:40:08,532 --> 21:40:10,502
Well, let me go ahead\nand close these tabs
22682
21:40:10,501 --> 21:40:15,901
and let me go into what I call version\n
22683
21:40:15,902 --> 21:40:19,711
And let me go into my source\nnine directory, FroshIMS3
22684
21:40:19,711 --> 21:40:22,652
and let me go ahead and open up app.py.
22685
21:40:22,652 --> 21:40:24,211
So this is a premade version.
22686
21:40:24,211 --> 21:40:26,141
I've gotten rid of\nfootball, in this case.
22687
21:40:26,142 --> 21:40:28,862
But I've added one\nthing at the very top.
22688
21:40:28,861 --> 21:40:33,871
What's, in English, does\nthis represent on line seven?
22689
21:40:33,872 --> 21:40:35,911
What would you describe\nwhat that thing is?
22690
21:40:41,452 --> 21:40:42,172
AUDIENCE: It's an empty dictionary.
22691
21:40:42,172 --> 21:40:43,372
DAVID: Yeah, it's an\nempty dictionary, right?
22692
21:40:43,372 --> 21:40:45,622
Registrants is apparently\na variable on the left.
22693
21:40:45,622 --> 21:40:47,930
It's being assigned an empty\ndictionary on the right.
22694
21:40:47,929 --> 21:40:49,971
And a dictionary, again,\nis just key value pairs.
22695
21:40:49,971 --> 21:40:53,451
Here, again, is where dictionaries\n
22696
21:40:53,960 --> 21:40:56,752
Because this is going to allow me\n
22697
21:40:56,751 --> 21:40:59,271
for ultimate frisbee, Carter\nregistered for soccer
22698
21:40:59,271 --> 21:41:00,831
Emma registered for something else.
22699
21:41:00,831 --> 21:41:04,311
You can associate keys with\nvalues, names with sports
22700
21:41:04,312 --> 21:41:07,532
assuming a model where you can only\n
22701
21:41:07,532 --> 21:41:12,802
And so let's see what the\nlogic is that handles this.
22702
21:41:12,801 --> 21:41:16,281
Here in my register route\nin the code I've premade
22703
21:41:16,282 --> 21:41:18,202
notice that I'm validating\nthe user's name.
22704
21:41:18,202 --> 21:41:20,392
Slightly differently from\nbefore but same idea.
22705
21:41:20,392 --> 21:41:23,482
I'm using request.form.get\nto get the human's name.
22706
21:41:23,482 --> 21:41:26,512
If not name, so if the\nhuman did not type a name
22707
21:41:26,512 --> 21:41:29,241
I'm going to output error.html.
22708
21:41:29,241 --> 21:41:33,652
But notice I've started to make\n
22709
21:41:33,652 --> 21:41:37,551
I'm telling the user, apparently,\n
22710
21:41:38,631 --> 21:41:41,182
I'm apparently passing\nto my error template
22711
21:41:41,182 --> 21:41:44,535
instead of just failure.html,\na specific message.
22712
21:41:44,535 --> 21:41:45,952
So let's go down this rabbit hole.
22713
21:41:45,952 --> 21:41:52,052
Let me actually go into\ntemplates/error.hml, and sure enough
22714
21:41:52,051 --> 21:41:55,762
here's a new file I created here, that\n
22715
21:41:55,762 --> 21:41:59,332
a grumpy cat as part of the error\n
22716
21:41:59,331 --> 21:42:04,581
In my block body I've got an H1 tag\n
22717
21:42:04,581 --> 21:42:07,131
I then have a paragraph\ntag that plugs in whatever
22718
21:42:07,131 --> 21:42:10,881
the error message is that the\ncontroller, app.py, is passing in.
22719
21:42:10,881 --> 21:42:14,211
And then just for fun, I have a\n
22720
21:42:14,211 --> 21:42:15,751
that there was, in fact, an error.
22721
21:42:18,422 --> 21:42:22,792
I do similarly\nrequest.form.get of sport
22722
21:42:22,792 --> 21:42:24,562
and I store it in a\nvariable called sport.
22723
21:42:24,562 --> 21:42:28,351
If there's no such sport, that is the\n
22724
21:42:28,351 --> 21:42:30,771
then I'm going to render\nerror.html two, but I'm
22725
21:42:30,771 --> 21:42:33,361
going to give a different\nmessage, missing sport.
22726
21:42:33,361 --> 21:42:38,451
Else, if the sport they did type in\n
22727
21:42:38,452 --> 21:42:41,991
I'm going to render error.html,\nbut complain differently
22728
21:42:41,991 --> 21:42:44,781
you gave me an invalid sport somehow.
22729
21:42:44,782 --> 21:42:47,032
As if a hacker went into\nthe HTML of the page
22730
21:42:47,032 --> 21:42:49,379
changed it to add their\nown sport like volleyball.
22731
21:42:49,379 --> 21:42:51,711
Even though it's not offered,\nthey submitted volleyball.
22732
21:42:51,711 --> 21:42:55,161
But that's OK, I'm rejecting it, even\n
22733
21:42:55,161 --> 21:42:58,671
tried to send it to me by\nchanging the dom locally.
22734
21:42:58,672 --> 21:43:00,832
And then really, the magic is just this.
22735
21:43:00,831 --> 21:43:03,441
I remember that this\nperson has registered
22736
21:43:03,441 --> 21:43:06,621
by indexing into the\nregistrant dictionary
22737
21:43:06,622 --> 21:43:11,631
using the name the human typed in as the\n
22738
21:43:12,902 --> 21:43:15,232
Well, I added one final route here.
22739
21:43:15,232 --> 21:43:19,672
I have a /registrants route with a\n
22740
21:43:19,672 --> 21:43:21,622
a template called registrants.html.
22741
21:43:21,622 --> 21:43:26,582
But it takes as input that\nglobal variable just like before.
22742
21:43:26,581 --> 21:43:32,961
So let's go down this rabbit hole let me\n
22743
21:43:34,402 --> 21:43:38,182
It looks a little crazy big,\nbut it extends the layout.
22744
21:43:39,381 --> 21:43:42,322
I've got an H1 tag that says\nregistrants, big and bold.
22745
21:43:42,322 --> 21:43:44,691
Then I've got a table\nthat we saw last week.
22746
21:43:44,691 --> 21:43:48,561
This has a table head that just\nsays name sport for two columns.
22747
21:43:48,562 --> 21:43:53,961
Then it has a table body where in,\n
22748
21:43:53,961 --> 21:43:57,172
I'm saying, for each name\nin the registrants variable
22749
21:43:57,172 --> 21:44:01,162
output a table row, start tag,\nand end tag, inside of which
22750
21:44:01,161 --> 21:44:04,341
two table datas, two\ncells, table data for name
22751
21:44:04,342 --> 21:44:08,552
table data for registrants bracket name.
22752
21:44:08,551 --> 21:44:10,971
So it's very similar to Python syntax.
22753
21:44:10,971 --> 21:44:14,861
It essentially is Python syntax, albeit\n
22754
21:44:15,361 --> 21:44:17,511
So the net effect here is what?
22755
21:44:17,512 --> 21:44:20,812
Let me open up my terminal\nwindow, run Flask run.
22756
21:44:20,812 --> 21:44:24,872
Let me now go into the\nform that I premade here.
22757
21:44:26,032 --> 21:44:27,952
Let me go ahead and type in David.
22758
21:44:27,952 --> 21:44:30,322
Let me choose, oh, no sport.
22759
21:44:33,232 --> 21:44:34,881
And there is the grumpy cat.
22760
21:44:34,881 --> 21:44:37,881
So missing sport, though,\nspecifically was outputed.
22761
21:44:38,672 --> 21:44:41,422
Let me go ahead and say no name.
22762
21:44:44,751 --> 21:44:47,241
All right, and let me\nmaliciously, now, do this.
22763
21:44:49,581 --> 21:44:53,691
I'll type my name, sure, but let\n
22764
21:44:53,691 --> 21:44:57,891
Let me maliciously go down in ultimate\n
22765
21:44:59,542 --> 21:45:04,342
Change that and change\nthis to volleyball.
22766
21:45:05,122 --> 21:45:08,991
So now, I can register for\nany sport I want to create.
22767
21:45:08,991 --> 21:45:11,961
Let me click register,\nbut invalid sports.
22768
21:45:11,961 --> 21:45:14,182
So again, that speaks to\nthe power and the need
22769
21:45:14,182 --> 21:45:17,361
for checking things on backend\nand not trusting users.
22770
21:45:17,361 --> 21:45:21,591
It is that easy to hack websites\n
22771
21:45:22,292 --> 21:45:24,292
All right, finally, let's\njust do this for real.
22772
21:45:24,292 --> 21:45:26,292
David is going to register\nfor ultimate frisbee.
22773
21:45:27,292 --> 21:45:30,562
And now, the output is not\nvery pretty, but notice
22774
21:45:30,562 --> 21:45:32,552
I'm at the registrants route.
22775
21:45:32,551 --> 21:45:34,641
And if I zoom out, I have an HTML table.
22776
21:45:34,642 --> 21:45:38,132
Two columns, name and sport,\nDavid and ultimate frisbee.
22777
21:45:38,131 --> 21:45:41,881
Let me go back to the form, letting me\n
22778
21:45:41,881 --> 21:45:43,322
and registered for basketball.
22779
21:45:44,221 --> 21:45:49,111
Now we see two rows in this table,\n
22780
21:45:49,622 --> 21:45:51,414
And if we do this one\nmore time, maybe Emma
22781
21:45:51,414 --> 21:45:53,762
comes along and registers\nfor soccer register.
22782
21:45:53,762 --> 21:45:58,667
All of this information is being\nstored in this dictionary, now.
22783
21:45:58,667 --> 21:45:59,792
All right, so that's great.
22784
21:45:59,792 --> 21:46:03,991
Now we have a database, albeit in\n
22785
21:46:03,991 --> 21:46:09,039
But why is this, maybe, not\nthe best implementation?
22786
21:46:10,716 --> 21:46:13,619
AUDIENCE: You are storing [INAUDIBLE].
22787
21:46:18,851 --> 21:46:21,476
So we're only storing this\ndictionary in the computer's memory
22788
21:46:21,476 --> 21:46:24,732
and that's great until I hit\nControl C and kill Flask
22789
21:46:26,172 --> 21:46:29,412
Or the server reboots, or maybe\nI close my laptop or whatever.
22790
21:46:29,411 --> 21:46:32,711
If the server stops running,\nmemory is going to be lost.
22791
21:46:34,182 --> 21:46:37,312
It's thrown away when you lose\npower or stop the program.
22792
21:46:37,312 --> 21:46:39,101
So maybe this isn't the best approach.
22793
21:46:39,101 --> 21:46:41,361
Maybe it would be better\nto use a CSV file.
22794
21:46:41,361 --> 21:46:43,991
And in fact, some 20 years ago,\nthat's literally what I did.
22795
21:46:43,991 --> 21:46:45,891
I stored everything in a CSV file.
22796
21:46:45,892 --> 21:46:48,402
But let's skip that step,\nbecause we already saw last week
22797
21:46:48,402 --> 21:46:51,551
or a couple of weeks ago\nnow, how we can use SQLite.
22798
21:46:51,551 --> 21:46:54,131
Let's see if we can't\nmarry in some SQL here
22799
21:46:54,131 --> 21:46:58,131
to store an actual\ndatabase for the program.
22800
21:46:58,131 --> 21:47:00,072
Let me go back here and\nlet me open up, say
22801
21:47:00,072 --> 21:47:03,611
version four of this, which\nis almost the same but it
22802
21:47:03,611 --> 21:47:05,291
adds a bit more functionality.
22803
21:47:05,292 --> 21:47:10,542
Let me close these tabs and let me\n
22804
21:47:10,542 --> 21:47:13,752
So notice it's almost\nthe same, but at the top
22805
21:47:13,751 --> 21:47:18,251
I'm creating a database connection\n
22806
21:47:18,251 --> 21:47:20,044
So that's a database\nI created in advance.
22807
21:47:20,044 --> 21:47:21,461
So let's go down that rabbit hole.
22808
21:47:22,482 --> 21:47:24,622
Let me make my terminal window bigger.
22809
21:47:24,622 --> 21:47:28,362
Let me run SQLite 3 of FroshIMS.db.
22810
21:47:30,342 --> 21:47:33,432
and let's just infer what\nI designed this to be.
22811
21:47:33,432 --> 21:47:37,721
I have a table called registrants,\n
22812
21:47:37,721 --> 21:47:42,101
An ID column that's an integer, a name\n
22813
21:47:42,101 --> 21:47:44,711
and a sport column that's\nalso text, cannot be null
22814
21:47:44,711 --> 21:47:46,361
and the primary key is just ID.
22815
21:47:46,361 --> 21:47:49,391
So that I have a unique\nID for every registration.
22816
21:47:49,392 --> 21:47:51,912
Let's see if there's\nanyone in there yet.
22817
21:47:51,911 --> 21:47:55,604
Select star from registrants.
22818
21:47:55,604 --> 21:47:56,771
OK, there's no one in there.
22819
21:47:56,771 --> 21:47:58,361
No one is yet registered for sports.
22820
21:47:58,361 --> 21:48:00,941
So let's go back to the\ncode and continue on.
22821
21:48:00,941 --> 21:48:03,792
In my code now, I've got\nthe same global variable
22822
21:48:03,792 --> 21:48:07,032
for validation and\ngeneration of my HTML.
22823
21:48:07,032 --> 21:48:09,522
Looks like my index route is the same.
22824
21:48:09,521 --> 21:48:13,241
It's dynamically generating\nthe menu of sports.
22825
21:48:13,241 --> 21:48:14,899
Interestingly, we'll come back to this.
22826
21:48:14,899 --> 21:48:17,232
There's a deregister route\nthat's going to allow someone
22827
21:48:17,232 --> 21:48:22,092
to deregister themselves if\nthey want to exit the sport
22828
21:48:24,702 --> 21:48:27,342
Here's my new and\nimproved register route.
22829
21:48:27,342 --> 21:48:30,702
Still works on POST, so\nsome mild privacy there.
22830
21:48:30,702 --> 21:48:33,532
I'm validating the\nsubmission as follows.
22831
21:48:33,532 --> 21:48:36,942
I'm getting the user's inputted\nname, the user's inputted sport
22832
21:48:36,941 --> 21:48:41,141
and if it is not a name or\nthe sport is not in sports
22833
21:48:41,142 --> 21:48:42,959
I'm going to render failure.html.
22834
21:48:43,792 --> 21:48:45,142
There's no cat in this version.
22835
21:48:46,542 --> 21:48:50,262
Otherwise, recall how we\nco-mingled SQL and Python before.
22836
21:48:50,262 --> 21:48:53,562
We're using CS50's SQL\nlibrary, but that just
22837
21:48:53,562 --> 21:48:56,892
makes it a little easier to execute\n
22838
21:48:56,892 --> 21:49:00,522
Insert into registrants\nname comma sport.
22839
21:49:00,521 --> 21:49:05,152
What two values, the name and the\n
22840
21:49:05,152 --> 21:49:08,021
And then lastly, and this is a new\n
22841
21:49:08,021 --> 21:49:10,781
explicitly now, Flask\nalso gives you access
22842
21:49:10,782 --> 21:49:17,407
to a redirect function, which is how\n
22843
21:49:17,407 --> 21:49:19,782
and all these other sites we\nplayed around with last week
22844
21:49:19,782 --> 21:49:23,262
we're all implemented redirecting\n
22845
21:49:23,262 --> 21:49:26,741
This Flask function\nredirect comes from my just
22846
21:49:26,741 --> 21:49:30,221
having imported it at the\nvery top of this file.
22847
21:49:30,221 --> 21:49:35,141
It handles the HTTP 301 or 302 or 307\n
22848
21:49:37,211 --> 21:49:42,282
All right, so that's it for\nregistering via this route.
22849
21:49:42,282 --> 21:49:45,912
Let's look at what the\nregistrant's route is.
22850
21:49:45,911 --> 21:49:49,031
Here, we have a new\nroute for /registrants.
22851
21:49:49,032 --> 21:49:52,042
And instead of just iterating\nover a dictionary like before
22852
21:49:52,042 --> 21:49:56,471
we're getting back, let's\nsee, db.execute of select star
22853
21:49:57,312 --> 21:50:00,402
So that's literally the programmatic\n
22854
21:50:00,402 --> 21:50:02,411
That gives me back a\nlist of dictionaries
22855
21:50:02,411 --> 21:50:05,652
each of which represents\none row in the table.
22856
21:50:05,652 --> 21:50:08,831
Then, I'm going to render\nregister and start HTML
22857
21:50:08,831 --> 21:50:12,281
passing in literally\nthat list of dictionaries
22858
21:50:12,282 --> 21:50:15,562
just like using CS50's\nlibrary in the past.
22859
21:50:15,562 --> 21:50:18,522
So let's go and look\nat these-- that form.
22860
21:50:18,521 --> 21:50:24,372
If I go into templates and\nopen up registrants.html
22861
21:50:24,372 --> 21:50:27,372
oh, OK, it's just a table like before.
22862
21:50:27,372 --> 21:50:30,672
And actually, let me change this\nsyntactically for consistency.
22863
21:50:30,672 --> 21:50:36,312
We have a Jinja for loop that\niterates over each registrant
22864
21:50:36,312 --> 21:50:39,672
and for each of them,\noutputs a table row.
22865
21:50:39,672 --> 21:50:40,902
Oh, but this is interesting.
22866
21:50:40,902 --> 21:50:44,622
Instead of just having two columns\n
22867
21:50:44,622 --> 21:50:47,654
notice that I'm also\noutputting a full-fledged form.
22868
21:50:47,653 --> 21:50:49,361
All right, this is\nstarting to get juicy.
22869
21:50:49,361 --> 21:50:52,661
So let's actually go back\nto my terminal window
22870
21:50:52,661 --> 21:50:56,531
run Flask, and actually see what\nthis example looks like now.
22871
21:50:58,812 --> 21:51:00,592
In the home page, it\nlooks exactly the same.
22872
21:51:00,592 --> 21:51:02,175
But let me now register for something.
22873
21:51:02,175 --> 21:51:04,792
David for ultimate frisbee, register.
22874
21:51:08,831 --> 21:51:12,561
David registering for\nultimate frisbee, register.
22875
21:51:13,062 --> 21:51:15,132
So good thing I have deregister.
22876
21:51:15,131 --> 21:51:16,932
So this is what it should now look like.
22877
21:51:16,932 --> 21:51:22,119
I have a page at the route called\n
22878
21:51:22,119 --> 21:51:24,161
columns, name and sport,\nDavid, ultimate frisbee.
22879
21:51:24,161 --> 21:51:25,521
But oh, wait, a third column.
22880
21:51:26,021 --> 21:51:30,461
Because if I view the page source,\n
22881
21:51:30,461 --> 21:51:34,572
For every row in this table, I'm also\n
22882
21:51:36,532 --> 21:51:40,092
But before we see how that works,\n
22883
21:51:40,682 --> 21:51:42,282
So Carter will give you basketball.
22884
21:51:44,961 --> 21:51:48,072
Now, let me go back and let's\nregister Emma for soccer.
22885
21:51:50,051 --> 21:51:55,481
Before we look at that HTML, let's\n
22886
21:51:55,482 --> 21:51:59,741
Let's go into SQLite FroshIMS.
22887
21:51:59,741 --> 21:52:10,152
Let me go into FroshIMS, and let me\n
22888
21:52:10,152 --> 21:52:12,461
And now do select star from registrants.
22889
21:52:12,461 --> 21:52:16,391
And whereas, previously, when I executed\n
22890
21:52:17,471 --> 21:52:21,111
So now we see exactly what's\ngoing on underneath the hood.
22891
21:52:21,111 --> 21:52:24,461
So let's look at this\nform now-- this page now.
22892
21:52:24,461 --> 21:52:29,141
If I want to unregister, deregister\n
22893
21:52:30,941 --> 21:52:33,042
Clicking one of those\nbuttons will indeed
22894
21:52:33,042 --> 21:52:36,021
delete the row from the database.
22895
21:52:36,021 --> 21:52:41,411
But how do we go about linking a web\n
22896
21:52:41,411 --> 21:52:43,451
This is the last piece of the puzzle.
22897
21:52:43,452 --> 21:52:47,232
Up until now, everything's been\nwith forms and also with URLs.
22898
21:52:47,232 --> 21:52:49,362
But what if the user is\nnot typing anything in
22899
21:52:49,361 --> 21:52:51,551
they're just clicking a button?
22900
21:52:53,422 --> 21:52:55,897
Let me go ahead and\nsniff the traffic, which
22901
21:52:55,896 --> 21:52:57,521
you could be in the habit of doing now.
22902
21:52:57,521 --> 21:53:01,331
Any time you're curious how a website\n
22903
21:53:01,331 --> 21:53:05,771
And Carter, shall we\nderegister you from basketball?
22904
21:53:05,771 --> 21:53:09,311
Let's deregister Carter and\nlet's see what just happened.
22905
21:53:09,312 --> 21:53:13,392
If I look at the deregister\nrequest, notice that it's a POST.
22906
21:53:13,392 --> 21:53:16,212
The status code that\neventually came back as 302
22907
21:53:16,211 --> 21:53:18,641
but let's look at the request itself.
22908
21:53:18,642 --> 21:53:21,372
All the headers there we'll ignore.
22909
21:53:21,372 --> 21:53:25,872
The only thing that\nbutton submits, cleverly
22910
21:53:25,872 --> 21:53:29,892
is an ID parameter, a key equaling two.
22911
21:53:29,892 --> 21:53:34,152
What does two presumably\nrepresent or map to?
22912
21:53:34,152 --> 21:53:37,062
Where did this two come from?
22913
21:53:37,062 --> 21:53:40,782
It doesn't say Carter, it\ndoesn't say basketball?
22914
21:53:41,330 --> 21:53:43,122
AUDIENCE: The second\nperson who registered.
22915
21:53:43,122 --> 21:53:44,532
DAVID: The second\nperson that registered.
22916
21:53:44,532 --> 21:53:47,682
So those primary keys that we started\n
22917
21:53:47,682 --> 21:53:50,982
ago, why it's useful to be able to\n
22918
21:53:50,982 --> 21:53:53,232
here is just one of the reasons why.
22919
21:53:53,232 --> 21:53:58,271
If it suffices for me just to\nsend the ID number of the person
22920
21:53:58,271 --> 21:54:03,141
I want to delete from the database,\n
22921
21:54:03,142 --> 21:54:09,322
If I go into app.py and I look at my\n
22922
21:54:11,172 --> 21:54:15,642
I first go into the form, and I get\n
22923
21:54:15,642 --> 21:54:19,872
If there was, in fact, an ID, and\nthe form wasn't somehow empty
22924
21:54:19,872 --> 21:54:21,642
I execute this line of code.
22925
21:54:21,642 --> 21:54:24,642
Delete from registrants where\nID equals question mark
22926
21:54:24,642 --> 21:54:29,202
and then I plug-in that number,\ndeleting Carter and only Carter.
22927
21:54:29,202 --> 21:54:32,472
And I'm not using his name, because\n
22928
21:54:32,471 --> 21:54:34,001
two people named Emma or David?
22929
21:54:34,001 --> 21:54:35,601
You don't want to delete both of them.
22930
21:54:35,601 --> 21:54:39,732
That's why these unique\nIDs are so, so important.
22931
21:54:39,732 --> 21:54:41,622
And here's another reason why.
22932
21:54:41,622 --> 21:54:45,012
You don't want to store\nsome things in URLs.
22933
21:54:45,012 --> 21:54:49,752
Suppose we went to this\nURL, deregister?ID=3.
22934
21:54:53,262 --> 21:54:58,131
Suppose I, maliciously,\nemailed this URL to Emma.
22935
21:54:58,131 --> 21:55:00,131
It doesn't matter so much\nwhat the beginning is
22936
21:55:00,131 --> 21:55:06,251
but supposed I emailed her this URL,\n
22937
21:55:07,691 --> 21:55:10,331
And it uses GET instead of POST.
22938
21:55:10,331 --> 21:55:12,326
What did I just trick her into doing?
22939
21:55:15,369 --> 21:55:17,161
What's going to happen\nif Emma clicks this?
22940
21:55:19,021 --> 21:55:21,521
DAVID: You would trick her\ninto deregistering herself.
22941
21:55:22,021 --> 21:55:24,811
Because if she's logged\ninto this FroshIMS website
22942
21:55:24,812 --> 21:55:28,862
and the URL contains her ID just\nbecause I'm being malicious
22943
21:55:28,861 --> 21:55:31,921
and she clicked on it and\nthe website is using GET
22944
21:55:31,922 --> 21:55:34,711
unfortunately, GET URLs\nare, again, stateful.
22945
21:55:34,711 --> 21:55:36,782
They have state information in the URLs.
22946
21:55:36,782 --> 21:55:39,532
And in this case, it's enough\nto delete the user and boom
22947
21:55:39,532 --> 21:55:42,144
she would have accidentally\nderegistered herself.
22948
21:55:42,144 --> 21:55:43,351
And this is pretty innocuous.
22949
21:55:43,351 --> 21:55:45,872
Suppose that this was\nher bank account trying
22950
21:55:45,872 --> 21:55:47,672
to make a withdrawal or a deposit.
22951
21:55:47,672 --> 21:55:50,822
Suppose that this were some\nother website, a Facebook URL
22952
21:55:50,822 --> 21:55:53,339
trying to trick her into\nposting something automatically.
22953
21:55:53,339 --> 21:55:55,172
Here, too, is another\nconsideration when you
22954
21:55:55,172 --> 21:55:59,402
should use POST versus GET,\nbecause GET requests can
22955
21:55:59,402 --> 21:56:04,331
be plugged into emails sent via Slack\n
22956
21:56:04,331 --> 21:56:06,601
And unless there's a\nprompt saying, are you sure
22957
21:56:06,601 --> 21:56:09,751
you want to deregister\nyourself, you might blindly
22958
21:56:09,751 --> 21:56:11,851
trick the user into being\nvulnerable to what's
22959
21:56:11,851 --> 21:56:14,221
called a cross-site request forgery.
22960
21:56:14,221 --> 21:56:17,401
A fancy way of saying you\ntrick them into clicking a link
22961
21:56:17,402 --> 21:56:21,241
that they shouldn't have, because\n
22962
21:56:21,241 --> 21:56:25,111
All right, any question, then,\non these building blocks?
22963
21:56:26,929 --> 21:56:32,275
AUDIENCE: What do the first\nthing in the instance of the SQL
22964
21:56:32,275 --> 21:56:34,232
[INAUDIBLE] where they\nhave three slashes?
22965
21:56:35,501 --> 21:56:37,456
DAVID: When three columns, you mean?
22966
21:56:37,456 --> 21:56:42,311
AUDIENCE: No, three forward slashes.
22967
21:56:42,312 --> 21:56:44,882
DAVID: The three forward slashes.
22968
21:56:46,407 --> 21:56:49,377
AUDIENCE: Yeah, so I\nthink it's in [INAUDIBLE]..
22969
21:56:52,851 --> 21:56:56,031
DAVID: Sorry, it's in where?
22970
21:56:57,235 --> 21:57:02,065
AUDIENCE: It's in [INAUDIBLE] scroll up.
22971
21:57:06,304 --> 21:57:07,721
DAVID: Sorry, the other direction?
22972
21:57:12,846 --> 21:57:16,407
So please scroll a little bit more.
22973
21:57:16,407 --> 21:57:17,532
DAVID: Keep scrolling more?
22974
21:57:20,081 --> 21:57:27,761
This is a URI, it's typical syntax\n
22975
21:57:27,762 --> 21:57:32,351
protocol, so to speak, which means\n
22976
21:57:32,351 --> 21:57:35,322
:// is just like you and I see in URLs.
22977
21:57:35,322 --> 21:57:37,961
The third slash, essentially,\nmeans current folder.
22978
21:57:38,622 --> 21:57:41,461
So it's a weird curiosity,\nbut it's typical
22979
21:57:41,461 --> 21:57:43,961
whenever you're referring to a\nlocal file and not one that's
22980
21:57:45,251 --> 21:57:48,311
That's a bit of an oversimplification,\n
22981
21:57:48,312 --> 21:57:50,472
Sorry for not clicking earlier.
22982
21:57:50,471 --> 21:57:53,292
All right, let's do one\nother iteration of FroshIMS
22983
21:57:53,292 --> 21:57:56,622
here just to show what I was\n
22984
21:57:56,622 --> 21:57:59,562
was not only storing these\nthings in CSV files, as I recall.
22985
21:57:59,562 --> 21:58:02,047
I was also automatically\ngenerating an email
22986
21:58:02,047 --> 21:58:04,422
to the proctor in charge of\nthe intramural sports program
22987
21:58:04,422 --> 21:58:07,512
so that they would have sort of a\n
22988
21:58:07,512 --> 21:58:09,652
and they could easily\nreply to them as well.
22989
21:58:09,652 --> 21:58:13,282
Let me go into FroshIMS version\nfive, which I precreated here
22990
21:58:13,282 --> 21:58:18,642
and let me go ahead and open\nup, say, app.py this time.
22991
21:58:18,642 --> 21:58:21,132
And this is some code\nthat I wrote in advance.
22992
21:58:21,131 --> 21:58:24,921
And it looks a little scary at first\n
22993
21:58:24,922 --> 21:58:30,312
I have now added the Flask\nmail library to the picture
22994
21:58:30,312 --> 21:58:33,461
by adding Flask mail to\nrequirements.txt and running a command
22995
21:58:33,461 --> 21:58:37,101
to automatically install email\nsupport for Flask as well.
22996
21:58:37,101 --> 21:58:39,851
And this is a little bit\ncryptic, but it's honestly mostly
22997
21:58:39,851 --> 21:58:41,622
copy paste from the documentation.
22998
21:58:41,622 --> 21:58:44,351
What I'm doing here is\nI'm configuring my Flask
22999
21:58:44,351 --> 21:58:47,721
application with a few configuration\nvariables, if you will.
23000
21:58:47,721 --> 21:58:51,221
This is the syntax for that.\n
23001
21:58:51,221 --> 21:58:55,601
comes with Flask that is automatically\n
23002
21:58:55,601 --> 21:58:58,961
on line nine, and I just had to fill\n
23003
21:58:58,961 --> 21:59:01,482
values for the default\nsender address that I
23004
21:59:01,482 --> 21:59:05,472
want to send email as, the default\n
23005
21:59:05,471 --> 21:59:08,651
the port number, the TCP port,\nthat we talked about last week.
23006
21:59:08,652 --> 21:59:12,491
The mail server, I'm going to use\nGmail's smtp.gmail.com server.
23007
21:59:12,491 --> 21:59:14,531
Use TLS, this means use encryption.
23008
21:59:15,672 --> 21:59:18,562
Mail username, this is going\nto grab it from my environment.
23009
21:59:18,562 --> 21:59:21,972
So for security purposes, I didn't\n
23010
21:59:21,971 --> 21:59:24,023
and password into the code.
23011
21:59:24,024 --> 21:59:26,982
So I'm actually storing those in what\n
23012
21:59:26,982 --> 21:59:29,112
You'll see more of these\nin problem set nine
23013
21:59:29,111 --> 21:59:31,121
and it's a very common\nconvention on a server
23014
21:59:31,122 --> 21:59:36,771
in the real world to store sensitive\n
23015
21:59:36,771 --> 21:59:39,641
so that it can be accessed\nwhen your website is running
23016
21:59:39,642 --> 21:59:41,202
but not in your source code.
23017
21:59:41,202 --> 21:59:44,232
It's way too easy if\nyou put credentials
23018
21:59:44,232 --> 21:59:47,532
sensitive stuff in your source\ncode, to post it to GitHub
23019
21:59:47,532 --> 21:59:50,832
or to screenshot it accidentally,\n
23020
21:59:50,831 --> 21:59:55,961
So for today's purposes, know that the\n
23021
21:59:55,961 --> 21:59:57,581
are called environment variables.
23022
21:59:57,581 --> 22:00:01,091
And this is like an out-of-band, a\n
23023
22:00:01,092 --> 22:00:04,122
pairs in the computer's memory\nby running a certain command
23024
22:00:04,122 --> 22:00:06,491
but that never show up\nin your actual code.
23025
22:00:06,491 --> 22:00:09,131
Otherwise, there would be so\nmany usernames and passwords
23026
22:00:09,131 --> 22:00:11,921
accidentally visible on the internet.
23027
22:00:11,922 --> 22:00:14,322
So I've installed this in advance.
23028
22:00:14,322 --> 22:00:16,152
Let me see if I can do this correctly.
23029
22:00:16,152 --> 22:00:19,581
Let me go over to another\ntab in just a moment.
23030
22:00:19,581 --> 22:00:23,441
And here, I have on my second\nscreen here, John Harvards inbox.
23031
22:00:23,441 --> 22:00:26,322
It's currently empty, and I'm\ngoing to go ahead and register
23032
22:00:26,322 --> 22:00:28,542
for some sport as John\nHarvard here, hopefully.
23033
22:00:28,542 --> 22:00:32,831
So let me go ahead and run\nFlask run on this version five.
23034
22:00:32,831 --> 22:00:36,311
Let me go ahead and\nreload the main screen.
23035
22:00:37,001 --> 22:00:38,681
Let me reload the main screen here.
23036
22:00:38,682 --> 22:00:41,411
This time, clearly, I'm\nasking for name and email.
23037
22:00:41,411 --> 22:00:43,497
So name will be John Harvard.
23038
22:00:45,771 --> 22:00:51,342
He'll register for, how about soccer.
23039
22:00:52,902 --> 22:00:57,612
And if I did this correctly, not\n
23040
22:00:57,611 --> 22:01:06,701
seeing you are registered, but when he\n
23041
22:01:06,702 --> 22:01:13,002
crossing his fingers that this\n
23042
22:01:13,001 --> 22:01:15,086
and I promise it did right before class.
23043
22:01:22,851 --> 22:01:25,241
I don't think there's\na mistake this time.
23044
22:01:28,402 --> 22:01:31,822
Let me try something\nover here real quick
23045
22:01:31,822 --> 22:01:35,661
but I don't think this is broken.
23046
22:01:35,661 --> 22:01:39,171
It wouldn't have said\nsuccess if it were.
23047
22:01:39,172 --> 22:01:44,992
I just tried submitting again, so I\n
23048
22:01:44,991 --> 22:01:47,156
Oh, I'm really sad right now.
23049
22:01:53,682 --> 22:01:57,521
DAVID: I could check\nspam, but then it's--
23050
22:01:57,521 --> 22:02:01,991
not sure we want to show spam here on\n
23051
22:02:13,751 --> 22:02:15,521
Wow, that was a risky click I worried.
23052
22:02:15,521 --> 22:02:19,250
All right, so you are registered\nis the email that I sent out
23053
22:02:19,250 --> 22:02:21,292
and it doesn't have any\nactual information in it.
23054
22:02:21,292 --> 22:02:23,262
But back in the day it\nwould have, because I
23055
22:02:23,262 --> 22:02:25,092
included the student's\nname and their dorm
23056
22:02:25,092 --> 22:02:27,634
and all of the other fields of\ninformation that we asked for.
23057
22:02:27,634 --> 22:02:30,312
So let's just take a quick look\nat how that code might work.
23058
22:02:30,312 --> 22:02:33,522
I did have to configure Gmail\nin a certain way to allow
23059
22:02:33,521 --> 22:02:36,191
what they call, less\nsecure apps using SMTP
23060
22:02:36,191 --> 22:02:38,591
which is the protocol\nused for outbound email.
23061
22:02:38,592 --> 22:02:42,960
But besides setting these things, let's\n
23062
22:02:42,960 --> 22:02:44,502
It's actually pretty straightforward.
23063
22:02:44,501 --> 22:02:47,631
In my register route, I validated\n
23064
22:02:48,861 --> 22:02:52,761
I then confirmed the registration\ndown here, nothing new there.
23065
22:02:52,762 --> 22:02:55,482
All I did was use two new lines of code.
23066
22:02:55,482 --> 22:02:57,983
And it's this easy to automate\nthe sending of emails.
23067
22:02:57,983 --> 22:02:59,691
I apparently have done\nit too many times
23068
22:02:59,691 --> 22:03:01,601
which is why it ended up in spam.
23069
22:03:01,601 --> 22:03:03,581
I created a variable called message.
23070
22:03:03,581 --> 22:03:06,981
I used a message function that\nI must have imported higher up
23071
22:03:08,092 --> 22:03:10,872
Here's, apparently, the subject\nline as the first argument.
23072
22:03:10,872 --> 22:03:15,192
And the second argument is the\nnamed parameter recipients
23073
22:03:15,191 --> 22:03:18,682
which takes a list of emails that\n
23074
22:03:18,682 --> 22:03:21,042
So in brackets, I just\nput the one user's email
23075
22:03:21,042 --> 22:03:24,081
and then mail.send that message.
23076
22:03:24,081 --> 22:03:29,021
So let's scroll back up to see what\n
23077
22:03:30,562 --> 22:03:33,792
Yep, mail is this, which\nI have as a variable
23078
22:03:33,792 --> 22:03:36,221
because I followed the\ndocumentation for this library.
23079
22:03:36,221 --> 22:03:41,809
You simply configure your current app\n
23080
22:03:41,809 --> 22:03:43,601
And if you look up here\nnow, on line seven
23081
22:03:43,601 --> 22:03:46,751
here's the new library\nfrom Flask mail I imported.
23082
22:03:46,751 --> 22:03:50,951
Capital Mail, capital Message, so that\n
23083
22:03:52,452 --> 22:03:55,302
So such a simple thing whether you\n
23084
22:03:55,301 --> 22:03:56,831
you want to do password resets.
23085
22:03:56,831 --> 22:04:00,701
It can be this easy to\nactually generate emails
23086
22:04:00,702 --> 22:04:03,754
provided you have the requisite\naccess and software installed.
23087
22:04:03,754 --> 22:04:05,961
And just to make clear that\nI did add something here
23088
22:04:05,961 --> 22:04:09,101
let me open up my requirements.txt\nfile, and indeed, I
23089
22:04:09,101 --> 22:04:13,812
have both Flask and\nFlask-mail ready to go.
23090
22:04:13,812 --> 22:04:16,812
But I ran the command in\nadvance to actually do that.
23091
22:04:16,812 --> 22:04:22,082
All right, any questions,\nthen, on these examples here?
23092
22:04:23,441 --> 22:04:29,572
So what other pieces might actually\n
23093
22:04:29,572 --> 22:04:32,675
It turns out that a key\ncomponent of most any web
23094
22:04:32,675 --> 22:04:34,842
application nowadays that\nwe haven't touched on yet
23095
22:04:34,842 --> 22:04:38,396
but it'll be one of our final flourishes\n
23096
22:04:38,396 --> 22:04:41,201
And a session is actually\na feature that derives
23097
22:04:41,202 --> 22:04:43,842
from all of the basics we talked\nabout today and last week
23098
22:04:43,842 --> 22:04:47,351
and a session is the technical term for\n
23099
22:04:47,351 --> 22:04:50,922
When you go to amazon.com and you start\n
23100
22:04:50,922 --> 22:04:53,605
they follow you from\npage to page to page.
23101
22:04:53,604 --> 22:04:56,021
Heck if you close your browser,\ncome back to the next day
23102
22:04:56,021 --> 22:04:59,668
they're typically still your shopping\n
23103
22:04:59,668 --> 22:05:01,001
because they want your business.
23104
22:05:01,001 --> 22:05:03,711
They don't want you to have to\nstart from scratch the next day.
23105
22:05:03,711 --> 22:05:07,482
Similarly, when you log\ninto any website these days
23106
22:05:07,482 --> 22:05:11,292
even if it's not an e-commerce thing\n
23107
22:05:11,292 --> 22:05:13,002
you and I are not in\nthe habit of logging
23108
22:05:13,001 --> 22:05:15,191
into every darn page\nwe visit on a website.
23109
22:05:15,191 --> 22:05:19,182
Typically, you log in once, and then\n
23110
22:05:19,182 --> 22:05:21,161
you stay logged into that website.
23111
22:05:21,161 --> 22:05:25,301
So somehow, the website is\nremembering that you have logged in.
23112
22:05:25,301 --> 22:05:27,889
And that is being implemented\nby way of this thing called
23113
22:05:27,889 --> 22:05:29,682
a session, and perhaps\na more familiar term
23114
22:05:29,682 --> 22:05:32,789
that you might know as, and\nworry about, called cookies.
23115
22:05:32,789 --> 22:05:35,122
Let's go ahead and take one\nmore five minute break here.
23116
22:05:35,122 --> 22:05:37,414
And when we come back, we'll\nlook at cookies, sessions
23117
22:05:40,952 --> 22:05:43,961
So the promise now is that\nwe're going to implement
23118
22:05:43,961 --> 22:05:47,381
this notion of a session, which is going\n
23119
22:05:47,381 --> 22:05:50,262
them logged in and even implement\nthings like a shopping cart.
23120
22:05:50,262 --> 22:05:54,642
And the overarching goal here\nis to build an application that
23121
22:05:54,642 --> 22:05:56,142
is, quote unquote, "stateful.
23122
22:05:56,142 --> 22:05:59,322
Again, state refers to information,\n
23123
22:06:01,001 --> 22:06:05,711
And in this context, the curiosity\nis that HTTP is technically
23124
22:06:07,271 --> 22:06:11,381
Once you visit a URL,\nhttp://something, hit Enter
23125
22:06:11,381 --> 22:06:14,471
web page is downloaded to\nyour browser, like that's it.
23126
22:06:14,471 --> 22:06:17,471
You can unplug from the internet,\nyou can turn off your Wi-Fi
23127
22:06:17,471 --> 22:06:19,991
but you still have the web page locally.
23128
22:06:19,991 --> 22:06:23,171
And yet we somehow want to make\nsure that the next time you
23129
22:06:23,172 --> 22:06:25,822
click on a link on that website,\nit doesn't forget who you are.
23130
22:06:25,822 --> 22:06:27,822
Or the next thing you add\nto your shopping cart
23131
22:06:27,822 --> 22:06:29,631
it doesn't forget what\nwas already there.
23132
22:06:29,631 --> 22:06:32,861
So we somehow want to\nmake HTTP stateful
23133
22:06:32,861 --> 22:06:36,191
and we can actually do this using the\n
23134
22:06:36,191 --> 22:06:40,902
So concretely, here's a form you might\n
23135
22:06:42,491 --> 22:06:46,421
And I say rarely because most of\n
23136
22:06:46,422 --> 22:06:49,452
you just stay logged in, pretty\nmuch endlessly, in your browser.
23137
22:06:49,452 --> 22:06:52,152
And that's because Google\nhas made the conscious choice
23138
22:06:52,152 --> 22:06:55,862
to give you a very long session\ntime, maybe a day, a week
23139
22:06:55,861 --> 22:06:57,611
a month, a year, because\nthey don't really
23140
22:06:57,611 --> 22:07:01,271
want to add friction to using their tool\n
23141
22:07:01,271 --> 22:07:04,191
By contrast, there's other\napplications on campus
23142
22:07:04,191 --> 22:07:07,031
including some of the CS50 zone,\n
23143
22:07:07,032 --> 22:07:09,102
Because we want to make\nsure that it's indeed you
23144
22:07:09,101 --> 22:07:13,042
accessing the site, and not a roommate\n
23145
22:07:13,042 --> 22:07:16,822
So once you do fill out this\nform, how does Google subsequently
23146
22:07:16,822 --> 22:07:20,152
know that you are you, and\nwhen you reload the page even
23147
22:07:20,152 --> 22:07:22,911
or open a second tab for\nyour same Gmail account
23148
22:07:22,911 --> 22:07:26,668
how do they know that you're still\n
23149
22:07:26,668 --> 22:07:29,001
Well, let's look underneath\nthe hood of what's going on.
23150
22:07:29,001 --> 22:07:32,451
When you log into Gmail,\nessentially, you initially
23151
22:07:32,452 --> 22:07:35,182
see a form like this\nusing a GET request.
23152
22:07:35,182 --> 22:07:37,851
And the website responds\nlike we saw last week
23153
22:07:37,851 --> 22:07:39,271
with some kind of HTTP response.
23154
22:07:39,271 --> 22:07:41,391
Hopefully 200 OK with the form.
23155
22:07:41,392 --> 22:07:46,222
Meanwhile, the website might\nalso respond with an HTTP header
23156
22:07:46,221 --> 22:07:49,432
that, last week we didn't care\nabout, this week, we now do.
23157
22:07:49,432 --> 22:07:53,002
Whenever you visit a website,\nit is very commonly the case
23158
22:07:53,001 --> 22:07:56,072
that the website is putting\na cookie on your computer.
23159
22:07:56,072 --> 22:07:58,672
And you may generally know\nthat cookies can be bad
23160
22:07:58,672 --> 22:08:03,412
and they track you in some way, and\n
23161
22:08:03,411 --> 22:08:08,811
Without cookies, you could not implement\n
23162
22:08:10,342 --> 22:08:12,741
Unfortunately, they can also\nbe used for ill purposes
23163
22:08:12,741 --> 22:08:16,042
like tracking you on every website and\n
23164
22:08:16,542 --> 22:08:18,471
So with good comes some bad.
23165
22:08:18,471 --> 22:08:21,651
But the basic primitive for\nus, the computer scientist
23166
22:08:21,652 --> 22:08:24,622
boils down to just HTTP headers.
23167
22:08:24,622 --> 22:08:29,512
A cookie is typically a big number,\n
23168
22:08:29,512 --> 22:08:33,741
that a server tells your browser\nto store in memory, or even
23169
22:08:35,402 --> 22:08:38,941
So you can think of it like a file that\n
23170
22:08:38,941 --> 22:08:43,012
And the promise that HTTP\nmakes is that if a server sets
23171
22:08:43,012 --> 22:08:45,982
a cookie on your computer,\nyou will represent
23172
22:08:45,982 --> 22:08:49,982
that same cookie or that same\nvalue on every subsequent request.
23173
22:08:49,982 --> 22:08:51,862
So when you visit the\nwebsite like Gmail
23174
22:08:51,861 --> 22:08:55,611
they plop a cookie on your computer\nlike this with some session
23175
22:08:55,611 --> 22:08:57,741
equals value, some long random value.
23176
22:08:57,741 --> 22:09:00,072
One, two, three, A, B,\nC, something like that.
23177
22:09:00,072 --> 22:09:04,491
And when you then visit another page\n
23178
22:09:04,491 --> 22:09:09,351
you send the opposite header, not\n
23179
22:09:09,351 --> 22:09:11,372
and you send the exact same value.
23180
22:09:11,372 --> 22:09:13,779
It's similar to going to a\nclub or an amusement park
23181
22:09:13,778 --> 22:09:15,861
where you pay once, you\ngo through the gates once
23182
22:09:15,861 --> 22:09:17,781
you get checked by\nsecurity once, and then
23183
22:09:17,782 --> 22:09:22,672
they very often take like a little stamp\n
23184
22:09:22,672 --> 22:09:25,505
And then for you, efficiency-wise,\n
23185
22:09:25,505 --> 22:09:27,839
or later in the evening, you\ncan just present your hand.
23186
22:09:27,839 --> 22:09:29,452
You've been stamped, presumably.
23187
22:09:29,452 --> 22:09:32,122
They've already-- you've\nalready paid, you've
23188
22:09:32,122 --> 22:09:33,661
already been searched or whatnot.
23189
22:09:33,661 --> 22:09:35,631
And so it's this sort\nof fast track ticket
23190
22:09:35,631 --> 22:09:37,461
back into the club, back into the park.
23191
22:09:37,461 --> 22:09:41,241
That's essentially what a\ncookie is doing for you, whereby
23192
22:09:41,241 --> 22:09:43,822
it's a way of reminding the\nwebsite we've already done this
23193
22:09:43,822 --> 22:09:46,051
you already asked me for\nmy username and password.
23194
22:09:46,051 --> 22:09:48,531
This is my path to now come and go.
23195
22:09:48,532 --> 22:09:52,522
Now, unlike this hand stamp, which\n
23196
22:09:52,521 --> 22:09:55,131
or duplicated or kept\non over multiple days
23197
22:09:55,131 --> 22:09:59,941
these cookies are really big, seemingly\n
23198
22:09:59,941 --> 22:10:02,661
So statistically, there's\nno way someone else
23199
22:10:02,661 --> 22:10:05,872
is just going to guess your cookie\n
23200
22:10:05,872 --> 22:10:08,872
very low probability, statistically.
23201
22:10:08,872 --> 22:10:12,711
But this is all it boils down to is this\n
23202
22:10:12,711 --> 22:10:16,771
to send these values back\nand forth in this way.
23203
22:10:16,771 --> 22:10:19,131
So when we actually\ntranslate this, now, to code
23204
22:10:19,131 --> 22:10:21,232
let's do something like\na simple login app.
23205
22:10:21,232 --> 22:10:24,351
Let me go into a folder I made\nin advance today called login.
23206
22:10:24,351 --> 22:10:28,922
And let me code up app.py and\nlet's take a look in here.
23207
22:10:30,471 --> 22:10:31,911
A couple of new things up top.
23208
22:10:31,911 --> 22:10:36,801
If I want to have the ability to\n
23209
22:10:36,801 --> 22:10:40,611
and implement sessions, I'm going\n
23210
22:10:41,551 --> 22:10:44,572
So this is another feature you\nget for free by using a framework
23211
22:10:44,572 --> 22:10:46,801
and not having to implement\nall this yourself.
23212
22:10:46,801 --> 22:10:48,951
And from the Flask\nsession library, I'm going
23213
22:10:48,952 --> 22:10:51,502
to import Session, capital S. Why?
23214
22:10:51,501 --> 22:10:53,751
I'm going to configure\nthe session as follows.
23215
22:10:53,751 --> 22:10:56,751
Long story short, there's different\nways to implement sessions.
23216
22:10:56,751 --> 22:11:00,711
The server can store these\ncookies in a database, in a file
23217
22:11:00,711 --> 22:11:03,152
in memory, in RAM, in other places too.
23218
22:11:03,152 --> 22:11:07,982
We are telling it to store these\n
23219
22:11:07,982 --> 22:11:11,452
So in fact, whenever you use sessions\n
23220
22:11:11,452 --> 22:11:15,292
you'll actually see a folder\n
23221
22:11:15,292 --> 22:11:17,842
inside of which are the\ncookies, essentially
23222
22:11:17,842 --> 22:11:20,211
for any users or friends\nor yourself who've been
23223
22:11:20,211 --> 22:11:22,801
visiting your particular application.
23224
22:11:22,801 --> 22:11:24,561
So I'm setting it to\nuse the file system
23225
22:11:24,562 --> 22:11:26,770
and I don't want them to be\npermanent because I want
23226
22:11:26,770 --> 22:11:29,482
when you close your browser,\nthe session to go away.
23227
22:11:29,482 --> 22:11:32,032
They could be made to be\npermanent and last much longer.
23228
22:11:32,032 --> 22:11:34,222
Then I tell my app to support sessions.
23229
22:11:35,902 --> 22:11:39,572
Let's see what this application actually\n
23230
22:11:39,572 --> 22:11:44,902
Let me go over to my terminal window,\n
23231
22:11:49,202 --> 22:11:51,502
Give it a second to kick back in.
23232
22:11:51,501 --> 22:11:53,361
Let me go ahead and open my URL.
23233
22:11:59,792 --> 22:12:02,226
So this website simply has a login form.
23234
22:12:02,226 --> 22:12:04,101
There's no password,\nthough I could certainly
23235
22:12:04,101 --> 22:12:05,902
add that and check for that too.
23236
22:12:07,411 --> 22:12:10,072
So I'm going to log in as\nmyself, David, and click Login.
23237
22:12:10,072 --> 22:12:12,892
And now notice I'm currently\nat the /login route.
23238
22:12:13,881 --> 22:12:16,206
If I try to go to the\ndefault route, just
23239
22:12:16,206 --> 22:12:18,831
slash, which is where most\nwebsites live by default
23240
22:12:18,831 --> 22:12:21,771
notice that I magically\nget redirected to log in.
23241
22:12:21,771 --> 22:12:24,921
So somehow, my code knows, hey, if\n
23242
22:12:26,691 --> 22:12:29,301
Let me type in my name,\nDavid, and click Login.
23243
22:12:29,301 --> 22:12:32,106
And now notice I am back at slash.
23244
22:12:32,107 --> 22:12:35,332
Chrome is sort of annoyingly hiding\n
23245
22:12:36,482 --> 22:12:39,232
And now notice it says you\nare logged in as David.
23246
22:12:39,892 --> 22:12:43,162
What's cool is notice if I reload\nthe page, it still knows that.
23247
22:12:43,161 --> 22:12:46,581
If I create a second tab and go to\n
23248
22:12:48,293 --> 22:12:50,001
I could keep doing\nthis in multiple tabs
23249
22:12:50,001 --> 22:12:53,721
it's still going to remember me on both\n
23250
22:12:55,051 --> 22:13:00,141
Especially when I click Log Out,\n
23251
22:13:00,142 --> 22:13:01,792
All right, so let's see how this works.
23252
22:13:01,792 --> 22:13:03,741
And it's some basic building blocks.
23253
22:13:03,741 --> 22:13:07,551
Under my /route, notice I have this.
23254
22:13:07,551 --> 22:13:12,421
If there is no name in the session,\nredirect the user to /login.
23255
22:13:12,422 --> 22:13:14,992
So these two lines\ntogether are what implement
23256
22:13:14,991 --> 22:13:19,611
that automatic redirection using\nHTTP 301 or 302 automatically.
23257
22:13:19,611 --> 22:13:21,691
It's handled for me\nwith these two lines.
23258
22:13:23,351 --> 22:13:25,101
All right, let's go\ndown that rabbit hole.
23259
22:13:29,991 --> 22:13:36,652
let me look in my templates\nfolder for my login demo and look
23260
22:13:40,042 --> 22:13:41,631
All right, so what's going on here?
23261
22:13:41,631 --> 22:13:46,221
I extend layout.html,\nI have a block body
23262
22:13:46,221 --> 22:13:47,961
and then I've got some other syntax.
23263
22:13:47,961 --> 22:13:50,504
So we haven't seen this yet,\nbut it's more Jinja stuff, which
23264
22:13:50,504 --> 22:13:52,221
again, is almost identical to Python.
23265
22:13:52,221 --> 22:13:55,311
If there's a name in\nthe session variable
23266
22:13:55,312 --> 22:14:00,082
then literally say you are logged in\n
23267
22:14:00,081 --> 22:14:04,641
And then notice this, I've got a simple\n
23268
22:14:04,642 --> 22:14:07,312
Else, if there is no\nname in the session
23269
22:14:07,312 --> 22:14:11,122
then it apparently says you are not\n
23270
22:14:11,122 --> 22:14:13,612
link to /login and then end diff.
23271
22:14:13,611 --> 22:14:16,221
So again, Jinja does\nnot rely on indentation.
23272
22:14:16,221 --> 22:14:19,221
Recall the HTML and CSS don't\nreally care about indentation
23273
22:14:20,452 --> 22:14:24,022
But in code with Jinja,\nyou need these end tags
23274
22:14:24,021 --> 22:14:26,872
end block, end for, end\nif, to make super obvious
23275
22:14:26,872 --> 22:14:29,092
that you're done with that thought.
23276
22:14:29,092 --> 22:14:32,601
So session is just this\nmagic variable that we now
23277
22:14:32,601 --> 22:14:36,891
have access to because we've\nincluded these two lines of code
23278
22:14:36,892 --> 22:14:41,902
and these that handle that whole\n
23279
22:14:41,902 --> 22:14:43,851
with a different, unique identifier.
23280
22:14:43,851 --> 22:14:46,521
If I made my code space\npublic and I let all of you
23281
22:14:46,521 --> 22:14:50,031
visit the exact same URL, all of\n
23282
22:14:50,032 --> 22:14:52,102
You could all type your\nown names individually
23283
22:14:52,101 --> 22:14:56,092
all log in at the same URL\nusing different sessions.
23284
22:14:56,092 --> 22:14:59,332
And in fact, I would then see,\nif I go into my terminal window
23285
22:14:59,331 --> 22:15:03,991
here and my login directory, notice the\n
23286
22:15:03,991 --> 22:15:07,881
And if I CD into that and type ls,\n
23287
22:15:07,881 --> 22:15:10,551
or actually, I think I\nstarted the server twice.
23288
22:15:11,941 --> 22:15:14,691
I would ultimately have one\nfile for every one of you.
23289
22:15:14,691 --> 22:15:16,701
And that's what's\nbeautiful about sessions
23290
22:15:16,702 --> 22:15:21,292
is it creates the illusion\nof per user storage.
23291
22:15:21,292 --> 22:15:25,622
Inside of my session is my name,\n
23292
22:15:26,482 --> 22:15:30,512
And the same is going to apply to\n
23293
22:15:30,512 --> 22:15:32,211
Let's see how login works here.
23294
22:15:32,211 --> 22:15:36,622
My login route supports both GET and\n
23295
22:15:36,622 --> 22:15:41,192
And notice this, this login route\n
23296
22:15:41,191 --> 22:15:45,394
If the user got to this\nroute via POST, my inference
23297
22:15:45,395 --> 22:15:47,062
is that they must have submitted a form.
23298
22:15:47,562 --> 22:15:50,991
Because that's how I'm going to\n
23299
22:15:50,991 --> 22:15:53,631
And if they did submit\nthe form via POST
23300
22:15:53,631 --> 22:15:56,691
I'm going to store, in\nthe session, at the name
23301
22:15:56,691 --> 22:15:59,091
key, whatever the human's name is.
23302
22:15:59,092 --> 22:16:01,267
And then, I'm going to\nredirect them back to slash.
23303
22:16:01,267 --> 22:16:04,352
Otherwise, I'm going to\nshow them the login form.
23304
22:16:05,631 --> 22:16:08,961
If I go to this login\nform, which lives at
23305
22:16:08,961 --> 22:16:12,861
literally, slash login, by default,\n
23306
22:16:14,721 --> 22:16:16,881
And so that's why I see the form.
23307
22:16:18,411 --> 22:16:22,311
The form, very cleverly,\nsubmits to itself
23308
22:16:22,312 --> 22:16:26,991
like the one route/login submits\nto its same self, /login
23309
22:16:26,991 --> 22:16:29,581
but it uses POST when\nyou submit the form.
23310
22:16:29,581 --> 22:16:33,711
And this is a nice way of having one\n
23311
22:16:36,142 --> 22:16:40,652
When I'm just there visiting /login\n
23312
22:16:40,652 --> 22:16:45,741
But if I submit the form, then this\n
23313
22:16:45,741 --> 22:16:49,732
and this just avoids my having to have\n
23314
22:16:50,301 --> 22:16:55,402
I can just have one route that\nhandles both GET and POST.
23315
22:16:57,182 --> 22:16:58,792
Well, it's as simple as this.
23316
22:16:58,792 --> 22:17:01,221
Change whatever name\nis in the session to be
23317
22:17:01,221 --> 22:17:05,451
none, which is Python's version of null,\n
23318
22:17:06,081 --> 22:17:12,152
Because now, in index.html, I will\n
23319
22:17:13,471 --> 22:17:16,921
And so I'll tell the user\ninstead, you are not logged in.
23320
22:17:18,375 --> 22:17:20,542
I want to say as simple as\nthis is, though I realize
23321
22:17:20,542 --> 22:17:22,402
this is a bunch of steps involved.
23322
22:17:22,402 --> 22:17:25,581
This is the essence of every\nwebsite on the internet that
23323
22:17:25,581 --> 22:17:26,951
has usernames and passwords.
23324
22:17:26,952 --> 22:17:30,202
And we skip the password name step for\n
23325
22:17:30,202 --> 22:17:34,612
but this is how every website out\n
23326
22:17:34,611 --> 22:17:37,221
And how this works,\nultimately, is that as soon
23327
22:17:37,221 --> 22:17:41,001
as you use in Python lines\nlike this and lines like this
23328
22:17:41,001 --> 22:17:45,141
Flask takes care of stamping the\n
23329
22:17:45,142 --> 22:17:50,392
and whenever Flask sees the same\ncookie coming back from a user
23330
22:17:50,392 --> 22:17:53,632
it grabs the appropriate\nfile from that folder
23331
22:17:53,631 --> 22:17:56,042
loads it into the\nsession global variable
23332
22:17:56,042 --> 22:18:01,012
so that your code is now unique\nto that user and their name.
23333
22:18:01,012 --> 22:18:03,922
Let's do one other\nexample with sessions here
23334
22:18:03,922 --> 22:18:06,652
that'll show how we might use\nthese, now, for shopping carts.
23335
22:18:06,652 --> 22:18:09,142
Let me go into the store example here.
23336
22:18:09,142 --> 22:18:10,972
Let me go ahead and\nrun this thing first.
23337
22:18:10,971 --> 22:18:16,851
If I run store in my same\ntab and go back over here
23338
22:18:16,851 --> 22:18:19,881
we'll see a very ugly\ne-commerce site that
23339
22:18:19,881 --> 22:18:21,771
just sells seven different books here.
23340
22:18:21,771 --> 22:18:26,258
But each of these books has a button\n
23341
22:18:26,259 --> 22:18:28,342
All right, well where are\nthese books coming from?
23342
22:18:29,361 --> 22:18:32,751
Let me go into my terminal window again.
23343
22:18:32,751 --> 22:18:35,991
Let me go into this example,\nwhich is called store
23344
22:18:35,991 --> 22:18:40,911
and let me open up about\nindex dot ht-- whoops.
23345
22:18:40,911 --> 22:18:48,381
Let's open up index,\nhow about, books.html
23346
22:18:48,381 --> 22:18:50,671
is the default one, not index this time.
23347
22:18:50,672 --> 22:18:55,461
So if I look here, notice that\nthat route that we just saw
23348
22:18:55,461 --> 22:19:00,172
uses a for loop in Jinja to iterate\n
23349
22:19:00,172 --> 22:19:03,532
and it outputs, in an H2\ntag, the title of the book
23350
22:19:03,532 --> 22:19:05,722
and then another one of these forms.
23351
22:19:08,072 --> 22:19:11,991
Let's go ahead and open up app.py,\n
23352
22:19:11,991 --> 22:19:13,432
what's ticking all of this off.
23353
22:19:13,432 --> 22:19:16,851
Notice that this file is\nimporting session support.
23354
22:19:16,851 --> 22:19:19,851
It's configuring sessions\ndown here, but it's also
23355
22:19:19,851 --> 22:19:22,292
connecting to a store.db file.
23356
22:19:23,842 --> 22:19:29,362
And notice this, in my /route,\nI'm selecting star from books
23357
22:19:29,361 --> 22:19:31,641
which is going to give me\na list of dictionaries
23358
22:19:31,642 --> 22:19:33,652
each of which represents a row of books.
23359
22:19:33,652 --> 22:19:37,881
And I'm going to pass that list of\n
23360
22:19:37,881 --> 22:19:41,601
is why this for loop\nworks the way it does.
23361
22:19:41,601 --> 22:19:43,281
Let's look at this actual database.
23362
22:19:43,282 --> 22:19:48,472
Let me increase my terminal window\n
23363
22:19:50,881 --> 22:19:55,531
It's a book-- it's a table called\n
23364
22:19:55,532 --> 22:19:58,552
Let's do select star\nfrom books semicolon.
23365
22:19:58,551 --> 22:20:01,311
There are the seven books,\neach of which has a unique ID.
23366
22:20:01,312 --> 22:20:03,241
And you might see where this is going.
23367
22:20:03,241 --> 22:20:06,801
If I go to the UI and I look\nat each of these buttons
23368
22:20:06,801 --> 22:20:11,961
for add to cart, just like Amazon might\n
23369
22:20:13,312 --> 22:20:15,264
And what's magical here,\njust like deregister
23370
22:20:15,264 --> 22:20:17,182
even though I didn't\nhighlight it at the time
23371
22:20:17,182 --> 22:20:19,342
there's another type\nof input that allows
23372
22:20:19,342 --> 22:20:23,241
you to specify a value without the\n
23373
22:20:23,241 --> 22:20:26,031
Instead of type equals\ntext or type equals submit
23374
22:20:26,032 --> 22:20:30,992
type equals hidden will put the value in\n
23375
22:20:30,991 --> 22:20:34,101
So that's how I'm saying that\nthe idea of this book is one
23376
22:20:34,101 --> 22:20:38,312
the idea of this book is two, the idea\n
23377
22:20:38,312 --> 22:20:40,882
And each of these forms,\nthen, will submit, apparently
23378
22:20:40,881 --> 22:20:45,842
to /cart using POST and that would\n
23379
22:20:46,592 --> 22:20:48,622
Let me click on one or two of these.
23380
22:20:48,622 --> 22:20:51,592
Let's add the first book, add to cart.
23381
22:20:52,642 --> 22:20:54,922
Notice my route change to /cart.
23382
22:20:54,922 --> 22:20:58,882
All right, let's go back and\nlet's add the book number two.
23383
22:20:59,872 --> 22:21:02,872
And let's skip ahead to the\nseventh book, Deathly Hallows
23384
22:21:02,872 --> 22:21:05,612
and now we have all three books here.
23385
22:21:05,611 --> 22:21:09,291
So what does the cart route do at /cart?
23386
22:21:10,042 --> 22:21:15,351
If I go back to my terminal window,\n
23387
22:21:15,351 --> 22:21:17,721
there's a lot going on\nhere, but let's see.
23388
22:21:17,721 --> 22:21:21,292
So the /cart route\nsupports both GET or POST
23389
22:21:21,292 --> 22:21:24,892
which is a nice way to\nconsolidate things into one URL.
23390
22:21:24,892 --> 22:21:26,792
All right, this is interesting.
23391
22:21:26,792 --> 22:21:30,945
If there is not a, quote\nunquote, "cart" key in session
23392
22:21:30,945 --> 22:21:32,612
we haven't technically seen this syntax.
23393
22:21:32,611 --> 22:21:36,771
But long story short, these lines\n
23394
22:21:37,982 --> 22:21:43,161
It makes sure that there's a cart\n
23395
22:21:43,161 --> 22:21:45,741
and it's by default going\nto be an empty list.
23396
22:21:46,282 --> 22:21:48,282
That just means you have\nan empty shopping cart.
23397
22:21:48,282 --> 22:21:55,612
But if the user visits this route via\n
23398
22:21:55,611 --> 22:21:59,241
they didn't muck with the form in any\n
23399
22:21:59,241 --> 22:22:02,691
they gave me a valid ID, then\nI'm going to use this syntax.
23400
22:22:02,691 --> 22:22:05,542
If session bracket cart is a list--
23401
22:22:05,542 --> 22:22:07,846
recall from a couple of weeks\nago that dot append just
23402
22:22:07,846 --> 22:22:08,971
adds something to the list.
23403
22:22:08,971 --> 22:22:13,201
So I'm going to add the ID to the\n
23404
22:22:13,202 --> 22:22:18,612
Otherwise, if the user is at /cart\n
23405
22:22:18,611 --> 22:22:22,091
Select star from books where ID is in.
23406
22:22:22,092 --> 22:22:24,932
And this might be syntax\nyou recall from Pset six.
23407
22:22:24,932 --> 22:22:27,542
It lets you look for\nmultiple IDs all at once
23408
22:22:27,542 --> 22:22:29,762
because if I have a list of session--
23409
22:22:29,762 --> 22:22:34,232
list of IDs in my cart, I can\nget all of those books at once.
23410
22:22:34,232 --> 22:22:36,932
So long story short,\nwhat has happened here?
23411
22:22:36,932 --> 22:22:42,932
I am storing, in the cart, the books\n
23412
22:22:42,932 --> 22:22:45,792
My browser is sending the same\nhand stamp again and again
23413
22:22:45,792 --> 22:22:48,911
which is how this website knows that\n
23414
22:22:48,911 --> 22:22:50,701
and not you or not Carter or not Emma.
23415
22:22:50,702 --> 22:22:54,002
Indeed, if all of us visited the\n
23416
22:22:54,001 --> 22:22:56,791
and allowed that, then we would\nall have our own illusions
23417
22:22:58,411 --> 22:23:00,902
And each of those carts,\nin practice, would just
23418
22:23:00,902 --> 22:23:04,682
be stored in this Flask\nsession directory on the server
23419
22:23:04,682 --> 22:23:06,482
so that the server\ncan keep track of each
23420
22:23:06,482 --> 22:23:09,452
of us using, again, these\ncookie values that are being
23421
22:23:09,452 --> 22:23:13,182
sent back and forth via these headers.
23422
22:23:13,682 --> 22:23:15,991
I know that's a lot,\nbut again, it's just
23423
22:23:15,991 --> 22:23:19,471
the new Python way of\njust leveraging those HTTP
23424
22:23:19,471 --> 22:23:22,121
headers from last week in a clever way.
23425
22:23:22,122 --> 22:23:26,101
Any questions before we look\nat one final set of examples?
23426
22:23:27,410 --> 22:23:31,322
AUDIENCE: [INAUDIBLE] understand how\n
23427
22:23:31,322 --> 22:23:34,744
How does it use [INAUDIBLE],,\nhow do you change [INAUDIBLE]??
23428
22:23:34,744 --> 22:23:39,145
Because in order to use a GET\nrequest dot [INAUDIBLE] equals
23429
22:23:39,145 --> 22:23:43,081
there has to be an\nexchange in [INAUDIBLE]..
23430
22:23:43,081 --> 22:23:46,292
DAVID: So I think you're\nasking about using the GET
23431
22:23:46,292 --> 22:23:48,461
and POST in the same function.
23432
22:23:48,461 --> 22:23:52,982
So this is just a nice\naesthetic, if you will.
23433
22:23:52,982 --> 22:23:56,432
If I had to have separate\nroutes for GET and POST, I mean
23434
22:23:56,432 --> 22:23:59,191
it literally might mean I need\ntwice as many routes in my file.
23435
22:23:59,191 --> 22:24:01,441
And it just starts to\nget a little annoying.
23436
22:24:01,441 --> 22:24:04,182
And these days, too, in\nterms of user experience
23437
22:24:04,182 --> 22:24:08,161
this is maybe only appeals to the\n
23438
22:24:09,452 --> 22:24:11,461
You don't want to have\nlots of words in the URL
23439
22:24:11,461 --> 22:24:14,922
it's nice if the URLs are nice and\n
23440
22:24:14,922 --> 22:24:21,032
So it's nice if I can centralize all of\n
23441
22:24:21,032 --> 22:24:25,652
only, and not in multiple routes,\none for GET, one for POST.
23442
22:24:25,652 --> 22:24:30,012
It's a little nitpicky of me,\nbut this is commonly done here.
23443
22:24:30,012 --> 22:24:34,622
So what this code here means is\nthat this route, this function
23444
22:24:34,622 --> 22:24:38,222
henceforth will support both\nGET requests and POST requests.
23445
22:24:38,221 --> 22:24:42,211
But then I need to distinguish between\n
23446
22:24:42,211 --> 22:24:44,881
Because if it's a GET request,\nI want to show the cart.
23447
22:24:44,881 --> 22:24:47,342
If it's a POST request, I\nwant to update the cart.
23448
22:24:47,342 --> 22:24:51,542
And the simplest way to do that\n
23449
22:24:51,542 --> 22:24:54,991
In the request variable that we\nimported from Flask up above
23450
22:24:54,991 --> 22:24:57,842
you can check what is the\ncurrent type of request.
23451
22:24:57,842 --> 22:25:00,932
Is it a GET, is it a POST, or\nis it something else altogether?
23452
22:25:02,581 --> 22:25:07,652
If it's a POST, that must mean, because\n
23453
22:25:07,652 --> 22:25:10,922
that the user clicked\nthe Add to Cart button.
23454
22:25:10,922 --> 22:25:16,112
Otherwise, if it's not POST, it's\n
23455
22:25:16,111 --> 22:25:20,371
Then, I just want to show the\nuser the contents of the cart
23456
22:25:20,372 --> 22:25:21,881
and I use these lines instead.
23457
22:25:21,881 --> 22:25:25,682
So it's just one way of avoiding having\n
23458
22:25:26,191 --> 22:25:29,101
You can combine them so long\nas you have a check like this.
23459
22:25:29,101 --> 22:25:35,432
If I really wanted to be pedantic, I\n
23460
22:25:38,012 --> 22:25:40,592
This would be more symmetric,\nbut it's not really necessary
23461
22:25:40,592 --> 22:25:43,512
because I know there's\nonly two possibilities.
23462
22:25:46,092 --> 22:25:48,902
All right, let's do one\nfinal set of examples here
23463
22:25:48,902 --> 22:25:51,092
that's going to tie the\nlast of these features
23464
22:25:51,092 --> 22:25:53,521
together to something\nthat you probably see
23465
22:25:53,521 --> 22:25:56,373
quite often in real-world applications.
23466
22:25:56,373 --> 22:25:58,081
And that, for better\nor for worse, is now
23467
22:25:58,081 --> 22:26:01,421
going to involve tying back in\nsome JavaScript from last week.
23468
22:26:01,422 --> 22:26:03,422
The goal at hand of\nthese examples is not
23469
22:26:03,422 --> 22:26:06,805
to necessarily master how you yourself\n
23470
22:26:06,804 --> 22:26:09,721
code, the JavaScript code, but just\n
23471
22:26:09,721 --> 22:26:11,051
these different languages work.
23472
22:26:11,051 --> 22:26:13,441
So that for final\nprojects, especially if you
23473
22:26:13,441 --> 22:26:16,921
do want to add JavaScript functionality,\n
23474
22:26:16,922 --> 22:26:20,101
you at least have the bare\nbones of a mental model for how
23475
22:26:20,101 --> 22:26:22,542
you can tie these languages together.
23476
22:26:22,542 --> 22:26:26,131
Even though our focus, generally,\n
23477
22:26:26,131 --> 22:26:28,322
than on JavaScript from last week.
23478
22:26:28,322 --> 22:26:33,422
Let me go ahead and open up an example\n
23479
22:26:35,021 --> 22:26:39,092
And let me go into my URL here and\n
23480
22:26:39,092 --> 22:26:44,432
like by default. This has just a simple\n
23481
22:26:44,432 --> 22:26:47,461
Let's take a look at the HTML\nthat just got sent to my browser.
23482
22:26:47,461 --> 22:26:49,572
All right, there's not\nmuch going on here at all.
23483
22:26:49,572 --> 22:26:52,682
So there's a form whose\naction is /search.
23484
22:26:52,682 --> 22:26:54,482
It's going to submit via GET.
23485
22:26:54,482 --> 22:26:58,902
It's going to use a q parameter, just\n
23486
22:26:58,902 --> 22:27:01,872
So this actually looks like the\nGoogle form we did last week.
23487
22:27:01,872 --> 22:27:03,752
So let's see what goes on here.
23488
22:27:03,751 --> 22:27:06,151
Let me search for something like cat.
23489
22:27:09,581 --> 22:27:12,042
all right, so this is actually\na somewhat familiar file.
23490
22:27:12,042 --> 22:27:16,232
What I've gone ahead and done is I've\n
23491
22:27:16,232 --> 22:27:19,052
from a couple of weeks ago\nwhen we first introduced SQL
23492
22:27:19,051 --> 22:27:21,601
and I loaded them into this\ndemo so that you can search
23493
22:27:21,601 --> 22:27:23,411
by keyword for any word you want.
23494
22:27:24,512 --> 22:27:27,661
If we were to do this again, we\n
23495
22:27:27,661 --> 22:27:32,861
that contain D-O-G, dog, as a\nsubstring somewhere and so forth.
23496
22:27:32,861 --> 22:27:34,891
So this is a traditional\nway of doing this.
23497
22:27:34,892 --> 22:27:40,502
Just like in Google, it uses\n/search?q=cat, q=dog, and so forth.
23498
22:27:41,381 --> 22:27:45,432
Well, let's just take a\nquick look at app.py here.
23499
22:27:45,432 --> 22:27:50,702
Let me go into my zero example\nhere, show zero, and open up
23500
22:27:50,702 --> 22:27:53,222
app.py and see what's going on.
23501
22:27:54,881 --> 22:27:57,542
Here's the form, that's\nhow we started today.
23502
22:27:57,542 --> 22:28:00,017
And here is the /search route.
23503
22:28:00,017 --> 22:28:01,142
Well, what's going on here?
23504
22:28:01,142 --> 22:28:02,832
This gets a little interesting.
23505
22:28:02,831 --> 22:28:06,031
So I first select a whole\nbunch of shows by doing this.
23506
22:28:06,032 --> 22:28:10,652
Select star from shows, where\ntitle like question mark.
23507
22:28:10,652 --> 22:28:15,331
And then I'm using some\npercent signs from SQL
23508
22:28:15,331 --> 22:28:17,792
on both the left and the\nright, and I'm plugging
23509
22:28:17,792 --> 22:28:20,211
in whatever the user's input was for q.
23510
22:28:20,211 --> 22:28:22,801
If I didn't use like and\nI used equal instead
23511
22:28:22,801 --> 22:28:25,891
I could get rid of these curly\nbrace, these percent signs
23512
22:28:25,892 --> 22:28:29,522
but then it would have to be a show\n
23513
22:28:29,521 --> 22:28:32,221
to it being like cat or like dog.
23514
22:28:32,221 --> 22:28:35,792
This whole line returns to me a\n
23515
22:28:35,792 --> 22:28:38,682
represents a show in the database.
23516
22:28:38,682 --> 22:28:41,971
And then, I'm passing all of those\n
23517
22:28:41,971 --> 22:28:45,771
So let's just follow that\nbreadcrumb, let's open up shows dot--
23518
22:28:48,152 --> 22:28:50,331
All right, so this is\nwhere templating gets cool.
23519
22:28:50,331 --> 22:28:53,031
So I just passed back hundreds\nof results, potentially
23520
22:28:53,032 --> 22:28:56,512
but the only thing I'm\noutputting is an unordered list
23521
22:28:56,512 --> 22:29:00,592
and using a Jinja for\nloop and li tag containing
23522
22:29:00,592 --> 22:29:02,506
the titles of each of those shows.
23523
22:29:02,506 --> 22:29:04,881
And just to prove that this\nis indeed a familiar data set
23524
22:29:04,881 --> 22:29:09,801
and I actually simplified it a bit,\n
23525
22:29:09,801 --> 22:29:14,091
I threw away all the other stuff like\n
23526
22:29:14,092 --> 22:29:18,832
and I just have, for instance,\nselect star from shows
23527
22:29:18,831 --> 22:29:21,201
limit 10, just so we can see 10 of them.
23528
22:29:21,202 --> 22:29:23,572
There's 10 of the shows\nfrom that database.
23529
22:29:23,572 --> 22:29:25,982
So that's all that's\nin the database itself.
23530
22:29:25,982 --> 22:29:29,572
So it would look like this is a\npretty vanilla web application.
23531
22:29:29,572 --> 22:29:31,607
It uses GET, it submits\nit to the server
23532
22:29:31,607 --> 22:29:33,982
the server spits out a response,\nand that response, then
23533
22:29:33,982 --> 22:29:38,752
looks like this, which is a huge\n
23534
22:29:40,461 --> 22:29:44,131
But everything else\ncomes from a layout.html.
23535
22:29:44,131 --> 22:29:46,771
All the stuff at the\ntop and at the bottom.
23536
22:29:46,771 --> 22:29:50,781
All right, so these days, though, we're\n
23537
22:29:50,782 --> 22:29:53,482
And you start typing something\nand you don't have to hit Submit
23538
22:29:53,482 --> 22:29:56,357
you don't have to click a button,\n
23539
22:29:56,357 --> 22:29:58,711
Web applications, nowadays,\nare much more dynamic.
23540
22:29:58,711 --> 22:30:01,891
So let's take a look at this\nversion one of this thing.
23541
22:30:01,892 --> 22:30:07,792
Let me go into shows one\nand close my previous tabs
23542
22:30:09,902 --> 22:30:14,601
And it's almost the same thing, but\n
23543
22:30:14,601 --> 22:30:16,625
I'm reloading the form,\nthere's no button now.
23544
22:30:16,625 --> 22:30:18,292
So gone is the need for a submit button.
23545
22:30:18,292 --> 22:30:20,312
I want to implement autocomplete now.
23546
22:30:20,312 --> 22:30:22,912
So let's go ahead and\ntype in C. OK, there's
23547
22:30:22,911 --> 22:30:25,881
every show that starts\nwith C. A, there's
23548
22:30:25,881 --> 22:30:27,861
every show that has C-A in it, rather.
23549
22:30:27,861 --> 22:30:31,101
T, there's every show with C-A-T in it.
23550
22:30:31,101 --> 22:30:34,911
I can start it again and do dog,\n
23551
22:30:34,911 --> 22:30:38,811
And notice my URL never changed,\nthere's no /search route
23552
22:30:40,012 --> 22:30:43,652
With every keystroke, it is\nsearching again and again and again.
23553
22:30:43,652 --> 22:30:46,282
That's a nice UX, user experience,\nbecause it's immediate.
23554
22:30:46,282 --> 22:30:48,532
This is what users are\nused to these days.
23555
22:30:48,532 --> 22:30:53,392
But if I look at the source code\n
23556
22:30:53,392 --> 22:30:58,615
there's just an empty UL by default but\n
23557
22:30:58,615 --> 22:31:00,032
So let's see what's going on here.
23558
22:31:00,032 --> 22:31:03,512
This JavaScript code\nis doing the following.
23559
22:31:03,512 --> 22:31:06,472
Let me zoom in a little bit more.
23560
22:31:06,471 --> 22:31:10,853
This JavaScript code is first\nselecting, with query selector
23561
22:31:10,854 --> 22:31:13,562
which you used this past week,\n
23562
22:31:13,562 --> 22:31:15,502
so that's just getting the text box.
23563
22:31:15,501 --> 22:31:19,136
Then it's adding an event listener\n
23564
22:31:19,137 --> 22:31:21,262
We didn't talk about this\nlast week, but literally
23565
22:31:21,262 --> 22:31:23,362
when you provide any\nkind of input by typing
23566
22:31:23,361 --> 22:31:27,861
by pasting, by any other\nuser interface mechanism
23567
22:31:27,861 --> 22:31:29,671
it triggers an event called input.
23568
22:31:29,672 --> 22:31:31,732
So similar to key press or key up.
23569
22:31:31,732 --> 22:31:35,542
I then have a function, no worries\n
23570
22:31:35,542 --> 22:31:37,244
Then what do I do inside of this?
23571
22:31:37,244 --> 22:31:39,202
All right, so this is\nnew, and this is the part
23572
22:31:39,202 --> 22:31:41,872
that let's just focus on the\nideas and not the syntax.
23573
22:31:41,872 --> 22:31:43,762
JavaScript, nowadays,\ncomes with a function
23574
22:31:43,762 --> 22:31:48,051
called fetch that allows you to\n
23575
22:31:48,051 --> 22:31:49,861
without reloading the whole page.
23576
22:31:49,861 --> 22:31:52,371
You can sort of secretly\ndo it inside of the page.
23577
22:31:53,691 --> 22:31:59,031
slash search question mark q equals\n
23578
22:31:59,032 --> 22:32:03,082
When I get back a response, I want\n
23579
22:32:03,081 --> 22:32:05,331
and store it in a variable called shows.
23580
22:32:05,331 --> 22:32:08,001
And I'm deliberately bouncing\naround, ignoring special words
23581
22:32:08,001 --> 22:32:11,361
like await and await here, but for\n
23582
22:32:11,361 --> 22:32:14,601
A response came back from the\n
23583
22:32:14,601 --> 22:32:16,432
storing it in a variable called shows.
23584
22:32:17,691 --> 22:32:22,461
I'm using query selector to select\n
23585
22:32:22,461 --> 22:32:26,721
and I'm changing its inner HTML\nto be equal to the shows that
23586
22:32:29,572 --> 22:32:32,402
Here's where, again, developer\ntools are quite powerful.
23587
22:32:32,402 --> 22:32:36,652
Let me go ahead and reload this\npage to get rid of everything.
23588
22:32:36,652 --> 22:32:39,952
And let me now open up inspect.
23589
22:32:39,952 --> 22:32:43,292
Let me go to the Network tab and\n
23590
22:32:43,292 --> 22:32:44,542
between my browser and server.
23591
22:32:44,542 --> 22:32:49,221
I'm going to search for C. Notice that\n
23592
22:32:54,202 --> 22:32:57,812
So I didn't even finish my cat\n
23593
22:32:57,812 --> 22:33:02,662
A bunch of response headers, but let's\n
23594
22:33:02,661 --> 22:33:07,311
This is literally the response from the\n
23595
22:33:07,312 --> 22:33:10,372
No UL, no HTML, no\ntitle, no body, nothing.
23596
22:33:11,452 --> 22:33:12,961
And we can actually simulate this.
23597
22:33:12,961 --> 22:33:17,122
Let me manually go to\nthat same URL, q=c, Enter.
23598
22:33:17,122 --> 22:33:19,131
We are just going to get back--
23599
22:33:20,572 --> 22:33:25,432
slash search q equals c, we are\n
23600
22:33:25,432 --> 22:33:27,892
which if I view source, it's\nnot even a complete web page.
23601
22:33:27,892 --> 22:33:31,312
The browser is trying to show it to me\n
23602
22:33:31,312 --> 22:33:33,862
but it's really just partial HTML.
23603
22:33:33,861 --> 22:33:36,471
But that's perfect,\nbecause this is literally
23604
22:33:36,471 --> 22:33:39,381
what I essentially want my\nPython code to copy paste
23605
22:33:39,381 --> 22:33:42,471
into the otherwise empty UL tag.
23606
22:33:42,471 --> 22:33:46,822
And that's what this JavaScript\ncode then, here, is doing.
23607
22:33:46,822 --> 22:33:51,111
Once it gets back that response from the\n
23608
22:33:51,111 --> 22:33:56,131
to plug all of those li's\ninto the UL after the fact.
23609
22:33:56,131 --> 22:33:58,491
Again, changing the so-called dom.
23610
22:33:58,491 --> 22:34:01,461
But there's a slightly better way\n
23611
22:34:02,932 --> 22:34:07,252
Because if you've got a hundred shows or\n
23612
22:34:08,211 --> 22:34:10,851
Why do I need to send all\nof these stupid HTML tags?
23613
22:34:10,851 --> 22:34:13,952
Why don't I just create those\nwhen I'm ready to create them?
23614
22:34:13,952 --> 22:34:15,682
Well, here's the final flourish.
23615
22:34:15,682 --> 22:34:18,232
Whenever making a web\napplication nowadays
23616
22:34:18,232 --> 22:34:22,402
where client and server keep talking\n
23617
22:34:22,402 --> 22:34:25,191
Gmail does this, literally\nevery cool application
23618
22:34:25,191 --> 22:34:27,531
nowadays you load the\npage once and then it
23619
22:34:27,532 --> 22:34:29,302
keeps on interacting\nwith you without you
23620
22:34:29,301 --> 22:34:31,971
reloading or having to change the URL.
23621
22:34:31,971 --> 22:34:36,471
Let's actually use a format called\n
23622
22:34:36,471 --> 22:34:39,561
is to say there's just a\nbetter, more efficient, better
23623
22:34:39,562 --> 22:34:42,152
designed way to send that same data.
23624
22:34:42,152 --> 22:34:46,432
I'm going to go into shows\ntwo now and do Flask run.
23625
22:34:46,432 --> 22:34:48,172
And I'm going to go\nback to my page here.
23626
22:34:48,172 --> 22:34:52,522
The user interface is exactly the same,\n
23627
22:34:52,521 --> 22:34:56,551
Here's C, C-A, C-A-T, and so forth.
23628
22:34:56,551 --> 22:34:59,031
But let's see what's coming back now.
23629
22:34:59,032 --> 22:35:09,152
If I go to /search?q=cat, Enter, notice\n
23630
22:35:09,152 --> 22:35:12,622
But the fact that it's so\ncompact is actually a good thing.
23631
22:35:12,622 --> 22:35:16,372
This is actually going to-- let\n
23632
22:35:17,452 --> 22:35:20,752
This is what's called\nJavaScript Object Notation.
23633
22:35:20,751 --> 22:35:26,541
In JavaScript, an angle-- a square\n
23634
22:35:26,542 --> 22:35:32,542
In JavaScript, a curly bracket says\n
23635
22:35:36,782 --> 22:35:42,772
Yes, sort of recall that you can now\n
23636
22:35:42,771 --> 22:35:45,122
notation using colons like this.
23637
22:35:45,122 --> 22:35:48,711
So long story short, cryptic\nas this is to you and me
23638
22:35:48,711 --> 22:35:52,072
and not very human friendly,\nit's very machine friendly.
23639
22:35:52,072 --> 22:35:55,792
Because for every\ntitle in that database
23640
22:35:55,792 --> 22:36:01,562
I get back its ID and its title, its\n
23641
22:36:01,562 --> 22:36:05,632
And this is a very generic format that\n
23642
22:36:05,631 --> 22:36:07,101
interface, might return to you.
23643
22:36:07,101 --> 22:36:09,141
And this is how APIs, nowadays, work.
23644
22:36:09,142 --> 22:36:13,882
You get back very raw textual\ndata in this format, JSON format
23645
22:36:13,881 --> 22:36:17,601
and then you can write code that\nactually programmatically turns
23646
22:36:17,601 --> 22:36:22,292
that JSON data into any language\nyou want, for instance, HTML.
23647
22:36:22,292 --> 22:36:25,252
So here's the third and final\nversion of this program.
23648
22:36:29,631 --> 22:36:32,452
I then, when I get input,\ncall this function.
23649
22:36:32,452 --> 22:36:38,991
I fetch slash search q equals whatever\n
23650
22:36:38,991 --> 22:36:41,955
I then wait for the response,\nbut instead of getting text
23651
22:36:41,955 --> 22:36:44,872
I'm calling this other function that\n
23652
22:36:44,872 --> 22:36:47,092
called JSON, that just parses that.
23653
22:36:47,092 --> 22:36:51,232
It turns it into a dictionary for me,\n
23654
22:36:51,232 --> 22:36:53,572
and stores it in a\nvariable called shows.
23655
22:36:53,572 --> 22:36:57,831
And this is where you start to see the\n
23656
22:36:57,831 --> 22:37:01,161
Let me initialize a variable called\n
23657
22:37:01,161 --> 22:37:03,961
using single quotes, but I\ncould also use double quotes.
23658
22:37:03,961 --> 22:37:06,232
This is JavaScript syntax for a loop.
23659
22:37:06,232 --> 22:37:10,101
Let me iterate over every\nID in the show's list
23660
22:37:10,101 --> 22:37:13,881
that I just got back in the server,\nthat big chunk of JSON data.
23661
22:37:13,881 --> 22:37:19,191
Let me create a variable called\n
23662
22:37:19,191 --> 22:37:21,652
the title of the show at that ID.
23663
22:37:21,652 --> 22:37:23,661
But for reasons we'll\ncome back to, let me
23664
22:37:23,661 --> 22:37:25,581
replace a couple of scary characters.
23665
22:37:25,581 --> 22:37:31,671
Then let me dynamically add to this\n
23666
22:37:33,232 --> 22:37:35,991
And then very lastly,\nafter this for loop
23667
22:37:35,991 --> 22:37:42,081
let me update the ULs in our HTML to\n
23668
22:37:42,081 --> 22:37:44,661
So in short, don't worry\ntoo much about the syntax
23669
22:37:44,661 --> 22:37:47,119
because you won't need to use\nthis unless you start playing
23670
22:37:47,119 --> 22:37:49,231
with more advanced features quite soon.
23671
22:37:49,232 --> 22:37:51,202
But what we're doing is,\nwith JavaScript, we're
23672
22:37:51,202 --> 22:37:54,122
creating a bigger and bigger\nand bigger string of HTML
23673
22:37:54,122 --> 22:37:57,562
containing all of the open brackets,\n
23674
22:37:57,562 --> 22:38:00,922
but we're just grabbing the\nraw data from the server.
23675
22:38:00,922 --> 22:38:02,932
And so in fact in\nproblem set nine, you're
23676
22:38:02,932 --> 22:38:06,652
going to use a real world third\n
23677
22:38:06,652 --> 22:38:08,092
interface, for which you sign up.
23678
22:38:08,092 --> 22:38:10,851
The data you're going to get\nback from that API is not
23679
22:38:10,851 --> 22:38:14,301
going to be show titles, but\nactually stock quotes and stocks
23680
22:38:14,301 --> 22:38:17,042
ticker symbols and the prices of last--
23681
22:38:17,042 --> 22:38:19,072
at which stocks were\nlast bought or sold
23682
22:38:19,072 --> 22:38:21,880
and you're going to get that\ndata back in JSON format.
23683
22:38:21,880 --> 22:38:24,922
And you're going to write a bit of\n
23684
22:38:24,922 --> 22:38:27,992
to the requisite HTML on the page.
23685
22:38:27,991 --> 22:38:31,281
So the final result here is\nliterally the kind of autocomplete
23686
22:38:31,282 --> 22:38:33,652
that you and I see and\ntake for granted every day
23687
22:38:33,652 --> 22:38:35,812
and that's ultimately how it works.
23688
22:38:35,812 --> 22:38:39,442
HTML and CSS are used to present\nthe data, your so-called view.
23689
22:38:39,441 --> 22:38:43,861
Python might be used to send or\n
23690
22:38:43,861 --> 22:38:46,131
And then lastly, JavaScript\nis going to be used
23691
22:38:46,131 --> 22:38:48,519
to make things dynamic and interactive.
23692
22:38:48,519 --> 22:38:50,601
So I know that's a whole\nbunch of building blocks
23693
22:38:50,601 --> 22:38:53,394
but the whole point of problem set\n
23694
22:38:53,394 --> 22:38:55,911
set the stage for hopefully a\nvery successful final project.
23695
22:38:55,911 --> 22:38:58,161
Why don't we go ahead and\nwrap up there, and we'll see
23696
22:38:58,161 --> 22:39:01,372
you one last time next week for emoji.
1952971
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.