Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
0
00:00:00,000 --> 00:00:02,982
1
00:00:02,982 --> 00:00:06,461
[MUSIC PLAYING]
2
00:00:06,461 --> 00:01:12,065
3
00:01:12,065 --> 00:01:13,210
DAVID MALAN: All right.
4
00:01:13,210 --> 00:01:18,700
This is CS50, and this is week six, wherein we finally transition
5
00:01:18,700 --> 00:01:20,935
from Scratch to C to, now, Python.
6
00:01:20,935 --> 00:01:22,735
And, indeed, this is going to be somewhat
7
00:01:22,735 --> 00:01:27,370
of a unique experience in that, just like a few weeks past--
8
00:01:27,370 --> 00:01:30,605
perhaps, for the first time-- and now, today, you're
9
00:01:30,605 --> 00:01:31,855
going to learn a new language.
10
00:01:31,855 --> 00:01:35,935
But the goal isn't just to throw another fire hose of content and syntax
11
00:01:35,935 --> 00:01:39,568
and whatnot at you, but rather, to really equip you all to actually teach
12
00:01:39,568 --> 00:01:41,110
yourself new languages in the future.
13
00:01:41,110 --> 00:01:43,902
And so, indeed, what we'll do today, what we'll do this coming week
14
00:01:43,902 --> 00:01:46,580
is prepare you to stand on your own.
15
00:01:46,580 --> 00:01:48,527
And once Python is passe and the world has
16
00:01:48,527 --> 00:01:50,860
moved on to some other language in some number of years,
17
00:01:50,860 --> 00:01:52,568
you'll be well equipped to figure out how
18
00:01:52,568 --> 00:01:55,027
to wrap your mind around some new syntax, some new language
19
00:01:55,027 --> 00:01:56,280
and solve problems, as well.
20
00:01:56,280 --> 00:01:59,320
Now, you recall, in week zero, this is where we started--
21
00:01:59,320 --> 00:02:01,390
just saying hello to the world.
22
00:02:01,390 --> 00:02:03,850
And that quickly escalated just a week later in C
23
00:02:03,850 --> 00:02:06,250
to be something much, much more cryptic.
24
00:02:06,250 --> 00:02:09,234
And if you've still struggled with some of the syntax,
25
00:02:09,234 --> 00:02:11,723
find yourself checking your notes or your previous code,
26
00:02:11,723 --> 00:02:12,640
that's totally normal.
27
00:02:12,640 --> 00:02:16,675
And that's one of the reasons why there are languages besides C
28
00:02:16,675 --> 00:02:18,970
out there-- among them, this language called Python.
29
00:02:18,970 --> 00:02:21,520
Humans over the decades have realized, gee,
30
00:02:21,520 --> 00:02:25,167
that wasn't necessarily the best design decision, or humans have realized, wow,
31
00:02:25,167 --> 00:02:25,750
you know what?
32
00:02:25,750 --> 00:02:30,160
Now that computers have gotten faster with more memory and faster CPUs,
33
00:02:30,160 --> 00:02:33,070
we can actually do more with our programming languages.
34
00:02:33,070 --> 00:02:36,985
So just as human languages evolve, so do actual programming languages.
35
00:02:36,985 --> 00:02:40,810
And even within a programming language, there's typically different versions.
36
00:02:40,810 --> 00:02:43,870
We, for instance, have been using version C11
37
00:02:43,870 --> 00:02:46,720
of C, which was updated in 2011.
38
00:02:46,720 --> 00:02:50,800
But Python itself continues to evolve, and it's now up to version 3-plus.
39
00:02:50,800 --> 00:02:53,680
And so there, too, these things will evolve in the coming days.
40
00:02:53,680 --> 00:02:56,560
Thankfully, what you're about to see is "Hello, World!"
41
00:02:56,560 --> 00:02:59,440
for the third time, but it's going to be literally this.
42
00:02:59,440 --> 00:03:04,930
None of the crazy syntax above or below, fewer semicolons, if any, fewer
43
00:03:04,930 --> 00:03:05,770
currently braces.
44
00:03:05,770 --> 00:03:08,630
And, really, a lot of the distractions get out of the way.
45
00:03:08,630 --> 00:03:11,200
So to get there, let's consider exactly how
46
00:03:11,200 --> 00:03:13,000
we've been programming up until now.
47
00:03:13,000 --> 00:03:16,300
So you write a program in C and you've got, hopefully,
48
00:03:16,300 --> 00:03:19,135
no syntax error, so you're ready to build it-- that is, compile it.
49
00:03:19,135 --> 00:03:22,135
And so, you've run make, and then, you've run the program, like ./hello.
50
00:03:22,135 --> 00:03:24,850
Or if you think back to week two, where we
51
00:03:24,850 --> 00:03:27,100
took a peek underneath the hood of what make is doing,
52
00:03:27,100 --> 00:03:29,710
it's really running the actual compiler--
53
00:03:29,710 --> 00:03:32,800
something called clang-- maybe with some command-line arguments creating
54
00:03:32,800 --> 00:03:34,090
a program called hello.
55
00:03:34,090 --> 00:03:36,128
And then, you could do ./hello.
56
00:03:36,128 --> 00:03:38,920
So, today, you're going to start doing something similar in spirit,
57
00:03:38,920 --> 00:03:40,270
but fewer steps.
58
00:03:40,270 --> 00:03:42,270
No longer will you have to compile your code
59
00:03:42,270 --> 00:03:45,520
and then run it, and then, maybe, fix or change it, and then compile your code
60
00:03:45,520 --> 00:03:47,470
and run it, and then repeat, repeat.
61
00:03:47,470 --> 00:03:50,200
The process of running your code is going
62
00:03:50,200 --> 00:03:52,542
to be distilled into just a single step.
63
00:03:52,542 --> 00:03:54,250
And the way to think of this, for now, is
64
00:03:54,250 --> 00:03:58,420
that, whereas C is frequently used as, indeed, a compiled language whereby
65
00:03:58,420 --> 00:04:01,045
you convert it first to 0s and 1s, Python's
66
00:04:01,045 --> 00:04:04,400
going to let you speed things up whereby you, the human programmer,
67
00:04:04,400 --> 00:04:05,740
don't have to compile it.
68
00:04:05,740 --> 00:04:09,400
You're just going to run what's called an interpreter-- which, by design,
69
00:04:09,400 --> 00:04:12,190
is named the exact same thing as the language itself--
70
00:04:12,190 --> 00:04:14,860
and by running this program installed in VS Code
71
00:04:14,860 --> 00:04:17,230
or, eventually, on your own Macs or PCs.
72
00:04:17,230 --> 00:04:20,320
This is just going to tell your computer to interpret this code
73
00:04:20,320 --> 00:04:23,800
and figure out how to get down to that lower level of 0s and 1s.
74
00:04:23,800 --> 00:04:26,626
But you don't have to compile the code yourself anymore.
75
00:04:26,626 --> 00:04:31,000
So with that said, let's consider what the code is going to look like,
76
00:04:31,000 --> 00:04:31,690
side by side.
77
00:04:31,690 --> 00:04:33,850
In fact, let's look back at some Scratch blocks,
78
00:04:33,850 --> 00:04:36,582
just like we did with C in week one, and do some side by sides.
79
00:04:36,582 --> 00:04:39,040
Because even though some of the syntax this week and beyond
80
00:04:39,040 --> 00:04:42,705
is going to be different, the ideas are truly going to be the same.
81
00:04:42,705 --> 00:04:45,565
There's not all that much intellectually new just yet.
82
00:04:45,565 --> 00:04:48,190
So whereas, in week zero, we might have said hello to the world
83
00:04:48,190 --> 00:04:51,220
with this purple puzzle piece, today, of course--
84
00:04:51,220 --> 00:04:56,080
or, rather, in week one, it looked like this in C. But today, moving forward,
85
00:04:56,080 --> 00:04:58,665
it's going to, quite simply, look like this instead.
86
00:04:58,665 --> 00:05:00,610
And if we go back and forth for just a moment,
87
00:05:00,610 --> 00:05:03,580
here, again, is the version in C, noticing
88
00:05:03,580 --> 00:05:05,500
the very C-like characteristics.
89
00:05:05,500 --> 00:05:09,200
And just at a glance here, in Python, I claim it's now this.
90
00:05:09,200 --> 00:05:13,190
What do you apparently need not worry about anymore?
91
00:05:13,190 --> 00:05:14,940
What's gone?
92
00:05:14,940 --> 00:05:15,990
So semi-colon is gone.
93
00:05:15,990 --> 00:05:19,073
And, indeed, you don't need those to finish most of your thoughts anymore.
94
00:05:19,073 --> 00:05:19,830
Anything else?
95
00:05:19,830 --> 00:05:20,860
AUDIENCE: Backslash n.
96
00:05:20,860 --> 00:05:22,690
DAVID MALAN: So the backslash n is absent.
97
00:05:22,690 --> 00:05:25,140
And that's curious because we're still going to get a new line,
98
00:05:25,140 --> 00:05:26,985
but we'll see that it's become the default.
99
00:05:26,985 --> 00:05:29,402
And this one's a little more subtle, but now, the function
100
00:05:29,402 --> 00:05:31,185
is called print instead of printf.
101
00:05:31,185 --> 00:05:33,610
So it's a little more familiar in that sense.
102
00:05:33,610 --> 00:05:34,110
All right.
103
00:05:34,110 --> 00:05:37,050
So when it comes to using libraries-- that
104
00:05:37,050 --> 00:05:39,300
is, code that other people have written-- in the past,
105
00:05:39,300 --> 00:05:43,350
we've done things like #include cs50.h to use CS50's own header
106
00:05:43,350 --> 00:05:47,730
file or standard I/O or standard lib or string or any number of other header
107
00:05:47,730 --> 00:05:49,440
files you have all used.
108
00:05:49,440 --> 00:05:52,635
Moving forward, we're going to give you, for this first week, a similar CS50
109
00:05:52,635 --> 00:05:53,280
library--
110
00:05:53,280 --> 00:05:55,920
just very short-term training wheels that we'll quickly
111
00:05:55,920 --> 00:05:59,370
take off because, in reality, it's a lot easier to do things in Python,
112
00:05:59,370 --> 00:06:00,267
as we'll see.
113
00:06:00,267 --> 00:06:02,100
But the syntax for this, now, is going to be
114
00:06:02,100 --> 00:06:05,165
to import the CS50 library in this way.
115
00:06:05,165 --> 00:06:08,452
And when we have, now, this ability, we can actually
116
00:06:08,452 --> 00:06:09,910
start writing some code right away.
117
00:06:09,910 --> 00:06:12,420
In fact, let me switch over to VS Code here.
118
00:06:12,420 --> 00:06:14,760
And just as in the past, I'll create a new file.
119
00:06:14,760 --> 00:06:17,230
But instead of creating something called .c,
120
00:06:17,230 --> 00:06:19,980
I'm going to go ahead and create my first program called hello.py,
121
00:06:19,980 --> 00:06:22,260
using code space hello dot py.
122
00:06:22,260 --> 00:06:24,000
That, of course, gives me this new tab.
123
00:06:24,000 --> 00:06:28,185
And let me actually, quite simply, do what I proposed-- print, quote unquote,
124
00:06:28,185 --> 00:06:33,780
"Hello, world" without the /n, without the semicolon, without the f in print.
125
00:06:33,780 --> 00:06:36,270
And now, let me go down to my terminal window.
126
00:06:36,270 --> 00:06:37,792
And I don't have to compile it.
127
00:06:37,792 --> 00:06:39,000
I don't have to do dot slash.
128
00:06:39,000 --> 00:06:43,140
I, instead, run a program called python, whose purpose in life
129
00:06:43,140 --> 00:06:46,180
is, now, to interpret my code top to bottom, left to right.
130
00:06:46,180 --> 00:06:50,130
And if I run python of hello.py, crossing my fingers, as always--
131
00:06:50,130 --> 00:06:51,000
voila.
132
00:06:51,000 --> 00:06:53,190
Now I have printed out "hello, world."
133
00:06:53,190 --> 00:06:56,460
So we seem to have gotten the new line for free, in the sense where
134
00:06:56,460 --> 00:06:57,735
it's automatically happening.
135
00:06:57,735 --> 00:06:59,880
The dollar sign isn't weirdly on the same line,
136
00:06:59,880 --> 00:07:02,220
like it once was in week one.
137
00:07:02,220 --> 00:07:04,493
But that's just a minor detail here.
138
00:07:04,493 --> 00:07:06,660
If we switch back to, now, some other capabilities--
139
00:07:06,660 --> 00:07:09,780
well, indeed, with the CS50 library, you can also not
140
00:07:09,780 --> 00:07:12,795
just import the library itself, but specific functions.
141
00:07:12,795 --> 00:07:14,850
And you'll see that, temporarily, we're going
142
00:07:14,850 --> 00:07:19,080
to give you a helper function called get_string, just like in C, that just
143
00:07:19,080 --> 00:07:20,872
makes it work exactly the same way as in C.
144
00:07:20,872 --> 00:07:22,580
And we'll see a couple of other functions
145
00:07:22,580 --> 00:07:24,660
that will just make life easier, initially.
146
00:07:24,660 --> 00:07:26,910
But, quickly, will we take those training wheels off
147
00:07:26,910 --> 00:07:29,295
so that nothing is, indeed, CS50-specific.
148
00:07:29,295 --> 00:07:29,970
All right.
149
00:07:29,970 --> 00:07:32,640
Well, how about functions, more generally, in Python?
150
00:07:32,640 --> 00:07:34,710
Let's do a whirlwind tour, if you will, much
151
00:07:34,710 --> 00:07:38,940
like we did in that first week of C, comparing one to the other.
152
00:07:38,940 --> 00:07:42,270
So back in our world of Scratch, one of the first programs we wrote
153
00:07:42,270 --> 00:07:45,360
was this one here, whereby we ask the human their name.
154
00:07:45,360 --> 00:07:49,110
We then used the return value that was automatically stored
155
00:07:49,110 --> 00:07:53,130
in this answer variable as an second argument
156
00:07:53,130 --> 00:07:56,265
to join so that we could say "Hello, David" or "Hello, Carter."
157
00:07:56,265 --> 00:07:59,340
So this was back in week zero.
158
00:07:59,340 --> 00:08:01,143
In week one, we converted it to this.
159
00:08:01,143 --> 00:08:03,810
And here is a perfect example of things like escalating quickly.
160
00:08:03,810 --> 00:08:05,910
And, again, this is why we start in Scratch.
161
00:08:05,910 --> 00:08:09,060
There's just so much distraction here to achieve the same idea.
162
00:08:09,060 --> 00:08:12,010
But even today, we're going to chip away at some of that syntax.
163
00:08:12,010 --> 00:08:17,940
So, in C, we had to declare the variable as a string, here.
164
00:08:17,940 --> 00:08:19,935
We of course, had the semicolon and more.
165
00:08:19,935 --> 00:08:22,650
Well, in Python, the comparable code, now,
166
00:08:22,650 --> 00:08:26,100
is going to look, more simply, like this.
167
00:08:26,100 --> 00:08:29,250
So semicolon is, again, gone on both lines, for that matter.
168
00:08:29,250 --> 00:08:30,450
So that's good.
169
00:08:30,450 --> 00:08:33,100
What else appears to have changed or disappeared?
170
00:08:33,100 --> 00:08:33,600
Yeah.
171
00:08:33,600 --> 00:08:35,340
AUDIENCE: [? Do you have ?] the same type of variable?
172
00:08:35,340 --> 00:08:36,090
DAVID MALAN: Yeah.
173
00:08:36,090 --> 00:08:39,419
So I didn't have to specifically say that answer is now a string.
174
00:08:39,419 --> 00:08:41,820
And, indeed, Python is dynamically typed.
175
00:08:41,820 --> 00:08:45,270
And, in fact, it will infer from context exactly what
176
00:08:45,270 --> 00:08:48,000
it is you are storing in that variable.
177
00:08:48,000 --> 00:08:50,775
Other details that seem a little bit different?
178
00:08:50,775 --> 00:08:53,640
179
00:08:53,640 --> 00:08:54,607
A little bit different.
180
00:08:54,607 --> 00:08:55,940
What else jumps out at you here?
181
00:08:55,940 --> 00:08:56,482
I'll go back.
182
00:08:56,482 --> 00:08:58,690
This was the C version.
183
00:08:58,690 --> 00:09:01,570
And maybe focus, now, on the second line because we've rather
184
00:09:01,570 --> 00:09:02,740
exhausted the first.
185
00:09:02,740 --> 00:09:04,690
Here's, now, the Python version.
186
00:09:04,690 --> 00:09:05,720
What's different here?
187
00:09:05,720 --> 00:09:06,220
Yeah?
188
00:09:06,220 --> 00:09:08,845
AUDIENCE: You don't need to worry about %s or percent anything.
189
00:09:08,845 --> 00:09:10,930
You just have the variable after [? them. ?]
190
00:09:10,930 --> 00:09:11,680
DAVID MALAN: Yeah.
191
00:09:11,680 --> 00:09:12,820
There's no %s anymore.
192
00:09:12,820 --> 00:09:16,480
There's no second argument, at the moment, per se, to print.
193
00:09:16,480 --> 00:09:17,818
Now, it is still a little weird.
194
00:09:17,818 --> 00:09:20,485
It's as though I've deployed some addition here, arithmetically.
195
00:09:20,485 --> 00:09:21,860
But that's not the case.
196
00:09:21,860 --> 00:09:23,230
Some of you have program before.
197
00:09:23,230 --> 00:09:27,377
And plus, some of you might know, means what in this context?
198
00:09:27,377 --> 00:09:29,960
So to combine or, more technically-- anyone know the buzzword?
199
00:09:29,960 --> 00:09:30,390
Yeah.
200
00:09:30,390 --> 00:09:31,040
AUDIENCE: Concatenate.
201
00:09:31,040 --> 00:09:32,460
DAVID MALAN: To concatenate.
202
00:09:32,460 --> 00:09:35,753
So to concatenate is the fancy way of what Scratch calls joining,
203
00:09:35,753 --> 00:09:38,420
which is to take one string on the left, one string on the right
204
00:09:38,420 --> 00:09:40,100
and to join them together.
205
00:09:40,100 --> 00:09:41,880
To glue them together, if you will.
206
00:09:41,880 --> 00:09:43,080
So this is not addition.
207
00:09:43,080 --> 00:09:45,080
It would be if it were numbers involved instead.
208
00:09:45,080 --> 00:09:46,413
But because we've got a string--
209
00:09:46,413 --> 00:09:49,430
Hello comma-- and another string implicitly in this variable
210
00:09:49,430 --> 00:09:53,540
based on what the human typed in in response to this get_string function.
211
00:09:53,540 --> 00:09:58,130
That's going to concatenate Hello comma space, and then, David or Carter
212
00:09:58,130 --> 00:09:59,637
or whatever the human has typed in.
213
00:09:59,637 --> 00:10:02,720
But it turns out, there's going to be different ways to do this in Python.
214
00:10:02,720 --> 00:10:04,387
And we'll show you a few different ones.
215
00:10:04,387 --> 00:10:06,380
And here, too, try not to get too hung up
216
00:10:06,380 --> 00:10:09,255
on or frustrated by all of the different ways you can solve problems.
217
00:10:09,255 --> 00:10:12,130
Odds are, you're going to be picking up tips and techniques for years
218
00:10:12,130 --> 00:10:14,100
to come if you continue programming.
219
00:10:14,100 --> 00:10:16,710
So let's just give you a few of the possible ways.
220
00:10:16,710 --> 00:10:20,900
So here's a second way you could print out hello comma David or hello comma
221
00:10:20,900 --> 00:10:21,680
Carter.
222
00:10:21,680 --> 00:10:22,655
But what has changed?
223
00:10:22,655 --> 00:10:26,030
In the previous version, I used concatenation explicitly.
224
00:10:26,030 --> 00:10:28,445
And the space here is important, grammatically,
225
00:10:28,445 --> 00:10:30,485
just so we get that in the final phrase.
226
00:10:30,485 --> 00:10:33,410
Now, I'm proposing to get rid of that space
227
00:10:33,410 --> 00:10:36,985
to add a comma outside of the double quotes, as well.
228
00:10:36,985 --> 00:10:39,020
But if you think back to C, this probably
229
00:10:39,020 --> 00:10:42,620
just means that print, similar in spirit to printf,
230
00:10:42,620 --> 00:10:45,200
can take not just one argument, but even two.
231
00:10:45,200 --> 00:10:47,510
And in fact, because of this comma in the middle that's
232
00:10:47,510 --> 00:10:50,390
outside of the double quotes, it's hello comma,
233
00:10:50,390 --> 00:10:52,655
and then, it will be automatically concatenated
234
00:10:52,655 --> 00:10:56,420
with-- even without using the plus, to whatever the value of answer is.
235
00:10:56,420 --> 00:10:59,630
And by default, just for grammatical prettiness,
236
00:10:59,630 --> 00:11:01,850
the print function always gives you a space
237
00:11:01,850 --> 00:11:05,120
for free in between each of the multiple arguments you pass in.
238
00:11:05,120 --> 00:11:07,290
We'll see how you can override that down the line.
239
00:11:07,290 --> 00:11:09,248
But, for now, that's just another way to do it.
240
00:11:09,248 --> 00:11:12,680
Now, perhaps the better, if slightly cryptic way to do this--
241
00:11:12,680 --> 00:11:14,420
or just the increasingly common way--
242
00:11:14,420 --> 00:11:18,290
is, probably, the third version, which looks a little weird, too.
243
00:11:18,290 --> 00:11:20,555
And, probably, the weirdness jumps out.
244
00:11:20,555 --> 00:11:24,060
We've suddenly introduced these curly braces,
245
00:11:24,060 --> 00:11:25,518
which I promised were mostly gone.
246
00:11:25,518 --> 00:11:26,060
And they are.
247
00:11:26,060 --> 00:11:29,270
But inside of this string here, I've done
248
00:11:29,270 --> 00:11:31,520
a curly brace, which might mean what?
249
00:11:31,520 --> 00:11:32,918
Just intuitively.
250
00:11:32,918 --> 00:11:35,210
And here is an example of how you learn a new language.
251
00:11:35,210 --> 00:11:39,945
Just infer, from context, how Python probably works.
252
00:11:39,945 --> 00:11:40,820
What might this mean?
253
00:11:40,820 --> 00:11:41,320
Yeah?
254
00:11:41,320 --> 00:11:45,160
AUDIENCE: [INAUDIBLE]
255
00:11:45,160 --> 00:11:45,910
DAVID MALAN: Yeah.
256
00:11:45,910 --> 00:11:48,610
So this is an indication, because the curly braces--
257
00:11:48,610 --> 00:11:50,740
because this was the way Python was designed--
258
00:11:50,740 --> 00:11:55,340
that we want to plug in the value of answer, not literally A-N-S-W-E-R.
259
00:11:55,340 --> 00:11:59,688
And the fancy word here is that the answer variable will be interpolated--
260
00:11:59,688 --> 00:12:01,480
that is, substituted with its actual value.
261
00:12:01,480 --> 00:12:04,435
But, but, but-- and this is actually weird-looking;
262
00:12:04,435 --> 00:12:06,820
this was introduced a few years ago to Python.
263
00:12:06,820 --> 00:12:11,230
What else did I have to change to make these curly braces work, apparently?
264
00:12:11,230 --> 00:12:11,935
Yeah?
265
00:12:11,935 --> 00:12:13,510
AUDIENCE: Drop the f before the--
266
00:12:13,510 --> 00:12:14,260
DAVID MALAN: Yeah.
267
00:12:14,260 --> 00:12:15,160
There's this weird f.
268
00:12:15,160 --> 00:12:17,245
And so, it's like part of printf.
269
00:12:17,245 --> 00:12:20,950
But now, it's inside the parentheses there.
270
00:12:20,950 --> 00:12:22,945
This is just the way Python designed this.
271
00:12:22,945 --> 00:12:24,820
So a few years ago, when they introduced what
272
00:12:24,820 --> 00:12:30,070
are called format strings or fstrings, you literally prefix your quoted string
273
00:12:30,070 --> 00:12:32,080
with the letter f.
274
00:12:32,080 --> 00:12:34,570
And then, you can use trickery like this,
275
00:12:34,570 --> 00:12:36,640
like putting curly braces so that the value will
276
00:12:36,640 --> 00:12:38,170
be substituted automatically.
277
00:12:38,170 --> 00:12:41,530
If you forget the f, you're going to literally see hello comma curly
278
00:12:41,530 --> 00:12:43,330
brace answer closed curly brace.
279
00:12:43,330 --> 00:12:45,355
If you add the f, it's, indeed, interpolated.
280
00:12:45,355 --> 00:12:47,360
The value is plugged in.
281
00:12:47,360 --> 00:12:47,860
All right.
282
00:12:47,860 --> 00:12:52,510
Questions on how we can just say hello to the world via Python, in this case.
283
00:12:52,510 --> 00:12:53,350
Yeah?
284
00:12:53,350 --> 00:12:55,280
AUDIENCE: If you do this without the f, what would happen?
285
00:12:55,280 --> 00:12:56,300
DAVID MALAN: If you do this without the--
286
00:12:56,300 --> 00:12:57,260
AUDIENCE: [? The f. ?]
287
00:12:57,260 --> 00:12:58,385
DAVID MALAN: Without the f?
288
00:12:58,385 --> 00:13:02,450
If you omit the f, you will literally see H-E-L-L-O comma curly brace
289
00:13:02,450 --> 00:13:04,730
A-N-S-W-E-R closed curly brace.
290
00:13:04,730 --> 00:13:05,930
So, in fact, let's do this.
291
00:13:05,930 --> 00:13:08,300
Let me go back to VS Code here, quickly.
292
00:13:08,300 --> 00:13:11,540
I've still got my file called hello.py open.
293
00:13:11,540 --> 00:13:14,210
And let me go ahead and change this ever so slightly.
294
00:13:14,210 --> 00:13:16,700
So I'm going to go ahead and--
295
00:13:16,700 --> 00:13:20,930
let's say from cs50 import get_string.
296
00:13:20,930 --> 00:13:23,615
And that's just the new syntax I propose using to import
297
00:13:23,615 --> 00:13:26,150
a function from someone else's library.
298
00:13:26,150 --> 00:13:30,593
I'm going to now go ahead and ask the question--
299
00:13:30,593 --> 00:13:33,260
let's go ahead and use get_string, storing the result in answer.
300
00:13:33,260 --> 00:13:37,480
So get_string, quote unquote, "What's your name?"
301
00:13:37,480 --> 00:13:41,090
And then, on this line, I'm going to deliberately make a mistake here,
302
00:13:41,090 --> 00:13:42,450
exactly to your question.
303
00:13:42,450 --> 00:13:46,820
Let me just say hello comma answer, and just this.
304
00:13:46,820 --> 00:13:48,980
Now, even though answer is a variable, Python's
305
00:13:48,980 --> 00:13:53,150
not going to be so presumptuous as to just plug in the value of a variable
306
00:13:53,150 --> 00:13:53,810
called answer.
307
00:13:53,810 --> 00:13:56,000
What it's going to do, of course, is--
308
00:13:56,000 --> 00:13:56,985
if I type in my name--
309
00:13:56,985 --> 00:13:57,485
whoops.
310
00:13:57,485 --> 00:13:58,880
I typed too fast.
311
00:13:58,880 --> 00:14:00,470
Let me go ahead and rerun that again.
312
00:14:00,470 --> 00:14:04,550
If I run python with hello.py, type in my name and hit Enter,
313
00:14:04,550 --> 00:14:06,035
I get hello comma answer.
314
00:14:06,035 --> 00:14:07,160
Well, let me do one better.
315
00:14:07,160 --> 00:14:10,680
Let me apply these curly braces as before.
316
00:14:10,680 --> 00:14:13,340
Let me rerun python of hello.py.
317
00:14:13,340 --> 00:14:14,060
What's your name?
318
00:14:14,060 --> 00:14:14,405
D-A-V-I-D.
319
00:14:14,405 --> 00:14:16,363
And here's, again, the answer to your question.
320
00:14:16,363 --> 00:14:18,780
Now, we get, literally, the curly braces.
321
00:14:18,780 --> 00:14:20,780
So the fix here, ultimately, is just going
322
00:14:20,780 --> 00:14:24,640
to be to add the f there, rerun my program again with David.
323
00:14:24,640 --> 00:14:26,482
And now, hello comma David.
324
00:14:26,482 --> 00:14:28,940
So this is, admittedly, a little more cryptic than the ones
325
00:14:28,940 --> 00:14:31,858
with the plus or the comma, but this is just increasingly common.
326
00:14:31,858 --> 00:14:33,650
Why? because you can read it left to right.
327
00:14:33,650 --> 00:14:34,720
It's nice and convenient.
328
00:14:34,720 --> 00:14:36,125
It's less cryptic than the %s's.
329
00:14:36,125 --> 00:14:40,130
So it's a new and improved version, if you will, of printf in C,
330
00:14:40,130 --> 00:14:44,780
based on decades of experience of programmers doing things like this.
331
00:14:44,780 --> 00:14:49,540
Questions on printing in this way?
332
00:14:49,540 --> 00:14:52,780
We're now on our way to programming in Python.
333
00:14:52,780 --> 00:14:53,280
Anything?
334
00:14:53,280 --> 00:14:53,780
All right.
335
00:14:53,780 --> 00:14:56,825
Well, what more can we do with this language, here?
336
00:14:56,825 --> 00:15:00,000
Well, let me propose that we consider that we
337
00:15:00,000 --> 00:15:07,200
have, for instance, a few other features that we can add to the mix, as well--
338
00:15:07,200 --> 00:15:12,640
namely, let's say some data types, as well.
339
00:15:12,640 --> 00:15:15,600
So let me flip over here, to back to the slides.
340
00:15:15,600 --> 00:15:18,318
And there's different data types in Python, as we'll soon see.
341
00:15:18,318 --> 00:15:19,485
But they're not as explicit.
342
00:15:19,485 --> 00:15:23,070
As we already saw, by using a string from get_string,
343
00:15:23,070 --> 00:15:25,050
you don't have to explicitly state what it is.
344
00:15:25,050 --> 00:15:29,130
But you saw-- recall, in C-- all of these various data types.
345
00:15:29,130 --> 00:15:33,720
And then, in Python, nicely enough, this list is about to get shorter.
346
00:15:33,720 --> 00:15:37,740
And so, here is our list in C. Here is an abbreviated list in Python.
347
00:15:37,740 --> 00:15:41,220
So we're still going to have strings, but they're going to be more succinctly
348
00:15:41,220 --> 00:15:45,032
called strs now, S-T-R. We're still going to have ints for integers.
349
00:15:45,032 --> 00:15:47,490
We're still going to have floats for floating point values.
350
00:15:47,490 --> 00:15:49,900
We're even going to have bools for true and false.
351
00:15:49,900 --> 00:15:53,550
But what's missing, now, from the list is long and floats.
352
00:15:53,550 --> 00:15:54,420
And why is that?
353
00:15:54,420 --> 00:15:56,220
Or rather, long and double.
354
00:15:56,220 --> 00:15:58,650
We'll recall that, in C, those used more bits.
355
00:15:58,650 --> 00:16:02,550
Well, in Python, the smaller data types, previously-- int and float,
356
00:16:02,550 --> 00:16:04,950
themselves-- just used more bits for you.
357
00:16:04,950 --> 00:16:08,010
And so, you don't need to distinguish between small and large.
358
00:16:08,010 --> 00:16:10,290
You just use one data type, and the language
359
00:16:10,290 --> 00:16:12,345
gives you a bigger range than before.
360
00:16:12,345 --> 00:16:15,510
It turns out, though, there's going to be some other features, as well,
361
00:16:15,510 --> 00:16:17,610
of Python, and these data types-- one of which
362
00:16:17,610 --> 00:16:20,010
will be called range, another of which will be list.
363
00:16:20,010 --> 00:16:21,402
So gone will be arrays.
364
00:16:21,402 --> 00:16:23,610
We'll actually use something literally called a list.
365
00:16:23,610 --> 00:16:28,110
Tuples-- sort of x, y pairs for coordinates and things like that.
366
00:16:28,110 --> 00:16:31,260
Dicts for dictionaries-- so we'll have built-in capabilities
367
00:16:31,260 --> 00:16:34,270
for storing keys and values we'll see, and even a set.
368
00:16:34,270 --> 00:16:36,270
Mathematically, a set is a collection of values,
369
00:16:36,270 --> 00:16:38,790
but it automatically gets rid of duplicates for you.
370
00:16:38,790 --> 00:16:43,470
So all of these things, we could absolutely implement in C if we wanted.
371
00:16:43,470 --> 00:16:47,940
And, indeed, in problem set five, you've been implementing your very own spell
372
00:16:47,940 --> 00:16:50,400
checker using some form of hash table.
373
00:16:50,400 --> 00:16:54,060
Well, it turns out that, in Python, you can solve those same problems,
374
00:16:54,060 --> 00:16:56,070
but perhaps a little more readily.
375
00:16:56,070 --> 00:16:58,980
In fact, let me go back over here to VS Code,
376
00:16:58,980 --> 00:17:01,895
and let me propose that I do the following.
377
00:17:01,895 --> 00:17:06,210
Let me go ahead and create a file called dictionary.py.
378
00:17:06,210 --> 00:17:09,510
Let me propose that I try to implement, say-- problem set five--
379
00:17:09,510 --> 00:17:14,220
our spell checker in Python instead of C and achieve, ultimately,
380
00:17:14,220 --> 00:17:17,443
the same kind of behavior whereby I'll be
381
00:17:17,443 --> 00:17:19,235
able to spell check a whole bunch of words.
382
00:17:19,235 --> 00:17:21,480
So this is jumping the gun a little bit because you're
383
00:17:21,480 --> 00:17:23,897
about to see syntax will revisit over the course of today.
384
00:17:23,897 --> 00:17:26,580
But, for now, I've got a new file called dictionary.py.
385
00:17:26,580 --> 00:17:30,810
And let me begin to create some placeholders for functions.
386
00:17:30,810 --> 00:17:34,710
We'll see in just a bit that, in Python, you can define a function called check,
387
00:17:34,710 --> 00:17:38,000
and that check function can take a word as its input.
388
00:17:38,000 --> 00:17:40,292
And I'll come back to this in just a moment.
389
00:17:40,292 --> 00:17:42,000
In Python, I can define a second function
390
00:17:42,000 --> 00:17:44,865
like load, which itself will take a whole dictionary,
391
00:17:44,865 --> 00:17:47,010
just like in problem set five.
392
00:17:47,010 --> 00:17:51,010
And I'll go ahead and come back to the implementation of this.
393
00:17:51,010 --> 00:17:53,130
Meanwhile, we might similarly implement a function
394
00:17:53,130 --> 00:17:57,090
called size, which takes no arguments but, ultimately, is going to return
395
00:17:57,090 --> 00:17:59,100
the size of my dictionary of words.
396
00:17:59,100 --> 00:18:02,370
And then, lastly, for consistency with problem set five,
397
00:18:02,370 --> 00:18:05,130
we might define an unload function, whose purpose in life
398
00:18:05,130 --> 00:18:07,770
is to free any memory that you've been using, just
399
00:18:07,770 --> 00:18:09,390
to give it back to the computer.
400
00:18:09,390 --> 00:18:11,790
Now, odds are, whether you're still working on speller
401
00:18:11,790 --> 00:18:15,660
or have finished speller, you wrote a decent amount of lines of code.
402
00:18:15,660 --> 00:18:18,550
And indeed, it's been, by design, a challenge.
403
00:18:18,550 --> 00:18:22,620
But one of the reasons for these higher-level languages like Python
404
00:18:22,620 --> 00:18:25,680
is that you can stand on the shoulders of programmers before you
405
00:18:25,680 --> 00:18:28,703
and solve very common problems much more quickly.
406
00:18:28,703 --> 00:18:31,620
So that you can focus on building your new app or your web application
407
00:18:31,620 --> 00:18:34,690
or your own project to solve problems of interest to you.
408
00:18:34,690 --> 00:18:38,490
So at the risk of crushing some spirits, let
409
00:18:38,490 --> 00:18:42,540
me propose that, in Python if you want a dictionary for something like a spell
410
00:18:42,540 --> 00:18:44,070
checker, well, that's fine.
411
00:18:44,070 --> 00:18:48,030
Go ahead and give yourself a variable, like words, to store all of those words
412
00:18:48,030 --> 00:18:52,410
and just assign it equal to a dictionary-- or dict, for short,
413
00:18:52,410 --> 00:18:53,220
in Python.
414
00:18:53,220 --> 00:18:55,140
That will give you a hash table.
415
00:18:55,140 --> 00:18:57,690
Now, it turns out, in speller recall, you
416
00:18:57,690 --> 00:18:59,720
don't need to worry about words and definitions.
417
00:18:59,720 --> 00:19:01,763
It's just about spell-checking the words.
418
00:19:01,763 --> 00:19:03,930
So strictly speaking, we don't need keys and values.
419
00:19:03,930 --> 00:19:05,610
We just need keys.
420
00:19:05,610 --> 00:19:07,980
So I'm going to save myself a few more keystrokes
421
00:19:07,980 --> 00:19:11,055
by just saying that, technically, in Python, using a set suffices.
422
00:19:11,055 --> 00:19:13,770
Again, a set is just a collection of values with no duplicates.
423
00:19:13,770 --> 00:19:16,400
But they don't necessarily have keys and values.
424
00:19:16,400 --> 00:19:18,250
It's just one or the other.
425
00:19:18,250 --> 00:19:21,420
But now that I have-- on line one, I claim the equivalent, in Python,
426
00:19:21,420 --> 00:19:25,720
of a hash table, I can actually do something like this.
427
00:19:25,720 --> 00:19:28,890
Here's how I might implement the check function in Python.
428
00:19:28,890 --> 00:19:33,840
If the word passed into this function is in my variable called words,
429
00:19:33,840 --> 00:19:35,390
well, return True.
430
00:19:35,390 --> 00:19:39,360
Else, go ahead and return False.
431
00:19:39,360 --> 00:19:40,030
Done.
432
00:19:40,030 --> 00:19:40,530
No, wait.
433
00:19:40,530 --> 00:19:42,990
You're thinking, if anything at all, maybe
434
00:19:42,990 --> 00:19:46,507
we want to handle lowercase instead of just uppercase and lowercase.
435
00:19:46,507 --> 00:19:47,340
Well, you know what?
436
00:19:47,340 --> 00:19:49,725
In Python, if you want to force a whole word to lowercase,
437
00:19:49,725 --> 00:19:51,360
you don't have to iterate over it with a loop.
438
00:19:51,360 --> 00:19:54,190
You don't have to use any of that C-type functions or anything.
439
00:19:54,190 --> 00:19:56,947
Just say word.lower, and that will convert the whole thing
440
00:19:56,947 --> 00:19:58,780
to lowercase for parity with the dictionary.
441
00:19:58,780 --> 00:19:59,440
All right.
442
00:19:59,440 --> 00:20:02,185
How about something like the load function in Python?
443
00:20:02,185 --> 00:20:06,130
Well, in Python, you can open files just like in C. For instance, in Python, I
444
00:20:06,130 --> 00:20:09,940
might do open, the dictionary argument in read mode,
445
00:20:09,940 --> 00:20:11,798
just like fopen in Python.
446
00:20:11,798 --> 00:20:13,090
I might do something like this.
447
00:20:13,090 --> 00:20:20,230
For each line in that file, let me go ahead and add, to my words variable,
448
00:20:20,230 --> 00:20:21,430
that line.
449
00:20:21,430 --> 00:20:24,790
And then, let me go ahead and close that file.
450
00:20:24,790 --> 00:20:26,320
And I think I'm done.
451
00:20:26,320 --> 00:20:28,457
I'm just going to go ahead and return True,
452
00:20:28,457 --> 00:20:30,040
just because I think I'm already done.
453
00:20:30,040 --> 00:20:32,350
Now, here, too, I could nitpick a little bit.
454
00:20:32,350 --> 00:20:35,680
Technically, if I'm reading in every line from the file,
455
00:20:35,680 --> 00:20:38,620
every line in the dictionary ends with, technically, a backslash n.
456
00:20:38,620 --> 00:20:41,140
But there's an easy way to get rid of that,
457
00:20:41,140 --> 00:20:43,360
just like you might see with an alternative syntax.
458
00:20:43,360 --> 00:20:45,060
What I'm actually going to do is this.
459
00:20:45,060 --> 00:20:49,060
Let me grab from the current line, the current word,
460
00:20:49,060 --> 00:20:51,940
by stripping off with reverse strip--
461
00:20:51,940 --> 00:20:53,935
rstrip; a function we'll, again, see--
462
00:20:53,935 --> 00:20:55,810
that just gets rid of the trailing new line--
463
00:20:55,810 --> 00:20:58,000
the backslash n at the end of that line.
464
00:20:58,000 --> 00:21:01,900
And what I really want to do, then, is add this word to that dictionary.
465
00:21:01,900 --> 00:21:05,780
Meanwhile, if I want to figure out what the size is of my dictionary, well--
466
00:21:05,780 --> 00:21:08,890
and, see, you're probably writing code to iterate over all of those lines,
467
00:21:08,890 --> 00:21:12,040
and you're just going to count them up using a variable.
468
00:21:12,040 --> 00:21:13,060
Not so in Python.
469
00:21:13,060 --> 00:21:15,460
You can just return the length of those words.
470
00:21:15,460 --> 00:21:19,360
And better still, in Python, you don't have to manage your own memory.
471
00:21:19,360 --> 00:21:20,500
No more malloc.
472
00:21:20,500 --> 00:21:21,700
No more free.
473
00:21:21,700 --> 00:21:24,370
No more manual thinking about memory.
474
00:21:24,370 --> 00:21:27,310
The language just deals with all of that for you.
475
00:21:27,310 --> 00:21:28,030
So you know what?
476
00:21:28,030 --> 00:21:30,760
It suffices for me to just return True and claim
477
00:21:30,760 --> 00:21:33,640
that unloading is done for me.
478
00:21:33,640 --> 00:21:35,170
And that's it.
479
00:21:35,170 --> 00:21:37,840
Again, whether, you're in the middle of or already finished,
480
00:21:37,840 --> 00:21:39,960
this might, perhaps, adjust some frustration,
481
00:21:39,960 --> 00:21:45,700
but also, enlightenment in that this is why higher-level languages exist.
482
00:21:45,700 --> 00:21:47,605
You can build on top of the same principles,
483
00:21:47,605 --> 00:21:50,170
the same ideas, with which you've been dealing,
484
00:21:50,170 --> 00:21:51,820
struggling even this past week.
485
00:21:51,820 --> 00:21:55,090
But you can now express yourself all the more succinctly.
486
00:21:55,090 --> 00:21:59,590
This one line implements a hash table for you, and all of this, now,
487
00:21:59,590 --> 00:22:03,250
just uses that hash table in a simpler way.
488
00:22:03,250 --> 00:22:05,980
Any questions, now, on this, keeping in mind
489
00:22:05,980 --> 00:22:08,830
that the point, nonetheless, of speller in p-set 5
490
00:22:08,830 --> 00:22:12,160
is to understand what's really going on underneath the hood
491
00:22:12,160 --> 00:22:14,860
and, better still, to notice this.
492
00:22:14,860 --> 00:22:18,010
This might seem all rather amazing, but let me go ahead and do this.
493
00:22:18,010 --> 00:22:21,100
I've actually got a couple of versions of speller written here,
494
00:22:21,100 --> 00:22:24,800
and I've got a version written in C that I won't show the source code for.
495
00:22:24,800 --> 00:22:28,990
But I'm going to go ahead and make that version of speller in C.
496
00:22:28,990 --> 00:22:32,470
And I'm going to go ahead here and, let's say, split
497
00:22:32,470 --> 00:22:34,270
my window here for just a moment.
498
00:22:34,270 --> 00:22:37,030
And I'm going to go into a Python version of speller,
499
00:22:37,030 --> 00:22:38,470
really, that I just wrote.
500
00:22:38,470 --> 00:22:42,820
And on the left-hand side here, let me go ahead and run speller--
501
00:22:42,820 --> 00:22:44,740
the version I compiled in C--
502
00:22:44,740 --> 00:22:47,890
using a big text like the Sherlock Holmes text,
503
00:22:47,890 --> 00:22:50,030
which has a whole lot of words in it.
504
00:22:50,030 --> 00:22:52,180
And on the right-hand side, let me run python
505
00:22:52,180 --> 00:22:55,510
of speller.py, which is a separate file I wrote in advance,
506
00:22:55,510 --> 00:22:57,430
just like we give you speller.c.
507
00:22:57,430 --> 00:23:00,790
And I'll, similarly, run this on the Sherlock Holmes text.
508
00:23:00,790 --> 00:23:05,020
And I'm going to do my best to hit Enter on the left and the right of my screen
509
00:23:05,020 --> 00:23:06,100
at the same time.
510
00:23:06,100 --> 00:23:08,770
But we should see, hopefully, the same list of misspelled words
511
00:23:08,770 --> 00:23:10,390
and the timings thereof.
512
00:23:10,390 --> 00:23:12,380
So here we go on the right.
513
00:23:12,380 --> 00:23:15,136
Here we go on the left.
514
00:23:15,136 --> 00:23:16,730
All right.
515
00:23:16,730 --> 00:23:18,680
A race to see which one wins here.
516
00:23:18,680 --> 00:23:19,820
C is on the left.
517
00:23:19,820 --> 00:23:21,680
Python is on the right.
518
00:23:21,680 --> 00:23:23,270
OK.
519
00:23:23,270 --> 00:23:25,530
Interesting.
520
00:23:25,530 --> 00:23:28,200
Hopefully, Python's close behind.
521
00:23:28,200 --> 00:23:30,330
Note that some of this is internet delay.
522
00:23:30,330 --> 00:23:33,360
And so, it might not necessarily be a crazy number of seconds.
523
00:23:33,360 --> 00:23:37,050
But the system is, indeed, using, if we measure it, a low level.
524
00:23:37,050 --> 00:23:39,630
How much time the CPU spent executing my code?
525
00:23:39,630 --> 00:23:41,653
C took a total of 1.64 seconds.
526
00:23:41,653 --> 00:23:44,820
That was pretty fast, even though it took a moment more for all of the bytes
527
00:23:44,820 --> 00:23:46,590
to come over the internet.
528
00:23:46,590 --> 00:23:49,050
The Python version, though, took what?
529
00:23:49,050 --> 00:23:50,605
2.44 seconds.
530
00:23:50,605 --> 00:23:53,100
So what might the inference be?
531
00:23:53,100 --> 00:23:55,590
One, maybe I'm just better at programming in C
532
00:23:55,590 --> 00:23:59,400
than I am in Python, which is probably not true.
533
00:23:59,400 --> 00:24:03,210
But what else might you infer from this example?
534
00:24:03,210 --> 00:24:07,541
535
00:24:07,541 --> 00:24:11,176
Should we, maybe, give up on Python, stick with C?
536
00:24:11,176 --> 00:24:12,070
No?
537
00:24:12,070 --> 00:24:14,410
So what might be going on here?
538
00:24:14,410 --> 00:24:16,870
Why is the Python version, that I claim is correct--
539
00:24:16,870 --> 00:24:20,620
and I think the numbers all line up, just not the times.
540
00:24:20,620 --> 00:24:21,820
Where is the trade-off here?
541
00:24:21,820 --> 00:24:23,915
Well, here, again, is this design trade-off.
542
00:24:23,915 --> 00:24:24,415
Yeah?
543
00:24:24,415 --> 00:24:29,310
AUDIENCE: In order to save the programmer time, [INAUDIBLE]..
544
00:24:29,310 --> 00:24:30,690
DAVID MALAN: Yeah, exactly.
545
00:24:30,690 --> 00:24:32,910
In order to save the human programmer time,
546
00:24:32,910 --> 00:24:35,700
there's a lot more features built into Python-- more functions,
547
00:24:35,700 --> 00:24:38,920
more automatic management of memory and so forth--
548
00:24:38,920 --> 00:24:40,530
and you have to pay a price.
549
00:24:40,530 --> 00:24:43,193
Someone else's code is doing all of that work for you.
550
00:24:43,193 --> 00:24:45,360
But if they've written some number of lines of code,
551
00:24:45,360 --> 00:24:47,152
those are just more lines of code that need
552
00:24:47,152 --> 00:24:50,730
to be executed for you, whereas here, the computer is
553
00:24:50,730 --> 00:24:54,615
at the risk of oversimplifying only running my lines of code.
554
00:24:54,615 --> 00:24:55,865
So there's just less overhead.
555
00:24:55,865 --> 00:24:57,448
And so, this is a perpetual trade-off.
556
00:24:57,448 --> 00:25:00,990
Typically, when using a more user-friendly and more modern language,
557
00:25:00,990 --> 00:25:02,983
one of the prices you might pay is performance.
558
00:25:02,983 --> 00:25:06,150
Now, there's a lot of smart computer scientists in the world, though, trying
559
00:25:06,150 --> 00:25:08,440
to push back on those same trade-offs.
560
00:25:08,440 --> 00:25:11,220
And so, these interpreters, like the command I wrote,
561
00:25:11,220 --> 00:25:15,390
Python technically can-- especially if you run a program again and again--
562
00:25:15,390 --> 00:25:19,350
actually, secretly, behind the scenes, compile your code for you,
563
00:25:19,350 --> 00:25:20,610
down to 0s and 1s.
564
00:25:20,610 --> 00:25:23,640
And then, the second, the third, the fourth time you run that program,
565
00:25:23,640 --> 00:25:25,010
it might very well be faster.
566
00:25:25,010 --> 00:25:27,150
So this is a bit of a head fake here, in that
567
00:25:27,150 --> 00:25:29,490
I'm running them once and only once.
568
00:25:29,490 --> 00:25:32,070
But we could get benefit over time if we kept
569
00:25:32,070 --> 00:25:34,183
running the Python version again and again
570
00:25:34,183 --> 00:25:35,850
and, perhaps, fine-tune the performance.
571
00:25:35,850 --> 00:25:38,017
But, in general, there's going to be this trade-off.
572
00:25:38,017 --> 00:25:40,560
Now, would you rather spend the 60 seconds
573
00:25:40,560 --> 00:25:43,620
I wrote implementing a spell checker or this 6 hours,
574
00:25:43,620 --> 00:25:47,910
16 hours you might be or have spent implementing the same in C?
575
00:25:47,910 --> 00:25:48,720
Probably not.
576
00:25:48,720 --> 00:25:52,650
For productivity's sake, this is why we have these additional languages.
577
00:25:52,650 --> 00:25:57,300
Just for fun, let me flip over to another screen here and open up
578
00:25:57,300 --> 00:26:00,540
a version of Python that's actually-- in just a second--
579
00:26:00,540 --> 00:26:04,230
on my own Mac instead of the cloud so that
580
00:26:04,230 --> 00:26:06,490
I can actually do something with graphics.
581
00:26:06,490 --> 00:26:09,930
So, here, I just have a black and white terminal window on my very own Mac.
582
00:26:09,930 --> 00:26:12,450
And I've pre-installed Python, just like we've done so
583
00:26:12,450 --> 00:26:14,370
for VS Code in the cloud for you.
584
00:26:14,370 --> 00:26:19,320
Notice that I've got this photo of, perhaps, one of your favorite TV
585
00:26:19,320 --> 00:26:21,090
shows here, with the cast of The Office.
586
00:26:21,090 --> 00:26:24,630
Notice all of the faces in this image here.
587
00:26:24,630 --> 00:26:30,210
And let me propose that we try to find one face in the crowd, CSI-style,
588
00:26:30,210 --> 00:26:33,660
whereby we want to find, perhaps, the Scranton Strangler, so to speak.
589
00:26:33,660 --> 00:26:37,080
And so, here is an example of this guy's face.
590
00:26:37,080 --> 00:26:40,385
Now, how do we go about finding this specific face in the crowd?
591
00:26:40,385 --> 00:26:42,510
Well, our human eyes, obviously, can pluck him out,
592
00:26:42,510 --> 00:26:44,370
especially if you're familiar with the show.
593
00:26:44,370 --> 00:26:46,605
But let me go ahead and do this instead.
594
00:26:46,605 --> 00:26:50,730
Let me go ahead and propose that we run code
595
00:26:50,730 --> 00:26:52,800
that I already wrote in advance here.
596
00:26:52,800 --> 00:26:55,085
This is a Python program with more lines of code
597
00:26:55,085 --> 00:26:56,460
that we won't dwell on for today.
598
00:26:56,460 --> 00:26:58,800
But it's meant to motivate what we can do.
599
00:26:58,800 --> 00:27:03,150
From a pillow library, implying a Python image library,
600
00:27:03,150 --> 00:27:07,033
I want to import some type of information,
601
00:27:07,033 --> 00:27:09,450
some feature called image so that I can manipulate images,
602
00:27:09,450 --> 00:27:12,150
not unlike our own problem set four.
603
00:27:12,150 --> 00:27:13,330
And this is powerful.
604
00:27:13,330 --> 00:27:13,830
in?
605
00:27:13,830 --> 00:27:14,330
Python.
606
00:27:14,330 --> 00:27:18,450
You can just [MIMICS EXPLOSION] import face recognition as a library
607
00:27:18,450 --> 00:27:19,950
that someone else wrote.
608
00:27:19,950 --> 00:27:22,770
From there, I'm going to create a variable called image.
609
00:27:22,770 --> 00:27:25,050
I'm going to use this face recognition libraries.
610
00:27:25,050 --> 00:27:27,330
load_image_file function.
611
00:27:27,330 --> 00:27:30,030
It's a little verbose, but it's similar in spirit to fopen.
612
00:27:30,030 --> 00:27:32,100
And I'm going to open office.jpeg.
613
00:27:32,100 --> 00:27:36,990
I'm going to, then, declare a second variable called face_locations, plural,
614
00:27:36,990 --> 00:27:40,620
because what I'm expecting to get back, per the documentation for this library,
615
00:27:40,620 --> 00:27:44,650
is a list of all of the faces' locations that are detected.
616
00:27:44,650 --> 00:27:45,150
All right.
617
00:27:45,150 --> 00:27:48,660
Then, I'm going to iterate over each of those faces using a for loop,
618
00:27:48,660 --> 00:27:50,460
that we'll see in more detail.
619
00:27:50,460 --> 00:27:53,475
I'm going to, then, infer what the top, right, bottom, and left corners
620
00:27:53,475 --> 00:27:55,050
are of that face.
621
00:27:55,050 --> 00:28:00,300
And then, what I'm going to do here is show that face alone,
622
00:28:00,300 --> 00:28:03,040
if I've detected the face in question.
623
00:28:03,040 --> 00:28:08,760
So let me go ahead, here, and run detect.py.
624
00:28:08,760 --> 00:28:12,370
And we'll see not just the one face we're looking for.
625
00:28:12,370 --> 00:28:16,430
But if I run Python of detect.py, it's going to do all of the analysis.
626
00:28:16,430 --> 00:28:22,380
I'll see a big opening here, now, of all of the faces that
627
00:28:22,380 --> 00:28:24,870
were detected in this here program.
628
00:28:24,870 --> 00:28:26,870
[CHUCKLES] OK, some better than others, I guess,
629
00:28:26,870 --> 00:28:28,560
if you zoom in on catching someone.
630
00:28:28,560 --> 00:28:29,970
Typical Angela.
631
00:28:29,970 --> 00:28:32,700
If you want to, now, find that one face, I
632
00:28:32,700 --> 00:28:34,920
think we need to train the software a bit more.
633
00:28:34,920 --> 00:28:37,080
So let me actually open up a second program called
634
00:28:37,080 --> 00:28:39,270
recognize that's got more going on.
635
00:28:39,270 --> 00:28:41,370
But let me, with a wave of a hand, point out
636
00:28:41,370 --> 00:28:45,870
that I'm now loading not only the office.jpeg, but also toby.jpeg
637
00:28:45,870 --> 00:28:49,840
to train the algorithm to find that specific face.
638
00:28:49,840 --> 00:28:53,580
And so, now, if I run this second version-- recognize.py--
639
00:28:53,580 --> 00:28:56,310
with Python of recognize.py--
640
00:28:56,310 --> 00:28:59,160
hold my breath for just a moment; it's analyzing, presumably,
641
00:28:59,160 --> 00:29:00,420
all of the faces--
642
00:29:00,420 --> 00:29:02,070
you see the same, original photo.
643
00:29:02,070 --> 00:29:05,610
But do you see one such face highlighted here?
644
00:29:05,610 --> 00:29:09,420
This version of the code found Toby, highlighted him
645
00:29:09,420 --> 00:29:12,110
with the screen and, voila, we have face recognition.
646
00:29:12,110 --> 00:29:14,318
So for better or for worse, this is what's happening,
647
00:29:14,318 --> 00:29:15,967
increasingly societally, nowadays.
648
00:29:15,967 --> 00:29:18,300
And honestly, even though I didn't write the code live--
649
00:29:18,300 --> 00:29:21,420
because it's a good dozen or more lines of code-- it's not terribly many.
650
00:29:21,420 --> 00:29:24,450
And literally, all the authorities-- all we have to do
651
00:29:24,450 --> 00:29:27,960
is import face recognition and, voila, you have access.
652
00:29:27,960 --> 00:29:29,890
These technologies are here already.
653
00:29:29,890 --> 00:29:31,690
But let's consider, for just a moment--
654
00:29:31,690 --> 00:29:33,820
how did we find Toby?
655
00:29:33,820 --> 00:29:35,150
How might that library--
656
00:29:35,150 --> 00:29:37,900
even though we're not going to look at its implementation details,
657
00:29:37,900 --> 00:29:40,000
how does it find Toby and distinguish him
658
00:29:40,000 --> 00:29:43,960
from all of these other faces in the crowd?
659
00:29:43,960 --> 00:29:47,180
What might it be doing, intuitively.
660
00:29:47,180 --> 00:29:50,570
Think back even to p-set four, what you, yourselves, have access to, data-wise.
661
00:29:50,570 --> 00:29:51,083
Yeah?
662
00:29:51,083 --> 00:29:53,750
AUDIENCE: [? Since ?] we gave it an image of Toby's face before,
663
00:29:53,750 --> 00:29:59,010
it probably looks at are the pixels in one area the same as in another area
664
00:29:59,010 --> 00:30:00,720
and allots it to the same--
665
00:30:00,720 --> 00:30:02,998
from that reference image to this image.
666
00:30:02,998 --> 00:30:06,870
And then, it's going to say, hey, a lot of the similar consult ranges
667
00:30:06,870 --> 00:30:09,292
are here and here, so we can safely guess
668
00:30:09,292 --> 00:30:10,750
that this is the same [? person. ?]
669
00:30:10,750 --> 00:30:11,875
DAVID MALAN: Yeah, exactly.
670
00:30:11,875 --> 00:30:15,610
And to summarize for the camera here, we have trained the software, if you will,
671
00:30:15,610 --> 00:30:17,560
by giving it a photo of Toby's face.
672
00:30:17,560 --> 00:30:20,218
So, by looking for the same or, really, similar pixels--
673
00:30:20,218 --> 00:30:22,510
especially if it's a slightly different image of Toby--
674
00:30:22,510 --> 00:30:24,970
we can, perhaps, identify him in the crowd.
675
00:30:24,970 --> 00:30:26,412
And what really is a human face?
676
00:30:26,412 --> 00:30:28,120
Well, at the end of the day, the computer
677
00:30:28,120 --> 00:30:30,340
only knows it as a pattern of bits or, really,
678
00:30:30,340 --> 00:30:32,110
at a higher level, a pattern of pixels.
679
00:30:32,110 --> 00:30:35,170
So maybe a human face is, perhaps, best defined, in general,
680
00:30:35,170 --> 00:30:39,295
as two eyes and a nose and a mouth that, even though all of us look similar,
681
00:30:39,295 --> 00:30:43,268
structurally, odds are, the measurement between the eyes and the nose
682
00:30:43,268 --> 00:30:45,310
and the width of the mouth, the skin tone and all
683
00:30:45,310 --> 00:30:47,920
of these other physical characteristics are patterns
684
00:30:47,920 --> 00:30:51,280
that software could, perhaps, detect and then look, statistically,
685
00:30:51,280 --> 00:30:53,920
through the image, looking for the closest possible match
686
00:30:53,920 --> 00:30:57,422
to these various measurement shapes, colors and sizes and the like.
687
00:30:57,422 --> 00:30:59,130
And, indeed, that might be the intuition.
688
00:30:59,130 --> 00:31:03,520
But what's powerful here, again, is just how easy and readily available
689
00:31:03,520 --> 00:31:06,280
this technology now is.
690
00:31:06,280 --> 00:31:06,820
All right.
691
00:31:06,820 --> 00:31:10,605
So with that said, let's propose to consider what more we
692
00:31:10,605 --> 00:31:13,480
can do with Python itself, get back to the fundamentals, so that you,
693
00:31:13,480 --> 00:31:16,990
yourselves can start to implement something along those same lines.
694
00:31:16,990 --> 00:31:21,820
So besides having access to things like a get_string function,
695
00:31:21,820 --> 00:31:26,080
the CS50 library provides a few other things, as well-- namely, in C,
696
00:31:26,080 --> 00:31:27,040
we had these.
697
00:31:27,040 --> 00:31:29,052
But in Python, we're going to have fewer.
698
00:31:29,052 --> 00:31:32,260
In Python, our library, short-term, is going to give you not only get_string,
699
00:31:32,260 --> 00:31:33,740
but also get_int and get_float.
700
00:31:33,740 --> 00:31:34,240
Why?
701
00:31:34,240 --> 00:31:36,310
It's actually just annoying, as we'll soon
702
00:31:36,310 --> 00:31:39,190
see, to get back an integer or a float from a user
703
00:31:39,190 --> 00:31:44,890
and just make sure that it's an int and a float and not a word like cat or dog,
704
00:31:44,890 --> 00:31:47,170
or some string that's not actually a number.
705
00:31:47,170 --> 00:31:50,810
Well, we can import not just the specific function, get_string,
706
00:31:50,810 --> 00:31:53,920
but we can actually import all of these functions one at a time,
707
00:31:53,920 --> 00:31:55,840
like this, as we'll soon see.
708
00:31:55,840 --> 00:31:59,410
Or you can even, in Python, import specific functions from a file.
709
00:31:59,410 --> 00:32:04,300
One of you asked a while back, when you include something like CS50.h
710
00:32:04,300 --> 00:32:08,780
or standard I/O .h, you're actually getting all of the code in that file,
711
00:32:08,780 --> 00:32:12,010
which, potentially, can add bulk to your own program or time.
712
00:32:12,010 --> 00:32:15,040
In this case, when you import specific functions from Python,
713
00:32:15,040 --> 00:32:17,875
you can be a little more narrowly precise
714
00:32:17,875 --> 00:32:21,230
as to what it is you want to have access to.
715
00:32:21,230 --> 00:32:21,730
All right.
716
00:32:21,730 --> 00:32:23,890
So, with that said, let's go ahead and see
717
00:32:23,890 --> 00:32:25,900
what conditionals look like in Python.
718
00:32:25,900 --> 00:32:29,470
So in the left-hand side again, here, we'll see Scratch.
719
00:32:29,470 --> 00:32:33,190
So it's just a contrived example asking if x is less than y, then,
720
00:32:33,190 --> 00:32:35,350
say, x is less than y.
721
00:32:35,350 --> 00:32:37,540
In C, it looked like this.
722
00:32:37,540 --> 00:32:41,050
In Python, now, it's going to look like this instead.
723
00:32:41,050 --> 00:32:44,815
And here's before in C, and here's after.
724
00:32:44,815 --> 00:32:47,320
And just to call out a few of the obvious differences, what
725
00:32:47,320 --> 00:32:49,810
has changed, in Python, for conditionals, it would seem?
726
00:32:49,810 --> 00:32:53,013
727
00:32:53,013 --> 00:32:53,930
What's the difference?
728
00:32:53,930 --> 00:32:54,230
Yeah.
729
00:32:54,230 --> 00:32:55,920
AUDIENCE: There's a lack of curly braces.
730
00:32:55,920 --> 00:32:56,380
DAVID MALAN: Yeah.
731
00:32:56,380 --> 00:32:57,760
So there's no more curly braces.
732
00:32:57,760 --> 00:32:59,170
And, indeed, you don't use those.
733
00:32:59,170 --> 00:33:04,138
What appears to be taking their place, if you might infer?
734
00:33:04,138 --> 00:33:05,680
What seems to have taken their place?
735
00:33:05,680 --> 00:33:05,890
What do you think?
736
00:33:05,890 --> 00:33:06,765
AUDIENCE: [INAUDIBLE]
737
00:33:06,765 --> 00:33:09,560
DAVID MALAN: So the colon at the start of this line, here.
738
00:33:09,560 --> 00:33:13,760
But also even more important, now, is this indentation below it.
739
00:33:13,760 --> 00:33:16,160
So some of you, and we know this from office hours,
740
00:33:16,160 --> 00:33:19,380
have a habit of indenting everything on the left, right?
741
00:33:19,380 --> 00:33:21,200
And it's just this crazy mess to look at.
742
00:33:21,200 --> 00:33:23,000
Frustrating for you, surely.
743
00:33:23,000 --> 00:33:25,670
But C and Clang is pretty tolerant when it
744
00:33:25,670 --> 00:33:27,860
comes to things like white space in a program.
745
00:33:27,860 --> 00:33:29,030
Python, uh-uh.
746
00:33:29,030 --> 00:33:32,240
They realized, years ago, that-- let's help humans help themselves and just
747
00:33:32,240 --> 00:33:34,610
require standard indentation.
748
00:33:34,610 --> 00:33:36,620
So four spaces would be the norm here.
749
00:33:36,620 --> 00:33:38,870
But because it's indented below that colon, that,
750
00:33:38,870 --> 00:33:42,110
indeed, indicates that this, now, is part of that condition.
751
00:33:42,110 --> 00:33:46,340
Something else has gone missing, versus C, in this conditional.
752
00:33:46,340 --> 00:33:47,855
What else is a little simplified?
753
00:33:47,855 --> 00:33:49,660
AUDIENCE: [INAUDIBLE]
754
00:33:49,660 --> 00:33:50,410
DAVID MALAN: Yeah.
755
00:33:50,410 --> 00:33:51,368
So no more parentheses.
756
00:33:51,368 --> 00:33:53,650
You can still use them, especially when you need to,
757
00:33:53,650 --> 00:33:56,112
logically, to do order of operations, like in math.
758
00:33:56,112 --> 00:33:57,820
But in this case, if you just want to ask
759
00:33:57,820 --> 00:34:01,162
a simple question, like if x less than y, you can just do it like that.
760
00:34:01,162 --> 00:34:02,620
How about when you have an if else?
761
00:34:02,620 --> 00:34:05,170
Well, this is almost the same, here, with these same changes.
762
00:34:05,170 --> 00:34:06,800
In C, this looked like this.
763
00:34:06,800 --> 00:34:08,800
And it's starting to get a bit bulky-- at least,
764
00:34:08,800 --> 00:34:10,659
if we use our curly braces in this way.
765
00:34:10,659 --> 00:34:13,060
In Python, we can tighten things up further, even though,
766
00:34:13,060 --> 00:34:15,727
strictly speaking, in C, you don't always need the curly braces.
767
00:34:15,727 --> 00:34:18,280
But here, gone are the parentheses, again.
768
00:34:18,280 --> 00:34:20,020
Gone are the curly braces.
769
00:34:20,020 --> 00:34:23,380
Indentation is consistent, and we've just added another keyword,
770
00:34:23,380 --> 00:34:24,580
else, with the colon.
771
00:34:24,580 --> 00:34:26,325
But no more semicolons, as well.
772
00:34:26,325 --> 00:34:30,010
How about something larger, like this, in if, else, if else?
773
00:34:30,010 --> 00:34:31,960
This one's a little curious.
774
00:34:31,960 --> 00:34:35,290
But in C, it looked like this-- if, else, if else.
775
00:34:35,290 --> 00:34:38,143
In Python, it now looks like this.
776
00:34:38,143 --> 00:34:40,060
And there's, perhaps, one curiosity here that,
777
00:34:40,060 --> 00:34:41,977
honestly, all these years later, I still can't
778
00:34:41,977 --> 00:34:43,630
remember how to spell it half the time.
779
00:34:43,630 --> 00:34:46,900
What's weird about this?
780
00:34:46,900 --> 00:34:50,415
What do you spot as different?
781
00:34:50,415 --> 00:34:51,230
Yeah, over here.
782
00:34:51,230 --> 00:34:53,520
AUDIENCE: [INAUDIBLE]
783
00:34:53,520 --> 00:34:54,270
DAVID MALAN: Yeah.
784
00:34:54,270 --> 00:34:56,260
Instead of else if, it's elif.
785
00:34:56,260 --> 00:34:56,760
Why?
786
00:34:56,760 --> 00:34:59,340
[SIGHS] Apparently, else space if was just too many
787
00:34:59,340 --> 00:35:02,250
keystrokes for humans to type, so they condensed it into this way.
788
00:35:02,250 --> 00:35:05,100
Probably means it's a little more distinguishable, too,
789
00:35:05,100 --> 00:35:07,200
for the computer between the if and the else, too.
790
00:35:07,200 --> 00:35:08,700
But just something to remember, now.
791
00:35:08,700 --> 00:35:10,620
It's, indeed, elif and not else if.
792
00:35:10,620 --> 00:35:11,123
All right.
793
00:35:11,123 --> 00:35:12,540
So what about variables in Python?
794
00:35:12,540 --> 00:35:16,590
I've used a couple of them already, but let's
795
00:35:16,590 --> 00:35:19,533
distill exactly how you define and declare these things, as well.
796
00:35:19,533 --> 00:35:22,200
So, in Scratch, if we wanted to create a variable called counter
797
00:35:22,200 --> 00:35:25,185
and set it equal, initially, to 0, we would do something
798
00:35:25,185 --> 00:35:28,680
like this-- specify that it's an int, use the assignment operator,
799
00:35:28,680 --> 00:35:30,060
end the thought with a semicolon.
800
00:35:30,060 --> 00:35:32,310
In Python, it's just simpler.
801
00:35:32,310 --> 00:35:34,680
You name the variable, use the assignment operator,
802
00:35:34,680 --> 00:35:37,755
as before, you set it equal to some value, and that's it.
803
00:35:37,755 --> 00:35:38,880
You don't mention the type.
804
00:35:38,880 --> 00:35:41,250
You don't mention the semicolon or anything more.
805
00:35:41,250 --> 00:35:44,250
What if you want to change a variable, like counter,
806
00:35:44,250 --> 00:35:46,320
by 1-- that is, incremented by 1?
807
00:35:46,320 --> 00:35:47,800
You have a few different ways here.
808
00:35:47,800 --> 00:35:51,990
In C, we saw syntax like this, where you can say counter equals counter plus 1,
809
00:35:51,990 --> 00:35:54,900
which, again, feels illogical.
810
00:35:54,900 --> 00:35:56,610
How can counter equal counter plus 1?
811
00:35:56,610 --> 00:36:01,890
But, again, we read this code, really, right to left, updating its value by 1.
812
00:36:01,890 --> 00:36:03,550
In Python, it's almost the same.
813
00:36:03,550 --> 00:36:04,535
You just get rid of the semicolon.
814
00:36:04,535 --> 00:36:05,580
So that logic is there.
815
00:36:05,580 --> 00:36:08,070
But recall, in C, we could do something slightly different
816
00:36:08,070 --> 00:36:09,840
that we can also do in Python.
817
00:36:09,840 --> 00:36:12,060
In Python, you can also, more succinctly,
818
00:36:12,060 --> 00:36:15,420
do this-- plus equals, and then, whatever number you want to add.
819
00:36:15,420 --> 00:36:17,790
Or you can even change it to subtract, if you prefer.
820
00:36:17,790 --> 00:36:21,495
Sadly, gone is something you've probably typed a whole lot.
821
00:36:21,495 --> 00:36:23,940
What was the other way you can add 1?
822
00:36:23,940 --> 00:36:24,773
AUDIENCE: Plus plus?
823
00:36:24,773 --> 00:36:26,940
DAVID MALAN: Plus plus is no more, sadly, in Python.
824
00:36:26,940 --> 00:36:29,600
Just too many ways to do the same thing, so they got rid of it
825
00:36:29,600 --> 00:36:31,705
in favor of just this syntax, here.
826
00:36:31,705 --> 00:36:33,140
So keep that in mind, as well.
827
00:36:33,140 --> 00:36:36,500
What about loops, when you want to do something in Python again and again.
828
00:36:36,500 --> 00:36:39,380
Well, in Scratch, in week zero, here's how we meowed three times,
829
00:36:39,380 --> 00:36:40,700
specifically.
830
00:36:40,700 --> 00:36:42,650
In C, we had a couple of ways of doing this.
831
00:36:42,650 --> 00:36:46,460
This was the more mechanical approach, where you create a variable called i.
832
00:36:46,460 --> 00:36:47,780
You set it equal to 0.
833
00:36:47,780 --> 00:36:51,230
You then do while i is less than 3, the following.
834
00:36:51,230 --> 00:36:54,530
And then, you, yourself increment i again and again.
835
00:36:54,530 --> 00:36:57,920
Mechanical in the sense that you have to implement all of these gears
836
00:36:57,920 --> 00:37:01,130
and make them turn yourself, but this was a correct way to do that.
837
00:37:01,130 --> 00:37:03,740
In Python, we can still achieve the same idea,
838
00:37:03,740 --> 00:37:05,945
but we don't need the int keyword.
839
00:37:05,945 --> 00:37:07,445
We don't need any of the semicolons.
840
00:37:07,445 --> 00:37:08,695
We don't need the parentheses.
841
00:37:08,695 --> 00:37:10,310
We don't need the curly braces.
842
00:37:10,310 --> 00:37:12,200
We can't use the plus plus, so maybe that's
843
00:37:12,200 --> 00:37:14,300
a minor step backwards if you're a fan.
844
00:37:14,300 --> 00:37:17,930
But otherwise, the code, the logic is exactly the same.
845
00:37:17,930 --> 00:37:20,390
But there's other ways to achieve this same idea.
846
00:37:20,390 --> 00:37:22,950
Recall that, in C, we could also do this.
847
00:37:22,950 --> 00:37:25,880
You could use a for loop, which does exactly the same thing.
848
00:37:25,880 --> 00:37:26,893
Both are correct.
849
00:37:26,893 --> 00:37:28,310
Both are, arguably, well-designed.
850
00:37:28,310 --> 00:37:32,000
It's to each their own when it comes to choosing between these.
851
00:37:32,000 --> 00:37:35,930
In Python, though, we're going to have to think through how to do this.
852
00:37:35,930 --> 00:37:41,300
So you don't do the same for loop as in C. The closest I could come up with
853
00:37:41,300 --> 00:37:44,270
is this, where you say for i--
854
00:37:44,270 --> 00:37:47,555
or whatever variable you want to do the counting-- in-- literally
855
00:37:47,555 --> 00:37:50,522
the preposition-- and then, you use square brackets here.
856
00:37:50,522 --> 00:37:52,730
And we've used square brackets before, in the context
857
00:37:52,730 --> 00:37:55,370
of arrays and things like that.
858
00:37:55,370 --> 00:38:00,080
And the 0, 1, 2 looks like an array, in some sense, even though we've also seen
859
00:38:00,080 --> 00:38:01,470
arrays with curly braces.
860
00:38:01,470 --> 00:38:03,950
But these square brackets, for now, denote a list.
861
00:38:03,950 --> 00:38:05,420
Python does not have arrays.
862
00:38:05,420 --> 00:38:08,600
An array is that contiguous chunk of memory, back to back to back,
863
00:38:08,600 --> 00:38:13,160
that you have to resize somehow by moving things around in memory,
864
00:38:13,160 --> 00:38:14,450
as per two weeks ago.
865
00:38:14,450 --> 00:38:19,175
In Python, though, you can just create a list like this using square brackets.
866
00:38:19,175 --> 00:38:22,700
And better still, as we'll see, you can add or even remove things
867
00:38:22,700 --> 00:38:24,920
from that list down the road.
868
00:38:24,920 --> 00:38:27,140
This, though, is not going to be very well-designed.
869
00:38:27,140 --> 00:38:28,610
This will work.
870
00:38:28,610 --> 00:38:32,030
This will iterate in Python three times.
871
00:38:32,030 --> 00:38:34,700
But what might rub you the wrong way about this design,
872
00:38:34,700 --> 00:38:36,860
even if you've never seen Python before?
873
00:38:36,860 --> 00:38:38,460
How does this example not end well?
874
00:38:38,460 --> 00:38:38,960
Yeah?
875
00:38:38,960 --> 00:38:41,810
AUDIENCE: Making a large list [INAUDIBLE]..
876
00:38:41,810 --> 00:38:42,560
DAVID MALAN: Yeah.
877
00:38:42,560 --> 00:38:45,830
If you're making a large list, you have to type out each one of these numbers,
878
00:38:45,830 --> 00:38:50,178
like comma 3, comma 4, comma 5, comma, dot, dot, dot, 50 comma, dot, dot, dot,
879
00:38:50,178 --> 00:38:50,678
500.
880
00:38:50,678 --> 00:38:52,640
Like, surely, that's not the best solution,
881
00:38:52,640 --> 00:38:55,760
to have all of these numbers on the screen,
882
00:38:55,760 --> 00:38:57,140
wrapping endlessly on the screen.
883
00:38:57,140 --> 00:39:01,100
So, in Python, another way to do this would be to use a function
884
00:39:01,100 --> 00:39:04,160
called range, which, technically, is a data type onto itself.
885
00:39:04,160 --> 00:39:08,080
And this returns to you as many values as you ask for it.
886
00:39:08,080 --> 00:39:09,830
range takes some other arguments, as well.
887
00:39:09,830 --> 00:39:14,540
But the simplest use case here is, if you want back the numbers 0, 1, and 2--
888
00:39:14,540 --> 00:39:15,890
a total of three values--
889
00:39:15,890 --> 00:39:19,070
you say, hey, Python, please give me a range of three values.
890
00:39:19,070 --> 00:39:21,260
And by default, they start at 0 on up.
891
00:39:21,260 --> 00:39:24,320
But this is more efficient than it would be
892
00:39:24,320 --> 00:39:26,390
to hard code the entire list at once.
893
00:39:26,390 --> 00:39:29,150
And the best metaphor I could come up with is something like this.
894
00:39:29,150 --> 00:39:30,775
Here, for instance, is a deck of cards.
895
00:39:30,775 --> 00:39:34,430
This is normal, human size, and there's presumably 52 cards here.
896
00:39:34,430 --> 00:39:38,728
So writing out 0 through 51 on code would be a little ridiculous
897
00:39:38,728 --> 00:39:39,770
for the reasons you know.
898
00:39:39,770 --> 00:39:44,510
And it would just be very unwieldy and ugly and wrapping in all of that.
899
00:39:44,510 --> 00:39:48,500
It would be the virtual equivalent of me handing you all of these cards at once
900
00:39:48,500 --> 00:39:49,430
to just deal with.
901
00:39:49,430 --> 00:39:52,760
And, right, they're not that big, but it's a lot of cards to hold on to.
902
00:39:52,760 --> 00:39:55,760
It requires a lot of memory or physical storage, if you will.
903
00:39:55,760 --> 00:39:59,840
What range does, metaphorically, is, if you ask me for three cards,
904
00:39:59,840 --> 00:40:04,910
I hand you them one at a time, like this, so that, at any point in time,
905
00:40:04,910 --> 00:40:08,150
you only have one number in the computer's memory
906
00:40:08,150 --> 00:40:09,760
until you're handed the next.
907
00:40:09,760 --> 00:40:11,840
The alternative-- the previous version would
908
00:40:11,840 --> 00:40:15,360
be to hand me all three cards at once, or all 52 cards at once.
909
00:40:15,360 --> 00:40:17,840
But in this case, range is just way more efficient.
910
00:40:17,840 --> 00:40:19,700
You can do range of 1,000.
911
00:40:19,700 --> 00:40:22,940
That's not going to give you a list of 1,000 values all at once.
912
00:40:22,940 --> 00:40:25,910
It's going to give you 1,000 values one at a time,
913
00:40:25,910 --> 00:40:30,800
reducing memory significantly in the computer itself.
914
00:40:30,800 --> 00:40:31,310
All right.
915
00:40:31,310 --> 00:40:34,745
So, besides this, what about doing something forever in Scratch?
916
00:40:34,745 --> 00:40:38,060
Well, we could do this, literally, with a forever block, which didn't quite
917
00:40:38,060 --> 00:40:42,590
exist in C. In C, we had to hack it together by saying while True--
918
00:40:42,590 --> 00:40:46,000
because True is, by definition, T-R-U-E, always true.
919
00:40:46,000 --> 00:40:50,420
So this just deliberately induces an infinite loop for us.
920
00:40:50,420 --> 00:40:53,375
In Python, the logic's going to be almost the same.
921
00:40:53,375 --> 00:40:55,250
And infinite loops in Python tend to actually
922
00:40:55,250 --> 00:40:58,760
be even more common because you can always break out of them, as you could
923
00:40:58,760 --> 00:41:02,280
in C. In Python, it looks like this.
924
00:41:02,280 --> 00:41:05,960
And this is slightly more subtle, but gone are the curly braces.
925
00:41:05,960 --> 00:41:07,370
Gone are the parentheses.
926
00:41:07,370 --> 00:41:10,400
But ever so slight difference, too?
927
00:41:10,400 --> 00:41:13,187
A capital T for True and it's going to be a capital F for False.
928
00:41:13,187 --> 00:41:14,270
Stupid little differences.
929
00:41:14,270 --> 00:41:16,440
Eventually, you're going to mistype one or the other.
930
00:41:16,440 --> 00:41:18,607
But these are the kinds of things to keep an eye out
931
00:41:18,607 --> 00:41:21,770
and to start recognizing in your mind's eye when you read code.
932
00:41:21,770 --> 00:41:25,310
Questions, now, on any of these building blocks?
933
00:41:25,310 --> 00:41:26,075
Yeah?
934
00:41:26,075 --> 00:41:31,360
AUDIENCE: In the for loop, was i set to 0 once for [? every loop? ?]
935
00:41:31,360 --> 00:41:33,970
DAVID MALAN: In the for loop, was i--
936
00:41:33,970 --> 00:41:37,090
it was set to 0 on the first iteration, then 1 on the next,
937
00:41:37,090 --> 00:41:38,530
then 2 on the third.
938
00:41:38,530 --> 00:41:39,985
And the same thing for range.
939
00:41:39,985 --> 00:41:44,050
It just doesn't use up as much memory all at once.
940
00:41:44,050 --> 00:41:49,860
Other questions, now, on any of these building blocks of Python?
941
00:41:49,860 --> 00:41:50,400
All right.
942
00:41:50,400 --> 00:41:53,250
Well, let's go ahead and build something a little more than hello.
943
00:41:53,250 --> 00:41:56,415
Let me propose that, over here, we implement, maybe,
944
00:41:56,415 --> 00:41:58,200
the simplest of calculators here.
945
00:41:58,200 --> 00:42:02,145
So let me go back to VS Code here, open my terminal window
946
00:42:02,145 --> 00:42:06,885
and open up, say, a file called calculator.py.
947
00:42:06,885 --> 00:42:09,000
And in calculator.py, we'll have an opportunity
948
00:42:09,000 --> 00:42:11,340
to explore some of these building blocks,
949
00:42:11,340 --> 00:42:13,890
but we'll allow things to escalate pretty quickly
950
00:42:13,890 --> 00:42:17,225
to more interesting examples so that we can do the same thing, ultimately,
951
00:42:17,225 --> 00:42:17,760
as well.
952
00:42:17,760 --> 00:42:19,510
And, in fact, let me go ahead and do this.
953
00:42:19,510 --> 00:42:22,950
Moreover, I've brought some code with me in advance.
954
00:42:22,950 --> 00:42:25,725
For instance, something called calculator0.c,
955
00:42:25,725 --> 00:42:28,860
from the first week of C. And let me go ahead
956
00:42:28,860 --> 00:42:34,420
and split my window here, in fact, so that I can now do something like this.
957
00:42:34,420 --> 00:42:37,170
Let me move this over here, here.
958
00:42:37,170 --> 00:42:38,105
Calculator.py.
959
00:42:38,105 --> 00:42:40,920
So now, I have, on the left of my screen, calculator.c--
960
00:42:40,920 --> 00:42:43,620
or calculator0.c because that's the first version I
961
00:42:43,620 --> 00:42:45,690
made-- and calculator.py on the right.
962
00:42:45,690 --> 00:42:48,290
Let me go ahead and implement, really, the same idea here.
963
00:42:48,290 --> 00:42:51,675
So on the right-hand side, the analog of including cs50.h
964
00:42:51,675 --> 00:42:56,390
would be from cs50 import get_int if I want to, indeed, use this function.
965
00:42:56,390 --> 00:42:58,140
Now, I'm going to go ahead and give myself
966
00:42:58,140 --> 00:43:00,453
a variable x without defining its type.
967
00:43:00,453 --> 00:43:02,370
I'm going to use this get_int function and I'm
968
00:43:02,370 --> 00:43:05,302
going to prompt the user for x, just like in C.
969
00:43:05,302 --> 00:43:08,010
I'm, then, going to go ahead and prompt the user for another int,
970
00:43:08,010 --> 00:43:12,300
like y, here, just like in C. And at the very end, I'm going to go ahead
971
00:43:12,300 --> 00:43:14,640
and do print x plus y.
972
00:43:14,640 --> 00:43:15,690
And that's it.
973
00:43:15,690 --> 00:43:19,020
Now, granted, I have some comments in my C version of the code,
974
00:43:19,020 --> 00:43:21,090
just to remind you of what each line is doing.
975
00:43:21,090 --> 00:43:23,878
But I've still distilled this into six lines-- or, really, four
976
00:43:23,878 --> 00:43:25,170
if I get rid of the blank line.
977
00:43:25,170 --> 00:43:29,580
So it's already, perhaps, a bit tighter here.
978
00:43:29,580 --> 00:43:33,600
It's tighter because something really important, historically, is missing.
979
00:43:33,600 --> 00:43:38,240
What did I seem to omit altogether that we haven't really highlighted yet?
980
00:43:38,240 --> 00:43:39,136
Yeah?
981
00:43:39,136 --> 00:43:40,530
AUDIENCE: [INAUDIBLE]
982
00:43:40,530 --> 00:43:41,280
DAVID MALAN: Yeah.
983
00:43:41,280 --> 00:43:42,910
The main function is gone.
984
00:43:42,910 --> 00:43:45,330
And in fact, maybe you took for granted that it just
985
00:43:45,330 --> 00:43:47,580
worked a moment ago when I wrote hello, but I didn't
986
00:43:47,580 --> 00:43:49,273
have a main function in hello, either.
987
00:43:49,273 --> 00:43:52,440
And this, too, is a feature of Python and a lot of other languages, as well.
988
00:43:52,440 --> 00:43:55,320
Instead of having to adhere to these long-standing traditions,
989
00:43:55,320 --> 00:43:57,400
if you just want to write code and get something done, fine.
990
00:43:57,400 --> 00:43:59,925
Just write code and get something done without, necessarily,
991
00:43:59,925 --> 00:44:01,185
all of this same boilerplate.
992
00:44:01,185 --> 00:44:04,380
So whatever is in your Python file--
993
00:44:04,380 --> 00:44:06,510
left indented, if you will, by default--
994
00:44:06,510 --> 00:44:10,180
is just going to be the code that the interpreter runs, top to bottom,
995
00:44:10,180 --> 00:44:10,850
left to right.
996
00:44:10,850 --> 00:44:14,300
Well, let me go ahead, now, and run code like this.
997
00:44:14,300 --> 00:44:17,470
Let me go ahead and open back up my terminal window,
998
00:44:17,470 --> 00:44:19,140
run python of calculator.py.
999
00:44:19,140 --> 00:44:21,570
And I'll do x is 1, y is 2.
1000
00:44:21,570 --> 00:44:23,460
And as you might expect, it gives me 3.
1001
00:44:23,460 --> 00:44:24,570
Slight aesthetic bug.
1002
00:44:24,570 --> 00:44:26,590
I put my space in the wrong place here.
1003
00:44:26,590 --> 00:44:27,810
So that's a newbie mistake.
1004
00:44:27,810 --> 00:44:29,220
Let me fix that, aesthetically.
1005
00:44:29,220 --> 00:44:31,050
Let me rerun python of calculator.py.
1006
00:44:31,050 --> 00:44:31,680
Type in 1.
1007
00:44:31,680 --> 00:44:32,250
Type in 2.
1008
00:44:32,250 --> 00:44:36,280
And, voila, there is now my same version again.
1009
00:44:36,280 --> 00:44:39,585
But let me propose, now, that we get rid of this training wheel.
1010
00:44:39,585 --> 00:44:41,460
We don't want to keep taking one step forward
1011
00:44:41,460 --> 00:44:43,793
and then two steps back by adding these training wheels,
1012
00:44:43,793 --> 00:44:45,330
so let me instead do this.
1013
00:44:45,330 --> 00:44:49,590
In my version of calculator.py, suppose that we take away, already,
1014
00:44:49,590 --> 00:44:53,610
the training wheel that is the CS50 library here and let me,
1015
00:44:53,610 --> 00:44:56,910
instead, then, use just Python's built-in function called
1016
00:44:56,910 --> 00:44:59,020
input, which literally does just that.
1017
00:44:59,020 --> 00:45:03,600
It gets input from the user and it stores it, as before, in x and y.
1018
00:45:03,600 --> 00:45:04,950
So this is not CS50-specific.
1019
00:45:04,950 --> 00:45:07,155
This is real-world Python programming.
1020
00:45:07,155 --> 00:45:10,740
Well, let me go ahead and run, again, python of calculator.py.
1021
00:45:10,740 --> 00:45:16,530
And, of course, if x is 1 and y is 2, x plus y should, of course, still be 3.
1022
00:45:16,530 --> 00:45:19,306
1023
00:45:19,306 --> 00:45:24,285
It's apparently 12, according to Python, until CS50's library gets involved.
1024
00:45:24,285 --> 00:45:28,620
But does anyone want to infer what just went wrong?
1025
00:45:28,620 --> 00:45:29,160
Yeah?
1026
00:45:29,160 --> 00:45:32,925
AUDIENCE: We're always [INAUDIBLE].
1027
00:45:32,925 --> 00:45:33,800
DAVID MALAN: Exactly.
1028
00:45:33,800 --> 00:45:37,660
The input function, by design, always returns a string of text.
1029
00:45:37,660 --> 00:45:39,410
After all, that's what the human typed in.
1030
00:45:39,410 --> 00:45:42,620
And even though, yes, I typed the number keys on the keyboard,
1031
00:45:42,620 --> 00:45:44,600
it's still coming back as all text.
1032
00:45:44,600 --> 00:45:47,090
Now, maybe we should use like a get_int function.
1033
00:45:47,090 --> 00:45:48,575
Well, that doesn't exist in Python.
1034
00:45:48,575 --> 00:45:52,340
All you can do is get textual input-- a string from the user.
1035
00:45:52,340 --> 00:45:54,415
But we can convert one to the other.
1036
00:45:54,415 --> 00:45:58,610
And so, a fix for this so that we don't accidentally concatenate--
1037
00:45:58,610 --> 00:46:02,760
that is, join x plus y together-- would be to do something like this.
1038
00:46:02,760 --> 00:46:04,595
Let me go back to my Python code, here.
1039
00:46:04,595 --> 00:46:08,870
And whereas, in C, we could previously do typecasting--
1040
00:46:08,870 --> 00:46:11,060
we can convert one type to another--
1041
00:46:11,060 --> 00:46:14,420
that generally wasn't the case when you were doing something complex,
1042
00:46:14,420 --> 00:46:15,470
like a string to an int.
1043
00:46:15,470 --> 00:46:18,450
You could do a char to an int and vise versa.
1044
00:46:18,450 --> 00:46:22,370
But for a string, recall, there was a special function in the C-type library
1045
00:46:22,370 --> 00:46:25,100
called a to I, like Ascii to integer.
1046
00:46:25,100 --> 00:46:27,880
That's the closest analog, here.
1047
00:46:27,880 --> 00:46:29,630
And, in fact, the way to do this in Python
1048
00:46:29,630 --> 00:46:32,740
would be to use a function called int, which,
1049
00:46:32,740 --> 00:46:34,490
indeed, is the name of the data type, too,
1050
00:46:34,490 --> 00:46:36,380
even though I have not yet had to type it.
1051
00:46:36,380 --> 00:46:40,340
And I can convert the output of the input function
1052
00:46:40,340 --> 00:46:44,600
automatically from a string immediately to an int.
1053
00:46:44,600 --> 00:46:48,620
And now, if I go back to my terminal window, rerun python of calculator.py
1054
00:46:48,620 --> 00:46:52,770
with 1 and 2 for x and y, now, I'm back in business.
1055
00:46:52,770 --> 00:46:55,400
So that, then, is, for instance, what the CS50 library
1056
00:46:55,400 --> 00:46:59,420
does, if temporarily this week, is it just deals with the conversion for you.
1057
00:46:59,420 --> 00:47:03,500
And, in fact, bad things could happen if I type the wrong thing,
1058
00:47:03,500 --> 00:47:05,615
like dog or cat instead of a number.
1059
00:47:05,615 --> 00:47:08,400
But we'll cross that bridge in just a moment, as well.
1060
00:47:08,400 --> 00:47:08,900
All right.
1061
00:47:08,900 --> 00:47:11,990
What if we do something slightly different, now, with our calculator.
1062
00:47:11,990 --> 00:47:16,400
1063
00:47:16,400 --> 00:47:18,790
Instead of addition, let's do division instead.
1064
00:47:18,790 --> 00:47:23,990
So z equals x divided by y, thereby giving me a third variable z.
1065
00:47:23,990 --> 00:47:27,320
Let me go ahead and run python of calculator.py again.
1066
00:47:27,320 --> 00:47:29,120
I'll type in 1.
1067
00:47:29,120 --> 00:47:31,790
I'll type in 3 this time.
1068
00:47:31,790 --> 00:47:37,470
And what problem do you think we're about to see?
1069
00:47:37,470 --> 00:47:38,400
Or is it gone?
1070
00:47:38,400 --> 00:47:41,670
What happened when I did this in C, albeit with some slightly more
1071
00:47:41,670 --> 00:47:47,680
cryptic syntax, when I divided one number, like 1 divided by 3?
1072
00:47:47,680 --> 00:47:48,600
Anyone recall?
1073
00:47:48,600 --> 00:47:49,100
Yeah?
1074
00:47:49,100 --> 00:47:51,310
AUDIENCE: You would round to the nearest integer.
1075
00:47:51,310 --> 00:47:52,060
DAVID MALAN: Yeah.
1076
00:47:52,060 --> 00:47:55,030
So it would round down to the nearest integer,
1077
00:47:55,030 --> 00:47:57,560
whereby you experience truncation.
1078
00:47:57,560 --> 00:48:00,340
So if you take an integer like 1, you divide it
1079
00:48:00,340 --> 00:48:02,530
by another integer like 3, that technically
1080
00:48:02,530 --> 00:48:06,310
should be 0.33333, infinitely long.
1081
00:48:06,310 --> 00:48:10,297
But in C, recall, you truncate the value.
1082
00:48:10,297 --> 00:48:12,130
If you divide an int by an int, you get back
1083
00:48:12,130 --> 00:48:14,965
an int, which means you get only the integer part, which was the 0.
1084
00:48:14,965 --> 00:48:18,805
Now, Python actually handles this for us and avoids the truncation.
1085
00:48:18,805 --> 00:48:23,650
But it leaves us, still, with one other problem here, which is going to be,
1086
00:48:23,650 --> 00:48:27,453
for instance, not necessarily visible at a glance.
1087
00:48:27,453 --> 00:48:28,245
This looks correct.
1088
00:48:28,245 --> 00:48:31,780
This has solved the problem in C. So truncation does not happen.
1089
00:48:31,780 --> 00:48:36,010
The integers are automatically converted to a float-- a floating point value.
1090
00:48:36,010 --> 00:48:41,970
But what other problem did we trip over, back in week one?
1091
00:48:41,970 --> 00:48:44,480
1092
00:48:44,480 --> 00:48:49,700
What else got a little dicey when dealing with simple arithmetic?
1093
00:48:49,700 --> 00:48:51,238
Anyone recall?
1094
00:48:51,238 --> 00:48:53,280
Well, the syntax in Python is a little different,
1095
00:48:53,280 --> 00:48:54,780
but let me go ahead and do this.
1096
00:48:54,780 --> 00:48:58,700
It turns out, in Python, if you want to see more significant digits than what
1097
00:48:58,700 --> 00:49:02,360
I'm seeing here by default, which is a dozen or so, let me go ahead
1098
00:49:02,360 --> 00:49:03,715
and print out z as follows.
1099
00:49:03,715 --> 00:49:07,310
Let me first print out a format string because I want to format z
1100
00:49:07,310 --> 00:49:08,780
in an interesting way.
1101
00:49:08,780 --> 00:49:11,330
And notice, this would have no effect on the difference.
1102
00:49:11,330 --> 00:49:14,630
This is just a format string that, for no compelling reason at the moment,
1103
00:49:14,630 --> 00:49:19,280
is interpolating z in those curly braces using an fstring or format string.
1104
00:49:19,280 --> 00:49:23,390
If I run this again with 1 and 3, we'll see, indeed, the exact same thing.
1105
00:49:23,390 --> 00:49:25,700
But when you use an fstring, you, indeed,
1106
00:49:25,700 --> 00:49:28,460
have the ability to format that string more precisely.
1107
00:49:28,460 --> 00:49:32,930
Just like with %f in Python, you could start to fine-tune how many significant
1108
00:49:32,930 --> 00:49:35,720
digits you see--
1109
00:49:35,720 --> 00:49:37,070
in C, rather.
1110
00:49:37,070 --> 00:49:40,190
In Python, you can do the same, but the syntax is a little different.
1111
00:49:40,190 --> 00:49:43,925
If you want the computer to interpolate z and show you
1112
00:49:43,925 --> 00:49:47,570
50 significant digits-- that is, 50 numbers
1113
00:49:47,570 --> 00:49:50,033
after the decimal point-- syntax is similar to C,
1114
00:49:50,033 --> 00:49:51,200
but it's a little different.
1115
00:49:51,200 --> 00:49:54,110
You literally put a colon after the variable's name.
1116
00:49:54,110 --> 00:49:59,090
dot 50 means show me the decimal point and, then, 50 digits to the right,
1117
00:49:59,090 --> 00:50:02,760
and the f just indicates please treat this as a floating point value.
1118
00:50:02,760 --> 00:50:05,540
So now, if I rerun python of calculator.py,
1119
00:50:05,540 --> 00:50:11,495
divide 1 by 3, unfortunately, Python has not solved all of the world's problems
1120
00:50:11,495 --> 00:50:12,710
for us.
1121
00:50:12,710 --> 00:50:15,545
This, again, was an example of floating point imprecision.
1122
00:50:15,545 --> 00:50:17,692
So that problem is still latent.
1123
00:50:17,692 --> 00:50:20,150
So just because the world has advanced, doesn't necessarily
1124
00:50:20,150 --> 00:50:22,317
mean that all of our problems from C have gone away.
1125
00:50:22,317 --> 00:50:26,418
There are solutions using third-party libraries for scientific calculations
1126
00:50:26,418 --> 00:50:26,960
and the like.
1127
00:50:26,960 --> 00:50:31,445
But out of the box, floating point imprecision is still an issue.
1128
00:50:31,445 --> 00:50:35,780
Meanwhile, there was one other problem in C
1129
00:50:35,780 --> 00:50:39,890
that we ran into involving numbers, and that was this-- integer overflow.
1130
00:50:39,890 --> 00:50:41,930
Recall that an integer in C only took up,
1131
00:50:41,930 --> 00:50:45,140
what, 32 bits typically, which meant you could count as high as 4 billion
1132
00:50:45,140 --> 00:50:48,140
or, maybe, if you're doing positive and negatives, as high as 2 billion,
1133
00:50:48,140 --> 00:50:50,030
after which, weird things would happen.
1134
00:50:50,030 --> 00:50:54,798
The number would go to 0 or negative or it would overflow or wrap back around.
1135
00:50:54,798 --> 00:50:56,840
Well, wonderfully, in Python, they did, at least,
1136
00:50:56,840 --> 00:51:00,800
address this, whereby you can count as high as you want.
1137
00:51:00,800 --> 00:51:03,830
And Python will just use more and more and more and more
1138
00:51:03,830 --> 00:51:08,000
bits and bytes to store really big numbers so integer overflow is not
1139
00:51:08,000 --> 00:51:09,020
a thing.
1140
00:51:09,020 --> 00:51:13,820
With that said, Python is limited to how many digits it will show you
1141
00:51:13,820 --> 00:51:15,410
on the screen at once as a string.
1142
00:51:15,410 --> 00:51:18,560
But, mathematically, your math will be correct now.
1143
00:51:18,560 --> 00:51:21,860
So we've taken a couple of steps forward, one step sideways.
1144
00:51:21,860 --> 00:51:25,530
But, indeed, we have solved some of our problems here.
1145
00:51:25,530 --> 00:51:26,030
All right.
1146
00:51:26,030 --> 00:51:32,230
Questions, now, on any of these examples thus far?
1147
00:51:32,230 --> 00:51:34,400
Question?
1148
00:51:34,400 --> 00:51:35,000
All right.
1149
00:51:35,000 --> 00:51:40,250
Well, how about another problem that we encountered in C. Let's
1150
00:51:40,250 --> 00:51:41,720
revisit it here in Python, as well.
1151
00:51:41,720 --> 00:51:43,595
So let me go ahead and, on the left-hand side
1152
00:51:43,595 --> 00:51:54,020
here, let me open up a file called, say, compare3.c on the left,
1153
00:51:54,020 --> 00:51:57,640
and let me go ahead and create a new file on the right called compare.py.
1154
00:51:57,640 --> 00:52:00,070
Because recall that bad things happened when
1155
00:52:00,070 --> 00:52:03,580
we needed to compare two values in C. So on the left,
1156
00:52:03,580 --> 00:52:06,550
here, is a reminder of what we once did in C,
1157
00:52:06,550 --> 00:52:11,230
whereby, if we want to compare values, we can get an int in C, store it in x.
1158
00:52:11,230 --> 00:52:13,450
A get_int in C, store it in y.
1159
00:52:13,450 --> 00:52:16,180
We then have our familiar, conditional logic here,
1160
00:52:16,180 --> 00:52:19,210
just printing out if x x less than y or not.
1161
00:52:19,210 --> 00:52:23,080
Well, we can certainly do the same thing, ultimately, in Python
1162
00:52:23,080 --> 00:52:25,720
by using some fairly familiar syntax.
1163
00:52:25,720 --> 00:52:27,640
And let's just demonstrate this one quickly.
1164
00:52:27,640 --> 00:52:29,500
Let me go over here, too.
1165
00:52:29,500 --> 00:52:34,690
I'll do from cs50 import get_int, even though I could do this, instead,
1166
00:52:34,690 --> 00:52:36,700
with the input function itself.
1167
00:52:36,700 --> 00:52:39,700
x equals get_int, and I'll prompt the user for that.
1168
00:52:39,700 --> 00:52:42,880
y equals get_int, and I'll prompt the user for that.
1169
00:52:42,880 --> 00:52:45,910
After that, recall that I can say, without parentheses,
1170
00:52:45,910 --> 00:52:52,010
if x is less than y, then print out, without the f, "x is less than y."
1171
00:52:52,010 --> 00:52:58,570
Then, I can go ahead and say else if x is greater than y, I can print out,
1172
00:52:58,570 --> 00:53:01,270
quote unquote, "x is greater than y."
1173
00:53:01,270 --> 00:53:05,320
If you'd like to interject now, what did I screw up?
1174
00:53:05,320 --> 00:53:05,820
Anyone?
1175
00:53:05,820 --> 00:53:06,150
Yeah?
1176
00:53:06,150 --> 00:53:06,915
AUDIENCE: Elif.
1177
00:53:06,915 --> 00:53:07,957
DAVID MALAN: Elif, right?
1178
00:53:07,957 --> 00:53:13,965
So elif x is greater than y, else-- this part's the same-- print
1179
00:53:13,965 --> 00:53:18,000
"x is equal to y."
1180
00:53:18,000 --> 00:53:19,805
There's no new logic going on here.
1181
00:53:19,805 --> 00:53:21,960
But, at least syntactically, it's a little cleaner.
1182
00:53:21,960 --> 00:53:25,500
Indeed, this program is only 11 lines long, albeit without any comments.
1183
00:53:25,500 --> 00:53:27,765
Let me go ahead and run python of compare.py.
1184
00:53:27,765 --> 00:53:28,350
Let's see.
1185
00:53:28,350 --> 00:53:30,235
Is 1 less than 2?
1186
00:53:30,235 --> 00:53:30,735
Indeed.
1187
00:53:30,735 --> 00:53:32,070
Let's run it again.
1188
00:53:32,070 --> 00:53:33,330
Is 2 less than 1?
1189
00:53:33,330 --> 00:53:34,890
No, it's greater than.
1190
00:53:34,890 --> 00:53:37,740
And let's, lastly, type in 1 and 1 twice.
1191
00:53:37,740 --> 00:53:38,910
x is equal to y.
1192
00:53:38,910 --> 00:53:42,030
So we've got a pretty side-by-side, one-to-one conversion here.
1193
00:53:42,030 --> 00:53:44,190
Let's do something a little more interesting, then.
1194
00:53:44,190 --> 00:53:48,270
In C, how about I open, instead, something where we actually
1195
00:53:48,270 --> 00:53:49,310
compared for a purpose?
1196
00:53:49,310 --> 00:53:54,150
So if I open up, from earlier in the course--
1197
00:53:54,150 --> 00:54:00,320
how about agree.c, which prompt the user to agree to something or not?
1198
00:54:00,320 --> 00:54:03,860
And let me code up a new version here, called agree.py.
1199
00:54:03,860 --> 00:54:06,720
And I'll do this on the right-hand side, with agree.py.
1200
00:54:06,720 --> 00:54:08,830
But on agree.c on the left--
1201
00:54:08,830 --> 00:54:12,210
notice that this is how we did this yes-no thing in C--
1202
00:54:12,210 --> 00:54:16,590
we compared c, a character, equal to single quotes 'Y'
1203
00:54:16,590 --> 00:54:18,840
or equal to single quotes little 'y.'
1204
00:54:18,840 --> 00:54:20,430
And then, the same thing for n.
1205
00:54:20,430 --> 00:54:22,470
Now, in Python, this one is actually going
1206
00:54:22,470 --> 00:54:23,960
to be a little bit different, here.
1207
00:54:23,960 --> 00:54:27,310
Let me go ahead and, in the Python version of this,
1208
00:54:27,310 --> 00:54:29,640
let me do something like this.
1209
00:54:29,640 --> 00:54:31,258
We'll use get_string.
1210
00:54:31,258 --> 00:54:31,800
Actually, no.
1211
00:54:31,800 --> 00:54:33,217
We'll just use input in this case.
1212
00:54:33,217 --> 00:54:36,780
So let's do s equals input.
1213
00:54:36,780 --> 00:54:38,940
And we'll ask the user the same thing--
1214
00:54:38,940 --> 00:54:40,875
Do you agree, question mark.
1215
00:54:40,875 --> 00:54:46,110
Then, let's go ahead and say, if s equals equals--
1216
00:54:46,110 --> 00:54:48,940
how about Y?
1217
00:54:48,940 --> 00:54:49,740
Huh.
1218
00:54:49,740 --> 00:54:50,758
How do I do this?
1219
00:54:50,758 --> 00:54:51,550
Well, a few things.
1220
00:54:51,550 --> 00:54:54,660
Turns out, I'm going to do this-- s equals equals little y.
1221
00:54:54,660 --> 00:54:57,210
Then, I'm going to go ahead and print out "Agreed."
1222
00:54:57,210 --> 00:55:03,390
And elif s equals equals capital N or s equals equals lowercase n,
1223
00:55:03,390 --> 00:55:05,520
I'm going to go ahead and print out "Not agreed."
1224
00:55:05,520 --> 00:55:08,820
And I claim, for the moment, that this is identical, now,
1225
00:55:08,820 --> 00:55:13,760
to the program on the left in C. But what's different?
1226
00:55:13,760 --> 00:55:17,280
So we're still doing the same kind of logic, these equal equals
1227
00:55:17,280 --> 00:55:18,780
for comparing for equality.
1228
00:55:18,780 --> 00:55:21,922
But notice that, nicely enough, Python got rid of the two vertical bars,
1229
00:55:21,922 --> 00:55:23,505
and it's just literally the word "or."
1230
00:55:23,505 --> 00:55:27,933
If you recall seeing ampersand ampersand to express a logical and in C, [GRUNTS]
1231
00:55:27,933 --> 00:55:29,850
you can just write, literally, the word "and."
1232
00:55:29,850 --> 00:55:33,390
And so, here's a hint of why Python tends to be pretty popular.
1233
00:55:33,390 --> 00:55:35,640
People just like that it's a little closer to English.
1234
00:55:35,640 --> 00:55:38,520
There's a little less of the cryptic syntax here.
1235
00:55:38,520 --> 00:55:41,850
Now, this is correct, as this code will now work.
1236
00:55:41,850 --> 00:55:45,750
But I've also used double quotes instead of single quotes,
1237
00:55:45,750 --> 00:55:48,780
and I also omitted, a few minutes ago, from my list of data
1238
00:55:48,780 --> 00:55:51,180
types in Python the word "char."
1239
00:55:51,180 --> 00:55:53,430
In Python, there are no chars.
1240
00:55:53,430 --> 00:55:55,320
There are no individual characters.
1241
00:55:55,320 --> 00:55:58,830
If you want to manipulate an individual character, you use a string--
1242
00:55:58,830 --> 00:56:00,510
that is to say, a str--
1243
00:56:00,510 --> 00:56:01,680
of size 1.
1244
00:56:01,680 --> 00:56:04,930
Now, in Python, you can use single quotes or double quotes.
1245
00:56:04,930 --> 00:56:06,930
I'm deliberately using double quotes everywhere,
1246
00:56:06,930 --> 00:56:09,715
just for consistency with how we treat strings in C.
1247
00:56:09,715 --> 00:56:12,090
It's pretty common, though, to use single quotes instead,
1248
00:56:12,090 --> 00:56:14,190
if only because, on most keyboards, you don't
1249
00:56:14,190 --> 00:56:16,320
have to hold the Shift key anymore.
1250
00:56:16,320 --> 00:56:18,288
Humans have really started to optimize just how
1251
00:56:18,288 --> 00:56:19,830
quickly they want to be able to code.
1252
00:56:19,830 --> 00:56:22,110
So using a single quote tends to be pretty popular
1253
00:56:22,110 --> 00:56:24,270
in Python and other languages, as well.
1254
00:56:24,270 --> 00:56:29,520
They are fundamentally the same, single or double, unlike in C,
1255
00:56:29,520 --> 00:56:30,570
where they have meaning.
1256
00:56:30,570 --> 00:56:33,120
So this is correct, I claim.
1257
00:56:33,120 --> 00:56:34,830
And, in fact, let me run this real quick.
1258
00:56:34,830 --> 00:56:37,090
I'll open up my terminal window here.
1259
00:56:37,090 --> 00:56:40,230
Let me get rid of the version in C and run python of agree.py.
1260
00:56:40,230 --> 00:56:42,420
And I'll type in Y. OK.
1261
00:56:42,420 --> 00:56:44,220
I'll run it again and type in little y.
1262
00:56:44,220 --> 00:56:46,780
And I'll stipulate it's going to work for no, as well.
1263
00:56:46,780 --> 00:56:49,840
But this isn't necessarily the only way we can do this.
1264
00:56:49,840 --> 00:56:52,350
There are other ways to implement the same idea.
1265
00:56:52,350 --> 00:56:57,630
And in fact, I can go about doing this instead.
1266
00:56:57,630 --> 00:56:59,910
Let me go back up to my code here.
1267
00:56:59,910 --> 00:57:03,240
And we saw a hint of this earlier.
1268
00:57:03,240 --> 00:57:06,240
We know that lists exist in Python, and you can create them
1269
00:57:06,240 --> 00:57:08,040
just by using square brackets.
1270
00:57:08,040 --> 00:57:10,380
So what if I simplify the code a little bit and just
1271
00:57:10,380 --> 00:57:14,940
say if s is in the following list of values--
1272
00:57:14,940 --> 00:57:17,850
capital Y or lowercase y.
1273
00:57:17,850 --> 00:57:21,090
It's not all that different, logically, but it's a little tighter.
1274
00:57:21,090 --> 00:57:22,440
It's a little more compact.
1275
00:57:22,440 --> 00:57:29,040
So elif s is in capital N or lowercase n, I can express that same idea, too.
1276
00:57:29,040 --> 00:57:32,220
So here, again, it's just getting a little more pleasant to write code.
1277
00:57:32,220 --> 00:57:33,960
There's less hitting of the keyboard.
1278
00:57:33,960 --> 00:57:36,090
You can express yourself a little more succinctly.
1279
00:57:36,090 --> 00:57:40,020
And using the keyword in, Python will figure out
1280
00:57:40,020 --> 00:57:44,370
how to search the entire list for whatever the value of s is.
1281
00:57:44,370 --> 00:57:47,010
And if it finds it, it will return True automatically.
1282
00:57:47,010 --> 00:57:48,230
Else, it will return False.
1283
00:57:48,230 --> 00:57:54,960
So if I run agree.py again and type in capital Y or lowercase y, that still,
1284
00:57:54,960 --> 00:57:55,695
now, works.
1285
00:57:55,695 --> 00:58:00,330
Well, I can tighten this up further if I want to add more features.
1286
00:58:00,330 --> 00:58:04,710
Well, what if I want to support not just big Y and little y,
1287
00:58:04,710 --> 00:58:10,050
but how about "Yes" or "yes" or, in case the user
1288
00:58:10,050 --> 00:58:14,357
is yelling or someone who isn't good with CapsLock types in "YES?"
1289
00:58:14,357 --> 00:58:14,940
Wait a minute.
1290
00:58:14,940 --> 00:58:16,020
But it could be weird.
1291
00:58:16,020 --> 00:58:20,850
Do we want to support this or this?
1292
00:58:20,850 --> 00:58:23,480
This just gets really tedious, quickly, combinatorially,
1293
00:58:23,480 --> 00:58:25,710
if you consider all of these possible permutations.
1294
00:58:25,710 --> 00:58:27,990
What would be smarter than doing something
1295
00:58:27,990 --> 00:58:30,120
like this, if you want to just be able to tolerate
1296
00:58:30,120 --> 00:58:33,570
"yes" in any form of capitalization?
1297
00:58:33,570 --> 00:58:35,370
Logically, what would be nice?
1298
00:58:35,370 --> 00:58:38,232
AUDIENCE: Maybe, whatever the input is, you just transfer it over
1299
00:58:38,232 --> 00:58:40,357
to all lowercase while uppercase, and then redo it?
1300
00:58:40,357 --> 00:58:41,125
DAVID MALAN: Exactly.
1301
00:58:41,125 --> 00:58:42,042
Super common paradigm.
1302
00:58:42,042 --> 00:58:46,510
Why don't we just force the user's input to all lowercase or all uppercase--
1303
00:58:46,510 --> 00:58:49,570
doesn't matter, so long as we're self-consistent-- and just compare
1304
00:58:49,570 --> 00:58:52,030
against all uppercase or all lowercase.
1305
00:58:52,030 --> 00:58:55,760
And that will get rid of all of the possible permutations, otherwise.
1306
00:58:55,760 --> 00:58:58,510
Now, in C, we might have done something like this.
1307
00:58:58,510 --> 00:59:01,820
We might have simplified this whole list and just said--
1308
00:59:01,820 --> 00:59:04,940
let's say we'll do--
1309
00:59:04,940 --> 00:59:06,220
how about lowercase?
1310
00:59:06,220 --> 00:59:10,490
So y or yes, and we'll just leave it at that.
1311
00:59:10,490 --> 00:59:12,370
But we need to force, now, s to lowercase.
1312
00:59:12,370 --> 00:59:15,970
Well, in C, we would have used the C-type library.
1313
00:59:15,970 --> 00:59:19,660
We would have done to.lower and call that function, passing it in.
1314
00:59:19,660 --> 00:59:22,330
Although, not really because, in C-type, those
1315
00:59:22,330 --> 00:59:25,870
operate on individual characters or chars, not whole strings.
1316
00:59:25,870 --> 00:59:29,920
We actually didn't see a function that could convert a whole string in C
1317
00:59:29,920 --> 00:59:31,030
to lowercase.
1318
00:59:31,030 --> 00:59:34,910
But in Python, we're going to benefit from some other feature, as well.
1319
00:59:34,910 --> 00:59:39,330
It turns out that Python supports what's called object-oriented programming.
1320
00:59:39,330 --> 00:59:41,830
And we're only going to scratch the surface of this in CS50.
1321
00:59:41,830 --> 00:59:44,740
But if you take a higher-level C course in programming or CS,
1322
00:59:44,740 --> 00:59:46,750
you explore this as a different paradigm.
1323
00:59:46,750 --> 00:59:49,930
Up until now, in C, we've been focusing on what's called, really,
1324
00:59:49,930 --> 00:59:51,025
procedural programming.
1325
00:59:51,025 --> 00:59:52,210
You write procedures.
1326
00:59:52,210 --> 00:59:55,250
You write functions, top to bottom, left to right.
1327
00:59:55,250 --> 00:59:57,790
And when you want to change some value, we
1328
00:59:57,790 --> 01:00:00,550
were in the habit of using a procedure-- that is, a function.
1329
01:00:00,550 --> 01:00:03,670
You would pass something, like a variable, into a function,
1330
01:00:03,670 --> 01:00:07,600
like toupper or tolower, and it would do its thing and hand you back a value.
1331
01:00:07,600 --> 01:00:12,610
Well, it turns out that it would be nicer, programming-wise, if some data
1332
01:00:12,610 --> 01:00:15,250
types just had built-in functionality.
1333
01:00:15,250 --> 01:00:18,220
Why do we have our variables over here and all of our helper functions,
1334
01:00:18,220 --> 01:00:21,010
like toupper and tolower over here, such that we constantly
1335
01:00:21,010 --> 01:00:22,660
have to pass one into the other.
1336
01:00:22,660 --> 01:00:27,590
It would be nice to bake into our data type some built-in functionality
1337
01:00:27,590 --> 01:00:33,267
so that you can change variables using their own, default built-in
1338
01:00:33,267 --> 01:00:33,850
functionality.
1339
01:00:33,850 --> 01:00:37,510
And so, Object-Oriented Programming, otherwise known as OOP,
1340
01:00:37,510 --> 01:00:41,635
is a technique whereby certain types of values, like a string--
1341
01:00:41,635 --> 01:00:47,230
AKA str-- not only have properties inside of them--
1342
01:00:47,230 --> 01:00:49,900
attributes, just like a struct in C--
1343
01:00:49,900 --> 01:00:54,480
your data can also have functions built into them, as well.
1344
01:00:54,480 --> 01:00:57,955
So, whereas in C, which is not object-oriented, you have structs.
1345
01:00:57,955 --> 01:01:01,150
And structs can only store data, like a name and a number
1346
01:01:01,150 --> 01:01:02,620
when implementing a person.
1347
01:01:02,620 --> 01:01:07,210
In Python, you can, for instance, have not just a structure--
1348
01:01:07,210 --> 01:01:09,010
otherwise known as a class--
1349
01:01:09,010 --> 01:01:10,930
storing a name and a number.
1350
01:01:10,930 --> 01:01:15,460
You can have a function call that person or email that person
1351
01:01:15,460 --> 01:01:19,510
or actual verbs or actions associated with that piece of data.
1352
01:01:19,510 --> 01:01:21,910
Now, in the context of strings, it turns out
1353
01:01:21,910 --> 01:01:24,565
that strings come with a lot of useful functionality.
1354
01:01:24,565 --> 01:01:28,900
And in fact, at this URL here, which is in docs.python.org,
1355
01:01:28,900 --> 01:01:31,720
which is the official documentation for Python,
1356
01:01:31,720 --> 01:01:34,300
you'll see a whole list of methods--
1357
01:01:34,300 --> 01:01:37,705
that is, functions-- that come with strings that you can actually
1358
01:01:37,705 --> 01:01:40,150
use to modify their values.
1359
01:01:40,150 --> 01:01:42,440
And what I mean by this is the following.
1360
01:01:42,440 --> 01:01:44,900
If we go through the documentation, poke around,
1361
01:01:44,900 --> 01:01:48,163
it turns out that strings come with a function called lower.
1362
01:01:48,163 --> 01:01:50,080
And if you want to use that function, you just
1363
01:01:50,080 --> 01:01:54,850
have to use slightly different syntax than in C. You do not do tolower,
1364
01:01:54,850 --> 01:01:59,140
and you do not say, as I just did, lower because this function is
1365
01:01:59,140 --> 01:02:01,150
built into s itself.
1366
01:02:01,150 --> 01:02:05,770
And just like in C, when you want to go inside of a variable, like a structure,
1367
01:02:05,770 --> 01:02:09,790
and access a piece of data inside of it, like name or number,
1368
01:02:09,790 --> 01:02:12,370
when you also have functions built into data types--
1369
01:02:12,370 --> 01:02:17,530
AKA methods; a method is just a function that is built into a piece of data--
1370
01:02:17,530 --> 01:02:23,480
you can do s dot lower open paren, closed paren in this case.
1371
01:02:23,480 --> 01:02:25,480
And I can do this down here, as well.
1372
01:02:25,480 --> 01:02:33,280
If s.lower in, quote unquote, "n" or "no", the whole thing,
1373
01:02:33,280 --> 01:02:35,455
I can force this whole thing to lowercase.
1374
01:02:35,455 --> 01:02:38,620
So the only difference here, now, as an object-oriented programming,
1375
01:02:38,620 --> 01:02:41,840
instead of constantly passing a value into a function,
1376
01:02:41,840 --> 01:02:45,910
you just access a function that's inside of the value.
1377
01:02:45,910 --> 01:02:48,928
It just works because of how the language itself is defined.
1378
01:02:48,928 --> 01:02:51,220
And the only way you know whether these functions exist
1379
01:02:51,220 --> 01:02:55,495
is the documentation-- a class, a book, a website or the like.
1380
01:02:55,495 --> 01:03:00,490
Questions, now, on this technique?
1381
01:03:00,490 --> 01:03:01,070
All right.
1382
01:03:01,070 --> 01:03:02,513
I claim this is correct.
1383
01:03:02,513 --> 01:03:05,180
Now, even though you've never programmed, most of you, in Python
1384
01:03:05,180 --> 01:03:07,655
before, not super well-designed.
1385
01:03:07,655 --> 01:03:12,140
There's an subtle inefficiency, now, on lines 3 and 5 together.
1386
01:03:12,140 --> 01:03:18,150
What's dumb about how I've used lower, might you think?
1387
01:03:18,150 --> 01:03:18,720
Yeah?
1388
01:03:18,720 --> 01:03:21,975
AUDIENCE: I feel like, using it twice, you'd just want another [? variable. ?]
1389
01:03:21,975 --> 01:03:22,440
DAVID MALAN: Yeah.
1390
01:03:22,440 --> 01:03:25,482
If you're going to use the same function twice and ask the same question,
1391
01:03:25,482 --> 01:03:29,248
expecting the same answer, why are you calling the function itself twice?
1392
01:03:29,248 --> 01:03:31,415
Maybe we should just store the result in a variable.
1393
01:03:31,415 --> 01:03:33,030
So we could do this in a couple of different ways.
1394
01:03:33,030 --> 01:03:36,360
We, for instance, could go up here and create another variable called t
1395
01:03:36,360 --> 01:03:38,040
and set that equal to s.lower.
1396
01:03:38,040 --> 01:03:41,330
And then, we could just change this to be t, here.
1397
01:03:41,330 --> 01:03:43,080
But honestly, I don't think we technically
1398
01:03:43,080 --> 01:03:45,480
need another variable altogether, here.
1399
01:03:45,480 --> 01:03:47,410
I could just do something like this.
1400
01:03:47,410 --> 01:03:52,360
Let's change the value of s to be the lowercase version thereof.
1401
01:03:52,360 --> 01:03:55,920
And so, now, I can quite simply refer to s again and again like this,
1402
01:03:55,920 --> 01:03:57,550
reusing that same value.
1403
01:03:57,550 --> 01:04:01,380
Now, to be sure, I have now just lost the user's original input.
1404
01:04:01,380 --> 01:04:05,430
And if I care about that-- if they typed in all caps, I have no idea anymore.
1405
01:04:05,430 --> 01:04:08,070
So maybe I do want to use a separate variable, altogether.
1406
01:04:08,070 --> 01:04:10,830
But a takeaway here, too, is that strings in Python
1407
01:04:10,830 --> 01:04:13,590
are technically what we'll call immutable--
1408
01:04:13,590 --> 01:04:15,640
that is, they cannot be changed.
1409
01:04:15,640 --> 01:04:19,830
This was not true in C. Once we gave you arrays in week two
1410
01:04:19,830 --> 01:04:22,800
or memory in week four, you could go to town on a string
1411
01:04:22,800 --> 01:04:25,780
and change any of the characters you want-- uppercasing, lowercasing,
1412
01:04:25,780 --> 01:04:27,560
changing it, shortening it and so forth.
1413
01:04:27,560 --> 01:04:33,690
But in this case, this returns a copy of s, forced to lowercase.
1414
01:04:33,690 --> 01:04:35,790
It doesn't change the original string--
1415
01:04:35,790 --> 01:04:38,700
that is, the bytes in the computer's memory.
1416
01:04:38,700 --> 01:04:41,580
When you assign it back to s, you're essentially
1417
01:04:41,580 --> 01:04:43,703
forgetting about the old version of s.
1418
01:04:43,703 --> 01:04:46,620
But because Python does memory management for you-- there's no malloc,
1419
01:04:46,620 --> 01:04:47,820
there's no free--
1420
01:04:47,820 --> 01:04:52,200
Python automatically frees up the original bytes, like Y-E-S,
1421
01:04:52,200 --> 01:04:54,750
and hands them back to the operating system for you.
1422
01:04:54,750 --> 01:04:55,340
All right.
1423
01:04:55,340 --> 01:04:59,640
Questions, now, on this technique?
1424
01:04:59,640 --> 01:05:02,310
Questions on this?
1425
01:05:02,310 --> 01:05:05,145
In general, I'll call out-- the Python documentation
1426
01:05:05,145 --> 01:05:07,927
will start to be your friend because, in class, we'll only scratch
1427
01:05:07,927 --> 01:05:09,510
the surface with some of these things.
1428
01:05:09,510 --> 01:05:12,210
But in docs.python.org, for instance, there's
1429
01:05:12,210 --> 01:05:15,630
a whole reference of all of the built-in functions that come with the language,
1430
01:05:15,630 --> 01:05:18,135
as well as, for instance, those with a string.
1431
01:05:18,135 --> 01:05:19,620
All right.
1432
01:05:19,620 --> 01:05:23,205
Before we take a break, let's go ahead and create something a little familiar
1433
01:05:23,205 --> 01:05:27,030
too based on our weeks here, in C. Let me
1434
01:05:27,030 --> 01:05:30,690
propose that we revisit those examples involving some meows.
1435
01:05:30,690 --> 01:05:34,260
So, for instance, when we had our cat meow back in the first week
1436
01:05:34,260 --> 01:05:37,650
and, then, second in C, we did something that was a little stupid at first
1437
01:05:37,650 --> 01:05:41,960
whereby we created a file, as I'll do here-- this time, called meow.py.
1438
01:05:41,960 --> 01:05:44,550
And if I want a cat to meow three times, I
1439
01:05:44,550 --> 01:05:47,190
could run it once, like this, a little copy-paste.
1440
01:05:47,190 --> 01:05:50,580
And now, python of meow.py, and I'm done.
1441
01:05:50,580 --> 01:05:53,100
Now, we've visited this example two times, at least,
1442
01:05:53,100 --> 01:05:54,690
now in Scratch and in C.
1443
01:05:54,690 --> 01:06:00,080
It's correct, I'll stipulate, but what's, obviously, poorly designed?
1444
01:06:00,080 --> 01:06:01,655
What's the fault here?
1445
01:06:01,655 --> 01:06:02,212
Yeah?
1446
01:06:02,212 --> 01:06:03,670
AUDIENCE: It should just be a loop.
1447
01:06:03,670 --> 01:06:04,990
DAVID MALAN: It should just be a loop, right?
1448
01:06:04,990 --> 01:06:05,990
Why type it three times?
1449
01:06:05,990 --> 01:06:08,560
Literally, copying and pasting is almost always a bad thing--
1450
01:06:08,560 --> 01:06:11,440
except in C, when you have the function prototypes that you need to borrow.
1451
01:06:11,440 --> 01:06:13,232
But in this case, this is just inefficient.
1452
01:06:13,232 --> 01:06:15,652
So what could we do better here, in Python?
1453
01:06:15,652 --> 01:06:18,610
Well, in Python, we could probably change this in a few different ways.
1454
01:06:18,610 --> 01:06:21,280
We could borrow some of the syntax we proposed in slide form
1455
01:06:21,280 --> 01:06:23,710
earlier, like give me a variable called i.
1456
01:06:23,710 --> 01:06:26,080
Set it to 0, no semicolon.
1457
01:06:26,080 --> 01:06:29,510
While i is less than 3-- if I want to do this three times--
1458
01:06:29,510 --> 01:06:31,280
I can go ahead and print out "meow."
1459
01:06:31,280 --> 01:06:33,580
And then, I can do i plus equals 1.
1460
01:06:33,580 --> 01:06:35,080
And I think this would do the trick.
1461
01:06:35,080 --> 01:06:38,650
Python of meow.py, and we're back in business already.
1462
01:06:38,650 --> 01:06:41,463
Well, if I wanted to change this to a for loop, well, in Python,
1463
01:06:41,463 --> 01:06:44,380
it would be a little tighter, but this would not be the best approach.
1464
01:06:44,380 --> 01:06:52,510
So for i in 0, 1, 2, I could just do print "meow", like this.
1465
01:06:52,510 --> 01:06:54,250
And that, too, would get the job done.
1466
01:06:54,250 --> 01:06:58,390
But, to our discussion earlier, this would get stupid pretty quickly
1467
01:06:58,390 --> 01:07:00,970
if you had to keep enumerating all of these values.
1468
01:07:00,970 --> 01:07:03,880
What did we introduce instead?
1469
01:07:03,880 --> 01:07:04,940
The range function.
1470
01:07:04,940 --> 01:07:05,440
Exactly.
1471
01:07:05,440 --> 01:07:09,040
So that hands me back, way more efficiently, just the values I want,
1472
01:07:09,040 --> 01:07:10,635
indeed, one at a time.
1473
01:07:10,635 --> 01:07:14,745
So even this, if I run it a third or fourth time, we've got the same result.
1474
01:07:14,745 --> 01:07:18,220
But now, let's transition to where we went with this back in the day.
1475
01:07:18,220 --> 01:07:20,650
How can we start to modularize this?
1476
01:07:20,650 --> 01:07:24,100
It would be nice, I claimed, if MIT had given us a meow function.
1477
01:07:24,100 --> 01:07:27,370
Wouldn't it be nice if Python had given us a meow function?
1478
01:07:27,370 --> 01:07:30,580
Maybe less compelling in Python, but how can I build my own function?
1479
01:07:30,580 --> 01:07:33,618
Well, I did this briefly with the spell checker earlier,
1480
01:07:33,618 --> 01:07:36,160
but let me go ahead and propose that we could implement, now,
1481
01:07:36,160 --> 01:07:40,280
our own version of this in Python as follows.
1482
01:07:40,280 --> 01:07:44,050
Let me go ahead and start fresh here and use the keyword def.
1483
01:07:44,050 --> 01:07:47,860
So this did not exist in C. You had the return value, the function
1484
01:07:47,860 --> 01:07:48,850
name, the arguments.
1485
01:07:48,850 --> 01:07:52,120
In Python, you literally say def to define a function.
1486
01:07:52,120 --> 01:07:54,757
You give it a name, like meow.
1487
01:07:54,757 --> 01:07:57,840
And now, I'm going to go ahead and, in this function, just print out meow.
1488
01:07:57,840 --> 01:08:01,460
And this lets me change it to anything else I want in the future.
1489
01:08:01,460 --> 01:08:03,400
But for now, it's an abstraction.
1490
01:08:03,400 --> 01:08:07,773
And in fact, I can move it out of sight, out of mind--
1491
01:08:07,773 --> 01:08:09,940
just going to hit Enter a bunch of times to pretend,
1492
01:08:09,940 --> 01:08:13,382
now, it exists, but I don't care how it is implemented.
1493
01:08:13,382 --> 01:08:15,340
And up here, now, I can do something like this.
1494
01:08:15,340 --> 01:08:20,590
For i in range of 3, let me go ahead and not print "meow" anymore.
1495
01:08:20,590 --> 01:08:25,359
Let me just call meow and tightening up my code further.
1496
01:08:25,359 --> 01:08:25,960
Let's see.
1497
01:08:25,960 --> 01:08:26,859
Python of meow.py.
1498
01:08:26,859 --> 01:08:31,240
This is, I think, going to be the first time it does not work correctly.
1499
01:08:31,240 --> 01:08:32,680
OK.
1500
01:08:32,680 --> 01:08:36,310
So here, we have, sadly, our first Python error.
1501
01:08:36,310 --> 01:08:37,569
And let's see.
1502
01:08:37,569 --> 01:08:40,300
The syntax is going to be different from C or Clangs output.
1503
01:08:40,300 --> 01:08:41,920
Traceback is the term of art here.
1504
01:08:41,920 --> 01:08:44,859
This is like a trace back of all of the lines of code
1505
01:08:44,859 --> 01:08:47,560
that were just executed or, really, functions you've called.
1506
01:08:47,560 --> 01:08:49,090
The file name is uninteresting.
1507
01:08:49,090 --> 01:08:52,149
This is my codespace, specifically, but the file name
1508
01:08:52,149 --> 01:08:53,890
is important here-- meow.py.
1509
01:08:53,890 --> 01:08:55,675
Our line 2 is the issue--
1510
01:08:55,675 --> 01:08:58,060
OK, I didn't get very far before I screwed up--
1511
01:08:58,060 --> 01:08:59,470
and then, there's a name error.
1512
01:08:59,470 --> 01:09:03,430
And you'll see, in Python, there's typically these capitalized keywords
1513
01:09:03,430 --> 01:09:05,350
that hint at what the issue is.
1514
01:09:05,350 --> 01:09:09,260
It's something related to names of variables. "meow" is not defined.
1515
01:09:09,260 --> 01:09:09,760
All right.
1516
01:09:09,760 --> 01:09:11,635
You're programming Python for the first time.
1517
01:09:11,635 --> 01:09:12,399
You've screwed up.
1518
01:09:12,399 --> 01:09:14,560
You're following some online tutorial.
1519
01:09:14,560 --> 01:09:16,149
You're seeing this.
1520
01:09:16,149 --> 01:09:18,010
Reason through it.
1521
01:09:18,010 --> 01:09:20,680
Why might "meow" not be defined?
1522
01:09:20,680 --> 01:09:24,779
What can we infer about Python?
1523
01:09:24,779 --> 01:09:27,240
How to troubleshoot, logically?
1524
01:09:27,240 --> 01:09:29,147
AUDIENCE: [INAUDIBLE]
1525
01:09:29,147 --> 01:09:29,939
DAVID MALAN: Maybe.
1526
01:09:29,939 --> 01:09:32,520
Is it because "meow" is defined after?
1527
01:09:32,520 --> 01:09:34,890
As smart as Python seems to be, vis-a-vis C,
1528
01:09:34,890 --> 01:09:37,055
they have some similar design characteristics.
1529
01:09:37,055 --> 01:09:37,920
So let's try that.
1530
01:09:37,920 --> 01:09:41,729
So let me scroll all the way back down to where I moved this earlier.
1531
01:09:41,729 --> 01:09:43,649
Let me get rid of it--
1532
01:09:43,649 --> 01:09:44,279
way down there.
1533
01:09:44,279 --> 01:09:46,410
I'll copy it to my clipboard.
1534
01:09:46,410 --> 01:09:48,180
And let me just hack something together.
1535
01:09:48,180 --> 01:09:49,963
Let me just put it up here.
1536
01:09:49,963 --> 01:09:51,130
And let's see if this works.
1537
01:09:51,130 --> 01:09:54,120
So now, let me clear my terminal, run python of meow.py.
1538
01:09:54,120 --> 01:09:55,110
OK.
1539
01:09:55,110 --> 01:09:56,198
We're back in business.
1540
01:09:56,198 --> 01:09:57,990
So that was actually really good intuition.
1541
01:09:57,990 --> 01:10:00,180
Good debugging technique, just reason through it.
1542
01:10:00,180 --> 01:10:02,430
Now, this is contradicting what I claimed back
1543
01:10:02,430 --> 01:10:05,325
in week one, which was that the main part of your program,
1544
01:10:05,325 --> 01:10:07,470
ideally, should just be at the top of the file.
1545
01:10:07,470 --> 01:10:08,580
Don't make me look for it.
1546
01:10:08,580 --> 01:10:10,497
It's not a huge deal with a four-line program,
1547
01:10:10,497 --> 01:10:13,290
but if you've got 40 lines or 400 lines, you
1548
01:10:13,290 --> 01:10:15,480
don't want the juicy part of your program
1549
01:10:15,480 --> 01:10:18,455
to be way down here, and all of these functions way up here.
1550
01:10:18,455 --> 01:10:22,085
So it would be nice, maybe, if we actually have a main function.
1551
01:10:22,085 --> 01:10:25,260
And so, it actually turns out to be a convention in Python
1552
01:10:25,260 --> 01:10:27,460
to define a main function.
1553
01:10:27,460 --> 01:10:30,720
It's not a special function that's automatically called, like in C.
1554
01:10:30,720 --> 01:10:32,340
But humans realized, you know what?
1555
01:10:32,340 --> 01:10:34,120
That was a pretty useful feature.
1556
01:10:34,120 --> 01:10:36,540
Let me define a function called main.
1557
01:10:36,540 --> 01:10:39,000
Let me indent these lines underneath it.
1558
01:10:39,000 --> 01:10:41,070
Let me practice what I'm preaching, which is put
1559
01:10:41,070 --> 01:10:43,290
the main code at the top of the file.
1560
01:10:43,290 --> 01:10:47,730
And, wonderfully, in Python, now, you do not need prototypes.
1561
01:10:47,730 --> 01:10:49,920
There's none of that hackish copying and pasting
1562
01:10:49,920 --> 01:10:52,462
of the return type, the name and the arguments to a function,
1563
01:10:52,462 --> 01:10:58,485
like we needed in C. This is now OK instead, except for one, minor detail.
1564
01:10:58,485 --> 01:11:01,290
Let me go ahead and run python of meow.py.
1565
01:11:01,290 --> 01:11:05,940
Hopefully, now, I've solved this problem by having [GROANS] a main function.
1566
01:11:05,940 --> 01:11:08,170
But now, nothing has happened.
1567
01:11:08,170 --> 01:11:08,670
All right.
1568
01:11:08,670 --> 01:11:12,200
Even if you've never programmed in Python before,
1569
01:11:12,200 --> 01:11:17,855
what might explain this behavior, and how do I fix?
1570
01:11:17,855 --> 01:11:20,730
Again, when you're off in the real world, learning some new language,
1571
01:11:20,730 --> 01:11:23,790
all you have is deductive logic to debug.
1572
01:11:23,790 --> 01:11:24,300
Yeah?
1573
01:11:24,300 --> 01:11:28,656
AUDIENCE: I remember in C, even though we [INAUDIBLE]..
1574
01:11:28,656 --> 01:11:31,708
1575
01:11:31,708 --> 01:11:32,500
DAVID MALAN: Right.
1576
01:11:32,500 --> 01:11:34,510
So the solution, to be clear, in C was that we
1577
01:11:34,510 --> 01:11:35,650
had to put the prototype up here.
1578
01:11:35,650 --> 01:11:36,790
Otherwise, we'd get an error message.
1579
01:11:36,790 --> 01:11:39,123
In this case, I'm actually not getting an error message.
1580
01:11:39,123 --> 01:11:42,610
And, indeed, I'll claim that you don't need the prototypes in Python.
1581
01:11:42,610 --> 01:11:46,910
Just not necessary because that was annoying, if nothing else.
1582
01:11:46,910 --> 01:11:48,820
But what else might explain?
1583
01:11:48,820 --> 01:11:49,570
Yeah, in the back?
1584
01:11:49,570 --> 01:11:51,030
AUDIENCE: [INAUDIBLE]
1585
01:11:51,030 --> 01:11:51,780
DAVID MALAN: Yeah.
1586
01:11:51,780 --> 01:11:53,880
Maybe you have to call main itself.
1587
01:11:53,880 --> 01:11:58,410
If main is not some special status in Python, maybe just because it exists
1588
01:11:58,410 --> 01:11:59,040
isn't enough.
1589
01:11:59,040 --> 01:12:02,580
And, indeed, if you want to call main, the new convention
1590
01:12:02,580 --> 01:12:05,460
is actually going to be-- as the very last line of your program,
1591
01:12:05,460 --> 01:12:07,350
typically-- to literally call main.
1592
01:12:07,350 --> 01:12:10,950
It's a little stupid-looking, but they made a design decision.
1593
01:12:10,950 --> 01:12:13,200
And this is how, now, we work around it.
1594
01:12:13,200 --> 01:12:14,610
Python of meow.py.
1595
01:12:14,610 --> 01:12:16,890
Now we're back in business.
1596
01:12:16,890 --> 01:12:19,560
But now, logically, why does this work the way it does?
1597
01:12:19,560 --> 01:12:22,320
Well, in this case-- top to bottom--
1598
01:12:22,320 --> 01:12:25,350
line 1 is telling Python to define a function called main
1599
01:12:25,350 --> 01:12:27,660
and, then, define it as follows, lines 2 and 3.
1600
01:12:27,660 --> 01:12:29,610
But it's not calling main yet.
1601
01:12:29,610 --> 01:12:33,210
Line 6 is telling Python how to define a function called meow,
1602
01:12:33,210 --> 01:12:35,580
but it's not calling these lines yet.
1603
01:12:35,580 --> 01:12:38,730
Now, on line 10, you're telling Python, call main.
1604
01:12:38,730 --> 01:12:41,310
And at that point, Python has been trained, if you will,
1605
01:12:41,310 --> 01:12:45,390
to know what main is on line 1, to know what meow is on line 6.
1606
01:12:45,390 --> 01:12:49,650
And so, it's now perfectly OK for main to be above meow
1607
01:12:49,650 --> 01:12:51,150
because you never called them yet.
1608
01:12:51,150 --> 01:12:54,340
You defined, defined, and then, you called.
1609
01:12:54,340 --> 01:12:56,380
And that's the logic behind this.
1610
01:12:56,380 --> 01:13:01,250
Any questions, now, on the structure of this technique, here?
1611
01:13:01,250 --> 01:13:03,000
Now, let's do one more, then.
1612
01:13:03,000 --> 01:13:07,740
Recall that the last thing we did in Scratch and in C was to,
1613
01:13:07,740 --> 01:13:10,940
actually, parameterize these same functions.
1614
01:13:10,940 --> 01:13:14,070
So suppose that you don't want main to be responsible for the loop here.
1615
01:13:14,070 --> 01:13:17,580
You instead want to, very simply, do something like "meow" three times
1616
01:13:17,580 --> 01:13:18,660
and be done with it.
1617
01:13:18,660 --> 01:13:21,427
Well, in Python, it's going to be similar in spirit to C.
1618
01:13:21,427 --> 01:13:23,760
But, again, we don't need to keep mentioning data types.
1619
01:13:23,760 --> 01:13:26,310
If you want "meow" to take some argument--
1620
01:13:26,310 --> 01:13:27,930
like a number n--
1621
01:13:27,930 --> 01:13:30,792
you can just specify n as the name of that argument.
1622
01:13:30,792 --> 01:13:33,250
Or you can call it anything else, of course, that you want.
1623
01:13:33,250 --> 01:13:35,700
You don't have to specify int or anything else.
1624
01:13:35,700 --> 01:13:40,890
In your code, now, inside of meow, you can do something like for i in,
1625
01:13:40,890 --> 01:13:41,670
let's say--
1626
01:13:41,670 --> 01:13:45,690
I definitely, now, can't do this because that would be weird, to start the list
1627
01:13:45,690 --> 01:13:46,590
and end it with n.
1628
01:13:46,590 --> 01:13:49,360
So, if I can come back over here, what's the solution?
1629
01:13:49,360 --> 01:13:51,270
How can I do something n times?
1630
01:13:51,270 --> 01:13:52,410
AUDIENCE: [INAUDIBLE]
1631
01:13:52,410 --> 01:13:53,160
DAVID MALAN: Yeah.
1632
01:13:53,160 --> 01:13:54,340
Using range.
1633
01:13:54,340 --> 01:13:58,140
So range is nice because I can pass in, now, this variable n.
1634
01:13:58,140 --> 01:13:59,940
And now, I can meow-- whoops.
1635
01:13:59,940 --> 01:14:03,195
Now i can print out, quote unquote, "meow."
1636
01:14:03,195 --> 01:14:05,820
So it's almost the same as in Scratch, almost the same as in C.
1637
01:14:05,820 --> 01:14:06,903
But it's a little simpler.
1638
01:14:06,903 --> 01:14:12,210
And if, now, I run meow.py, I'll have the ability, now, to do this here,
1639
01:14:12,210 --> 01:14:13,110
as well.
1640
01:14:13,110 --> 01:14:13,770
All right.
1641
01:14:13,770 --> 01:14:16,590
Questions on any of this?
1642
01:14:16,590 --> 01:14:19,800
Right now, we're taking this stroll through week one.
1643
01:14:19,800 --> 01:14:22,050
We're going to, momentarily, escalate things
1644
01:14:22,050 --> 01:14:24,840
to look not only at some of these basics,
1645
01:14:24,840 --> 01:14:27,390
but also, other features, like we saw with face recognition
1646
01:14:27,390 --> 01:14:28,920
with the speller or the like.
1647
01:14:28,920 --> 01:14:31,962
Because of how many of us are here, we have a huge amount of candy
1648
01:14:31,962 --> 01:14:32,670
out in the lobby.
1649
01:14:32,670 --> 01:14:34,440
So why don't we go ahead and take a 10-minute break?
1650
01:14:34,440 --> 01:14:37,230
And when we come back, we'll do even fancier, more powerful things
1651
01:14:37,230 --> 01:14:38,595
with Python in 10.
1652
01:14:38,595 --> 01:14:40,020
All right.
1653
01:14:40,020 --> 01:14:41,730
So we are back.
1654
01:14:41,730 --> 01:14:44,280
Among our goals, now, are to introduce a few more building
1655
01:14:44,280 --> 01:14:47,880
blocks so that we can solve more interesting problems at the end,
1656
01:14:47,880 --> 01:14:49,560
much like those that we began with.
1657
01:14:49,560 --> 01:14:52,830
You'll recall, from a few weeks ago, we played with this two-dimensional Super
1658
01:14:52,830 --> 01:14:53,670
Mario world.
1659
01:14:53,670 --> 01:14:57,380
And we tried to print a vertical column of three or more bricks.
1660
01:14:57,380 --> 01:15:00,210
Well, let me propose that we use this as an opportunity to, now,
1661
01:15:00,210 --> 01:15:02,880
tinker with some of Python's more useful, more
1662
01:15:02,880 --> 01:15:04,470
user-friendly functionality, as well.
1663
01:15:04,470 --> 01:15:09,265
So let me code a file called mario.py, and let's just print out
1664
01:15:09,265 --> 01:15:10,890
the equivalent of that vertical column.
1665
01:15:10,890 --> 01:15:12,690
So it's of height 3.
1666
01:15:12,690 --> 01:15:16,740
Each one is a hash, so let's do for i in range of 3 initially,
1667
01:15:16,740 --> 01:15:18,600
and let's just print out a single hash.
1668
01:15:18,600 --> 01:15:21,790
And I think, now, python of mario.py--
1669
01:15:21,790 --> 01:15:22,290
voila.
1670
01:15:22,290 --> 01:15:27,480
We're in business, printing out just that same column there.
1671
01:15:27,480 --> 01:15:31,110
What if, though, we want to print a column of some variable height
1672
01:15:31,110 --> 01:15:33,510
where the user tells us how tall they want it to be?
1673
01:15:33,510 --> 01:15:39,600
Well, let me go up here, for instance and, instead, how about--
1674
01:15:39,600 --> 01:15:40,920
let's do this.
1675
01:15:40,920 --> 01:15:45,210
How about from cs50 import?
1676
01:15:45,210 --> 01:15:47,620
How about the get_int function, as before?
1677
01:15:47,620 --> 01:15:50,430
So it will deal with making sure the user gives us an integer.
1678
01:15:50,430 --> 01:15:54,750
And now, in the past, whenever we wanted to get a number from a user,
1679
01:15:54,750 --> 01:15:56,780
we've actually followed a certain paradigm.
1680
01:15:56,780 --> 01:16:02,895
In fact, if I open up here, for instance,
1681
01:16:02,895 --> 01:16:06,630
how about mario1.c from a while back, you
1682
01:16:06,630 --> 01:16:11,430
might recall that we had code like this.
1683
01:16:11,430 --> 01:16:13,800
And we specifically use the do while loop in C
1684
01:16:13,800 --> 01:16:16,410
whenever we want to get something from the user,
1685
01:16:16,410 --> 01:16:18,858
maybe, again and again and again, until they cooperate.
1686
01:16:18,858 --> 01:16:20,900
At which point, we finally break out of the loop.
1687
01:16:20,900 --> 01:16:22,830
So it turns out, Python does have while loops,
1688
01:16:22,830 --> 01:16:25,698
does have for loops, does not have do while loops.
1689
01:16:25,698 --> 01:16:27,990
And yet, pretty much any time you've gotten user input,
1690
01:16:27,990 --> 01:16:30,100
you've probably used this paradigm.
1691
01:16:30,100 --> 01:16:33,930
So it turns out that the Python equivalent of this is to do,
1692
01:16:33,930 --> 01:16:36,450
similar in spirit, but using only a while loop.
1693
01:16:36,450 --> 01:16:39,300
And a common paradigm in Python, as I alluded earlier,
1694
01:16:39,300 --> 01:16:43,440
is to actually deliberately induce an infinite loop while True--
1695
01:16:43,440 --> 01:16:48,240
capital T-- and then, do what you want to do, like get an int from the user
1696
01:16:48,240 --> 01:16:51,690
and prompt them for the height, for instance, in question.
1697
01:16:51,690 --> 01:16:56,070
And then, if you're sure that the user has given you what you want--
1698
01:16:56,070 --> 01:16:59,220
like n is greater than 0, which is what I want, in this case,
1699
01:16:59,220 --> 01:17:02,610
because I want a positive integer; otherwise, there's nothing to print--
1700
01:17:02,610 --> 01:17:04,505
you literally just break out of the loop.
1701
01:17:04,505 --> 01:17:08,070
And so, we could actually use this technique in C. It's just not
1702
01:17:08,070 --> 01:17:10,260
really done in C. You could absolutely, in C,
1703
01:17:10,260 --> 01:17:13,590
have done a while True loop with the parentheses, lowercase true.
1704
01:17:13,590 --> 01:17:15,670
You could break out of it, and so forth.
1705
01:17:15,670 --> 01:17:18,312
But in Python, this is the Python way.
1706
01:17:18,312 --> 01:17:19,770
And this is actually a term of art.
1707
01:17:19,770 --> 01:17:24,017
This way in Python is pythonic This is "the way everyone does it,"
1708
01:17:24,017 --> 01:17:24,600
quote unquote.
1709
01:17:24,600 --> 01:17:28,830
Doesn't mean you have to, but that's the way the cool Python programmers would
1710
01:17:28,830 --> 01:17:31,980
implement an idea like this-- trying to do something again and again
1711
01:17:31,980 --> 01:17:34,607
and again until the user actually cooperates.
1712
01:17:34,607 --> 01:17:36,690
But all we've done is take away the do while loop.
1713
01:17:36,690 --> 01:17:39,790
But still, logically, we can implement the same idea.
1714
01:17:39,790 --> 01:17:44,580
Now, below this, let me go ahead and just print out, for i in range of n
1715
01:17:44,580 --> 01:17:47,370
this time-- because I want it to be variable and not 3.
1716
01:17:47,370 --> 01:17:49,920
I can go ahead and print out the hash--
1717
01:17:49,920 --> 01:17:52,260
let me go ahead and get rid of the C version here--
1718
01:17:52,260 --> 01:17:55,920
open my terminal window and I'll run, again, Python of mario.py.
1719
01:17:55,920 --> 01:17:58,530
I'll type in 3 and I get back those three hashes.
1720
01:17:58,530 --> 01:18:02,635
But if I, instead, type in 4, I now get four hashes instead.
1721
01:18:02,635 --> 01:18:04,640
So the takeaway here is, quite simply, that this
1722
01:18:04,640 --> 01:18:08,030
would be the way, for instance, to actually get back
1723
01:18:08,030 --> 01:18:11,615
a value in Python that is consistent with some parameter,
1724
01:18:11,615 --> 01:18:13,160
like greater than 0.
1725
01:18:13,160 --> 01:18:13,950
How about this?
1726
01:18:13,950 --> 01:18:17,810
Let's actually practice what we preached a moment ago with our meowing examples
1727
01:18:17,810 --> 01:18:19,830
and factoring all this out.
1728
01:18:19,830 --> 01:18:23,220
Let me go ahead and define a main function, as before.
1729
01:18:23,220 --> 01:18:25,190
Let me go ahead and assume, for the moment,
1730
01:18:25,190 --> 01:18:28,673
that a get_height function exists, which is not a thing in Python.
1731
01:18:28,673 --> 01:18:30,340
I'm going to invent it in just a moment.
1732
01:18:30,340 --> 01:18:33,620
And now, I'm going to go ahead and do something like this. for i
1733
01:18:33,620 --> 01:18:39,470
in the range of that height, well, let's go ahead and print out those hashes.
1734
01:18:39,470 --> 01:18:41,760
So I'm assuming that get_height exists.
1735
01:18:41,760 --> 01:18:44,725
Let me go ahead and implement that abstraction, so define a function,
1736
01:18:44,725 --> 01:18:46,100
now, called get_height.
1737
01:18:46,100 --> 01:18:48,830
It's not going to take any arguments in this design.
1738
01:18:48,830 --> 01:18:52,820
While True, I can go ahead and do the same thing as before--
1739
01:18:52,820 --> 01:18:55,880
assign a variable n, the return value of get_int
1740
01:18:55,880 --> 01:18:58,140
prompting the user for that height.
1741
01:18:58,140 --> 01:19:03,980
And then, if n is greater than 0, I can go ahead and break.
1742
01:19:03,980 --> 01:19:08,390
But if I break here, I, logically-- just like in C--
1743
01:19:08,390 --> 01:19:11,360
end up executing below the loop in question.
1744
01:19:11,360 --> 01:19:12,690
But there's nothing there.
1745
01:19:12,690 --> 01:19:16,820
But if I want get_height to return the height, what should
1746
01:19:16,820 --> 01:19:18,650
I type here on line 14, logically?
1747
01:19:18,650 --> 01:19:21,580
1748
01:19:21,580 --> 01:19:23,380
What do I want to return, to be clear?
1749
01:19:23,380 --> 01:19:23,995
AUDIENCE: [INAUDIBLE]
1750
01:19:23,995 --> 01:19:24,745
DAVID MALAN: Yeah.
1751
01:19:24,745 --> 01:19:26,890
So I actually want to return n.
1752
01:19:26,890 --> 01:19:30,880
And here's another curiosity of Python, vis-a-vis C.
1753
01:19:30,880 --> 01:19:33,670
There doesn't seem to be an issue of scope anymore, right?
1754
01:19:33,670 --> 01:19:37,180
In C, it was super important to not only declare your variables with the data
1755
01:19:37,180 --> 01:19:39,550
types, you also had to be mindful of where they exist--
1756
01:19:39,550 --> 01:19:41,200
inside of those curly braces.
1757
01:19:41,200 --> 01:19:45,238
In Python, it turns out you can be a little looser with things, for better
1758
01:19:45,238 --> 01:19:45,780
or for worse.
1759
01:19:45,780 --> 01:19:50,020
And so, on line 11, if I create a variable called n,
1760
01:19:50,020 --> 01:19:57,170
it exists on line 11, 12 and even 13, outside of the while loop.
1761
01:19:57,170 --> 01:19:59,710
So to be clear, in C, with a while loop, we
1762
01:19:59,710 --> 01:20:03,040
would have ordinarily had not a colon.
1763
01:20:03,040 --> 01:20:05,920
We would have had the curly brace, like here and over here.
1764
01:20:05,920 --> 01:20:08,770
And a week ago, I would have claimed that, in C, n
1765
01:20:08,770 --> 01:20:12,130
does not exist outside of the while loop, by nature of those curly braces.
1766
01:20:12,130 --> 01:20:15,250
Even though the curly braces are gone, Python actually
1767
01:20:15,250 --> 01:20:20,685
allows you to use a variable any time after you have assigned it a value.
1768
01:20:20,685 --> 01:20:23,625
So slightly more powerful, as such.
1769
01:20:23,625 --> 01:20:26,830
However, I can tighten this up a little bit, logically.
1770
01:20:26,830 --> 01:20:30,700
And this is true in C. I don't really need to break out of the loop
1771
01:20:30,700 --> 01:20:32,020
by using break.
1772
01:20:32,020 --> 01:20:36,070
Recall that or know that I can actually-- once I'm ready to go,
1773
01:20:36,070 --> 01:20:40,030
I can just return the value I care about, even inside of the loop.
1774
01:20:40,030 --> 01:20:43,000
And that will have the side effect of breaking me out of the loop
1775
01:20:43,000 --> 01:20:46,590
and, also, breaking me out of and returning from the entire function.
1776
01:20:46,590 --> 01:20:50,470
So nothing too new here, in terms of C versus Python, except for this issue
1777
01:20:50,470 --> 01:20:51,490
with scope.
1778
01:20:51,490 --> 01:20:53,770
And I, indeed, returned n at the bottom there,
1779
01:20:53,770 --> 01:20:56,360
just to make clear that n would still exist.
1780
01:20:56,360 --> 01:20:58,170
So either of those are correct.
1781
01:20:58,170 --> 01:21:02,350
Now, I just have a Python program that I think
1782
01:21:02,350 --> 01:21:05,590
is going to allow me to implement this same Mario idea.
1783
01:21:05,590 --> 01:21:07,450
So let's run python of mario.py.
1784
01:21:07,450 --> 01:21:09,820
And-- OK, so nothing happened.
1785
01:21:09,820 --> 01:21:13,390
Python of mario.py.
1786
01:21:13,390 --> 01:21:14,260
What did I do wrong?
1787
01:21:14,260 --> 01:21:14,965
AUDIENCE: [INAUDIBLE]
1788
01:21:14,965 --> 01:21:16,590
DAVID MALAN: Yeah, I have to call main.
1789
01:21:16,590 --> 01:21:19,720
So, at the bottom of my code, I have to call main here.
1790
01:21:19,720 --> 01:21:22,720
And this is a stylistic detail that's been subtle.
1791
01:21:22,720 --> 01:21:26,050
Generally speaking, when you are writing in Python,
1792
01:21:26,050 --> 01:21:28,360
there's not a CS50 style guide, per se.
1793
01:21:28,360 --> 01:21:33,700
There's actually a Python style guide that most people adhere to.
1794
01:21:33,700 --> 01:21:37,480
And in this case, double blank lines between functions is the norm.
1795
01:21:37,480 --> 01:21:41,890
I'm doing that deliberately, although it might, otherwise, not be obvious.
1796
01:21:41,890 --> 01:21:45,130
But now that I've called main on line 16, let's run mario.py once more.
1797
01:21:45,130 --> 01:21:46,690
Aha.
1798
01:21:46,690 --> 01:21:47,560
Now we see it.
1799
01:21:47,560 --> 01:21:51,730
Type in 3, and I'm back in business, printing out the values there.
1800
01:21:51,730 --> 01:21:52,330
Yeah?
1801
01:21:52,330 --> 01:21:54,146
AUDIENCE: Why do you [INAUDIBLE]?
1802
01:21:54,146 --> 01:21:56,120
Why can't [INAUDIBLE]?
1803
01:21:56,120 --> 01:21:56,870
DAVID MALAN: Sure.
1804
01:21:56,870 --> 01:21:58,453
Why do I need the if condition at all?
1805
01:21:58,453 --> 01:22:02,390
Why can't I just return n here as by doing return n.
1806
01:22:02,390 --> 01:22:06,890
Or if I really want to be succinct, I could technically just do this.
1807
01:22:06,890 --> 01:22:09,512
The only reason I added the if condition is
1808
01:22:09,512 --> 01:22:11,720
because, if the user types in negative 1, negative 2,
1809
01:22:11,720 --> 01:22:13,850
I wanted to prompt them again and again.
1810
01:22:13,850 --> 01:22:14,390
That's all.
1811
01:22:14,390 --> 01:22:17,660
But that would be totally acceptable, too, if you were OK with that result
1812
01:22:17,660 --> 01:22:18,630
instead.
1813
01:22:18,630 --> 01:22:21,170
Well, let me do one other thing here to point out
1814
01:22:21,170 --> 01:22:23,870
why we are using get_int so frequently.
1815
01:22:23,870 --> 01:22:26,030
This new training wheel, albeit temporarily.
1816
01:22:26,030 --> 01:22:28,490
So let me go back to the way it was a moment ago
1817
01:22:28,490 --> 01:22:32,510
and let me propose, now, to take away get_int.
1818
01:22:32,510 --> 01:22:35,840
I claimed earlier that, if you're not using get_int,
1819
01:22:35,840 --> 01:22:40,400
you can just use the input function itself from Python.
1820
01:22:40,400 --> 01:22:43,250
But that always returns a string, or a str.
1821
01:22:43,250 --> 01:22:48,110
And so, recall that you have to pass the output of the input function to an int,
1822
01:22:48,110 --> 01:22:51,930
either on the same line or, if you prefer, on another line, instead.
1823
01:22:51,930 --> 01:22:54,110
But it turns out what I didn't do was show
1824
01:22:54,110 --> 01:22:59,250
you what happens if you don't cooperate with the program.
1825
01:22:59,250 --> 01:23:02,540
So if I run python of mario.py now, works great, even
1826
01:23:02,540 --> 01:23:04,252
without the get_int function.
1827
01:23:04,252 --> 01:23:05,210
And I can do it with 4.
1828
01:23:05,210 --> 01:23:06,575
Still works great.
1829
01:23:06,575 --> 01:23:09,122
But let me clear my terminal and be difficult, now,
1830
01:23:09,122 --> 01:23:11,330
as the user and type in "cat" for the height instead.
1831
01:23:11,330 --> 01:23:12,560
Enter.
1832
01:23:12,560 --> 01:23:14,540
Now, we see one of those trace backs again.
1833
01:23:14,540 --> 01:23:15,900
This one is different.
1834
01:23:15,900 --> 01:23:18,780
This isn't a name error, but, apparently, a value error.
1835
01:23:18,780 --> 01:23:20,870
And if I ignore the stuff I don't understand,
1836
01:23:20,870 --> 01:23:24,440
I can see "invalid literal for int with base 10-- "cat.""
1837
01:23:24,440 --> 01:23:27,800
That's a super cryptic way of saying that C-A-T is not
1838
01:23:27,800 --> 01:23:29,640
a number in decimal notation.
1839
01:23:29,640 --> 01:23:32,600
And so, I would seem to have to, somehow, handle this case.
1840
01:23:32,600 --> 01:23:34,490
And if you want to be more curious, you'll
1841
01:23:34,490 --> 01:23:36,350
see that this is, indeed, a traceback.
1842
01:23:36,350 --> 01:23:40,100
And C tends to do this, too, or the debugger would do this for you, too.
1843
01:23:40,100 --> 01:23:41,960
You can see all of the functions that have
1844
01:23:41,960 --> 01:23:43,502
been called to get you to this point.
1845
01:23:43,502 --> 01:23:48,170
So apparently, my problem is, initially, in line 14.
1846
01:23:48,170 --> 01:23:50,375
But line 14, if I keep scrolling, is uninteresting.
1847
01:23:50,375 --> 01:23:51,410
It's main.
1848
01:23:51,410 --> 01:23:55,820
But line 14 leads me to execute line 2, which is, indeed, in main.
1849
01:23:55,820 --> 01:23:59,225
That leads me to execute line 9, which is in get_height.
1850
01:23:59,225 --> 01:24:00,880
And so, OK, here is the issue.
1851
01:24:00,880 --> 01:24:02,960
So the closest line number to the error message
1852
01:24:02,960 --> 01:24:05,360
is the one that probably reveals the most.
1853
01:24:05,360 --> 01:24:06,950
Line 9 is where my issue is.
1854
01:24:06,950 --> 01:24:10,940
So I can't just blindly ask the user for input and, then, convert it to an int
1855
01:24:10,940 --> 01:24:12,620
if they're not going to give me an int.
1856
01:24:12,620 --> 01:24:13,870
Now, how do we deal with this?
1857
01:24:13,870 --> 01:24:16,010
Well, back in problem set two, you might recall
1858
01:24:16,010 --> 01:24:18,380
validating that the user typed in a number
1859
01:24:18,380 --> 01:24:19,862
and using a for loop and the like.
1860
01:24:19,862 --> 01:24:22,445
Well, it turns out, there's a better way to do this in Python,
1861
01:24:22,445 --> 01:24:24,830
and the semantics are there.
1862
01:24:24,830 --> 01:24:29,600
If you want to try to convert something to a number that might not actually
1863
01:24:29,600 --> 01:24:32,780
be a number, turns out, Python and certain other languages
1864
01:24:32,780 --> 01:24:35,060
literally have a keyword called try.
1865
01:24:35,060 --> 01:24:37,820
And if only this existed for the past few weeks, I know.
1866
01:24:37,820 --> 01:24:40,583
But you can try to do the following with your code.
1867
01:24:40,583 --> 01:24:41,750
What do I want to try to do?
1868
01:24:41,750 --> 01:24:46,980
Well, I want to try to execute those few lines, except if there's an error.
1869
01:24:46,980 --> 01:24:50,225
So I can say except if there's a value error-- specifically,
1870
01:24:50,225 --> 01:24:53,065
the one I screwed up and created a moment ago.
1871
01:24:53,065 --> 01:24:56,480
And if there is a value error, I can print out an informative message
1872
01:24:56,480 --> 01:25:00,920
to the user, like "not an integer" or anything else.
1873
01:25:00,920 --> 01:25:05,270
And what's happening here, now, is literally this operative word, try.
1874
01:25:05,270 --> 01:25:09,920
Python is going to try to get input and try to convert it to an int,
1875
01:25:09,920 --> 01:25:12,470
and it's going to try to check if it's greater than 0
1876
01:25:12,470 --> 01:25:14,750
and then try to return it.
1877
01:25:14,750 --> 01:25:15,467
Why?
1878
01:25:15,467 --> 01:25:17,300
Three of those lines are inside of, indented
1879
01:25:17,300 --> 01:25:20,780
underneath the try block, except if something goes wrong--
1880
01:25:20,780 --> 01:25:23,540
specifically, a value error happens.
1881
01:25:23,540 --> 01:25:24,560
Then, it prints this.
1882
01:25:24,560 --> 01:25:26,110
But it doesn't return anything.
1883
01:25:26,110 --> 01:25:30,335
And because I'm in a loop, that means it's going to do it again and again
1884
01:25:30,335 --> 01:25:33,980
and again until the human actually cooperates and gives me
1885
01:25:33,980 --> 01:25:35,360
an actual number.
1886
01:25:35,360 --> 01:25:38,210
And so, this, too, is what the world would call pythonic.
1887
01:25:38,210 --> 01:25:41,420
In Python, you don't, necessarily, rigorously try to validate
1888
01:25:41,420 --> 01:25:43,940
the user's input, make sure they haven't screwed up.
1889
01:25:43,940 --> 01:25:46,160
You honestly take a more lackadaisical approach
1890
01:25:46,160 --> 01:25:50,300
and just try to do something, but catch an error if it happens.
1891
01:25:50,300 --> 01:25:53,720
So catch is also a term of art, even though it's not a keyword here.
1892
01:25:53,720 --> 01:25:55,760
Except if something happens, you handle it.
1893
01:25:55,760 --> 01:25:57,470
So you try and you handle it.
1894
01:25:57,470 --> 01:25:59,480
You best-effort programming, if you will.
1895
01:25:59,480 --> 01:26:04,200
But this is baked into the mindset of the Python programming community.
1896
01:26:04,200 --> 01:26:08,630
So now, if I do python of mario.py and I cooperate, works great as before.
1897
01:26:08,630 --> 01:26:09,830
Try and succeed.
1898
01:26:09,830 --> 01:26:10,670
3 works.
1899
01:26:10,670 --> 01:26:11,345
4 works.
1900
01:26:11,345 --> 01:26:17,243
If, though, I try and fail by typing in "cat," it doesn't crash, per se.
1901
01:26:17,243 --> 01:26:18,410
It doesn't show me an error.
1902
01:26:18,410 --> 01:26:20,695
It shows me something more user-friendly, like "not an integer."
1903
01:26:20,695 --> 01:26:22,610
And then, I can try again with "dog."
1904
01:26:22,610 --> 01:26:23,390
"Not an integer."
1905
01:26:23,390 --> 01:26:24,980
I can try again with 5.
1906
01:26:24,980 --> 01:26:26,240
And now, it works.
1907
01:26:26,240 --> 01:26:28,160
So we won't, generally, have you write much
1908
01:26:28,160 --> 01:26:30,500
in the way of these try-except blocks, only because they
1909
01:26:30,500 --> 01:26:33,080
get a little sophisticated quickly.
1910
01:26:33,080 --> 01:26:35,777
But that is to reveal what the get_int function is doing.
1911
01:26:35,777 --> 01:26:37,610
This is why we give you the training wheels,
1912
01:26:37,610 --> 01:26:39,420
so that, when you want to get an int, you
1913
01:26:39,420 --> 01:26:41,990
don't have to jump through all these annoying hoops to do so.
1914
01:26:41,990 --> 01:26:45,965
But that's all the library's really doing for you, is just try and except.
1915
01:26:45,965 --> 01:26:48,980
You won't be left with any training wheels, ultimately.
1916
01:26:48,980 --> 01:26:52,760
Questions, now, on getting input and trying in this way?
1917
01:26:52,760 --> 01:26:55,433
1918
01:26:55,433 --> 01:26:56,100
Anything at all?
1919
01:26:56,100 --> 01:26:56,610
Yeah?
1920
01:26:56,610 --> 01:27:03,643
AUDIENCE: I'm still [INAUDIBLE] try block.
1921
01:27:03,643 --> 01:27:06,560
DAVID MALAN: Oh, could you put the condition outside of the try block?
1922
01:27:06,560 --> 01:27:07,310
Short answer, yes.
1923
01:27:07,310 --> 01:27:09,227
And, in fact, I struggled with this last night
1924
01:27:09,227 --> 01:27:11,750
when tweaking this example to show the simplest version.
1925
01:27:11,750 --> 01:27:17,180
I will disclaim that, really, I should only be trying, literally,
1926
01:27:17,180 --> 01:27:18,470
to do the fragile part.
1927
01:27:18,470 --> 01:27:21,710
And then, down here, I should be really doing
1928
01:27:21,710 --> 01:27:24,380
what you're proposing, which is do the condition out here.
1929
01:27:24,380 --> 01:27:27,380
The problem is, though, that, logically, this gets messy quickly, right?
1930
01:27:27,380 --> 01:27:31,205
Because except if there's a value error, I want to print out "not an integer."
1931
01:27:31,205 --> 01:27:33,920
I can't compare n against 0, then, because n doesn't
1932
01:27:33,920 --> 01:27:35,752
exist because there was an error.
1933
01:27:35,752 --> 01:27:37,460
So it turns out-- and I'll show you this;
1934
01:27:37,460 --> 01:27:39,350
this is now the advanced version of Python--
1935
01:27:39,350 --> 01:27:42,620
there's actually an else keyword you can use in Python
1936
01:27:42,620 --> 01:27:44,570
that does not accompany if or elif.
1937
01:27:44,570 --> 01:27:48,680
It accompanies try and except, which I think is weirdly confusing.
1938
01:27:48,680 --> 01:27:50,640
A different word would have been better.
1939
01:27:50,640 --> 01:27:53,692
But if you'd really prefer, I could have done this, instead.
1940
01:27:53,692 --> 01:27:56,900
And this is one of these design things where reasonable people will disagree.
1941
01:27:56,900 --> 01:27:58,775
Generally speaking, you should only try to do
1942
01:27:58,775 --> 01:28:00,980
the one line that might very well fail.
1943
01:28:00,980 --> 01:28:02,420
But honestly, this looks stupid.
1944
01:28:02,420 --> 01:28:04,850
No, it's just unnecessarily complicated.
1945
01:28:04,850 --> 01:28:08,560
And so, my own preference was actually the original, which was-- yeah,
1946
01:28:08,560 --> 01:28:10,310
I'm trying a few extra lines that, really,
1947
01:28:10,310 --> 01:28:11,973
aren't going to fail, mathematically.
1948
01:28:11,973 --> 01:28:12,890
But it's just tighter.
1949
01:28:12,890 --> 01:28:14,030
It's cleaner this way.
1950
01:28:14,030 --> 01:28:16,580
And here's, again, the sort of arguments you'll
1951
01:28:16,580 --> 01:28:18,530
start to make yourself as you get more comfortable with programming.
1952
01:28:18,530 --> 01:28:19,280
You'll have an opinion.
1953
01:28:19,280 --> 01:28:20,488
You'll disagree with someone.
1954
01:28:20,488 --> 01:28:25,200
And so long as you can back you argument up, it's pretty reasonable, probably.
1955
01:28:25,200 --> 01:28:25,700
All right.
1956
01:28:25,700 --> 01:28:30,222
So how about we, now, take away some piece of magic
1957
01:28:30,222 --> 01:28:31,430
that's been here for a while.
1958
01:28:31,430 --> 01:28:33,950
Let me go ahead and delete all of this here.
1959
01:28:33,950 --> 01:28:38,855
And let me propose that we revisit not that vertical column and the exceptions
1960
01:28:38,855 --> 01:28:42,110
that might result from getting input, but these horizontal question marks
1961
01:28:42,110 --> 01:28:43,130
that we saw a while ago.
1962
01:28:43,130 --> 01:28:45,980
So I want all of those question marks on the same line.
1963
01:28:45,980 --> 01:28:48,860
And yet, I worry we're about to see a challenge here because print,
1964
01:28:48,860 --> 01:28:51,830
up until now, has been putting new lines everywhere automatically,
1965
01:28:51,830 --> 01:28:53,570
even without those backslash n's.
1966
01:28:53,570 --> 01:28:56,360
Well, let me propose that we do this.
1967
01:28:56,360 --> 01:28:58,130
for i in the range of 4.
1968
01:28:58,130 --> 01:29:02,165
If I want four question marks, let me just print four question marks.
1969
01:29:02,165 --> 01:29:04,370
Unfortunately, I don't think this is correct yet.
1970
01:29:04,370 --> 01:29:06,530
Let me run python of mario.py.
1971
01:29:06,530 --> 01:29:11,510
And, of course, this gives me a column instead of the row of question marks
1972
01:29:11,510 --> 01:29:12,630
that I want.
1973
01:29:12,630 --> 01:29:13,550
So how do we do this?
1974
01:29:13,550 --> 01:29:17,785
Well, it turns out, if you read the documentation for the print function,
1975
01:29:17,785 --> 01:29:19,910
it turns out that print, not surprisingly, perhaps,
1976
01:29:19,910 --> 01:29:22,000
takes a lot of different arguments, as well.
1977
01:29:22,000 --> 01:29:24,590
And in fact, if you go to the documentation for it,
1978
01:29:24,590 --> 01:29:27,650
you'll see that it takes not just positional
1979
01:29:27,650 --> 01:29:30,685
arguments-- that is, from left to right, separated by commas.
1980
01:29:30,685 --> 01:29:32,810
It turns out, Python has supports a fancier feature
1981
01:29:32,810 --> 01:29:36,860
with arguments where you can pass the names of arguments to functions, too.
1982
01:29:36,860 --> 01:29:38,470
So what do I mean by this?
1983
01:29:38,470 --> 01:29:43,430
If I go back to VS Code here and I've read the documentation,
1984
01:29:43,430 --> 01:29:48,995
it turns out that, yes, as before, you can pass multiple arguments to Python,
1985
01:29:48,995 --> 01:29:49,700
like this.
1986
01:29:49,700 --> 01:29:53,030
Hello comma David comma Nalan, that will just automatically
1987
01:29:53,030 --> 01:29:56,553
concatenate all three of those positional arguments together.
1988
01:29:56,553 --> 01:29:59,720
They're positional in the sense that they literally flow from left to right,
1989
01:29:59,720 --> 01:30:01,238
separated by commas.
1990
01:30:01,238 --> 01:30:03,530
But if you don't want to just pass in values like that,
1991
01:30:03,530 --> 01:30:07,370
you want to actually print out, as I did before, a question mark.
1992
01:30:07,370 --> 01:30:11,240
But you want to override the default behavior of print
1993
01:30:11,240 --> 01:30:14,610
by changing the line ending, you can actually do this.
1994
01:30:14,610 --> 01:30:18,890
You can use the name of an argument that you know exists from the documentation
1995
01:30:18,890 --> 01:30:22,130
and set it equal to some alternative value.
1996
01:30:22,130 --> 01:30:24,770
And in fact, even though this looks cryptic,
1997
01:30:24,770 --> 01:30:30,380
this is how I would override the end of each line, to be quote, unquote.
1998
01:30:30,380 --> 01:30:32,900
That is nothing because, if you read the documentation,
1999
01:30:32,900 --> 01:30:37,190
the default value for this end argument-- does someone want to guess--
2000
01:30:37,190 --> 01:30:38,750
is--
2001
01:30:38,750 --> 01:30:39,800
is backslash n.
2002
01:30:39,800 --> 01:30:41,690
So if you read the documentation, you'll se
2003
01:30:41,690 --> 01:30:46,550
that backslash n is the implied default for this end argument.
2004
01:30:46,550 --> 01:30:49,810
And so, if you want to change it, you just say end equals something else.
2005
01:30:49,810 --> 01:30:57,057
And so, here, I can change it to nothing and, now, rerun python of mario.py.
2006
01:30:57,057 --> 01:30:58,640
And now, they're all in the same line.
2007
01:30:58,640 --> 01:31:01,190
Now, it looks a little stupid because I made that week
2008
01:31:01,190 --> 01:31:04,190
one mistake where I still need to move the cursor to the next line.
2009
01:31:04,190 --> 01:31:05,570
That's just a different problem.
2010
01:31:05,570 --> 01:31:07,612
I'm just going to go over here and print nothing.
2011
01:31:07,612 --> 01:31:10,550
I don't even need to print backslash n because, if print automatically
2012
01:31:10,550 --> 01:31:13,970
gives you a backslash n, just call print with nothing,
2013
01:31:13,970 --> 01:31:15,420
and you'll get that for free.
2014
01:31:15,420 --> 01:31:16,940
So let me rerun python of mario.py.
2015
01:31:16,940 --> 01:31:19,895
And now, it looks a little prettier at the prompt.
2016
01:31:19,895 --> 01:31:21,770
And to be super clear as to what's going on--
2017
01:31:21,770 --> 01:31:24,300
suppose I want to make an exclamation here.
2018
01:31:24,300 --> 01:31:27,320
I could change the backslash n default to an exclamation point,
2019
01:31:27,320 --> 01:31:28,680
just for kicks.
2020
01:31:28,680 --> 01:31:31,550
And if I run python of mario.py Again, now, I
2021
01:31:31,550 --> 01:31:36,662
get this exclamation with question marks and exclamation points, as well.
2022
01:31:36,662 --> 01:31:38,120
So that's all that's going on here.
2023
01:31:38,120 --> 01:31:40,670
And this is what's called a named argument.
2024
01:31:40,670 --> 01:31:43,670
It literally has a name that you can specify when calling it in.
2025
01:31:43,670 --> 01:31:47,787
And it's different from positional in that you're literally using the name.
2026
01:31:47,787 --> 01:31:49,370
Let me propose something else, though.
2027
01:31:49,370 --> 01:31:50,828
And this is why people like Python.
2028
01:31:50,828 --> 01:31:52,550
There's just cool ways to do things.
2029
01:31:52,550 --> 01:31:55,724
2030
01:31:55,724 --> 01:32:00,740
That's a three-line, verbose way of printing out four question marks.
2031
01:32:00,740 --> 01:32:04,002
I could certainly take the shortcut and just do this.
2032
01:32:04,002 --> 01:32:06,085
But that's not really that interesting for anyone,
2033
01:32:06,085 --> 01:32:08,720
especially if I want to do it a variable number of times.
2034
01:32:08,720 --> 01:32:10,390
But Python does let you do this.
2035
01:32:10,390 --> 01:32:15,110
If you want to multiply a character some number of times,
2036
01:32:15,110 --> 01:32:18,020
not only can you use plus for concatenation,
2037
01:32:18,020 --> 01:32:23,930
you can use star or an asterisk for multiplication, if you will-- that is,
2038
01:32:23,930 --> 01:32:26,250
concatenation again and again and again.
2039
01:32:26,250 --> 01:32:29,030
So if I just print out, quote unquote, "?"
2040
01:32:29,030 --> 01:32:34,190
times 4, that's actually going to be the tightest way, the most distinct way
2041
01:32:34,190 --> 01:32:36,020
I can print four question marks instead.
2042
01:32:36,020 --> 01:32:39,095
And if I don't use 4, I use n, where I get n from the user.
2043
01:32:39,095 --> 01:32:39,830
Bang.
2044
01:32:39,830 --> 01:32:42,320
Now, I've gotten rid of the for loop entirely,
2045
01:32:42,320 --> 01:32:48,000
and I'm using the star operator to manipulate it instead.
2046
01:32:48,000 --> 01:32:50,120
And, to be super clear here, insofar as Python
2047
01:32:50,120 --> 01:32:54,440
does not have malloc or free or memory management that you have to do,
2048
01:32:54,440 --> 01:32:56,060
guess what Python also doesn't have.
2049
01:32:56,060 --> 01:32:59,760
2050
01:32:59,760 --> 01:33:03,110
Anything on your minds in the past couple of week?
2051
01:33:03,110 --> 01:33:03,875
Doesn't have--
2052
01:33:03,875 --> 01:33:04,853
AUDIENCE: Pointers.
2053
01:33:04,853 --> 01:33:06,020
DAVID MALAN: Pointers, yeah.
2054
01:33:06,020 --> 01:33:09,295
So Python does not have pointers, which just means that all of that
2055
01:33:09,295 --> 01:33:11,420
happens for you automatically, underneath the hood,
2056
01:33:11,420 --> 01:33:14,150
again, by way of code that someone else wrote.
2057
01:33:14,150 --> 01:33:15,950
How about one more throwback with Mario?
2058
01:33:15,950 --> 01:33:20,450
We've talked about, in week one, this two-dimensional structure where
2059
01:33:20,450 --> 01:33:24,302
it's like I claim 3 by 3-- a grid of bricks, if you will.
2060
01:33:24,302 --> 01:33:25,760
Well, how can we do this in Python?
2061
01:33:25,760 --> 01:33:27,590
We can do this in a couple of ways, now.
2062
01:33:27,590 --> 01:33:32,810
Let me go back to my mario.py, and let me do something like for i in range
2063
01:33:32,810 --> 01:33:36,200
of-- we'll just do 3, even though I know, now, I could use get_int
2064
01:33:36,200 --> 01:33:38,453
or I could use input and int.
2065
01:33:38,453 --> 01:33:41,120
And if I want to do something two-dimensionally, just like in C,
2066
01:33:41,120 --> 01:33:42,590
you can nest your for loops.
2067
01:33:42,590 --> 01:33:45,980
So maybe I could do for j in range of 3.
2068
01:33:45,980 --> 01:33:50,690
And then, in here, I could print out a hash symbol.
2069
01:33:50,690 --> 01:33:53,210
And then, let's see if that gives me 9 total.
2070
01:33:53,210 --> 01:33:56,870
So if I've got a nested loop like this, python of mario.py
2071
01:33:56,870 --> 01:33:58,625
hopefully gives me a grid.
2072
01:33:58,625 --> 01:34:01,710
No, it gave me a column of 9.
2073
01:34:01,710 --> 01:34:09,280
Why, logically, even though I've got my row and my columns?
2074
01:34:09,280 --> 01:34:10,210
Yeah.
2075
01:34:10,210 --> 01:34:11,542
AUDIENCE: [INAUDIBLE]
2076
01:34:11,542 --> 01:34:13,000
DAVID MALAN: Yeah, the line ending.
2077
01:34:13,000 --> 01:34:17,380
So in my row, I can't let print just keep adding new line, adding new line.
2078
01:34:17,380 --> 01:34:20,740
So I just have to override this here and let me not screw up like before.
2079
01:34:20,740 --> 01:34:24,250
Let me print one at the end of the whole row, just to move the cursor down.
2080
01:34:24,250 --> 01:34:28,090
And I think, now, together, we've got our 3 by 3.
2081
01:34:28,090 --> 01:34:29,950
Of course, we could tighten this up further.
2082
01:34:29,950 --> 01:34:33,730
If I don't like the nested loop, I probably could go in here
2083
01:34:33,730 --> 01:34:37,975
and just print out, for instance, a brick times 3.
2084
01:34:37,975 --> 01:34:41,055
Or I could change the 3 to a variable if I've gotten it from the user.
2085
01:34:41,055 --> 01:34:42,582
So I can tighten this up further.
2086
01:34:42,582 --> 01:34:45,790
So, again, just different ways to solve the same problem and, again, evidence
2087
01:34:45,790 --> 01:34:47,575
of why a lot of people like Python.
2088
01:34:47,575 --> 01:34:49,825
There's just some more pleasant ways to solve problems
2089
01:34:49,825 --> 01:34:52,330
without getting into the weeds, constantly, of doing things,
2090
01:34:52,330 --> 01:34:56,845
like with for loops and while loops endlessly.
2091
01:34:56,845 --> 01:34:57,430
All right.
2092
01:34:57,430 --> 01:34:59,222
Well, how about some other building blocks?
2093
01:34:59,222 --> 01:35:02,983
Lists are going to be so incredibly useful in Python, just as arrays
2094
01:35:02,983 --> 01:35:04,900
were in C. But arrays are annoying because you
2095
01:35:04,900 --> 01:35:06,410
have to manage the memory yourself.
2096
01:35:06,410 --> 01:35:08,327
You have to in advance how big they are or you
2097
01:35:08,327 --> 01:35:11,440
have to use pointers and malloc or realloc to resize them.
2098
01:35:11,440 --> 01:35:12,100
Oh my god.
2099
01:35:12,100 --> 01:35:14,267
The past two weeks have been painful, in that sense.
2100
01:35:14,267 --> 01:35:17,298
But Python does this all for free for you.
2101
01:35:17,298 --> 01:35:19,090
In fact, there's a whole bunch of functions
2102
01:35:19,090 --> 01:35:22,030
that come with Python that involve lists,
2103
01:35:22,030 --> 01:35:29,678
and they'll allow us, ultimately, to do things again and again and again
2104
01:35:29,678 --> 01:35:30,970
within the same data structure.
2105
01:35:30,970 --> 01:35:33,220
And, for instance, we'll be able to get the length of a list.
2106
01:35:33,220 --> 01:35:35,560
You don't have to remember it yourself in a variable.
2107
01:35:35,560 --> 01:35:39,085
You can just ask Python how many elements are in this list.
2108
01:35:39,085 --> 01:35:42,850
And with this, I think we can solve some old problems, too.
2109
01:35:42,850 --> 01:35:45,250
So let me go back here, to VS Code.
2110
01:35:45,250 --> 01:35:50,890
Let me close mario and give us a new program called scores.py.
2111
01:35:50,890 --> 01:35:54,535
And rather than show the C and the Python now, let's just focus on Python.
2112
01:35:54,535 --> 01:35:59,390
And in scores.c way back when, we just averaged three test scores or something
2113
01:35:59,390 --> 01:35:59,890
like that--
2114
01:35:59,890 --> 01:36:01,900
72, 73, and 33--
2115
01:36:01,900 --> 01:36:03,230
a few weeks ago.
2116
01:36:03,230 --> 01:36:07,450
So if I want to create a list in this Python version of 72, 73, 33,
2117
01:36:07,450 --> 01:36:09,220
I just use my square bracket notation.
2118
01:36:09,220 --> 01:36:12,640
C let you use curly braces if you know the values in advance,
2119
01:36:12,640 --> 01:36:14,170
but Python's just this.
2120
01:36:14,170 --> 01:36:16,855
And now, if I want to compute the average--
2121
01:36:16,855 --> 01:36:19,360
in C, recall, I did something with a loop.
2122
01:36:19,360 --> 01:36:21,140
I added all the values together.
2123
01:36:21,140 --> 01:36:23,230
I, then, divide it by the total number of values
2124
01:36:23,230 --> 01:36:26,110
just like you would in grade school, and that gave me the average.
2125
01:36:26,110 --> 01:36:29,085
Well, Python comes with a lot of super handy functions--
2126
01:36:29,085 --> 01:36:31,395
not just length, but others, as well.
2127
01:36:31,395 --> 01:36:34,150
And so, in fact, if you want to compute the average,
2128
01:36:34,150 --> 01:36:36,970
you can take the sum of all of those scores
2129
01:36:36,970 --> 01:36:40,010
and divide it by the length of all of those scores.
2130
01:36:40,010 --> 01:36:42,490
So Python comes with length, comes with sum.
2131
01:36:42,490 --> 01:36:45,310
You can just pass in a whole list of any size
2132
01:36:45,310 --> 01:36:47,590
and let it deal with that problem for you.
2133
01:36:47,590 --> 01:36:49,900
So if I want to, now, print out this average,
2134
01:36:49,900 --> 01:36:51,760
I can print out Average colon--
2135
01:36:51,760 --> 01:36:55,570
and then, I'll plug in my average variable for interpolation.
2136
01:36:55,570 --> 01:36:58,900
Let me make this an fstring so that it gets formatted,
2137
01:36:58,900 --> 01:37:01,530
and let me just run python of scores.py.
2138
01:37:01,530 --> 01:37:02,800
And there is my average.
2139
01:37:02,800 --> 01:37:05,890
It's rounding weird because we're still vulnerable to some floating point
2140
01:37:05,890 --> 01:37:09,340
imprecision, but at least I didn't need loops
2141
01:37:09,340 --> 01:37:11,575
and I didn't have to write all this darn code just
2142
01:37:11,575 --> 01:37:15,130
to do something that Excel and Google Spreadsheets can just do like that.
2143
01:37:15,130 --> 01:37:17,950
Well, Python is closer to those kinds of tools,
2144
01:37:17,950 --> 01:37:21,790
but more powerful in that you can manipulate the data yourself.
2145
01:37:21,790 --> 01:37:25,510
How about, though, if I want to get a bunch of scores manually from the user
2146
01:37:25,510 --> 01:37:27,280
and, then, sum them together.
2147
01:37:27,280 --> 01:37:28,920
Well, let's combine a few ideas here.
2148
01:37:28,920 --> 01:37:29,830
How about this?
2149
01:37:29,830 --> 01:37:36,070
First, let me go ahead and import the get_int function from the CS50 library,
2150
01:37:36,070 --> 01:37:39,340
just so we don't have to deal with try and except or all of that.
2151
01:37:39,340 --> 01:37:42,340
And let me go ahead and give myself an empty list.
2152
01:37:42,340 --> 01:37:44,410
And this is powerful.
2153
01:37:44,410 --> 01:37:48,068
In C, [SIGHS] there's no point to an empty array
2154
01:37:48,068 --> 01:37:50,860
because, if you create an empty array with square bracket notation,
2155
01:37:50,860 --> 01:37:52,600
it's not useful for anything.
2156
01:37:52,600 --> 01:37:55,780
But in Python, you can create it empty because Python
2157
01:37:55,780 --> 01:37:59,590
will grow and shrink the list for you automatically, as you add things to it.
2158
01:37:59,590 --> 01:38:01,600
So if I want to get three scores from the user,
2159
01:38:01,600 --> 01:38:04,840
I could do something like this-- for i in range of 3.
2160
01:38:04,840 --> 01:38:08,680
And then, I can grab a variable called "score" or anything.
2161
01:38:08,680 --> 01:38:11,467
I could call get_int, prompt the human for the score
2162
01:38:11,467 --> 01:38:12,550
that they want to type in.
2163
01:38:12,550 --> 01:38:15,060
And then, once they do, I can do this.
2164
01:38:15,060 --> 01:38:19,450
Thinking back to our object-oriented programming capability now,
2165
01:38:19,450 --> 01:38:24,358
I could do scores.append, and I can append that score to it.
2166
01:38:24,358 --> 01:38:27,400
And you would only know this from having read the documentation, heard it
2167
01:38:27,400 --> 01:38:30,040
in class, in a book or whatnot, but it turns out
2168
01:38:30,040 --> 01:38:33,880
that, just like strings have functions like lower built into them,
2169
01:38:33,880 --> 01:38:37,735
lists have functions like append built into them that just literally appends
2170
01:38:37,735 --> 01:38:40,165
to the end of the list for you, and Python
2171
01:38:40,165 --> 01:38:42,250
will grow or shrink it as needed.
2172
01:38:42,250 --> 01:38:44,760
No more malloc or realloc or the like.
2173
01:38:44,760 --> 01:38:49,120
So this just appends to the scores list.
2174
01:38:49,120 --> 01:38:51,740
That score, and then again and again and again.
2175
01:38:51,740 --> 01:38:52,990
So the array starts at--
2176
01:38:52,990 --> 01:38:57,640
sorry, the list starts at size 0, then grows to 1 then 2 then 3
2177
01:38:57,640 --> 01:38:59,320
without you having to do anything else.
2178
01:38:59,320 --> 01:39:02,845
And so, now, down here, I can compute an average
2179
01:39:02,845 --> 01:39:05,620
with the sum of those scores divided by the length
2180
01:39:05,620 --> 01:39:07,455
of the total number of scores.
2181
01:39:07,455 --> 01:39:11,830
And to be clear, length is the total number of elements in the list.
2182
01:39:11,830 --> 01:39:14,200
Doesn't matter how big the values themselves are.
2183
01:39:14,200 --> 01:39:18,160
Now I can go ahead and print out an fstring with something
2184
01:39:18,160 --> 01:39:22,100
like Average colon average in curly braces.
2185
01:39:22,100 --> 01:39:24,680
And if I run python of scores.py--
2186
01:39:24,680 --> 01:39:27,505
I'll type in, just for the sake of discussion, the three values,
2187
01:39:27,505 --> 01:39:29,440
I still get the same answer.
2188
01:39:29,440 --> 01:39:31,390
But that would have been painful to do in C
2189
01:39:31,390 --> 01:39:35,770
unless you committed, in advance, to a fixed size array-- which we already
2190
01:39:35,770 --> 01:39:41,830
decided, weeks ago, was annoying-- or you grew it dynamically
2191
01:39:41,830 --> 01:39:44,740
using malloc or realloc or the like.
2192
01:39:44,740 --> 01:39:45,400
All right.
2193
01:39:45,400 --> 01:39:46,240
What else can I do?
2194
01:39:46,240 --> 01:39:49,990
Well, there's some nice things you might as well know exist.
2195
01:39:49,990 --> 01:39:54,340
Instead of scores.append, you can do slight fanciness like this.
2196
01:39:54,340 --> 01:39:57,290
If you want to append something to a list,
2197
01:39:57,290 --> 01:40:00,100
you can actually do plus equals, and then
2198
01:40:00,100 --> 01:40:03,620
put that thing in a temporary list of its own
2199
01:40:03,620 --> 01:40:05,740
and just use what is essentially concatenation--
2200
01:40:05,740 --> 01:40:09,410
but not concatenation of strings, but concatenation of lists.
2201
01:40:09,410 --> 01:40:13,480
So this new line 6 appends to the score's list--
2202
01:40:13,480 --> 01:40:15,640
this tiny, little list I'm temporarily creating
2203
01:40:15,640 --> 01:40:17,670
with just the current new score.
2204
01:40:17,670 --> 01:40:20,260
So just another piece of syntax that's worth seeing that
2205
01:40:20,260 --> 01:40:23,290
allows you to do something like that, as well.
2206
01:40:23,290 --> 01:40:23,890
All right.
2207
01:40:23,890 --> 01:40:26,093
Well, how about we go back to strings for a moment?
2208
01:40:26,093 --> 01:40:29,260
And all of these examples, as always, are on the course's website afterward.
2209
01:40:29,260 --> 01:40:32,860
Suppose we want to do something like converting characters to uppercase.
2210
01:40:32,860 --> 01:40:35,170
Well, to be clear, I could do something like this.
2211
01:40:35,170 --> 01:40:38,080
Let me create a program called uppercase.py.
2212
01:40:38,080 --> 01:40:42,280
Let me prompt the user for a before string as by using the input function
2213
01:40:42,280 --> 01:40:44,510
or get_string, which is almost the same.
2214
01:40:44,510 --> 01:40:47,110
And I'll prompt the user for a string beforehand.
2215
01:40:47,110 --> 01:40:52,750
Then, let me go ahead and print out, how about, the keyword "After,"
2216
01:40:52,750 --> 01:40:56,650
and then end the new line with nothing, just so
2217
01:40:56,650 --> 01:41:00,010
that I can see "Before" on one line and "After" on the next line.
2218
01:41:00,010 --> 01:41:01,240
And then, let me do this--
2219
01:41:01,240 --> 01:41:04,450
and here's where Python gets pleasant, too, with loops--
2220
01:41:04,450 --> 01:41:07,270
for c in before--
2221
01:41:07,270 --> 01:41:11,110
print c.upper end equals quote, unquote.
2222
01:41:11,110 --> 01:41:12,580
And then, I'll print this here.
2223
01:41:12,580 --> 01:41:13,120
All right.
2224
01:41:13,120 --> 01:41:15,950
That was fast, but let's try to infer what's going on.
2225
01:41:15,950 --> 01:41:19,600
So line 1 just gets input from the user, stores it in a variable called before.
2226
01:41:19,600 --> 01:41:22,510
Line two literally just prints "After" but doesn't
2227
01:41:22,510 --> 01:41:25,300
move the cursor to the next line.
2228
01:41:25,300 --> 01:41:27,015
What it, then, does is this.
2229
01:41:27,015 --> 01:41:29,875
And, in C, this was a little more annoying.
2230
01:41:29,875 --> 01:41:31,450
You needed a for loop with i.
2231
01:41:31,450 --> 01:41:34,690
You needed array notation with the square brackets.
2232
01:41:34,690 --> 01:41:39,850
But, Python, if you say for variable in string--
2233
01:41:39,850 --> 01:41:42,670
so for c, for character, in string, Python
2234
01:41:42,670 --> 01:41:46,060
is going to automatically assign c to the first letter
2235
01:41:46,060 --> 01:41:47,110
that the user types in.
2236
01:41:47,110 --> 01:41:49,120
Then, on the next iteration, the second letter, the third letter,
2237
01:41:49,120 --> 01:41:49,745
and the fourth.
2238
01:41:49,745 --> 01:41:52,360
So you don't need any square bracket notation, you just use c,
2239
01:41:52,360 --> 01:41:55,180
and Python will do it for you and just hand you back,
2240
01:41:55,180 --> 01:41:59,000
one at a time, each of the letters that the user has typed in.
2241
01:41:59,000 --> 01:42:04,720
So if I go back over here and I run, for instance, python of uppercase.py
2242
01:42:04,720 --> 01:42:09,760
and I'll type in, how about, "david" in all lowercase and hit Enter,
2243
01:42:09,760 --> 01:42:13,630
you'll now see that it's all uppercase instead by iterating over it,
2244
01:42:13,630 --> 01:42:15,372
indeed, one character at a time.
2245
01:42:15,372 --> 01:42:17,830
But we already know, thanks to object-oriented programming,
2246
01:42:17,830 --> 01:42:20,027
strings themselves have the functionality built
2247
01:42:20,027 --> 01:42:24,100
in to not just uppercase single characters, but the whole string.
2248
01:42:24,100 --> 01:42:26,530
So, honestly, this was a bit of a silly exercise.
2249
01:42:26,530 --> 01:42:31,360
I don't need to use a loop anymore, like in C. And so, some of the habits
2250
01:42:31,360 --> 01:42:34,720
you've only just developed in recent weeks, it's time to start breaking them
2251
01:42:34,720 --> 01:42:36,130
when they're not necessary.
2252
01:42:36,130 --> 01:42:40,470
I can create a variable called after, set it equal to before.upper--
2253
01:42:40,470 --> 01:42:43,600
which, indeed, exists, just like dot lower exists.
2254
01:42:43,600 --> 01:42:47,490
And then, what I can go ahead and print out is, for instance--
2255
01:42:47,490 --> 01:42:49,990
let's get rid of this print line here and do it at the end--
2256
01:42:49,990 --> 01:42:53,900
"After" and print the value of that variable.
2257
01:42:53,900 --> 01:42:58,005
So now, if I rerun uppercase.py, type in "david" in all lowercase,
2258
01:42:58,005 --> 01:43:03,400
I can just uppercase the whole thing all at once because, again, in Python,
2259
01:43:03,400 --> 01:43:07,000
you don't have to operate on characters individually.
2260
01:43:07,000 --> 01:43:13,310
Questions on any of these tricks up until now?
2261
01:43:13,310 --> 01:43:13,810
No?
2262
01:43:13,810 --> 01:43:14,290
All right.
2263
01:43:14,290 --> 01:43:17,290
How about a few other techniques that we saw in C that we'll bring back,
2264
01:43:17,290 --> 01:43:18,145
now, in Python.
2265
01:43:18,145 --> 01:43:22,860
So it turns out, in Python, there are other libraries you can use, too,
2266
01:43:22,860 --> 01:43:24,360
that unlock even more functionality.
2267
01:43:24,360 --> 01:43:27,040
So, in C, if you wanted command line arguments,
2268
01:43:27,040 --> 01:43:32,410
you just change the signature for main to be, instead of void,
2269
01:43:32,410 --> 01:43:38,515
int argc comma string argv, open brackets for an array or char star,
2270
01:43:38,515 --> 01:43:39,130
eventually.
2271
01:43:39,130 --> 01:43:41,770
Well, it turns out, in Python, that, if you want to access command line
2272
01:43:41,770 --> 01:43:44,770
arguments, it's a little simpler, but they're tucked away in a library--
2273
01:43:44,770 --> 01:43:46,990
otherwise known as a module--
2274
01:43:46,990 --> 01:43:49,552
called sys, the system module.
2275
01:43:49,552 --> 01:43:51,760
Now, this is similar, in spirit, to the CS50 library,
2276
01:43:51,760 --> 01:43:53,802
and that's got a bunch of functionality built in.
2277
01:43:53,802 --> 01:43:55,725
But this one comes with Python itself.
2278
01:43:55,725 --> 01:43:59,710
So if I want tot create a program like greet.py, in VS Code,
2279
01:43:59,710 --> 01:44:01,510
here, let me go ahead and do this.
2280
01:44:01,510 --> 01:44:05,785
From the sys library, let's import argv.
2281
01:44:05,785 --> 01:44:07,850
And that's just a thing that exists.
2282
01:44:07,850 --> 01:44:10,660
It's not built into main because there is no main, per se, anymore.
2283
01:44:10,660 --> 01:44:12,590
So it's tucked away in that library.
2284
01:44:12,590 --> 01:44:14,330
And now, I can do something like this.
2285
01:44:14,330 --> 01:44:16,925
If the length of argv equals equals 2, well,
2286
01:44:16,925 --> 01:44:19,090
let's go ahead and print out something friendly,
2287
01:44:19,090 --> 01:44:24,955
like hello comma argv bracket 1, and then, close quotes.
2288
01:44:24,955 --> 01:44:28,360
Else, if the length of argv is not equal to 2,
2289
01:44:28,360 --> 01:44:30,400
Let's just go ahead and print out hello, world.
2290
01:44:30,400 --> 01:44:32,525
Now, at a glance, this might look a little cryptic,
2291
01:44:32,525 --> 01:44:35,050
but it's identical to what we did a few weeks ago.
2292
01:44:35,050 --> 01:44:39,570
When I run this, python of greet.py, with no arguments,
2293
01:44:39,570 --> 01:44:40,950
it just says "hello, world."
2294
01:44:40,950 --> 01:44:46,180
But if I, instead, add a command line argument, like my first name and hit
2295
01:44:46,180 --> 01:44:49,825
Enter, now, the length of argv is no longer 1.
2296
01:44:49,825 --> 01:44:51,700
It's going to be 2.
2297
01:44:51,700 --> 01:44:54,680
And so, it prints out "Hello, David" instead.
2298
01:44:54,680 --> 01:44:57,880
So the takeaway here is that, whereas in C,
2299
01:44:57,880 --> 01:45:03,955
argv technically contained the name of your program, like ./hello or ./greet,
2300
01:45:03,955 --> 01:45:05,455
and then everything the human typed.
2301
01:45:05,455 --> 01:45:08,410
Python's a little different in that, because we're
2302
01:45:08,410 --> 01:45:10,150
using the interpreter in this way--
2303
01:45:10,150 --> 01:45:16,090
technically, when you run python of greet.py, the length of argv is only 1.
2304
01:45:16,090 --> 01:45:18,760
It contains only greet.py, so the name of the file.
2305
01:45:18,760 --> 01:45:21,670
It does not unnecessarily contain Python itself
2306
01:45:21,670 --> 01:45:24,460
because what's the point of that being there, omnipresently?
2307
01:45:24,460 --> 01:45:28,760
It does contain the number of words that the human typed after Python itself.
2308
01:45:28,760 --> 01:45:32,230
So argv is length 1 here. argv is length 2 here.
2309
01:45:32,230 --> 01:45:35,350
And that's why, when it did equal 2, I saw "Hello, David" instead
2310
01:45:35,350 --> 01:45:37,240
of the default "Hello, world."
2311
01:45:37,240 --> 01:45:41,440
So same ability to access command line arguments, add these kinds of inputs
2312
01:45:41,440 --> 01:45:43,570
to your functions, but you have to unlock it
2313
01:45:43,570 --> 01:45:47,830
by way of using argv instead, in this way.
2314
01:45:47,830 --> 01:45:51,910
If you want to see all of the words, you could do something like this.
2315
01:45:51,910 --> 01:45:57,760
Just as-- if we combine ideas, here-- for i in range of, how about, length
2316
01:45:57,760 --> 01:45:59,610
of argv.
2317
01:45:59,610 --> 01:46:02,260
Then, I can do this-- print argv bracket i.
2318
01:46:02,260 --> 01:46:02,860
All right.
2319
01:46:02,860 --> 01:46:06,385
A little cryptic, but line 3 is just a for loop iterating
2320
01:46:06,385 --> 01:46:08,410
over the range of length of argv.
2321
01:46:08,410 --> 01:46:12,640
So if the human types in two words, the length of argv will be 2.
2322
01:46:12,640 --> 01:46:16,885
So this is just a way of saying iterate over all of the words in argv,
2323
01:46:16,885 --> 01:46:18,380
printing them one at a time.
2324
01:46:18,380 --> 01:46:22,810
So python of greet.py, Enter just prints out the name of the program.
2325
01:46:22,810 --> 01:46:27,340
python of greet.py with David prints out greet.py and, then, David.
2326
01:46:27,340 --> 01:46:29,470
I can keep running it though with more words,
2327
01:46:29,470 --> 01:46:32,650
and they'll each get printed one at a time.
2328
01:46:32,650 --> 01:46:35,440
But what's nice, too, about Python--
2329
01:46:35,440 --> 01:46:38,920
and this is the point of this exercise-- honestly, this looks pretty cryptic.
2330
01:46:38,920 --> 01:46:40,720
This is not very pleasant to look at.
2331
01:46:40,720 --> 01:46:46,150
If you just want to iterate over every word in a list, which argv is,
2332
01:46:46,150 --> 01:46:47,680
watch what I can do.
2333
01:46:47,680 --> 01:46:52,090
I can do for arg or any variable name in argv.
2334
01:46:52,090 --> 01:46:54,147
Let me just, now, print out that argument.
2335
01:46:54,147 --> 01:46:56,980
I could keep calling it i, but i seems weird when it's not a number.
2336
01:46:56,980 --> 01:46:59,710
So I'm changing to arg as a word, instead.
2337
01:46:59,710 --> 01:47:03,970
If I now do python of greet.py, it does this.
2338
01:47:03,970 --> 01:47:06,460
If I do python of greet.py, David, it does that again.
2339
01:47:06,460 --> 01:47:08,690
David Malan, it does that again.
2340
01:47:08,690 --> 01:47:10,898
So this is, again, why Python is just very appealing.
2341
01:47:10,898 --> 01:47:13,482
You want to do something this many times, iterate over a list?
2342
01:47:13,482 --> 01:47:15,820
Just say it, and it reads a little more like English.
2343
01:47:15,820 --> 01:47:18,130
And there's even other fanciness, too, if I may.
2344
01:47:18,130 --> 01:47:21,820
It's a little stupid that I keep seeing the name of the program, greet.py,
2345
01:47:21,820 --> 01:47:24,640
so it'd be nice if I could remove that.
2346
01:47:24,640 --> 01:47:28,960
Python also supports what are called slices of arrays--
2347
01:47:28,960 --> 01:47:30,340
sorry, slices of lists.
2348
01:47:30,340 --> 01:47:32,050
Even I get the terminology confused.
2349
01:47:32,050 --> 01:47:36,400
If argv is a list, then it's going to print out everything in it.
2350
01:47:36,400 --> 01:47:41,950
But if I want a slice of it that starts at location 1 all the way to the end,
2351
01:47:41,950 --> 01:47:45,500
you can use this funky syntax in between the square brackets, which
2352
01:47:45,500 --> 01:47:48,700
we've not seen yet, that's going to start at item 1
2353
01:47:48,700 --> 01:47:50,220
and go all the way to the end.
2354
01:47:50,220 --> 01:47:53,830
And so, this is a nice, clever way of slicing off,
2355
01:47:53,830 --> 01:47:56,170
if you will, the very first element because now,
2356
01:47:56,170 --> 01:48:01,900
when I run greet.py, David Malan, I should only see David and Malan.
2357
01:48:01,900 --> 01:48:04,940
If I only want one element, I could do 1 to 2.
2358
01:48:04,940 --> 01:48:08,260
If I want all of them, I could do 0 onward.
2359
01:48:08,260 --> 01:48:10,900
I could give myself just one of them in this way.
2360
01:48:10,900 --> 01:48:14,380
So you can play with the start value and the end value in this way,
2361
01:48:14,380 --> 01:48:17,020
to slice and dice these lists in different ways.
2362
01:48:17,020 --> 01:48:20,620
That would have been a pain in C, just because we didn't really
2363
01:48:20,620 --> 01:48:26,840
have the built-in support for manipulating arrays as cleanly as this.
2364
01:48:26,840 --> 01:48:27,340
All right.
2365
01:48:27,340 --> 01:48:31,440
Just so you've seen it, too-- though, this one is less exciting to see live--
2366
01:48:31,440 --> 01:48:33,940
if I go ahead and create a quick program here, it turns out,
2367
01:48:33,940 --> 01:48:37,630
there's something else in the sys library, the ability to exit programs--
2368
01:48:37,630 --> 01:48:41,590
either exiting with status code 1 or 0, as we've been doing any time something
2369
01:48:41,590 --> 01:48:42,673
goes right or wrong.
2370
01:48:42,673 --> 01:48:45,340
So, for instance, let me whip up a quick program that just says,
2371
01:48:45,340 --> 01:48:52,300
if the length of sys.argv does not equal 2, then let's yell at the user
2372
01:48:52,300 --> 01:48:54,970
and say you're missing a command line argument.
2373
01:48:54,970 --> 01:48:57,380
Otherwise, command-line argument.
2374
01:48:57,380 --> 01:49:01,360
And let's, then, return sys.exit(1).
2375
01:49:01,360 --> 01:49:05,590
Else, let's go ahead and, logically, just say print a formatted string that
2376
01:49:05,590 --> 01:49:07,450
says hello-- as before--
2377
01:49:07,450 --> 01:49:09,640
sys.argv 1.
2378
01:49:09,640 --> 01:49:11,770
Now, things look different all of a sudden,
2379
01:49:11,770 --> 01:49:13,312
but I'm doing something deliberately.
2380
01:49:13,312 --> 01:49:14,870
First, let's see what this does.
2381
01:49:14,870 --> 01:49:18,730
So, on line 1, I'm importing not argv, specifically.
2382
01:49:18,730 --> 01:49:22,150
I'm importing the whole sys library, and we'll see why in a second.
2383
01:49:22,150 --> 01:49:27,220
Well, it turns out that the sys library has not only the argv list,
2384
01:49:27,220 --> 01:49:30,580
it also has a function called exit, which I'd like to be able to use,
2385
01:49:30,580 --> 01:49:31,370
as well.
2386
01:49:31,370 --> 01:49:35,200
So it turns out that, if you import a whole library in this way, that's fine.
2387
01:49:35,200 --> 01:49:37,840
But you have to refer to the things inside of it
2388
01:49:37,840 --> 01:49:42,980
by using that same library's name and a dot to namespace it, so to speak.
2389
01:49:42,980 --> 01:49:47,002
So here, I'm just saying, if the user does not type in two words,
2390
01:49:47,002 --> 01:49:49,960
yell at them with missing command line argument, and then, exit with 1.
2391
01:49:49,960 --> 01:49:52,975
Just like in C, when you do exit 1, just means something went wrong.
2392
01:49:52,975 --> 01:49:54,785
Otherwise, print out hello to this.
2393
01:49:54,785 --> 01:49:57,910
And this is starting to look cryptic, but it's just a combination of ideas.
2394
01:49:57,910 --> 01:50:02,080
The curly braces means interpolate this value, plug it in here.
2395
01:50:02,080 --> 01:50:05,740
sys.argv is just the verbose way of saying go into the sys library
2396
01:50:05,740 --> 01:50:09,010
and get the argv variable therein.
2397
01:50:09,010 --> 01:50:11,860
And bracket 1, of course, just like arrays in C,
2398
01:50:11,860 --> 01:50:15,440
is just the second element at the prompt.
2399
01:50:15,440 --> 01:50:18,700
So when I run this version, now-- python of exit.py--
2400
01:50:18,700 --> 01:50:21,340
with no arguments, I get yelled at in this way.
2401
01:50:21,340 --> 01:50:24,640
If, however, I type in two arguments total--
2402
01:50:24,640 --> 01:50:26,950
the name of the file and my own name--
2403
01:50:26,950 --> 01:50:29,050
now, I get greeted with hello, David.
2404
01:50:29,050 --> 01:50:30,310
And it's the same idea before.
2405
01:50:30,310 --> 01:50:33,160
This was a very low-level technique, but same thing here.
2406
01:50:33,160 --> 01:50:36,310
If you do echo dollar sign question mark Enter,
2407
01:50:36,310 --> 01:50:39,170
you'll see the exit code of your program.
2408
01:50:39,170 --> 01:50:41,270
So if I do this incorrectly again--
2409
01:50:41,270 --> 01:50:43,953
let me rerun it without my name, Enter--
2410
01:50:43,953 --> 01:50:44,620
I get yelled at.
2411
01:50:44,620 --> 01:50:47,320
But if I do echo dollar sign question mark,
2412
01:50:47,320 --> 01:50:50,170
there's the secret one that's returned.
2413
01:50:50,170 --> 01:50:54,160
Again, just to show you parity with C, in this case.
2414
01:50:54,160 --> 01:50:56,320
Questions, now, on any of these techniques, here?
2415
01:50:56,320 --> 01:50:58,900
2416
01:50:58,900 --> 01:50:59,400
No.
2417
01:50:59,400 --> 01:51:00,030
All right.
2418
01:51:00,030 --> 01:51:02,580
How about something that's a little more powerful, too?
2419
01:51:02,580 --> 01:51:05,880
We spend so much time in week 0 and 1 doing searching
2420
01:51:05,880 --> 01:51:07,830
and, then, eventually, sorting in week 3.
2421
01:51:07,830 --> 01:51:10,288
Well, it turns out, Python can help with some of this, too.
2422
01:51:10,288 --> 01:51:12,720
Let me go ahead and create a program called names.py
2423
01:51:12,720 --> 01:51:15,053
that's just going to be an opportunity to, maybe, search
2424
01:51:15,053 --> 01:51:16,650
over a whole bunch of names.
2425
01:51:16,650 --> 01:51:21,060
Let me go ahead and import sys, just so I have access to exit.
2426
01:51:21,060 --> 01:51:22,920
And let me go ahead and create a variable
2427
01:51:22,920 --> 01:51:26,756
called names that's going to be a list with a whole bunch of names.
2428
01:51:26,756 --> 01:51:27,660
How about here?
2429
01:51:27,660 --> 01:51:34,740
Charlie and Fred and George and Ginny and Percy and, lastly, Ron.
2430
01:51:34,740 --> 01:51:36,290
So a whole bunch of names here.
2431
01:51:36,290 --> 01:51:38,040
And it'd be a little annoying to implement
2432
01:51:38,040 --> 01:51:42,540
code that iterates over that, from left to right, in C, searching for one
2433
01:51:42,540 --> 01:51:43,165
of those names.
2434
01:51:43,165 --> 01:51:43,957
In fact, what name?
2435
01:51:43,957 --> 01:51:46,290
Well, let's go ahead and ask the user to input the name
2436
01:51:46,290 --> 01:51:48,498
that they want to search for so that we can tell them
2437
01:51:48,498 --> 01:51:50,460
if the name is there or not.
2438
01:51:50,460 --> 01:51:54,670
And we could do this, similar to C, in Python, doing something like this.
2439
01:51:54,670 --> 01:52:00,600
So for n in names, where n is just a variable to iterate over each name--
2440
01:52:00,600 --> 01:52:05,595
if the name I'm looking for equals the current name in the list--
2441
01:52:05,595 --> 01:52:09,060
AKA n-- well, let's print out something friendly, like "Found."
2442
01:52:09,060 --> 01:52:14,250
And then, let's do sys.exit 0 to indicate that we found whoever that is.
2443
01:52:14,250 --> 01:52:17,460
Otherwise, if we get all the way to the bottom here, outside of this loop,
2444
01:52:17,460 --> 01:52:20,340
let's just print "Not found" because if we haven't exited yet.
2445
01:52:20,340 --> 01:52:22,800
And then, let's just exit with 1.
2446
01:52:22,800 --> 01:52:25,980
Just to be clear, I can continue importing all of sys,
2447
01:52:25,980 --> 01:52:31,920
or I could do from sys import exit, and then, I could get rid of sys dot
2448
01:52:31,920 --> 01:52:33,240
everywhere else.
2449
01:52:33,240 --> 01:52:36,540
But sometimes, it's helpful to know exactly where functions came from.
2450
01:52:36,540 --> 01:52:39,675
So this, too, is just a matter of style, in this case.
2451
01:52:39,675 --> 01:52:40,230
All right.
2452
01:52:40,230 --> 01:52:41,522
So let's go ahead and run this.
2453
01:52:41,522 --> 01:52:46,540
python of names.py, and let's look for Ron, all the way at the end.
2454
01:52:46,540 --> 01:52:47,040
All right.
2455
01:52:47,040 --> 01:52:47,910
He's found.
2456
01:52:47,910 --> 01:52:51,570
And let's search for someone outside of the family here, like Hermione.
2457
01:52:51,570 --> 01:52:52,700
Not found.
2458
01:52:52,700 --> 01:52:53,200
OK.
2459
01:52:53,200 --> 01:52:54,783
So it seems to be working in this way.
2460
01:52:54,783 --> 01:52:58,548
But I've essentially implemented what algorithm?
2461
01:52:58,548 --> 01:53:05,247
What algorithm would this seem to be, per line 7 and 8 to 9 and 10?
2462
01:53:05,247 --> 01:53:05,955
AUDIENCE: Linear.
2463
01:53:05,955 --> 01:53:06,450
DAVID MALAN: Yeah.
2464
01:53:06,450 --> 01:53:07,350
So it's just linear search.
2465
01:53:07,350 --> 01:53:10,185
It's a loop, even thought he syntax is a little more succinct today,
2466
01:53:10,185 --> 01:53:12,060
and it's just iterating over the whole thing.
2467
01:53:12,060 --> 01:53:15,240
Well, honestly, we've seen an even more terse way to do this in Python.
2468
01:53:15,240 --> 01:53:19,230
And this, again, is what makes it a more pleasant language, sometimes.
2469
01:53:19,230 --> 01:53:20,630
Why don't I just do this?
2470
01:53:20,630 --> 01:53:24,790
Instead of iterating one at a time, why don't I just say this?
2471
01:53:24,790 --> 01:53:27,840
Let me go ahead and change my condition to just
2472
01:53:27,840 --> 01:53:33,270
be-- how about if the name we're looking for is in the names list, we're done.
2473
01:53:33,270 --> 01:53:33,960
We found it.
2474
01:53:33,960 --> 01:53:36,570
Use the end preposition that we've seen a couple of times,
2475
01:53:36,570 --> 01:53:40,710
now, that itself asks the question, is something in something else?
2476
01:53:40,710 --> 01:53:44,050
And Python will take care of linear search for us.
2477
01:53:44,050 --> 01:53:46,080
And it's going to work exactly the same if I
2478
01:53:46,080 --> 01:53:48,030
do python of names.py, search for Ron.
2479
01:53:48,030 --> 01:53:50,077
It's still going to find him and it's still
2480
01:53:50,077 --> 01:53:51,660
going to do it linearly, in this case.
2481
01:53:51,660 --> 01:53:58,060
But I don't have to write all of the lower-level code myself, in this case.
2482
01:53:58,060 --> 01:54:02,430
Questions, now, on any of this?
2483
01:54:02,430 --> 01:54:05,380
The code's just getting shorter and shorter.
2484
01:54:05,380 --> 01:54:05,880
No?
2485
01:54:05,880 --> 01:54:07,740
What about-- let's see.
2486
01:54:07,740 --> 01:54:09,250
What else might we have here?
2487
01:54:09,250 --> 01:54:10,770
How about this?
2488
01:54:10,770 --> 01:54:12,780
Let's go ahead and implement that phonebook
2489
01:54:12,780 --> 01:54:15,690
that we started, metaphorically, with in the beginning of the course.
2490
01:54:15,690 --> 01:54:17,940
Let's code up a program called phonebook.py.
2491
01:54:17,940 --> 01:54:22,440
And in this case, let's go ahead and let's create a dictionary this time.
2492
01:54:22,440 --> 01:54:25,470
Recall that a dictionary is a little something that
2493
01:54:25,470 --> 01:54:27,060
implements something like this--
2494
01:54:27,060 --> 01:54:31,140
a two-column table that's got keys and values, words
2495
01:54:31,140 --> 01:54:33,240
and definitions, names and numbers.
2496
01:54:33,240 --> 01:54:36,367
And let's focus on the last of those, names and numbers, in this case.
2497
01:54:36,367 --> 01:54:38,700
Well, I claimed earlier that Python has built-in support
2498
01:54:38,700 --> 01:54:42,780
for dictionaries-- dict objects-- that you can create with one line.
2499
01:54:42,780 --> 01:54:45,120
I didn't need it for speller because a set is sufficient
2500
01:54:45,120 --> 01:54:47,610
when you only want one of the keys or the values, not both.
2501
01:54:47,610 --> 01:54:49,680
But now, I want some names and numbers.
2502
01:54:49,680 --> 01:54:53,220
So it turns out, in Python, you can create an empty dictionary
2503
01:54:53,220 --> 01:54:55,680
by saying dict open parenthesis, closed.
2504
01:54:55,680 --> 01:54:58,080
And that just gives you, essentially, a chart that
2505
01:54:58,080 --> 01:54:59,640
looks like this, with nothing in it.
2506
01:54:59,640 --> 01:55:01,725
Or there's more succinct syntax.
2507
01:55:01,725 --> 01:55:06,858
You can, alternatively, do this, with two curly braces, instead.
2508
01:55:06,858 --> 01:55:09,150
And, in fact, I've been using a shortcut all this time.
2509
01:55:09,150 --> 01:55:15,885
When I had a list, earlier, where my variable was called scores,
2510
01:55:15,885 --> 01:55:19,860
and I did this, that was actually the shorthand version of this--
2511
01:55:19,860 --> 01:55:21,637
hey, Python, give me an empty list.
2512
01:55:21,637 --> 01:55:23,970
So there's different syntax for achieving the same goal.
2513
01:55:23,970 --> 01:55:27,540
In this case, if I want a dictionary for people,
2514
01:55:27,540 --> 01:55:32,530
I can either do this or, more commonly, just two curly braces, like that.
2515
01:55:32,530 --> 01:55:33,030
All right.
2516
01:55:33,030 --> 01:55:34,360
Well, what do I want to put in this?
2517
01:55:34,360 --> 01:55:36,360
Well, let me actually put some things in this.
2518
01:55:36,360 --> 01:55:39,360
And I'm going to just move my closed curly brace to a new line.
2519
01:55:39,360 --> 01:55:42,580
If I want to implement this idea of keys and values,
2520
01:55:42,580 --> 01:55:47,220
the way you do this in Python is key colon value comma.
2521
01:55:47,220 --> 01:55:48,230
Key colon value.
2522
01:55:48,230 --> 01:55:50,410
So you'd implement it more in code.
2523
01:55:50,410 --> 01:55:54,270
So, for instance, if I want Carter to be the first key in my phone book and I
2524
01:55:54,270 --> 01:56:00,135
want his number to be +1-617-495-1000, I can put that as the corresponding
2525
01:56:00,135 --> 01:56:00,960
value.
2526
01:56:00,960 --> 01:56:02,010
The colon is in between.
2527
01:56:02,010 --> 01:56:05,970
Both are strings, or strs, so I've quoted both deliberately.
2528
01:56:05,970 --> 01:56:07,762
If I want to add myself, I can put a comma.
2529
01:56:07,762 --> 01:56:10,970
And then, just to keep things pretty, I'm moving the cursor to the next line.
2530
01:56:10,970 --> 01:56:12,990
But that's not strictly required, aesthetically.
2531
01:56:12,990 --> 01:56:13,865
It's just good style.
2532
01:56:13,865 --> 01:56:19,500
And here, I might do +1-949-468-2750.
2533
01:56:19,500 --> 01:56:24,270
And now, I have a dictionary that, essentially, has two rows, here--
2534
01:56:24,270 --> 01:56:27,322
Carter and his number and David and his number, as well.
2535
01:56:27,322 --> 01:56:30,405
And if I kept adding to this, this chart would just get longer and longer.
2536
01:56:30,405 --> 01:56:32,430
Suppose I want to search for one of our numbers.
2537
01:56:32,430 --> 01:56:34,950
Well, let's prompt the user for the name,
2538
01:56:34,950 --> 01:56:37,470
for whose number you want to search by getting string.
2539
01:56:37,470 --> 01:56:38,560
Or you know what?
2540
01:56:38,560 --> 01:56:39,893
We don't need this CS50 library.
2541
01:56:39,893 --> 01:56:43,090
Let's just use input and prompt the user for a name.
2542
01:56:43,090 --> 01:56:49,230
And now, we can use this super terse syntax and just say if name in people,
2543
01:56:49,230 --> 01:56:53,700
print the formatted string number colon and--
2544
01:56:53,700 --> 01:56:57,160
here, we can do this-- people bracket name.
2545
01:56:57,160 --> 01:56:57,930
OK.
2546
01:56:57,930 --> 01:57:01,800
So this is getting cool quickly, confusingly.
2547
01:57:01,800 --> 01:57:02,805
So let me run this.
2548
01:57:02,805 --> 01:57:06,810
python of phonebook.py Let's type in Carter.
2549
01:57:06,810 --> 01:57:08,910
And, indeed, I see his number.
2550
01:57:08,910 --> 01:57:12,910
Let's run it again with David, and I see my number here.
2551
01:57:12,910 --> 01:57:14,590
So what's going on?
2552
01:57:14,590 --> 01:57:19,320
Well, it turns out that a dictionary is very similar, in spirit, to a list.
2553
01:57:19,320 --> 01:57:22,350
It's actually very similar, in spirit, to an array in C.
2554
01:57:22,350 --> 01:57:27,150
But instead of being limited to keys that are numbers, like bracket 0,
2555
01:57:27,150 --> 01:57:30,690
bracket 1, bracket 2, you can actually use words.
2556
01:57:30,690 --> 01:57:33,060
And that's all I'm doing here on line 8.
2557
01:57:33,060 --> 01:57:36,765
If I want to check for the name Carter, which is currently
2558
01:57:36,765 --> 01:57:39,555
in this variable called name, I can index
2559
01:57:39,555 --> 01:57:42,660
into my people dictionary using not a number,
2560
01:57:42,660 --> 01:57:44,830
but using, literally, a string--
2561
01:57:44,830 --> 01:57:48,000
the name Carter or David or anything else.
2562
01:57:48,000 --> 01:57:50,640
To make this clearer, too, notice that I'm, at the moment,
2563
01:57:50,640 --> 01:57:54,095
using this format string, which is adding some undue complexity.
2564
01:57:54,095 --> 01:57:56,220
But I could clarify this, perhaps, further as this.
2565
01:57:56,220 --> 01:57:58,080
I could give myself another variable called
2566
01:57:58,080 --> 01:58:01,320
number, set it equal to the people dictionary,
2567
01:58:01,320 --> 01:58:03,875
indexing into it using the current name.
2568
01:58:03,875 --> 01:58:07,230
And now, I can shorten this to make it clearer that all I'm doing
2569
01:58:07,230 --> 01:58:09,910
is printing the value of that.
2570
01:58:09,910 --> 01:58:12,930
And, in fact, I can do this even more cryptically.
2571
01:58:12,930 --> 01:58:16,710
This would be weird to do, but if I only ever want to show David's phone number
2572
01:58:16,710 --> 01:58:21,150
and never Carter's, I can literally, quote unquote, "index into" the people
2573
01:58:21,150 --> 01:58:24,930
dictionary because, now, when I run this, even if I type Carter,
2574
01:58:24,930 --> 01:58:27,020
I'm going to get back my number instead.
2575
01:58:27,020 --> 01:58:31,080
But that's all that's happening if I undo that, because that's now a bug.
2576
01:58:31,080 --> 01:58:35,250
But I index into it using the value of name.
2577
01:58:35,250 --> 01:58:37,230
Dictionaries are just so wonderfully convenient
2578
01:58:37,230 --> 01:58:39,688
because, now, you can associate anything with anything else
2579
01:58:39,688 --> 01:58:43,420
but not using numbers, but entire key words, instead.
2580
01:58:43,420 --> 01:58:46,770
So here's how, if, in speller, we gave you not just words,
2581
01:58:46,770 --> 01:58:50,340
but hundreds of thousands of definitions, as well,
2582
01:58:50,340 --> 01:58:52,385
you could essentially store them as this.
2583
01:58:52,385 --> 01:58:55,680
And then, when the human wants to look up a definition in a proper dictionary,
2584
01:58:55,680 --> 01:58:57,750
not just for spell checking, you could index
2585
01:58:57,750 --> 01:59:00,290
into the dictionary using square brackets
2586
01:59:00,290 --> 01:59:04,240
and get back the definition in English, as well.
2587
01:59:04,240 --> 01:59:06,770
Questions on this?
2588
01:59:06,770 --> 01:59:07,280
Yeah?
2589
01:59:07,280 --> 01:59:09,760
AUDIENCE: Is the way this code does, as presented,
2590
01:59:09,760 --> 01:59:11,744
saying that Python has [INAUDIBLE]?
2591
01:59:11,744 --> 01:59:21,390
2592
01:59:21,390 --> 01:59:22,890
DAVID MALAN: A really good question.
2593
01:59:22,890 --> 01:59:27,330
So, to summarize, how is Python finding that name within that dictionary?
2594
01:59:27,330 --> 01:59:31,110
This is where, honestly, speller in p-set 5 is what Python's all about.
2595
01:59:31,110 --> 01:59:34,215
So you have struggled, are struggling with implementing your own spell
2596
01:59:34,215 --> 01:59:36,090
checker and implementing your own hash table.
2597
01:59:36,090 --> 01:59:39,210
And recall that, per last week, the goal of a hash table is to,
2598
01:59:39,210 --> 01:59:41,190
ideally, get constant time access.
2599
01:59:41,190 --> 01:59:45,435
Not something linear, which is slow and even better than something logarithmic,
2600
01:59:45,435 --> 01:59:47,400
like log base 2 of n.
2601
01:59:47,400 --> 01:59:50,130
So Python and the really smart people who invented it,
2602
01:59:50,130 --> 01:59:53,310
they have written the code that does its best to give you
2603
01:59:53,310 --> 01:59:55,853
constant time searches of dictionaries.
2604
01:59:55,853 --> 01:59:58,020
And they're not always going to succeed, just as you
2605
01:59:58,020 --> 01:59:59,430
and your own problem set are probably going
2606
01:59:59,430 --> 02:00:01,805
to have some collisions once in a while and start to have
2607
02:00:01,805 --> 02:00:03,440
chains of length lists of words.
2608
02:00:03,440 --> 02:00:05,940
But this is where, again, you defer to someone else, someone
2609
02:00:05,940 --> 02:00:07,800
smarter than you, someone with more time than you
2610
02:00:07,800 --> 02:00:09,270
to solve these problems for you.
2611
02:00:09,270 --> 02:00:11,490
And if you read Python's documentation, you'll
2612
02:00:11,490 --> 02:00:13,650
see that it doesn't guarantee constant time,
2613
02:00:13,650 --> 02:00:15,990
but it's going to, ideally, optimize the data structure
2614
02:00:15,990 --> 02:00:19,320
for you to get as fast as possible.
2615
02:00:19,320 --> 02:00:22,690
And of all of the data structures like a dictionary,
2616
02:00:22,690 --> 02:00:25,380
a hash table is, really, like the Swiss army knife of computing
2617
02:00:25,380 --> 02:00:28,260
because it just lets you associate something with something else.
2618
02:00:28,260 --> 02:00:30,510
And even though we keep focusing on names and numbers,
2619
02:00:30,510 --> 02:00:32,400
that's a really powerful thing because it's
2620
02:00:32,400 --> 02:00:34,230
more powerful than lists and arrays, which
2621
02:00:34,230 --> 02:00:35,910
are only numbers and something else.
2622
02:00:35,910 --> 02:00:38,690
Now, you can have any sorts of relationships, instead.
2623
02:00:38,690 --> 02:00:39,270
All right.
2624
02:00:39,270 --> 02:00:41,178
Let me show a few other examples before we
2625
02:00:41,178 --> 02:00:43,470
culminate with some more powerful techniques in Python,
2626
02:00:43,470 --> 02:00:45,000
thanks to libraries.
2627
02:00:45,000 --> 02:00:49,480
How about this problem we encountered in week 4, which was this.
2628
02:00:49,480 --> 02:00:54,120
Let me code up a program called, again, compare.py here but, this time,
2629
02:00:54,120 --> 02:00:56,770
compare to strings and not numbers.
2630
02:00:56,770 --> 02:01:01,230
So let me, for instance, get one string from the user called s.
2631
02:01:01,230 --> 02:01:04,890
Just for the sake of discussion, let me get another string from the user
2632
02:01:04,890 --> 02:01:07,830
called t so that we can actually do some comparison here.
2633
02:01:07,830 --> 02:01:12,780
And if s equals equals t, let's go ahead and print out that they're the same.
2634
02:01:12,780 --> 02:01:15,640
Else, let's go ahead and print out that they're different.
2635
02:01:15,640 --> 02:01:17,910
So this is very similar to what we did in week 4.
2636
02:01:17,910 --> 02:01:20,580
But in week 4, recall we did this specifically
2637
02:01:20,580 --> 02:01:23,800
because we had encountered a problem.
2638
02:01:23,800 --> 02:01:28,680
For instance, if I run-- whoops.
2639
02:01:28,680 --> 02:01:34,970
If I run-- what's going on?
2640
02:01:34,970 --> 02:01:40,396
[INAUDIBLE] Come on.
2641
02:01:40,396 --> 02:01:41,390
Oh.
2642
02:01:41,390 --> 02:01:41,890
OK.
2643
02:01:41,890 --> 02:01:43,240
Wow, OK.
2644
02:01:43,240 --> 02:01:43,840
Long day.
2645
02:01:43,840 --> 02:01:44,380
All right.
2646
02:01:44,380 --> 02:01:48,670
If I run the proper command, python of compare.py, then let's go ahead
2647
02:01:48,670 --> 02:01:53,785
and type in something like "cat" in all lowercase, "cat" in all lowercase.
2648
02:01:53,785 --> 02:01:56,110
And they're the same.
2649
02:01:56,110 --> 02:01:59,565
If, though, I do this again with "dog" and "dog," they're the same.
2650
02:01:59,565 --> 02:02:01,690
And, of course, "cat" and "dog," they're different.
2651
02:02:01,690 --> 02:02:06,430
But does anyone recall, from two weeks ago, when I typed in my name twice,
2652
02:02:06,430 --> 02:02:08,680
both identically capitalized.
2653
02:02:08,680 --> 02:02:10,360
What did it say?
2654
02:02:10,360 --> 02:02:13,390
That they were, in fact, different.
2655
02:02:13,390 --> 02:02:14,110
And why was that?
2656
02:02:14,110 --> 02:02:16,660
Why were two strings in C different, even though I typed literally
2657
02:02:16,660 --> 02:02:17,410
the same thing?
2658
02:02:17,410 --> 02:02:20,040
2659
02:02:20,040 --> 02:02:21,540
Two different places in memory.
2660
02:02:21,540 --> 02:02:24,560
So each string might look the same, aesthetically, but, of course,
2661
02:02:24,560 --> 02:02:25,852
was stored elsewhere in memory.
2662
02:02:25,852 --> 02:02:29,970
And yet, Python appears to be using the equality operator--
2663
02:02:29,970 --> 02:02:33,510
equals equals-- like you and I would expect, as humans-- actually
2664
02:02:33,510 --> 02:02:38,510
comparing for us char by char in each of those strings for actual [? quality. ?]
2665
02:02:38,510 --> 02:02:41,610
So this is a feature of Python, in that it's just easier to do.
2666
02:02:41,610 --> 02:02:42,210
And why?
2667
02:02:42,210 --> 02:02:44,627
Well, this derives from the reality that, in Python, there
2668
02:02:44,627 --> 02:02:45,630
are no pointers anymore.
2669
02:02:45,630 --> 02:02:47,297
There's no underlying memory management.
2670
02:02:47,297 --> 02:02:50,400
It's not up to you, now, to worry about those lower-level details.
2671
02:02:50,400 --> 02:02:52,960
The language itself takes care of that for you.
2672
02:02:52,960 --> 02:02:55,050
And so, similarly, if I do this and don't
2673
02:02:55,050 --> 02:02:57,510
ask the user for two strings, but just one,
2674
02:02:57,510 --> 02:02:59,370
and then, I do something like this.
2675
02:02:59,370 --> 02:03:05,550
How about give myself a second variable t, set it equal to s.capitalize, which,
2676
02:03:05,550 --> 02:03:08,040
note, is not the same as upper; capitalize, by design,
2677
02:03:08,040 --> 02:03:12,270
per Python's documentation, will only capitalize the first letter for you--
2678
02:03:12,270 --> 02:03:15,240
I can now print out, say, two fstrings here--
2679
02:03:15,240 --> 02:03:18,240
what the value of s is and, then, let me print out,
2680
02:03:18,240 --> 02:03:20,340
with another fstring, what the value of t is.
2681
02:03:20,340 --> 02:03:22,995
And recall that, in C, this was a problem
2682
02:03:22,995 --> 02:03:26,820
because if you capitalize s and store it in t,
2683
02:03:26,820 --> 02:03:29,670
we accidentally capitalized both s and t.
2684
02:03:29,670 --> 02:03:33,510
But in this case, in Python, when I actually run this and type in "cat"
2685
02:03:33,510 --> 02:03:37,770
In all lowercase, the original s is unchanged
2686
02:03:37,770 --> 02:03:42,780
because, when I use capitalize on line 3, this is, indeed, capitalizing s.
2687
02:03:42,780 --> 02:03:47,550
But it's returning a copy of the result. It cannot change s itself
2688
02:03:47,550 --> 02:03:50,385
because, again, for that technical term, s is immutable.
2689
02:03:50,385 --> 02:03:53,265
Strings, once they exist, cannot be changed themselves.
2690
02:03:53,265 --> 02:03:58,590
But you can return copies and modify mutated copies of those same strings.
2691
02:03:58,590 --> 02:04:02,040
So, in short, all of those headaches we encountered in week 4
2692
02:04:02,040 --> 02:04:05,070
are now solved, really, in the way you might expect.
2693
02:04:05,070 --> 02:04:07,500
And here's another one that we dwelled on in week 4,
2694
02:04:07,500 --> 02:04:09,660
with the colored liquid in glasses.
2695
02:04:09,660 --> 02:04:12,150
Let me code up a program called swap.py.
2696
02:04:12,150 --> 02:04:16,690
And in swap.py, let me set x equal to 1, y equal to 2.
2697
02:04:16,690 --> 02:04:18,690
And then, let me just print out an fstring here.
2698
02:04:18,690 --> 02:04:24,360
So how about x is this comma y is that.
2699
02:04:24,360 --> 02:04:27,735
And then, let me do that twice, just for the sake of demonstration.
2700
02:04:27,735 --> 02:04:31,005
And in here, recall that we had to create a swap function.
2701
02:04:31,005 --> 02:04:33,630
But then, we had to pass it in by reference with the ampersand.
2702
02:04:33,630 --> 02:04:38,460
And oh my god, that was peak complexity in C. Well,
2703
02:04:38,460 --> 02:04:41,100
if you want to swap x and y in Python, you
2704
02:04:41,100 --> 02:04:43,830
could do x comma y equals y comma x.
2705
02:04:43,830 --> 02:04:49,020
And now, python of swap.py.
2706
02:04:49,020 --> 02:04:50,130
And there we go.
2707
02:04:50,130 --> 02:04:51,840
All of that's handled for you.
2708
02:04:51,840 --> 02:04:56,350
It's like a shell game without even a temporary variable in mind.
2709
02:04:56,350 --> 02:04:58,290
So what more can we do here?
2710
02:04:58,290 --> 02:05:00,870
How about a few final building blocks?
2711
02:05:00,870 --> 02:05:03,330
And these related, now, to files from that week 4.
2712
02:05:03,330 --> 02:05:07,710
Suppose that I want to save some names and numbers in a CSV file--
2713
02:05:07,710 --> 02:05:11,080
Comma Separated Values, which is like a very lightweight spreadsheet.
2714
02:05:11,080 --> 02:05:15,300
Well, first, let me create a phonebook.csv file
2715
02:05:15,300 --> 02:05:19,458
that just has name comma number as the first row there.
2716
02:05:19,458 --> 02:05:21,750
But after that, I'm going to go ahead, now, and code up
2717
02:05:21,750 --> 02:05:25,170
a phonebook.py program that actually allows
2718
02:05:25,170 --> 02:05:27,040
me to add things to this phonebook.
2719
02:05:27,040 --> 02:05:31,020
So let me split my screen here so that we can see the old and the new.
2720
02:05:31,020 --> 02:05:34,050
And down here, in my code for phonebook.py,
2721
02:05:34,050 --> 02:05:36,360
in this new and improved version, I'm going
2722
02:05:36,360 --> 02:05:40,020
to actually import a whole other library, this one called CSV.
2723
02:05:40,020 --> 02:05:42,885
And here, too, especially for people in data science and the like,
2724
02:05:42,885 --> 02:05:46,500
really like being able to manipulate files and data that might very well be
2725
02:05:46,500 --> 02:05:48,060
stored in spreadsheets or CSVs--
2726
02:05:48,060 --> 02:05:51,510
Comma Separated Values, which we saw briefly in week 4.
2727
02:05:51,510 --> 02:05:53,670
In phonebook.py, then, it suffices to just
2728
02:05:53,670 --> 02:05:57,348
import CSV after reading the documentation therefore
2729
02:05:57,348 --> 02:05:59,265
because this is going to give me functionality
2730
02:05:59,265 --> 02:06:02,150
in code related to CSV files.
2731
02:06:02,150 --> 02:06:04,950
So here's how I might open a file in Python.
2732
02:06:04,950 --> 02:06:08,340
I literally call open-- it's not fopen now; it's just open--
2733
02:06:08,340 --> 02:06:10,860
and I open this file called phonebook.csv.
2734
02:06:10,860 --> 02:06:13,470
And just as in C, I'm going to open it in append mode--
2735
02:06:13,470 --> 02:06:15,930
not right, where it would change the whole thing.
2736
02:06:15,930 --> 02:06:18,660
I want to append new line at a time.
2737
02:06:18,660 --> 02:06:21,750
After this, I want to get, maybe, a name from the user.
2738
02:06:21,750 --> 02:06:25,350
So let's prompt the user for some input for their name.
2739
02:06:25,350 --> 02:06:27,255
And then, let's prompt the user for a number,
2740
02:06:27,255 --> 02:06:31,060
as well, using input prompting for number.
2741
02:06:31,060 --> 02:06:31,560
All right.
2742
02:06:31,560 --> 02:06:33,602
And now, this is a little cryptic, and you'd only
2743
02:06:33,602 --> 02:06:35,050
know this from the documentation.
2744
02:06:35,050 --> 02:06:38,370
But if you want to write rows to a CSV file
2745
02:06:38,370 --> 02:06:41,850
that you can, then, view in Excel or the like, you can do this--
2746
02:06:41,850 --> 02:06:45,060
give me a variable called writer-- but I could call it anything I want.
2747
02:06:45,060 --> 02:06:50,760
Let me use a csv.writer function that comes with this CSV library,
2748
02:06:50,760 --> 02:06:51,885
passing in the file.
2749
02:06:51,885 --> 02:06:56,070
This is like saying, hey, Python, treat this open file as a CSV file
2750
02:06:56,070 --> 02:06:59,340
so that things are separated with commas and nicely formatted
2751
02:06:59,340 --> 02:07:00,515
in rows and columns.
2752
02:07:00,515 --> 02:07:02,100
Now, I'm going to do this--
2753
02:07:02,100 --> 02:07:04,030
use that writer to write a row.
2754
02:07:04,030 --> 02:07:05,280
Well, what do I want to write?
2755
02:07:05,280 --> 02:07:07,380
I want to write a short list--
2756
02:07:07,380 --> 02:07:10,200
namely, the current name and the current number--
2757
02:07:10,200 --> 02:07:14,790
to that file, but I don't want to use fprintf and %s and all of that stuff
2758
02:07:14,790 --> 02:07:16,440
that we might have had in the past.
2759
02:07:16,440 --> 02:07:19,030
And now, I just want to close the file.
2760
02:07:19,030 --> 02:07:20,410
Let me reopen my terminal.
2761
02:07:20,410 --> 02:07:26,102
Let me run python of phonebook.py, and let me type in David and then
2762
02:07:26,102 --> 02:07:30,190
+1-949-468-2750 and, crossing my fingers,
2763
02:07:30,190 --> 02:07:33,430
watching the actual CSV at top-left.
2764
02:07:33,430 --> 02:07:35,737
My code has just added me to the file.
2765
02:07:35,737 --> 02:07:37,570
And if I were to run it again, for instance,
2766
02:07:37,570 --> 02:07:41,770
with Carter and +1-617-495-1000, crossing my fingers again--
2767
02:07:41,770 --> 02:07:42,820
we've updated the file.
2768
02:07:42,820 --> 02:07:46,150
And it turns out, there's code now, via which I can even read that file.
2769
02:07:46,150 --> 02:07:48,850
But I can, first, tighten this up, just so you've seen it.
2770
02:07:48,850 --> 02:07:52,720
It turns out, in Python, it's so common to open files and close them.
2771
02:07:52,720 --> 02:07:54,610
Humans make mistakes, and they often forget
2772
02:07:54,610 --> 02:07:58,477
to close files, which might, then, end up using more memory than you intend.
2773
02:07:58,477 --> 02:08:00,310
So you can, alternatively, do this in Python
2774
02:08:00,310 --> 02:08:03,310
so that you don't have to worry about closing files.
2775
02:08:03,310 --> 02:08:05,920
You can use this keyword instead.
2776
02:08:05,920 --> 02:08:09,100
You can say with the opening of this file
2777
02:08:09,100 --> 02:08:13,420
as a variable called file do all of the following underneath.
2778
02:08:13,420 --> 02:08:15,470
So I'm indenting most of my code.
2779
02:08:15,470 --> 02:08:18,430
I'm using this new, Python-specific keyword called width.
2780
02:08:18,430 --> 02:08:22,330
And this is just a matter of saying, with the following opening of the file,
2781
02:08:22,330 --> 02:08:26,120
do those next four lines of code, and then, automatically close it for me
2782
02:08:26,120 --> 02:08:27,370
at the end of the indentation.
2783
02:08:27,370 --> 02:08:31,480
It's a minor optimization, but this, again, is the pythonic way
2784
02:08:31,480 --> 02:08:33,250
to do things, instead.
2785
02:08:33,250 --> 02:08:34,720
How else might I do this, too?
2786
02:08:34,720 --> 02:08:38,860
Well, it turns out that the code I've written here-- on line 9,
2787
02:08:38,860 --> 02:08:40,630
especially-- is a little fragile.
2788
02:08:40,630 --> 02:08:44,350
If any human opens this spreadsheet-- the CSV file in Excel,
2789
02:08:44,350 --> 02:08:46,000
Google Spreadsheets, Apple Numbers--
2790
02:08:46,000 --> 02:08:49,390
and maybe moves the columns around just because, maybe, they're fussing.
2791
02:08:49,390 --> 02:08:52,790
They saved it, and they don't realize they've, now, changed my assumptions.
2792
02:08:52,790 --> 02:08:55,120
I don't want to, necessarily, write name and number
2793
02:08:55,120 --> 02:08:58,360
always in that order because what if someone screws up and flips those two
2794
02:08:58,360 --> 02:09:01,040
columns by literally dragging and dropping?
2795
02:09:01,040 --> 02:09:03,640
So it turns out that, instead of using a list here,
2796
02:09:03,640 --> 02:09:06,890
we can use another feature of this library, as follows.
2797
02:09:06,890 --> 02:09:09,520
Instead of using a writer, there's something
2798
02:09:09,520 --> 02:09:11,530
called a dictionary writer or dict writer
2799
02:09:11,530 --> 02:09:14,140
that takes the same argument as input--
2800
02:09:14,140 --> 02:09:15,580
the file that's opened.
2801
02:09:15,580 --> 02:09:18,070
But now, the one difference here is that you
2802
02:09:18,070 --> 02:09:25,030
need to tell this dictionary writer that your field names are name and number.
2803
02:09:25,030 --> 02:09:27,370
And let me close the CSV here.
2804
02:09:27,370 --> 02:09:32,140
Name and number are the names of the fields, the columns in this CSV file.
2805
02:09:32,140 --> 02:09:34,450
And when it comes time to write a new row,
2806
02:09:34,450 --> 02:09:37,750
the syntax here is going to be a little uglier, but it's just a dictionary.
2807
02:09:37,750 --> 02:09:40,120
The name I want to write to the dictionary
2808
02:09:40,120 --> 02:09:42,310
is going to be whatever name the human typed in.
2809
02:09:42,310 --> 02:09:45,790
The number that I want to write to the CSV file
2810
02:09:45,790 --> 02:09:48,550
is going to be whatever the number the human typed in.
2811
02:09:48,550 --> 02:09:51,010
But what's different, now, about this code is,
2812
02:09:51,010 --> 02:09:55,960
by simply using a dictionary writer here instead of the generic writer,
2813
02:09:55,960 --> 02:10:00,640
now, the columns can be in this order or this order or any order.
2814
02:10:00,640 --> 02:10:03,010
And the dictionary writer is going to figure out,
2815
02:10:03,010 --> 02:10:06,557
based on the first line of text in that CSV, where to put name,
2816
02:10:06,557 --> 02:10:07,390
where to put number.
2817
02:10:07,390 --> 02:10:08,883
So if you flip them, no big deal.
2818
02:10:08,883 --> 02:10:11,050
It's going to notice, oh, wait, the columns changed.
2819
02:10:11,050 --> 02:10:14,330
And it's going to insert the columns correctly.
2820
02:10:14,330 --> 02:10:18,970
So just, again, another more powerful feature that lets you
2821
02:10:18,970 --> 02:10:22,750
focus on real work, as opposed to actually getting
2822
02:10:22,750 --> 02:10:27,250
tied up in the weeds of writing code like this, otherwise.
2823
02:10:27,250 --> 02:10:30,440
Questions on this one, as well?
2824
02:10:30,440 --> 02:10:33,520
But what we will do, now, is come full circle
2825
02:10:33,520 --> 02:10:37,180
to some of the more sophisticated examples with which we began,
2826
02:10:37,180 --> 02:10:40,855
and I'm going to go back over to my own Mac laptop
2827
02:10:40,855 --> 02:10:43,743
here, where I've got my own terminal window up and running,
2828
02:10:43,743 --> 02:10:46,285
and I was just going to introduce a couple of final libraries
2829
02:10:46,285 --> 02:10:49,788
that really speak to just how powerful Python can be
2830
02:10:49,788 --> 02:10:51,580
and how quickly you can get up and running.
2831
02:10:51,580 --> 02:10:54,330
To be fair, can't necessarily do all of these things in the cloud,
2832
02:10:54,330 --> 02:10:57,337
like in code spaces, because you need access to your own speakers
2833
02:10:57,337 --> 02:10:58,420
or microphone or the like.
2834
02:10:58,420 --> 02:11:01,090
So that's why I'm doing it on my own Mac, here.
2835
02:11:01,090 --> 02:11:05,680
But let me go ahead and open up a program called speech.py.
2836
02:11:05,680 --> 02:11:07,300
And I'm not using VS Code here.
2837
02:11:07,300 --> 02:11:10,150
I'm using a program called VI that's entirely terminal window based.
2838
02:11:10,150 --> 02:11:13,105
But it's going to allow me, for instance, to import the Python
2839
02:11:13,105 --> 02:11:16,120
text to speech version 3 library.
2840
02:11:16,120 --> 02:11:18,790
I'm going to give myself a variable called engine that's
2841
02:11:18,790 --> 02:11:21,610
going to be set equal to the Python text to speech
2842
02:11:21,610 --> 02:11:26,350
3 libraries init method, which is just going to initialize this library that
2843
02:11:26,350 --> 02:11:28,090
relates to text to speech.
2844
02:11:28,090 --> 02:11:32,410
I'm going to, then, use the engine's say function to say something
2845
02:11:32,410 --> 02:11:35,260
like, how about, hello comma world.
2846
02:11:35,260 --> 02:11:39,850
And then, as my last line, I'm going to say engine.runAndWait, capitalized
2847
02:11:39,850 --> 02:11:44,690
as such, to tell my program, now, to run that speech and wait until it's done.
2848
02:11:44,690 --> 02:11:45,190
All right.
2849
02:11:45,190 --> 02:11:46,540
I'm going to save this file.
2850
02:11:46,540 --> 02:11:49,110
I'm going to run python of speech.py.
2851
02:11:49,110 --> 02:11:52,357
And I'm going to cross my fingers, as always, and--
2852
02:11:52,357 --> 02:11:53,440
INTERPRETER: Hello, world.
2853
02:11:53,440 --> 02:11:54,398
DAVID MALAN: All right.
2854
02:11:54,398 --> 02:11:57,130
So now, I have a program that's actually synthesizing speech
2855
02:11:57,130 --> 02:11:58,570
using a library like this.
2856
02:11:58,570 --> 02:12:01,285
How can I, now, modify this to be a little more interesting?
2857
02:12:01,285 --> 02:12:02,690
Well, how about this?
2858
02:12:02,690 --> 02:12:05,050
Let me go ahead and prompt the user for their name,
2859
02:12:05,050 --> 02:12:08,680
like we've done several times here, using Python's built-in name function.
2860
02:12:08,680 --> 02:12:11,665
And now, let me go ahead and use a format string in conjunction
2861
02:12:11,665 --> 02:12:14,980
with this library, interpolating the value of name there.
2862
02:12:14,980 --> 02:12:18,460
And-- at least, if my name is somewhat phonetically pronounceable--
2863
02:12:18,460 --> 02:12:23,587
let's go ahead and run python of speech.py, type in my name, and--
2864
02:12:23,587 --> 02:12:24,670
INTERPRETER: Hello, David.
2865
02:12:24,670 --> 02:12:25,445
DAVID MALAN: OK.
2866
02:12:25,445 --> 02:12:27,640
It's a weird choice of inflection, but we're
2867
02:12:27,640 --> 02:12:30,475
starting to synthesize voice, not unlike Siri or Google Assistant
2868
02:12:30,475 --> 02:12:32,050
or Alexa or the like.
2869
02:12:32,050 --> 02:12:36,130
Now, we can, maybe, do something a little more advanced, too.
2870
02:12:36,130 --> 02:12:39,310
In addition to synthesizing speech in this way,
2871
02:12:39,310 --> 02:12:43,270
we could synthesize, for instance, an actual graphic.
2872
02:12:43,270 --> 02:12:45,740
Let me go ahead, now, and do something like this.
2873
02:12:45,740 --> 02:12:48,760
Let me create a program called qr.py.
2874
02:12:48,760 --> 02:12:50,890
I'm going to go ahead and import a library called
2875
02:12:50,890 --> 02:12:54,860
OS, which gives you access to operating system related functionality in Python.
2876
02:12:54,860 --> 02:12:56,860
I'm going to import a library I've pre-installed
2877
02:12:56,860 --> 02:12:59,830
called qrcode, which is a two-dimensional barcode that you
2878
02:12:59,830 --> 02:13:01,300
might have seen in the real world.
2879
02:13:01,300 --> 02:13:03,715
I'm going to go ahead and create an image variable using
2880
02:13:03,715 --> 02:13:08,260
this qrcode library's make function, which, per its documentation,
2881
02:13:08,260 --> 02:13:10,365
takes a URL, like one of CS50's own videos.
2882
02:13:10,365 --> 02:13:23,003
So we'll do this with youtu.be/xvF2joSPgG0.
2883
02:13:23,003 --> 02:13:24,670
So, hopefully, that's the right lecture.
2884
02:13:24,670 --> 02:13:27,160
And now, we've got img.save, which is going to allow
2885
02:13:27,160 --> 02:13:30,130
me to create a file called qr.ping.
2886
02:13:30,130 --> 02:13:33,460
Think back, now, on problem set 4 and how painful it was to save files.
2887
02:13:33,460 --> 02:13:36,940
We'll just use the save function, now, in Python and save this as a PNG file--
2888
02:13:36,940 --> 02:13:38,260
Portable Network Graphic.
2889
02:13:38,260 --> 02:13:43,420
And then, lastly, let's just go ahead and open with the command open qr.png
2890
02:13:43,420 --> 02:13:46,120
on my Mac so that, hopefully, this just automatically opens.
2891
02:13:46,120 --> 02:13:46,660
All right.
2892
02:13:46,660 --> 02:13:49,300
I'm going to go ahead and just double-check my syntax here
2893
02:13:49,300 --> 02:13:51,280
so that I haven't made any mistakes.
2894
02:13:51,280 --> 02:13:54,235
I'm going to go ahead and run python of qr.py.
2895
02:13:54,235 --> 02:13:55,810
Enter.
2896
02:13:55,810 --> 02:13:57,223
That opens up this.
2897
02:13:57,223 --> 02:13:58,390
Let me go ahead and zoom in.
2898
02:13:58,390 --> 02:14:03,750
If you've got a phone handy and you'd like to scan this code here,
2899
02:14:03,750 --> 02:14:07,131
whether in person or online--
2900
02:14:07,131 --> 02:14:08,095
I apologize.
2901
02:14:08,095 --> 02:14:09,130
You won't appreciate it.
2902
02:14:09,130 --> 02:14:11,640
2903
02:14:11,640 --> 02:14:12,140
Amazing!
2904
02:14:12,140 --> 02:14:13,600
OK.
2905
02:14:13,600 --> 02:14:17,230
And, lastly, let me go back into our speech example
2906
02:14:17,230 --> 02:14:21,400
here, create a final ending here in our final moments.
2907
02:14:21,400 --> 02:14:26,060
And how about we just say something like "This was CS50," like this.
2908
02:14:26,060 --> 02:14:27,087
Let's go ahead, here.
2909
02:14:27,087 --> 02:14:28,795
Fix my capitalization, just for tidiness.
2910
02:14:28,795 --> 02:14:29,878
Let's get rid of the name.
2911
02:14:29,878 --> 02:14:33,840
And now, with our final flourish and your introduction to Python equipped--
2912
02:14:33,840 --> 02:14:35,230
here we go--
2913
02:14:35,230 --> 02:14:36,535
INTERPRETER: This was CS50.
2914
02:14:36,535 --> 02:14:37,000
DAVID MALAN: All right.
2915
02:14:37,000 --> 02:14:38,000
We'll see you next time.
2916
02:14:38,000 --> 02:14:39,460
[APPLAUSE]
2917
02:14:39,460 --> 02:14:41,860
2918
02:14:41,860 --> 02:14:45,210
[MUSIC PLAYING]
2919
02:14:45,210 --> 02:15:18,000
241803
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.