All language subtitles for 2022_lecture6_720p_sdr-en

af Afrikaans
ak Akan
sq Albanian
am Amharic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranรฎ)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 0 00:00:00,000 --> 00:00:02,982 1 00:00:02,982 --> 00:00:06,461 [MUSIC PLAYING] 2 00:00:06,461 --> 00:01:12,065 3 00:01:12,065 --> 00:01:13,210 DAVID MALAN: All right. 4 00:01:13,210 --> 00:01:18,700 This is CS50, and this is week six, wherein we finally transition 5 00:01:18,700 --> 00:01:20,935 from Scratch to C to, now, Python. 6 00:01:20,935 --> 00:01:22,735 And, indeed, this is going to be somewhat 7 00:01:22,735 --> 00:01:27,370 of a unique experience in that, just like a few weeks past-- 8 00:01:27,370 --> 00:01:30,605 perhaps, for the first time-- and now, today, you're 9 00:01:30,605 --> 00:01:31,855 going to learn a new language. 10 00:01:31,855 --> 00:01:35,935 But the goal isn't just to throw another fire hose of content and syntax 11 00:01:35,935 --> 00:01:39,568 and whatnot at you, but rather, to really equip you all to actually teach 12 00:01:39,568 --> 00:01:41,110 yourself new languages in the future. 13 00:01:41,110 --> 00:01:43,902 And so, indeed, what we'll do today, what we'll do this coming week 14 00:01:43,902 --> 00:01:46,580 is prepare you to stand on your own. 15 00:01:46,580 --> 00:01:48,527 And once Python is passe and the world has 16 00:01:48,527 --> 00:01:50,860 moved on to some other language in some number of years, 17 00:01:50,860 --> 00:01:52,568 you'll be well equipped to figure out how 18 00:01:52,568 --> 00:01:55,027 to wrap your mind around some new syntax, some new language 19 00:01:55,027 --> 00:01:56,280 and solve problems, as well. 20 00:01:56,280 --> 00:01:59,320 Now, you recall, in week zero, this is where we started-- 21 00:01:59,320 --> 00:02:01,390 just saying hello to the world. 22 00:02:01,390 --> 00:02:03,850 And that quickly escalated just a week later in C 23 00:02:03,850 --> 00:02:06,250 to be something much, much more cryptic. 24 00:02:06,250 --> 00:02:09,234 And if you've still struggled with some of the syntax, 25 00:02:09,234 --> 00:02:11,723 find yourself checking your notes or your previous code, 26 00:02:11,723 --> 00:02:12,640 that's totally normal. 27 00:02:12,640 --> 00:02:16,675 And that's one of the reasons why there are languages besides C 28 00:02:16,675 --> 00:02:18,970 out there-- among them, this language called Python. 29 00:02:18,970 --> 00:02:21,520 Humans over the decades have realized, gee, 30 00:02:21,520 --> 00:02:25,167 that wasn't necessarily the best design decision, or humans have realized, wow, 31 00:02:25,167 --> 00:02:25,750 you know what? 32 00:02:25,750 --> 00:02:30,160 Now that computers have gotten faster with more memory and faster CPUs, 33 00:02:30,160 --> 00:02:33,070 we can actually do more with our programming languages. 34 00:02:33,070 --> 00:02:36,985 So just as human languages evolve, so do actual programming languages. 35 00:02:36,985 --> 00:02:40,810 And even within a programming language, there's typically different versions. 36 00:02:40,810 --> 00:02:43,870 We, for instance, have been using version C11 37 00:02:43,870 --> 00:02:46,720 of C, which was updated in 2011. 38 00:02:46,720 --> 00:02:50,800 But Python itself continues to evolve, and it's now up to version 3-plus. 39 00:02:50,800 --> 00:02:53,680 And so there, too, these things will evolve in the coming days. 40 00:02:53,680 --> 00:02:56,560 Thankfully, what you're about to see is "Hello, World!" 41 00:02:56,560 --> 00:02:59,440 for the third time, but it's going to be literally this. 42 00:02:59,440 --> 00:03:04,930 None of the crazy syntax above or below, fewer semicolons, if any, fewer 43 00:03:04,930 --> 00:03:05,770 currently braces. 44 00:03:05,770 --> 00:03:08,630 And, really, a lot of the distractions get out of the way. 45 00:03:08,630 --> 00:03:11,200 So to get there, let's consider exactly how 46 00:03:11,200 --> 00:03:13,000 we've been programming up until now. 47 00:03:13,000 --> 00:03:16,300 So you write a program in C and you've got, hopefully, 48 00:03:16,300 --> 00:03:19,135 no syntax error, so you're ready to build it-- that is, compile it. 49 00:03:19,135 --> 00:03:22,135 And so, you've run make, and then, you've run the program, like ./hello. 50 00:03:22,135 --> 00:03:24,850 Or if you think back to week two, where we 51 00:03:24,850 --> 00:03:27,100 took a peek underneath the hood of what make is doing, 52 00:03:27,100 --> 00:03:29,710 it's really running the actual compiler-- 53 00:03:29,710 --> 00:03:32,800 something called clang-- maybe with some command-line arguments creating 54 00:03:32,800 --> 00:03:34,090 a program called hello. 55 00:03:34,090 --> 00:03:36,128 And then, you could do ./hello. 56 00:03:36,128 --> 00:03:38,920 So, today, you're going to start doing something similar in spirit, 57 00:03:38,920 --> 00:03:40,270 but fewer steps. 58 00:03:40,270 --> 00:03:42,270 No longer will you have to compile your code 59 00:03:42,270 --> 00:03:45,520 and then run it, and then, maybe, fix or change it, and then compile your code 60 00:03:45,520 --> 00:03:47,470 and run it, and then repeat, repeat. 61 00:03:47,470 --> 00:03:50,200 The process of running your code is going 62 00:03:50,200 --> 00:03:52,542 to be distilled into just a single step. 63 00:03:52,542 --> 00:03:54,250 And the way to think of this, for now, is 64 00:03:54,250 --> 00:03:58,420 that, whereas C is frequently used as, indeed, a compiled language whereby 65 00:03:58,420 --> 00:04:01,045 you convert it first to 0s and 1s, Python's 66 00:04:01,045 --> 00:04:04,400 going to let you speed things up whereby you, the human programmer, 67 00:04:04,400 --> 00:04:05,740 don't have to compile it. 68 00:04:05,740 --> 00:04:09,400 You're just going to run what's called an interpreter-- which, by design, 69 00:04:09,400 --> 00:04:12,190 is named the exact same thing as the language itself-- 70 00:04:12,190 --> 00:04:14,860 and by running this program installed in VS Code 71 00:04:14,860 --> 00:04:17,230 or, eventually, on your own Macs or PCs. 72 00:04:17,230 --> 00:04:20,320 This is just going to tell your computer to interpret this code 73 00:04:20,320 --> 00:04:23,800 and figure out how to get down to that lower level of 0s and 1s. 74 00:04:23,800 --> 00:04:26,626 But you don't have to compile the code yourself anymore. 75 00:04:26,626 --> 00:04:31,000 So with that said, let's consider what the code is going to look like, 76 00:04:31,000 --> 00:04:31,690 side by side. 77 00:04:31,690 --> 00:04:33,850 In fact, let's look back at some Scratch blocks, 78 00:04:33,850 --> 00:04:36,582 just like we did with C in week one, and do some side by sides. 79 00:04:36,582 --> 00:04:39,040 Because even though some of the syntax this week and beyond 80 00:04:39,040 --> 00:04:42,705 is going to be different, the ideas are truly going to be the same. 81 00:04:42,705 --> 00:04:45,565 There's not all that much intellectually new just yet. 82 00:04:45,565 --> 00:04:48,190 So whereas, in week zero, we might have said hello to the world 83 00:04:48,190 --> 00:04:51,220 with this purple puzzle piece, today, of course-- 84 00:04:51,220 --> 00:04:56,080 or, rather, in week one, it looked like this in C. But today, moving forward, 85 00:04:56,080 --> 00:04:58,665 it's going to, quite simply, look like this instead. 86 00:04:58,665 --> 00:05:00,610 And if we go back and forth for just a moment, 87 00:05:00,610 --> 00:05:03,580 here, again, is the version in C, noticing 88 00:05:03,580 --> 00:05:05,500 the very C-like characteristics. 89 00:05:05,500 --> 00:05:09,200 And just at a glance here, in Python, I claim it's now this. 90 00:05:09,200 --> 00:05:13,190 What do you apparently need not worry about anymore? 91 00:05:13,190 --> 00:05:14,940 What's gone? 92 00:05:14,940 --> 00:05:15,990 So semi-colon is gone. 93 00:05:15,990 --> 00:05:19,073 And, indeed, you don't need those to finish most of your thoughts anymore. 94 00:05:19,073 --> 00:05:19,830 Anything else? 95 00:05:19,830 --> 00:05:20,860 AUDIENCE: Backslash n. 96 00:05:20,860 --> 00:05:22,690 DAVID MALAN: So the backslash n is absent. 97 00:05:22,690 --> 00:05:25,140 And that's curious because we're still going to get a new line, 98 00:05:25,140 --> 00:05:26,985 but we'll see that it's become the default. 99 00:05:26,985 --> 00:05:29,402 And this one's a little more subtle, but now, the function 100 00:05:29,402 --> 00:05:31,185 is called print instead of printf. 101 00:05:31,185 --> 00:05:33,610 So it's a little more familiar in that sense. 102 00:05:33,610 --> 00:05:34,110 All right. 103 00:05:34,110 --> 00:05:37,050 So when it comes to using libraries-- that 104 00:05:37,050 --> 00:05:39,300 is, code that other people have written-- in the past, 105 00:05:39,300 --> 00:05:43,350 we've done things like #include cs50.h to use CS50's own header 106 00:05:43,350 --> 00:05:47,730 file or standard I/O or standard lib or string or any number of other header 107 00:05:47,730 --> 00:05:49,440 files you have all used. 108 00:05:49,440 --> 00:05:52,635 Moving forward, we're going to give you, for this first week, a similar CS50 109 00:05:52,635 --> 00:05:53,280 library-- 110 00:05:53,280 --> 00:05:55,920 just very short-term training wheels that we'll quickly 111 00:05:55,920 --> 00:05:59,370 take off because, in reality, it's a lot easier to do things in Python, 112 00:05:59,370 --> 00:06:00,267 as we'll see. 113 00:06:00,267 --> 00:06:02,100 But the syntax for this, now, is going to be 114 00:06:02,100 --> 00:06:05,165 to import the CS50 library in this way. 115 00:06:05,165 --> 00:06:08,452 And when we have, now, this ability, we can actually 116 00:06:08,452 --> 00:06:09,910 start writing some code right away. 117 00:06:09,910 --> 00:06:12,420 In fact, let me switch over to VS Code here. 118 00:06:12,420 --> 00:06:14,760 And just as in the past, I'll create a new file. 119 00:06:14,760 --> 00:06:17,230 But instead of creating something called .c, 120 00:06:17,230 --> 00:06:19,980 I'm going to go ahead and create my first program called hello.py, 121 00:06:19,980 --> 00:06:22,260 using code space hello dot py. 122 00:06:22,260 --> 00:06:24,000 That, of course, gives me this new tab. 123 00:06:24,000 --> 00:06:28,185 And let me actually, quite simply, do what I proposed-- print, quote unquote, 124 00:06:28,185 --> 00:06:33,780 "Hello, world" without the /n, without the semicolon, without the f in print. 125 00:06:33,780 --> 00:06:36,270 And now, let me go down to my terminal window. 126 00:06:36,270 --> 00:06:37,792 And I don't have to compile it. 127 00:06:37,792 --> 00:06:39,000 I don't have to do dot slash. 128 00:06:39,000 --> 00:06:43,140 I, instead, run a program called python, whose purpose in life 129 00:06:43,140 --> 00:06:46,180 is, now, to interpret my code top to bottom, left to right. 130 00:06:46,180 --> 00:06:50,130 And if I run python of hello.py, crossing my fingers, as always-- 131 00:06:50,130 --> 00:06:51,000 voila. 132 00:06:51,000 --> 00:06:53,190 Now I have printed out "hello, world." 133 00:06:53,190 --> 00:06:56,460 So we seem to have gotten the new line for free, in the sense where 134 00:06:56,460 --> 00:06:57,735 it's automatically happening. 135 00:06:57,735 --> 00:06:59,880 The dollar sign isn't weirdly on the same line, 136 00:06:59,880 --> 00:07:02,220 like it once was in week one. 137 00:07:02,220 --> 00:07:04,493 But that's just a minor detail here. 138 00:07:04,493 --> 00:07:06,660 If we switch back to, now, some other capabilities-- 139 00:07:06,660 --> 00:07:09,780 well, indeed, with the CS50 library, you can also not 140 00:07:09,780 --> 00:07:12,795 just import the library itself, but specific functions. 141 00:07:12,795 --> 00:07:14,850 And you'll see that, temporarily, we're going 142 00:07:14,850 --> 00:07:19,080 to give you a helper function called get_string, just like in C, that just 143 00:07:19,080 --> 00:07:20,872 makes it work exactly the same way as in C. 144 00:07:20,872 --> 00:07:22,580 And we'll see a couple of other functions 145 00:07:22,580 --> 00:07:24,660 that will just make life easier, initially. 146 00:07:24,660 --> 00:07:26,910 But, quickly, will we take those training wheels off 147 00:07:26,910 --> 00:07:29,295 so that nothing is, indeed, CS50-specific. 148 00:07:29,295 --> 00:07:29,970 All right. 149 00:07:29,970 --> 00:07:32,640 Well, how about functions, more generally, in Python? 150 00:07:32,640 --> 00:07:34,710 Let's do a whirlwind tour, if you will, much 151 00:07:34,710 --> 00:07:38,940 like we did in that first week of C, comparing one to the other. 152 00:07:38,940 --> 00:07:42,270 So back in our world of Scratch, one of the first programs we wrote 153 00:07:42,270 --> 00:07:45,360 was this one here, whereby we ask the human their name. 154 00:07:45,360 --> 00:07:49,110 We then used the return value that was automatically stored 155 00:07:49,110 --> 00:07:53,130 in this answer variable as an second argument 156 00:07:53,130 --> 00:07:56,265 to join so that we could say "Hello, David" or "Hello, Carter." 157 00:07:56,265 --> 00:07:59,340 So this was back in week zero. 158 00:07:59,340 --> 00:08:01,143 In week one, we converted it to this. 159 00:08:01,143 --> 00:08:03,810 And here is a perfect example of things like escalating quickly. 160 00:08:03,810 --> 00:08:05,910 And, again, this is why we start in Scratch. 161 00:08:05,910 --> 00:08:09,060 There's just so much distraction here to achieve the same idea. 162 00:08:09,060 --> 00:08:12,010 But even today, we're going to chip away at some of that syntax. 163 00:08:12,010 --> 00:08:17,940 So, in C, we had to declare the variable as a string, here. 164 00:08:17,940 --> 00:08:19,935 We of course, had the semicolon and more. 165 00:08:19,935 --> 00:08:22,650 Well, in Python, the comparable code, now, 166 00:08:22,650 --> 00:08:26,100 is going to look, more simply, like this. 167 00:08:26,100 --> 00:08:29,250 So semicolon is, again, gone on both lines, for that matter. 168 00:08:29,250 --> 00:08:30,450 So that's good. 169 00:08:30,450 --> 00:08:33,100 What else appears to have changed or disappeared? 170 00:08:33,100 --> 00:08:33,600 Yeah. 171 00:08:33,600 --> 00:08:35,340 AUDIENCE: [? Do you have ?] the same type of variable? 172 00:08:35,340 --> 00:08:36,090 DAVID MALAN: Yeah. 173 00:08:36,090 --> 00:08:39,419 So I didn't have to specifically say that answer is now a string. 174 00:08:39,419 --> 00:08:41,820 And, indeed, Python is dynamically typed. 175 00:08:41,820 --> 00:08:45,270 And, in fact, it will infer from context exactly what 176 00:08:45,270 --> 00:08:48,000 it is you are storing in that variable. 177 00:08:48,000 --> 00:08:50,775 Other details that seem a little bit different? 178 00:08:50,775 --> 00:08:53,640 179 00:08:53,640 --> 00:08:54,607 A little bit different. 180 00:08:54,607 --> 00:08:55,940 What else jumps out at you here? 181 00:08:55,940 --> 00:08:56,482 I'll go back. 182 00:08:56,482 --> 00:08:58,690 This was the C version. 183 00:08:58,690 --> 00:09:01,570 And maybe focus, now, on the second line because we've rather 184 00:09:01,570 --> 00:09:02,740 exhausted the first. 185 00:09:02,740 --> 00:09:04,690 Here's, now, the Python version. 186 00:09:04,690 --> 00:09:05,720 What's different here? 187 00:09:05,720 --> 00:09:06,220 Yeah? 188 00:09:06,220 --> 00:09:08,845 AUDIENCE: You don't need to worry about %s or percent anything. 189 00:09:08,845 --> 00:09:10,930 You just have the variable after [? them. ?] 190 00:09:10,930 --> 00:09:11,680 DAVID MALAN: Yeah. 191 00:09:11,680 --> 00:09:12,820 There's no %s anymore. 192 00:09:12,820 --> 00:09:16,480 There's no second argument, at the moment, per se, to print. 193 00:09:16,480 --> 00:09:17,818 Now, it is still a little weird. 194 00:09:17,818 --> 00:09:20,485 It's as though I've deployed some addition here, arithmetically. 195 00:09:20,485 --> 00:09:21,860 But that's not the case. 196 00:09:21,860 --> 00:09:23,230 Some of you have program before. 197 00:09:23,230 --> 00:09:27,377 And plus, some of you might know, means what in this context? 198 00:09:27,377 --> 00:09:29,960 So to combine or, more technically-- anyone know the buzzword? 199 00:09:29,960 --> 00:09:30,390 Yeah. 200 00:09:30,390 --> 00:09:31,040 AUDIENCE: Concatenate. 201 00:09:31,040 --> 00:09:32,460 DAVID MALAN: To concatenate. 202 00:09:32,460 --> 00:09:35,753 So to concatenate is the fancy way of what Scratch calls joining, 203 00:09:35,753 --> 00:09:38,420 which is to take one string on the left, one string on the right 204 00:09:38,420 --> 00:09:40,100 and to join them together. 205 00:09:40,100 --> 00:09:41,880 To glue them together, if you will. 206 00:09:41,880 --> 00:09:43,080 So this is not addition. 207 00:09:43,080 --> 00:09:45,080 It would be if it were numbers involved instead. 208 00:09:45,080 --> 00:09:46,413 But because we've got a string-- 209 00:09:46,413 --> 00:09:49,430 Hello comma-- and another string implicitly in this variable 210 00:09:49,430 --> 00:09:53,540 based on what the human typed in in response to this get_string function. 211 00:09:53,540 --> 00:09:58,130 That's going to concatenate Hello comma space, and then, David or Carter 212 00:09:58,130 --> 00:09:59,637 or whatever the human has typed in. 213 00:09:59,637 --> 00:10:02,720 But it turns out, there's going to be different ways to do this in Python. 214 00:10:02,720 --> 00:10:04,387 And we'll show you a few different ones. 215 00:10:04,387 --> 00:10:06,380 And here, too, try not to get too hung up 216 00:10:06,380 --> 00:10:09,255 on or frustrated by all of the different ways you can solve problems. 217 00:10:09,255 --> 00:10:12,130 Odds are, you're going to be picking up tips and techniques for years 218 00:10:12,130 --> 00:10:14,100 to come if you continue programming. 219 00:10:14,100 --> 00:10:16,710 So let's just give you a few of the possible ways. 220 00:10:16,710 --> 00:10:20,900 So here's a second way you could print out hello comma David or hello comma 221 00:10:20,900 --> 00:10:21,680 Carter. 222 00:10:21,680 --> 00:10:22,655 But what has changed? 223 00:10:22,655 --> 00:10:26,030 In the previous version, I used concatenation explicitly. 224 00:10:26,030 --> 00:10:28,445 And the space here is important, grammatically, 225 00:10:28,445 --> 00:10:30,485 just so we get that in the final phrase. 226 00:10:30,485 --> 00:10:33,410 Now, I'm proposing to get rid of that space 227 00:10:33,410 --> 00:10:36,985 to add a comma outside of the double quotes, as well. 228 00:10:36,985 --> 00:10:39,020 But if you think back to C, this probably 229 00:10:39,020 --> 00:10:42,620 just means that print, similar in spirit to printf, 230 00:10:42,620 --> 00:10:45,200 can take not just one argument, but even two. 231 00:10:45,200 --> 00:10:47,510 And in fact, because of this comma in the middle that's 232 00:10:47,510 --> 00:10:50,390 outside of the double quotes, it's hello comma, 233 00:10:50,390 --> 00:10:52,655 and then, it will be automatically concatenated 234 00:10:52,655 --> 00:10:56,420 with-- even without using the plus, to whatever the value of answer is. 235 00:10:56,420 --> 00:10:59,630 And by default, just for grammatical prettiness, 236 00:10:59,630 --> 00:11:01,850 the print function always gives you a space 237 00:11:01,850 --> 00:11:05,120 for free in between each of the multiple arguments you pass in. 238 00:11:05,120 --> 00:11:07,290 We'll see how you can override that down the line. 239 00:11:07,290 --> 00:11:09,248 But, for now, that's just another way to do it. 240 00:11:09,248 --> 00:11:12,680 Now, perhaps the better, if slightly cryptic way to do this-- 241 00:11:12,680 --> 00:11:14,420 or just the increasingly common way-- 242 00:11:14,420 --> 00:11:18,290 is, probably, the third version, which looks a little weird, too. 243 00:11:18,290 --> 00:11:20,555 And, probably, the weirdness jumps out. 244 00:11:20,555 --> 00:11:24,060 We've suddenly introduced these curly braces, 245 00:11:24,060 --> 00:11:25,518 which I promised were mostly gone. 246 00:11:25,518 --> 00:11:26,060 And they are. 247 00:11:26,060 --> 00:11:29,270 But inside of this string here, I've done 248 00:11:29,270 --> 00:11:31,520 a curly brace, which might mean what? 249 00:11:31,520 --> 00:11:32,918 Just intuitively. 250 00:11:32,918 --> 00:11:35,210 And here is an example of how you learn a new language. 251 00:11:35,210 --> 00:11:39,945 Just infer, from context, how Python probably works. 252 00:11:39,945 --> 00:11:40,820 What might this mean? 253 00:11:40,820 --> 00:11:41,320 Yeah? 254 00:11:41,320 --> 00:11:45,160 AUDIENCE: [INAUDIBLE] 255 00:11:45,160 --> 00:11:45,910 DAVID MALAN: Yeah. 256 00:11:45,910 --> 00:11:48,610 So this is an indication, because the curly braces-- 257 00:11:48,610 --> 00:11:50,740 because this was the way Python was designed-- 258 00:11:50,740 --> 00:11:55,340 that we want to plug in the value of answer, not literally A-N-S-W-E-R. 259 00:11:55,340 --> 00:11:59,688 And the fancy word here is that the answer variable will be interpolated-- 260 00:11:59,688 --> 00:12:01,480 that is, substituted with its actual value. 261 00:12:01,480 --> 00:12:04,435 But, but, but-- and this is actually weird-looking; 262 00:12:04,435 --> 00:12:06,820 this was introduced a few years ago to Python. 263 00:12:06,820 --> 00:12:11,230 What else did I have to change to make these curly braces work, apparently? 264 00:12:11,230 --> 00:12:11,935 Yeah? 265 00:12:11,935 --> 00:12:13,510 AUDIENCE: Drop the f before the-- 266 00:12:13,510 --> 00:12:14,260 DAVID MALAN: Yeah. 267 00:12:14,260 --> 00:12:15,160 There's this weird f. 268 00:12:15,160 --> 00:12:17,245 And so, it's like part of printf. 269 00:12:17,245 --> 00:12:20,950 But now, it's inside the parentheses there. 270 00:12:20,950 --> 00:12:22,945 This is just the way Python designed this. 271 00:12:22,945 --> 00:12:24,820 So a few years ago, when they introduced what 272 00:12:24,820 --> 00:12:30,070 are called format strings or fstrings, you literally prefix your quoted string 273 00:12:30,070 --> 00:12:32,080 with the letter f. 274 00:12:32,080 --> 00:12:34,570 And then, you can use trickery like this, 275 00:12:34,570 --> 00:12:36,640 like putting curly braces so that the value will 276 00:12:36,640 --> 00:12:38,170 be substituted automatically. 277 00:12:38,170 --> 00:12:41,530 If you forget the f, you're going to literally see hello comma curly 278 00:12:41,530 --> 00:12:43,330 brace answer closed curly brace. 279 00:12:43,330 --> 00:12:45,355 If you add the f, it's, indeed, interpolated. 280 00:12:45,355 --> 00:12:47,360 The value is plugged in. 281 00:12:47,360 --> 00:12:47,860 All right. 282 00:12:47,860 --> 00:12:52,510 Questions on how we can just say hello to the world via Python, in this case. 283 00:12:52,510 --> 00:12:53,350 Yeah? 284 00:12:53,350 --> 00:12:55,280 AUDIENCE: If you do this without the f, what would happen? 285 00:12:55,280 --> 00:12:56,300 DAVID MALAN: If you do this without the-- 286 00:12:56,300 --> 00:12:57,260 AUDIENCE: [? The f. ?] 287 00:12:57,260 --> 00:12:58,385 DAVID MALAN: Without the f? 288 00:12:58,385 --> 00:13:02,450 If you omit the f, you will literally see H-E-L-L-O comma curly brace 289 00:13:02,450 --> 00:13:04,730 A-N-S-W-E-R closed curly brace. 290 00:13:04,730 --> 00:13:05,930 So, in fact, let's do this. 291 00:13:05,930 --> 00:13:08,300 Let me go back to VS Code here, quickly. 292 00:13:08,300 --> 00:13:11,540 I've still got my file called hello.py open. 293 00:13:11,540 --> 00:13:14,210 And let me go ahead and change this ever so slightly. 294 00:13:14,210 --> 00:13:16,700 So I'm going to go ahead and-- 295 00:13:16,700 --> 00:13:20,930 let's say from cs50 import get_string. 296 00:13:20,930 --> 00:13:23,615 And that's just the new syntax I propose using to import 297 00:13:23,615 --> 00:13:26,150 a function from someone else's library. 298 00:13:26,150 --> 00:13:30,593 I'm going to now go ahead and ask the question-- 299 00:13:30,593 --> 00:13:33,260 let's go ahead and use get_string, storing the result in answer. 300 00:13:33,260 --> 00:13:37,480 So get_string, quote unquote, "What's your name?" 301 00:13:37,480 --> 00:13:41,090 And then, on this line, I'm going to deliberately make a mistake here, 302 00:13:41,090 --> 00:13:42,450 exactly to your question. 303 00:13:42,450 --> 00:13:46,820 Let me just say hello comma answer, and just this. 304 00:13:46,820 --> 00:13:48,980 Now, even though answer is a variable, Python's 305 00:13:48,980 --> 00:13:53,150 not going to be so presumptuous as to just plug in the value of a variable 306 00:13:53,150 --> 00:13:53,810 called answer. 307 00:13:53,810 --> 00:13:56,000 What it's going to do, of course, is-- 308 00:13:56,000 --> 00:13:56,985 if I type in my name-- 309 00:13:56,985 --> 00:13:57,485 whoops. 310 00:13:57,485 --> 00:13:58,880 I typed too fast. 311 00:13:58,880 --> 00:14:00,470 Let me go ahead and rerun that again. 312 00:14:00,470 --> 00:14:04,550 If I run python with hello.py, type in my name and hit Enter, 313 00:14:04,550 --> 00:14:06,035 I get hello comma answer. 314 00:14:06,035 --> 00:14:07,160 Well, let me do one better. 315 00:14:07,160 --> 00:14:10,680 Let me apply these curly braces as before. 316 00:14:10,680 --> 00:14:13,340 Let me rerun python of hello.py. 317 00:14:13,340 --> 00:14:14,060 What's your name? 318 00:14:14,060 --> 00:14:14,405 D-A-V-I-D. 319 00:14:14,405 --> 00:14:16,363 And here's, again, the answer to your question. 320 00:14:16,363 --> 00:14:18,780 Now, we get, literally, the curly braces. 321 00:14:18,780 --> 00:14:20,780 So the fix here, ultimately, is just going 322 00:14:20,780 --> 00:14:24,640 to be to add the f there, rerun my program again with David. 323 00:14:24,640 --> 00:14:26,482 And now, hello comma David. 324 00:14:26,482 --> 00:14:28,940 So this is, admittedly, a little more cryptic than the ones 325 00:14:28,940 --> 00:14:31,858 with the plus or the comma, but this is just increasingly common. 326 00:14:31,858 --> 00:14:33,650 Why? because you can read it left to right. 327 00:14:33,650 --> 00:14:34,720 It's nice and convenient. 328 00:14:34,720 --> 00:14:36,125 It's less cryptic than the %s's. 329 00:14:36,125 --> 00:14:40,130 So it's a new and improved version, if you will, of printf in C, 330 00:14:40,130 --> 00:14:44,780 based on decades of experience of programmers doing things like this. 331 00:14:44,780 --> 00:14:49,540 Questions on printing in this way? 332 00:14:49,540 --> 00:14:52,780 We're now on our way to programming in Python. 333 00:14:52,780 --> 00:14:53,280 Anything? 334 00:14:53,280 --> 00:14:53,780 All right. 335 00:14:53,780 --> 00:14:56,825 Well, what more can we do with this language, here? 336 00:14:56,825 --> 00:15:00,000 Well, let me propose that we consider that we 337 00:15:00,000 --> 00:15:07,200 have, for instance, a few other features that we can add to the mix, as well-- 338 00:15:07,200 --> 00:15:12,640 namely, let's say some data types, as well. 339 00:15:12,640 --> 00:15:15,600 So let me flip over here, to back to the slides. 340 00:15:15,600 --> 00:15:18,318 And there's different data types in Python, as we'll soon see. 341 00:15:18,318 --> 00:15:19,485 But they're not as explicit. 342 00:15:19,485 --> 00:15:23,070 As we already saw, by using a string from get_string, 343 00:15:23,070 --> 00:15:25,050 you don't have to explicitly state what it is. 344 00:15:25,050 --> 00:15:29,130 But you saw-- recall, in C-- all of these various data types. 345 00:15:29,130 --> 00:15:33,720 And then, in Python, nicely enough, this list is about to get shorter. 346 00:15:33,720 --> 00:15:37,740 And so, here is our list in C. Here is an abbreviated list in Python. 347 00:15:37,740 --> 00:15:41,220 So we're still going to have strings, but they're going to be more succinctly 348 00:15:41,220 --> 00:15:45,032 called strs now, S-T-R. We're still going to have ints for integers. 349 00:15:45,032 --> 00:15:47,490 We're still going to have floats for floating point values. 350 00:15:47,490 --> 00:15:49,900 We're even going to have bools for true and false. 351 00:15:49,900 --> 00:15:53,550 But what's missing, now, from the list is long and floats. 352 00:15:53,550 --> 00:15:54,420 And why is that? 353 00:15:54,420 --> 00:15:56,220 Or rather, long and double. 354 00:15:56,220 --> 00:15:58,650 We'll recall that, in C, those used more bits. 355 00:15:58,650 --> 00:16:02,550 Well, in Python, the smaller data types, previously-- int and float, 356 00:16:02,550 --> 00:16:04,950 themselves-- just used more bits for you. 357 00:16:04,950 --> 00:16:08,010 And so, you don't need to distinguish between small and large. 358 00:16:08,010 --> 00:16:10,290 You just use one data type, and the language 359 00:16:10,290 --> 00:16:12,345 gives you a bigger range than before. 360 00:16:12,345 --> 00:16:15,510 It turns out, though, there's going to be some other features, as well, 361 00:16:15,510 --> 00:16:17,610 of Python, and these data types-- one of which 362 00:16:17,610 --> 00:16:20,010 will be called range, another of which will be list. 363 00:16:20,010 --> 00:16:21,402 So gone will be arrays. 364 00:16:21,402 --> 00:16:23,610 We'll actually use something literally called a list. 365 00:16:23,610 --> 00:16:28,110 Tuples-- sort of x, y pairs for coordinates and things like that. 366 00:16:28,110 --> 00:16:31,260 Dicts for dictionaries-- so we'll have built-in capabilities 367 00:16:31,260 --> 00:16:34,270 for storing keys and values we'll see, and even a set. 368 00:16:34,270 --> 00:16:36,270 Mathematically, a set is a collection of values, 369 00:16:36,270 --> 00:16:38,790 but it automatically gets rid of duplicates for you. 370 00:16:38,790 --> 00:16:43,470 So all of these things, we could absolutely implement in C if we wanted. 371 00:16:43,470 --> 00:16:47,940 And, indeed, in problem set five, you've been implementing your very own spell 372 00:16:47,940 --> 00:16:50,400 checker using some form of hash table. 373 00:16:50,400 --> 00:16:54,060 Well, it turns out that, in Python, you can solve those same problems, 374 00:16:54,060 --> 00:16:56,070 but perhaps a little more readily. 375 00:16:56,070 --> 00:16:58,980 In fact, let me go back over here to VS Code, 376 00:16:58,980 --> 00:17:01,895 and let me propose that I do the following. 377 00:17:01,895 --> 00:17:06,210 Let me go ahead and create a file called dictionary.py. 378 00:17:06,210 --> 00:17:09,510 Let me propose that I try to implement, say-- problem set five-- 379 00:17:09,510 --> 00:17:14,220 our spell checker in Python instead of C and achieve, ultimately, 380 00:17:14,220 --> 00:17:17,443 the same kind of behavior whereby I'll be 381 00:17:17,443 --> 00:17:19,235 able to spell check a whole bunch of words. 382 00:17:19,235 --> 00:17:21,480 So this is jumping the gun a little bit because you're 383 00:17:21,480 --> 00:17:23,897 about to see syntax will revisit over the course of today. 384 00:17:23,897 --> 00:17:26,580 But, for now, I've got a new file called dictionary.py. 385 00:17:26,580 --> 00:17:30,810 And let me begin to create some placeholders for functions. 386 00:17:30,810 --> 00:17:34,710 We'll see in just a bit that, in Python, you can define a function called check, 387 00:17:34,710 --> 00:17:38,000 and that check function can take a word as its input. 388 00:17:38,000 --> 00:17:40,292 And I'll come back to this in just a moment. 389 00:17:40,292 --> 00:17:42,000 In Python, I can define a second function 390 00:17:42,000 --> 00:17:44,865 like load, which itself will take a whole dictionary, 391 00:17:44,865 --> 00:17:47,010 just like in problem set five. 392 00:17:47,010 --> 00:17:51,010 And I'll go ahead and come back to the implementation of this. 393 00:17:51,010 --> 00:17:53,130 Meanwhile, we might similarly implement a function 394 00:17:53,130 --> 00:17:57,090 called size, which takes no arguments but, ultimately, is going to return 395 00:17:57,090 --> 00:17:59,100 the size of my dictionary of words. 396 00:17:59,100 --> 00:18:02,370 And then, lastly, for consistency with problem set five, 397 00:18:02,370 --> 00:18:05,130 we might define an unload function, whose purpose in life 398 00:18:05,130 --> 00:18:07,770 is to free any memory that you've been using, just 399 00:18:07,770 --> 00:18:09,390 to give it back to the computer. 400 00:18:09,390 --> 00:18:11,790 Now, odds are, whether you're still working on speller 401 00:18:11,790 --> 00:18:15,660 or have finished speller, you wrote a decent amount of lines of code. 402 00:18:15,660 --> 00:18:18,550 And indeed, it's been, by design, a challenge. 403 00:18:18,550 --> 00:18:22,620 But one of the reasons for these higher-level languages like Python 404 00:18:22,620 --> 00:18:25,680 is that you can stand on the shoulders of programmers before you 405 00:18:25,680 --> 00:18:28,703 and solve very common problems much more quickly. 406 00:18:28,703 --> 00:18:31,620 So that you can focus on building your new app or your web application 407 00:18:31,620 --> 00:18:34,690 or your own project to solve problems of interest to you. 408 00:18:34,690 --> 00:18:38,490 So at the risk of crushing some spirits, let 409 00:18:38,490 --> 00:18:42,540 me propose that, in Python if you want a dictionary for something like a spell 410 00:18:42,540 --> 00:18:44,070 checker, well, that's fine. 411 00:18:44,070 --> 00:18:48,030 Go ahead and give yourself a variable, like words, to store all of those words 412 00:18:48,030 --> 00:18:52,410 and just assign it equal to a dictionary-- or dict, for short, 413 00:18:52,410 --> 00:18:53,220 in Python. 414 00:18:53,220 --> 00:18:55,140 That will give you a hash table. 415 00:18:55,140 --> 00:18:57,690 Now, it turns out, in speller recall, you 416 00:18:57,690 --> 00:18:59,720 don't need to worry about words and definitions. 417 00:18:59,720 --> 00:19:01,763 It's just about spell-checking the words. 418 00:19:01,763 --> 00:19:03,930 So strictly speaking, we don't need keys and values. 419 00:19:03,930 --> 00:19:05,610 We just need keys. 420 00:19:05,610 --> 00:19:07,980 So I'm going to save myself a few more keystrokes 421 00:19:07,980 --> 00:19:11,055 by just saying that, technically, in Python, using a set suffices. 422 00:19:11,055 --> 00:19:13,770 Again, a set is just a collection of values with no duplicates. 423 00:19:13,770 --> 00:19:16,400 But they don't necessarily have keys and values. 424 00:19:16,400 --> 00:19:18,250 It's just one or the other. 425 00:19:18,250 --> 00:19:21,420 But now that I have-- on line one, I claim the equivalent, in Python, 426 00:19:21,420 --> 00:19:25,720 of a hash table, I can actually do something like this. 427 00:19:25,720 --> 00:19:28,890 Here's how I might implement the check function in Python. 428 00:19:28,890 --> 00:19:33,840 If the word passed into this function is in my variable called words, 429 00:19:33,840 --> 00:19:35,390 well, return True. 430 00:19:35,390 --> 00:19:39,360 Else, go ahead and return False. 431 00:19:39,360 --> 00:19:40,030 Done. 432 00:19:40,030 --> 00:19:40,530 No, wait. 433 00:19:40,530 --> 00:19:42,990 You're thinking, if anything at all, maybe 434 00:19:42,990 --> 00:19:46,507 we want to handle lowercase instead of just uppercase and lowercase. 435 00:19:46,507 --> 00:19:47,340 Well, you know what? 436 00:19:47,340 --> 00:19:49,725 In Python, if you want to force a whole word to lowercase, 437 00:19:49,725 --> 00:19:51,360 you don't have to iterate over it with a loop. 438 00:19:51,360 --> 00:19:54,190 You don't have to use any of that C-type functions or anything. 439 00:19:54,190 --> 00:19:56,947 Just say word.lower, and that will convert the whole thing 440 00:19:56,947 --> 00:19:58,780 to lowercase for parity with the dictionary. 441 00:19:58,780 --> 00:19:59,440 All right. 442 00:19:59,440 --> 00:20:02,185 How about something like the load function in Python? 443 00:20:02,185 --> 00:20:06,130 Well, in Python, you can open files just like in C. For instance, in Python, I 444 00:20:06,130 --> 00:20:09,940 might do open, the dictionary argument in read mode, 445 00:20:09,940 --> 00:20:11,798 just like fopen in Python. 446 00:20:11,798 --> 00:20:13,090 I might do something like this. 447 00:20:13,090 --> 00:20:20,230 For each line in that file, let me go ahead and add, to my words variable, 448 00:20:20,230 --> 00:20:21,430 that line. 449 00:20:21,430 --> 00:20:24,790 And then, let me go ahead and close that file. 450 00:20:24,790 --> 00:20:26,320 And I think I'm done. 451 00:20:26,320 --> 00:20:28,457 I'm just going to go ahead and return True, 452 00:20:28,457 --> 00:20:30,040 just because I think I'm already done. 453 00:20:30,040 --> 00:20:32,350 Now, here, too, I could nitpick a little bit. 454 00:20:32,350 --> 00:20:35,680 Technically, if I'm reading in every line from the file, 455 00:20:35,680 --> 00:20:38,620 every line in the dictionary ends with, technically, a backslash n. 456 00:20:38,620 --> 00:20:41,140 But there's an easy way to get rid of that, 457 00:20:41,140 --> 00:20:43,360 just like you might see with an alternative syntax. 458 00:20:43,360 --> 00:20:45,060 What I'm actually going to do is this. 459 00:20:45,060 --> 00:20:49,060 Let me grab from the current line, the current word, 460 00:20:49,060 --> 00:20:51,940 by stripping off with reverse strip-- 461 00:20:51,940 --> 00:20:53,935 rstrip; a function we'll, again, see-- 462 00:20:53,935 --> 00:20:55,810 that just gets rid of the trailing new line-- 463 00:20:55,810 --> 00:20:58,000 the backslash n at the end of that line. 464 00:20:58,000 --> 00:21:01,900 And what I really want to do, then, is add this word to that dictionary. 465 00:21:01,900 --> 00:21:05,780 Meanwhile, if I want to figure out what the size is of my dictionary, well-- 466 00:21:05,780 --> 00:21:08,890 and, see, you're probably writing code to iterate over all of those lines, 467 00:21:08,890 --> 00:21:12,040 and you're just going to count them up using a variable. 468 00:21:12,040 --> 00:21:13,060 Not so in Python. 469 00:21:13,060 --> 00:21:15,460 You can just return the length of those words. 470 00:21:15,460 --> 00:21:19,360 And better still, in Python, you don't have to manage your own memory. 471 00:21:19,360 --> 00:21:20,500 No more malloc. 472 00:21:20,500 --> 00:21:21,700 No more free. 473 00:21:21,700 --> 00:21:24,370 No more manual thinking about memory. 474 00:21:24,370 --> 00:21:27,310 The language just deals with all of that for you. 475 00:21:27,310 --> 00:21:28,030 So you know what? 476 00:21:28,030 --> 00:21:30,760 It suffices for me to just return True and claim 477 00:21:30,760 --> 00:21:33,640 that unloading is done for me. 478 00:21:33,640 --> 00:21:35,170 And that's it. 479 00:21:35,170 --> 00:21:37,840 Again, whether, you're in the middle of or already finished, 480 00:21:37,840 --> 00:21:39,960 this might, perhaps, adjust some frustration, 481 00:21:39,960 --> 00:21:45,700 but also, enlightenment in that this is why higher-level languages exist. 482 00:21:45,700 --> 00:21:47,605 You can build on top of the same principles, 483 00:21:47,605 --> 00:21:50,170 the same ideas, with which you've been dealing, 484 00:21:50,170 --> 00:21:51,820 struggling even this past week. 485 00:21:51,820 --> 00:21:55,090 But you can now express yourself all the more succinctly. 486 00:21:55,090 --> 00:21:59,590 This one line implements a hash table for you, and all of this, now, 487 00:21:59,590 --> 00:22:03,250 just uses that hash table in a simpler way. 488 00:22:03,250 --> 00:22:05,980 Any questions, now, on this, keeping in mind 489 00:22:05,980 --> 00:22:08,830 that the point, nonetheless, of speller in p-set 5 490 00:22:08,830 --> 00:22:12,160 is to understand what's really going on underneath the hood 491 00:22:12,160 --> 00:22:14,860 and, better still, to notice this. 492 00:22:14,860 --> 00:22:18,010 This might seem all rather amazing, but let me go ahead and do this. 493 00:22:18,010 --> 00:22:21,100 I've actually got a couple of versions of speller written here, 494 00:22:21,100 --> 00:22:24,800 and I've got a version written in C that I won't show the source code for. 495 00:22:24,800 --> 00:22:28,990 But I'm going to go ahead and make that version of speller in C. 496 00:22:28,990 --> 00:22:32,470 And I'm going to go ahead here and, let's say, split 497 00:22:32,470 --> 00:22:34,270 my window here for just a moment. 498 00:22:34,270 --> 00:22:37,030 And I'm going to go into a Python version of speller, 499 00:22:37,030 --> 00:22:38,470 really, that I just wrote. 500 00:22:38,470 --> 00:22:42,820 And on the left-hand side here, let me go ahead and run speller-- 501 00:22:42,820 --> 00:22:44,740 the version I compiled in C-- 502 00:22:44,740 --> 00:22:47,890 using a big text like the Sherlock Holmes text, 503 00:22:47,890 --> 00:22:50,030 which has a whole lot of words in it. 504 00:22:50,030 --> 00:22:52,180 And on the right-hand side, let me run python 505 00:22:52,180 --> 00:22:55,510 of speller.py, which is a separate file I wrote in advance, 506 00:22:55,510 --> 00:22:57,430 just like we give you speller.c. 507 00:22:57,430 --> 00:23:00,790 And I'll, similarly, run this on the Sherlock Holmes text. 508 00:23:00,790 --> 00:23:05,020 And I'm going to do my best to hit Enter on the left and the right of my screen 509 00:23:05,020 --> 00:23:06,100 at the same time. 510 00:23:06,100 --> 00:23:08,770 But we should see, hopefully, the same list of misspelled words 511 00:23:08,770 --> 00:23:10,390 and the timings thereof. 512 00:23:10,390 --> 00:23:12,380 So here we go on the right. 513 00:23:12,380 --> 00:23:15,136 Here we go on the left. 514 00:23:15,136 --> 00:23:16,730 All right. 515 00:23:16,730 --> 00:23:18,680 A race to see which one wins here. 516 00:23:18,680 --> 00:23:19,820 C is on the left. 517 00:23:19,820 --> 00:23:21,680 Python is on the right. 518 00:23:21,680 --> 00:23:23,270 OK. 519 00:23:23,270 --> 00:23:25,530 Interesting. 520 00:23:25,530 --> 00:23:28,200 Hopefully, Python's close behind. 521 00:23:28,200 --> 00:23:30,330 Note that some of this is internet delay. 522 00:23:30,330 --> 00:23:33,360 And so, it might not necessarily be a crazy number of seconds. 523 00:23:33,360 --> 00:23:37,050 But the system is, indeed, using, if we measure it, a low level. 524 00:23:37,050 --> 00:23:39,630 How much time the CPU spent executing my code? 525 00:23:39,630 --> 00:23:41,653 C took a total of 1.64 seconds. 526 00:23:41,653 --> 00:23:44,820 That was pretty fast, even though it took a moment more for all of the bytes 527 00:23:44,820 --> 00:23:46,590 to come over the internet. 528 00:23:46,590 --> 00:23:49,050 The Python version, though, took what? 529 00:23:49,050 --> 00:23:50,605 2.44 seconds. 530 00:23:50,605 --> 00:23:53,100 So what might the inference be? 531 00:23:53,100 --> 00:23:55,590 One, maybe I'm just better at programming in C 532 00:23:55,590 --> 00:23:59,400 than I am in Python, which is probably not true. 533 00:23:59,400 --> 00:24:03,210 But what else might you infer from this example? 534 00:24:03,210 --> 00:24:07,541 535 00:24:07,541 --> 00:24:11,176 Should we, maybe, give up on Python, stick with C? 536 00:24:11,176 --> 00:24:12,070 No? 537 00:24:12,070 --> 00:24:14,410 So what might be going on here? 538 00:24:14,410 --> 00:24:16,870 Why is the Python version, that I claim is correct-- 539 00:24:16,870 --> 00:24:20,620 and I think the numbers all line up, just not the times. 540 00:24:20,620 --> 00:24:21,820 Where is the trade-off here? 541 00:24:21,820 --> 00:24:23,915 Well, here, again, is this design trade-off. 542 00:24:23,915 --> 00:24:24,415 Yeah? 543 00:24:24,415 --> 00:24:29,310 AUDIENCE: In order to save the programmer time, [INAUDIBLE].. 544 00:24:29,310 --> 00:24:30,690 DAVID MALAN: Yeah, exactly. 545 00:24:30,690 --> 00:24:32,910 In order to save the human programmer time, 546 00:24:32,910 --> 00:24:35,700 there's a lot more features built into Python-- more functions, 547 00:24:35,700 --> 00:24:38,920 more automatic management of memory and so forth-- 548 00:24:38,920 --> 00:24:40,530 and you have to pay a price. 549 00:24:40,530 --> 00:24:43,193 Someone else's code is doing all of that work for you. 550 00:24:43,193 --> 00:24:45,360 But if they've written some number of lines of code, 551 00:24:45,360 --> 00:24:47,152 those are just more lines of code that need 552 00:24:47,152 --> 00:24:50,730 to be executed for you, whereas here, the computer is 553 00:24:50,730 --> 00:24:54,615 at the risk of oversimplifying only running my lines of code. 554 00:24:54,615 --> 00:24:55,865 So there's just less overhead. 555 00:24:55,865 --> 00:24:57,448 And so, this is a perpetual trade-off. 556 00:24:57,448 --> 00:25:00,990 Typically, when using a more user-friendly and more modern language, 557 00:25:00,990 --> 00:25:02,983 one of the prices you might pay is performance. 558 00:25:02,983 --> 00:25:06,150 Now, there's a lot of smart computer scientists in the world, though, trying 559 00:25:06,150 --> 00:25:08,440 to push back on those same trade-offs. 560 00:25:08,440 --> 00:25:11,220 And so, these interpreters, like the command I wrote, 561 00:25:11,220 --> 00:25:15,390 Python technically can-- especially if you run a program again and again-- 562 00:25:15,390 --> 00:25:19,350 actually, secretly, behind the scenes, compile your code for you, 563 00:25:19,350 --> 00:25:20,610 down to 0s and 1s. 564 00:25:20,610 --> 00:25:23,640 And then, the second, the third, the fourth time you run that program, 565 00:25:23,640 --> 00:25:25,010 it might very well be faster. 566 00:25:25,010 --> 00:25:27,150 So this is a bit of a head fake here, in that 567 00:25:27,150 --> 00:25:29,490 I'm running them once and only once. 568 00:25:29,490 --> 00:25:32,070 But we could get benefit over time if we kept 569 00:25:32,070 --> 00:25:34,183 running the Python version again and again 570 00:25:34,183 --> 00:25:35,850 and, perhaps, fine-tune the performance. 571 00:25:35,850 --> 00:25:38,017 But, in general, there's going to be this trade-off. 572 00:25:38,017 --> 00:25:40,560 Now, would you rather spend the 60 seconds 573 00:25:40,560 --> 00:25:43,620 I wrote implementing a spell checker or this 6 hours, 574 00:25:43,620 --> 00:25:47,910 16 hours you might be or have spent implementing the same in C? 575 00:25:47,910 --> 00:25:48,720 Probably not. 576 00:25:48,720 --> 00:25:52,650 For productivity's sake, this is why we have these additional languages. 577 00:25:52,650 --> 00:25:57,300 Just for fun, let me flip over to another screen here and open up 578 00:25:57,300 --> 00:26:00,540 a version of Python that's actually-- in just a second-- 579 00:26:00,540 --> 00:26:04,230 on my own Mac instead of the cloud so that 580 00:26:04,230 --> 00:26:06,490 I can actually do something with graphics. 581 00:26:06,490 --> 00:26:09,930 So, here, I just have a black and white terminal window on my very own Mac. 582 00:26:09,930 --> 00:26:12,450 And I've pre-installed Python, just like we've done so 583 00:26:12,450 --> 00:26:14,370 for VS Code in the cloud for you. 584 00:26:14,370 --> 00:26:19,320 Notice that I've got this photo of, perhaps, one of your favorite TV 585 00:26:19,320 --> 00:26:21,090 shows here, with the cast of The Office. 586 00:26:21,090 --> 00:26:24,630 Notice all of the faces in this image here. 587 00:26:24,630 --> 00:26:30,210 And let me propose that we try to find one face in the crowd, CSI-style, 588 00:26:30,210 --> 00:26:33,660 whereby we want to find, perhaps, the Scranton Strangler, so to speak. 589 00:26:33,660 --> 00:26:37,080 And so, here is an example of this guy's face. 590 00:26:37,080 --> 00:26:40,385 Now, how do we go about finding this specific face in the crowd? 591 00:26:40,385 --> 00:26:42,510 Well, our human eyes, obviously, can pluck him out, 592 00:26:42,510 --> 00:26:44,370 especially if you're familiar with the show. 593 00:26:44,370 --> 00:26:46,605 But let me go ahead and do this instead. 594 00:26:46,605 --> 00:26:50,730 Let me go ahead and propose that we run code 595 00:26:50,730 --> 00:26:52,800 that I already wrote in advance here. 596 00:26:52,800 --> 00:26:55,085 This is a Python program with more lines of code 597 00:26:55,085 --> 00:26:56,460 that we won't dwell on for today. 598 00:26:56,460 --> 00:26:58,800 But it's meant to motivate what we can do. 599 00:26:58,800 --> 00:27:03,150 From a pillow library, implying a Python image library, 600 00:27:03,150 --> 00:27:07,033 I want to import some type of information, 601 00:27:07,033 --> 00:27:09,450 some feature called image so that I can manipulate images, 602 00:27:09,450 --> 00:27:12,150 not unlike our own problem set four. 603 00:27:12,150 --> 00:27:13,330 And this is powerful. 604 00:27:13,330 --> 00:27:13,830 in? 605 00:27:13,830 --> 00:27:14,330 Python. 606 00:27:14,330 --> 00:27:18,450 You can just [MIMICS EXPLOSION] import face recognition as a library 607 00:27:18,450 --> 00:27:19,950 that someone else wrote. 608 00:27:19,950 --> 00:27:22,770 From there, I'm going to create a variable called image. 609 00:27:22,770 --> 00:27:25,050 I'm going to use this face recognition libraries. 610 00:27:25,050 --> 00:27:27,330 load_image_file function. 611 00:27:27,330 --> 00:27:30,030 It's a little verbose, but it's similar in spirit to fopen. 612 00:27:30,030 --> 00:27:32,100 And I'm going to open office.jpeg. 613 00:27:32,100 --> 00:27:36,990 I'm going to, then, declare a second variable called face_locations, plural, 614 00:27:36,990 --> 00:27:40,620 because what I'm expecting to get back, per the documentation for this library, 615 00:27:40,620 --> 00:27:44,650 is a list of all of the faces' locations that are detected. 616 00:27:44,650 --> 00:27:45,150 All right. 617 00:27:45,150 --> 00:27:48,660 Then, I'm going to iterate over each of those faces using a for loop, 618 00:27:48,660 --> 00:27:50,460 that we'll see in more detail. 619 00:27:50,460 --> 00:27:53,475 I'm going to, then, infer what the top, right, bottom, and left corners 620 00:27:53,475 --> 00:27:55,050 are of that face. 621 00:27:55,050 --> 00:28:00,300 And then, what I'm going to do here is show that face alone, 622 00:28:00,300 --> 00:28:03,040 if I've detected the face in question. 623 00:28:03,040 --> 00:28:08,760 So let me go ahead, here, and run detect.py. 624 00:28:08,760 --> 00:28:12,370 And we'll see not just the one face we're looking for. 625 00:28:12,370 --> 00:28:16,430 But if I run Python of detect.py, it's going to do all of the analysis. 626 00:28:16,430 --> 00:28:22,380 I'll see a big opening here, now, of all of the faces that 627 00:28:22,380 --> 00:28:24,870 were detected in this here program. 628 00:28:24,870 --> 00:28:26,870 [CHUCKLES] OK, some better than others, I guess, 629 00:28:26,870 --> 00:28:28,560 if you zoom in on catching someone. 630 00:28:28,560 --> 00:28:29,970 Typical Angela. 631 00:28:29,970 --> 00:28:32,700 If you want to, now, find that one face, I 632 00:28:32,700 --> 00:28:34,920 think we need to train the software a bit more. 633 00:28:34,920 --> 00:28:37,080 So let me actually open up a second program called 634 00:28:37,080 --> 00:28:39,270 recognize that's got more going on. 635 00:28:39,270 --> 00:28:41,370 But let me, with a wave of a hand, point out 636 00:28:41,370 --> 00:28:45,870 that I'm now loading not only the office.jpeg, but also toby.jpeg 637 00:28:45,870 --> 00:28:49,840 to train the algorithm to find that specific face. 638 00:28:49,840 --> 00:28:53,580 And so, now, if I run this second version-- recognize.py-- 639 00:28:53,580 --> 00:28:56,310 with Python of recognize.py-- 640 00:28:56,310 --> 00:28:59,160 hold my breath for just a moment; it's analyzing, presumably, 641 00:28:59,160 --> 00:29:00,420 all of the faces-- 642 00:29:00,420 --> 00:29:02,070 you see the same, original photo. 643 00:29:02,070 --> 00:29:05,610 But do you see one such face highlighted here? 644 00:29:05,610 --> 00:29:09,420 This version of the code found Toby, highlighted him 645 00:29:09,420 --> 00:29:12,110 with the screen and, voila, we have face recognition. 646 00:29:12,110 --> 00:29:14,318 So for better or for worse, this is what's happening, 647 00:29:14,318 --> 00:29:15,967 increasingly societally, nowadays. 648 00:29:15,967 --> 00:29:18,300 And honestly, even though I didn't write the code live-- 649 00:29:18,300 --> 00:29:21,420 because it's a good dozen or more lines of code-- it's not terribly many. 650 00:29:21,420 --> 00:29:24,450 And literally, all the authorities-- all we have to do 651 00:29:24,450 --> 00:29:27,960 is import face recognition and, voila, you have access. 652 00:29:27,960 --> 00:29:29,890 These technologies are here already. 653 00:29:29,890 --> 00:29:31,690 But let's consider, for just a moment-- 654 00:29:31,690 --> 00:29:33,820 how did we find Toby? 655 00:29:33,820 --> 00:29:35,150 How might that library-- 656 00:29:35,150 --> 00:29:37,900 even though we're not going to look at its implementation details, 657 00:29:37,900 --> 00:29:40,000 how does it find Toby and distinguish him 658 00:29:40,000 --> 00:29:43,960 from all of these other faces in the crowd? 659 00:29:43,960 --> 00:29:47,180 What might it be doing, intuitively. 660 00:29:47,180 --> 00:29:50,570 Think back even to p-set four, what you, yourselves, have access to, data-wise. 661 00:29:50,570 --> 00:29:51,083 Yeah? 662 00:29:51,083 --> 00:29:53,750 AUDIENCE: [? Since ?] we gave it an image of Toby's face before, 663 00:29:53,750 --> 00:29:59,010 it probably looks at are the pixels in one area the same as in another area 664 00:29:59,010 --> 00:30:00,720 and allots it to the same-- 665 00:30:00,720 --> 00:30:02,998 from that reference image to this image. 666 00:30:02,998 --> 00:30:06,870 And then, it's going to say, hey, a lot of the similar consult ranges 667 00:30:06,870 --> 00:30:09,292 are here and here, so we can safely guess 668 00:30:09,292 --> 00:30:10,750 that this is the same [? person. ?] 669 00:30:10,750 --> 00:30:11,875 DAVID MALAN: Yeah, exactly. 670 00:30:11,875 --> 00:30:15,610 And to summarize for the camera here, we have trained the software, if you will, 671 00:30:15,610 --> 00:30:17,560 by giving it a photo of Toby's face. 672 00:30:17,560 --> 00:30:20,218 So, by looking for the same or, really, similar pixels-- 673 00:30:20,218 --> 00:30:22,510 especially if it's a slightly different image of Toby-- 674 00:30:22,510 --> 00:30:24,970 we can, perhaps, identify him in the crowd. 675 00:30:24,970 --> 00:30:26,412 And what really is a human face? 676 00:30:26,412 --> 00:30:28,120 Well, at the end of the day, the computer 677 00:30:28,120 --> 00:30:30,340 only knows it as a pattern of bits or, really, 678 00:30:30,340 --> 00:30:32,110 at a higher level, a pattern of pixels. 679 00:30:32,110 --> 00:30:35,170 So maybe a human face is, perhaps, best defined, in general, 680 00:30:35,170 --> 00:30:39,295 as two eyes and a nose and a mouth that, even though all of us look similar, 681 00:30:39,295 --> 00:30:43,268 structurally, odds are, the measurement between the eyes and the nose 682 00:30:43,268 --> 00:30:45,310 and the width of the mouth, the skin tone and all 683 00:30:45,310 --> 00:30:47,920 of these other physical characteristics are patterns 684 00:30:47,920 --> 00:30:51,280 that software could, perhaps, detect and then look, statistically, 685 00:30:51,280 --> 00:30:53,920 through the image, looking for the closest possible match 686 00:30:53,920 --> 00:30:57,422 to these various measurement shapes, colors and sizes and the like. 687 00:30:57,422 --> 00:30:59,130 And, indeed, that might be the intuition. 688 00:30:59,130 --> 00:31:03,520 But what's powerful here, again, is just how easy and readily available 689 00:31:03,520 --> 00:31:06,280 this technology now is. 690 00:31:06,280 --> 00:31:06,820 All right. 691 00:31:06,820 --> 00:31:10,605 So with that said, let's propose to consider what more we 692 00:31:10,605 --> 00:31:13,480 can do with Python itself, get back to the fundamentals, so that you, 693 00:31:13,480 --> 00:31:16,990 yourselves can start to implement something along those same lines. 694 00:31:16,990 --> 00:31:21,820 So besides having access to things like a get_string function, 695 00:31:21,820 --> 00:31:26,080 the CS50 library provides a few other things, as well-- namely, in C, 696 00:31:26,080 --> 00:31:27,040 we had these. 697 00:31:27,040 --> 00:31:29,052 But in Python, we're going to have fewer. 698 00:31:29,052 --> 00:31:32,260 In Python, our library, short-term, is going to give you not only get_string, 699 00:31:32,260 --> 00:31:33,740 but also get_int and get_float. 700 00:31:33,740 --> 00:31:34,240 Why? 701 00:31:34,240 --> 00:31:36,310 It's actually just annoying, as we'll soon 702 00:31:36,310 --> 00:31:39,190 see, to get back an integer or a float from a user 703 00:31:39,190 --> 00:31:44,890 and just make sure that it's an int and a float and not a word like cat or dog, 704 00:31:44,890 --> 00:31:47,170 or some string that's not actually a number. 705 00:31:47,170 --> 00:31:50,810 Well, we can import not just the specific function, get_string, 706 00:31:50,810 --> 00:31:53,920 but we can actually import all of these functions one at a time, 707 00:31:53,920 --> 00:31:55,840 like this, as we'll soon see. 708 00:31:55,840 --> 00:31:59,410 Or you can even, in Python, import specific functions from a file. 709 00:31:59,410 --> 00:32:04,300 One of you asked a while back, when you include something like CS50.h 710 00:32:04,300 --> 00:32:08,780 or standard I/O .h, you're actually getting all of the code in that file, 711 00:32:08,780 --> 00:32:12,010 which, potentially, can add bulk to your own program or time. 712 00:32:12,010 --> 00:32:15,040 In this case, when you import specific functions from Python, 713 00:32:15,040 --> 00:32:17,875 you can be a little more narrowly precise 714 00:32:17,875 --> 00:32:21,230 as to what it is you want to have access to. 715 00:32:21,230 --> 00:32:21,730 All right. 716 00:32:21,730 --> 00:32:23,890 So, with that said, let's go ahead and see 717 00:32:23,890 --> 00:32:25,900 what conditionals look like in Python. 718 00:32:25,900 --> 00:32:29,470 So in the left-hand side again, here, we'll see Scratch. 719 00:32:29,470 --> 00:32:33,190 So it's just a contrived example asking if x is less than y, then, 720 00:32:33,190 --> 00:32:35,350 say, x is less than y. 721 00:32:35,350 --> 00:32:37,540 In C, it looked like this. 722 00:32:37,540 --> 00:32:41,050 In Python, now, it's going to look like this instead. 723 00:32:41,050 --> 00:32:44,815 And here's before in C, and here's after. 724 00:32:44,815 --> 00:32:47,320 And just to call out a few of the obvious differences, what 725 00:32:47,320 --> 00:32:49,810 has changed, in Python, for conditionals, it would seem? 726 00:32:49,810 --> 00:32:53,013 727 00:32:53,013 --> 00:32:53,930 What's the difference? 728 00:32:53,930 --> 00:32:54,230 Yeah. 729 00:32:54,230 --> 00:32:55,920 AUDIENCE: There's a lack of curly braces. 730 00:32:55,920 --> 00:32:56,380 DAVID MALAN: Yeah. 731 00:32:56,380 --> 00:32:57,760 So there's no more curly braces. 732 00:32:57,760 --> 00:32:59,170 And, indeed, you don't use those. 733 00:32:59,170 --> 00:33:04,138 What appears to be taking their place, if you might infer? 734 00:33:04,138 --> 00:33:05,680 What seems to have taken their place? 735 00:33:05,680 --> 00:33:05,890 What do you think? 736 00:33:05,890 --> 00:33:06,765 AUDIENCE: [INAUDIBLE] 737 00:33:06,765 --> 00:33:09,560 DAVID MALAN: So the colon at the start of this line, here. 738 00:33:09,560 --> 00:33:13,760 But also even more important, now, is this indentation below it. 739 00:33:13,760 --> 00:33:16,160 So some of you, and we know this from office hours, 740 00:33:16,160 --> 00:33:19,380 have a habit of indenting everything on the left, right? 741 00:33:19,380 --> 00:33:21,200 And it's just this crazy mess to look at. 742 00:33:21,200 --> 00:33:23,000 Frustrating for you, surely. 743 00:33:23,000 --> 00:33:25,670 But C and Clang is pretty tolerant when it 744 00:33:25,670 --> 00:33:27,860 comes to things like white space in a program. 745 00:33:27,860 --> 00:33:29,030 Python, uh-uh. 746 00:33:29,030 --> 00:33:32,240 They realized, years ago, that-- let's help humans help themselves and just 747 00:33:32,240 --> 00:33:34,610 require standard indentation. 748 00:33:34,610 --> 00:33:36,620 So four spaces would be the norm here. 749 00:33:36,620 --> 00:33:38,870 But because it's indented below that colon, that, 750 00:33:38,870 --> 00:33:42,110 indeed, indicates that this, now, is part of that condition. 751 00:33:42,110 --> 00:33:46,340 Something else has gone missing, versus C, in this conditional. 752 00:33:46,340 --> 00:33:47,855 What else is a little simplified? 753 00:33:47,855 --> 00:33:49,660 AUDIENCE: [INAUDIBLE] 754 00:33:49,660 --> 00:33:50,410 DAVID MALAN: Yeah. 755 00:33:50,410 --> 00:33:51,368 So no more parentheses. 756 00:33:51,368 --> 00:33:53,650 You can still use them, especially when you need to, 757 00:33:53,650 --> 00:33:56,112 logically, to do order of operations, like in math. 758 00:33:56,112 --> 00:33:57,820 But in this case, if you just want to ask 759 00:33:57,820 --> 00:34:01,162 a simple question, like if x less than y, you can just do it like that. 760 00:34:01,162 --> 00:34:02,620 How about when you have an if else? 761 00:34:02,620 --> 00:34:05,170 Well, this is almost the same, here, with these same changes. 762 00:34:05,170 --> 00:34:06,800 In C, this looked like this. 763 00:34:06,800 --> 00:34:08,800 And it's starting to get a bit bulky-- at least, 764 00:34:08,800 --> 00:34:10,659 if we use our curly braces in this way. 765 00:34:10,659 --> 00:34:13,060 In Python, we can tighten things up further, even though, 766 00:34:13,060 --> 00:34:15,727 strictly speaking, in C, you don't always need the curly braces. 767 00:34:15,727 --> 00:34:18,280 But here, gone are the parentheses, again. 768 00:34:18,280 --> 00:34:20,020 Gone are the curly braces. 769 00:34:20,020 --> 00:34:23,380 Indentation is consistent, and we've just added another keyword, 770 00:34:23,380 --> 00:34:24,580 else, with the colon. 771 00:34:24,580 --> 00:34:26,325 But no more semicolons, as well. 772 00:34:26,325 --> 00:34:30,010 How about something larger, like this, in if, else, if else? 773 00:34:30,010 --> 00:34:31,960 This one's a little curious. 774 00:34:31,960 --> 00:34:35,290 But in C, it looked like this-- if, else, if else. 775 00:34:35,290 --> 00:34:38,143 In Python, it now looks like this. 776 00:34:38,143 --> 00:34:40,060 And there's, perhaps, one curiosity here that, 777 00:34:40,060 --> 00:34:41,977 honestly, all these years later, I still can't 778 00:34:41,977 --> 00:34:43,630 remember how to spell it half the time. 779 00:34:43,630 --> 00:34:46,900 What's weird about this? 780 00:34:46,900 --> 00:34:50,415 What do you spot as different? 781 00:34:50,415 --> 00:34:51,230 Yeah, over here. 782 00:34:51,230 --> 00:34:53,520 AUDIENCE: [INAUDIBLE] 783 00:34:53,520 --> 00:34:54,270 DAVID MALAN: Yeah. 784 00:34:54,270 --> 00:34:56,260 Instead of else if, it's elif. 785 00:34:56,260 --> 00:34:56,760 Why? 786 00:34:56,760 --> 00:34:59,340 [SIGHS] Apparently, else space if was just too many 787 00:34:59,340 --> 00:35:02,250 keystrokes for humans to type, so they condensed it into this way. 788 00:35:02,250 --> 00:35:05,100 Probably means it's a little more distinguishable, too, 789 00:35:05,100 --> 00:35:07,200 for the computer between the if and the else, too. 790 00:35:07,200 --> 00:35:08,700 But just something to remember, now. 791 00:35:08,700 --> 00:35:10,620 It's, indeed, elif and not else if. 792 00:35:10,620 --> 00:35:11,123 All right. 793 00:35:11,123 --> 00:35:12,540 So what about variables in Python? 794 00:35:12,540 --> 00:35:16,590 I've used a couple of them already, but let's 795 00:35:16,590 --> 00:35:19,533 distill exactly how you define and declare these things, as well. 796 00:35:19,533 --> 00:35:22,200 So, in Scratch, if we wanted to create a variable called counter 797 00:35:22,200 --> 00:35:25,185 and set it equal, initially, to 0, we would do something 798 00:35:25,185 --> 00:35:28,680 like this-- specify that it's an int, use the assignment operator, 799 00:35:28,680 --> 00:35:30,060 end the thought with a semicolon. 800 00:35:30,060 --> 00:35:32,310 In Python, it's just simpler. 801 00:35:32,310 --> 00:35:34,680 You name the variable, use the assignment operator, 802 00:35:34,680 --> 00:35:37,755 as before, you set it equal to some value, and that's it. 803 00:35:37,755 --> 00:35:38,880 You don't mention the type. 804 00:35:38,880 --> 00:35:41,250 You don't mention the semicolon or anything more. 805 00:35:41,250 --> 00:35:44,250 What if you want to change a variable, like counter, 806 00:35:44,250 --> 00:35:46,320 by 1-- that is, incremented by 1? 807 00:35:46,320 --> 00:35:47,800 You have a few different ways here. 808 00:35:47,800 --> 00:35:51,990 In C, we saw syntax like this, where you can say counter equals counter plus 1, 809 00:35:51,990 --> 00:35:54,900 which, again, feels illogical. 810 00:35:54,900 --> 00:35:56,610 How can counter equal counter plus 1? 811 00:35:56,610 --> 00:36:01,890 But, again, we read this code, really, right to left, updating its value by 1. 812 00:36:01,890 --> 00:36:03,550 In Python, it's almost the same. 813 00:36:03,550 --> 00:36:04,535 You just get rid of the semicolon. 814 00:36:04,535 --> 00:36:05,580 So that logic is there. 815 00:36:05,580 --> 00:36:08,070 But recall, in C, we could do something slightly different 816 00:36:08,070 --> 00:36:09,840 that we can also do in Python. 817 00:36:09,840 --> 00:36:12,060 In Python, you can also, more succinctly, 818 00:36:12,060 --> 00:36:15,420 do this-- plus equals, and then, whatever number you want to add. 819 00:36:15,420 --> 00:36:17,790 Or you can even change it to subtract, if you prefer. 820 00:36:17,790 --> 00:36:21,495 Sadly, gone is something you've probably typed a whole lot. 821 00:36:21,495 --> 00:36:23,940 What was the other way you can add 1? 822 00:36:23,940 --> 00:36:24,773 AUDIENCE: Plus plus? 823 00:36:24,773 --> 00:36:26,940 DAVID MALAN: Plus plus is no more, sadly, in Python. 824 00:36:26,940 --> 00:36:29,600 Just too many ways to do the same thing, so they got rid of it 825 00:36:29,600 --> 00:36:31,705 in favor of just this syntax, here. 826 00:36:31,705 --> 00:36:33,140 So keep that in mind, as well. 827 00:36:33,140 --> 00:36:36,500 What about loops, when you want to do something in Python again and again. 828 00:36:36,500 --> 00:36:39,380 Well, in Scratch, in week zero, here's how we meowed three times, 829 00:36:39,380 --> 00:36:40,700 specifically. 830 00:36:40,700 --> 00:36:42,650 In C, we had a couple of ways of doing this. 831 00:36:42,650 --> 00:36:46,460 This was the more mechanical approach, where you create a variable called i. 832 00:36:46,460 --> 00:36:47,780 You set it equal to 0. 833 00:36:47,780 --> 00:36:51,230 You then do while i is less than 3, the following. 834 00:36:51,230 --> 00:36:54,530 And then, you, yourself increment i again and again. 835 00:36:54,530 --> 00:36:57,920 Mechanical in the sense that you have to implement all of these gears 836 00:36:57,920 --> 00:37:01,130 and make them turn yourself, but this was a correct way to do that. 837 00:37:01,130 --> 00:37:03,740 In Python, we can still achieve the same idea, 838 00:37:03,740 --> 00:37:05,945 but we don't need the int keyword. 839 00:37:05,945 --> 00:37:07,445 We don't need any of the semicolons. 840 00:37:07,445 --> 00:37:08,695 We don't need the parentheses. 841 00:37:08,695 --> 00:37:10,310 We don't need the curly braces. 842 00:37:10,310 --> 00:37:12,200 We can't use the plus plus, so maybe that's 843 00:37:12,200 --> 00:37:14,300 a minor step backwards if you're a fan. 844 00:37:14,300 --> 00:37:17,930 But otherwise, the code, the logic is exactly the same. 845 00:37:17,930 --> 00:37:20,390 But there's other ways to achieve this same idea. 846 00:37:20,390 --> 00:37:22,950 Recall that, in C, we could also do this. 847 00:37:22,950 --> 00:37:25,880 You could use a for loop, which does exactly the same thing. 848 00:37:25,880 --> 00:37:26,893 Both are correct. 849 00:37:26,893 --> 00:37:28,310 Both are, arguably, well-designed. 850 00:37:28,310 --> 00:37:32,000 It's to each their own when it comes to choosing between these. 851 00:37:32,000 --> 00:37:35,930 In Python, though, we're going to have to think through how to do this. 852 00:37:35,930 --> 00:37:41,300 So you don't do the same for loop as in C. The closest I could come up with 853 00:37:41,300 --> 00:37:44,270 is this, where you say for i-- 854 00:37:44,270 --> 00:37:47,555 or whatever variable you want to do the counting-- in-- literally 855 00:37:47,555 --> 00:37:50,522 the preposition-- and then, you use square brackets here. 856 00:37:50,522 --> 00:37:52,730 And we've used square brackets before, in the context 857 00:37:52,730 --> 00:37:55,370 of arrays and things like that. 858 00:37:55,370 --> 00:38:00,080 And the 0, 1, 2 looks like an array, in some sense, even though we've also seen 859 00:38:00,080 --> 00:38:01,470 arrays with curly braces. 860 00:38:01,470 --> 00:38:03,950 But these square brackets, for now, denote a list. 861 00:38:03,950 --> 00:38:05,420 Python does not have arrays. 862 00:38:05,420 --> 00:38:08,600 An array is that contiguous chunk of memory, back to back to back, 863 00:38:08,600 --> 00:38:13,160 that you have to resize somehow by moving things around in memory, 864 00:38:13,160 --> 00:38:14,450 as per two weeks ago. 865 00:38:14,450 --> 00:38:19,175 In Python, though, you can just create a list like this using square brackets. 866 00:38:19,175 --> 00:38:22,700 And better still, as we'll see, you can add or even remove things 867 00:38:22,700 --> 00:38:24,920 from that list down the road. 868 00:38:24,920 --> 00:38:27,140 This, though, is not going to be very well-designed. 869 00:38:27,140 --> 00:38:28,610 This will work. 870 00:38:28,610 --> 00:38:32,030 This will iterate in Python three times. 871 00:38:32,030 --> 00:38:34,700 But what might rub you the wrong way about this design, 872 00:38:34,700 --> 00:38:36,860 even if you've never seen Python before? 873 00:38:36,860 --> 00:38:38,460 How does this example not end well? 874 00:38:38,460 --> 00:38:38,960 Yeah? 875 00:38:38,960 --> 00:38:41,810 AUDIENCE: Making a large list [INAUDIBLE].. 876 00:38:41,810 --> 00:38:42,560 DAVID MALAN: Yeah. 877 00:38:42,560 --> 00:38:45,830 If you're making a large list, you have to type out each one of these numbers, 878 00:38:45,830 --> 00:38:50,178 like comma 3, comma 4, comma 5, comma, dot, dot, dot, 50 comma, dot, dot, dot, 879 00:38:50,178 --> 00:38:50,678 500. 880 00:38:50,678 --> 00:38:52,640 Like, surely, that's not the best solution, 881 00:38:52,640 --> 00:38:55,760 to have all of these numbers on the screen, 882 00:38:55,760 --> 00:38:57,140 wrapping endlessly on the screen. 883 00:38:57,140 --> 00:39:01,100 So, in Python, another way to do this would be to use a function 884 00:39:01,100 --> 00:39:04,160 called range, which, technically, is a data type onto itself. 885 00:39:04,160 --> 00:39:08,080 And this returns to you as many values as you ask for it. 886 00:39:08,080 --> 00:39:09,830 range takes some other arguments, as well. 887 00:39:09,830 --> 00:39:14,540 But the simplest use case here is, if you want back the numbers 0, 1, and 2-- 888 00:39:14,540 --> 00:39:15,890 a total of three values-- 889 00:39:15,890 --> 00:39:19,070 you say, hey, Python, please give me a range of three values. 890 00:39:19,070 --> 00:39:21,260 And by default, they start at 0 on up. 891 00:39:21,260 --> 00:39:24,320 But this is more efficient than it would be 892 00:39:24,320 --> 00:39:26,390 to hard code the entire list at once. 893 00:39:26,390 --> 00:39:29,150 And the best metaphor I could come up with is something like this. 894 00:39:29,150 --> 00:39:30,775 Here, for instance, is a deck of cards. 895 00:39:30,775 --> 00:39:34,430 This is normal, human size, and there's presumably 52 cards here. 896 00:39:34,430 --> 00:39:38,728 So writing out 0 through 51 on code would be a little ridiculous 897 00:39:38,728 --> 00:39:39,770 for the reasons you know. 898 00:39:39,770 --> 00:39:44,510 And it would just be very unwieldy and ugly and wrapping in all of that. 899 00:39:44,510 --> 00:39:48,500 It would be the virtual equivalent of me handing you all of these cards at once 900 00:39:48,500 --> 00:39:49,430 to just deal with. 901 00:39:49,430 --> 00:39:52,760 And, right, they're not that big, but it's a lot of cards to hold on to. 902 00:39:52,760 --> 00:39:55,760 It requires a lot of memory or physical storage, if you will. 903 00:39:55,760 --> 00:39:59,840 What range does, metaphorically, is, if you ask me for three cards, 904 00:39:59,840 --> 00:40:04,910 I hand you them one at a time, like this, so that, at any point in time, 905 00:40:04,910 --> 00:40:08,150 you only have one number in the computer's memory 906 00:40:08,150 --> 00:40:09,760 until you're handed the next. 907 00:40:09,760 --> 00:40:11,840 The alternative-- the previous version would 908 00:40:11,840 --> 00:40:15,360 be to hand me all three cards at once, or all 52 cards at once. 909 00:40:15,360 --> 00:40:17,840 But in this case, range is just way more efficient. 910 00:40:17,840 --> 00:40:19,700 You can do range of 1,000. 911 00:40:19,700 --> 00:40:22,940 That's not going to give you a list of 1,000 values all at once. 912 00:40:22,940 --> 00:40:25,910 It's going to give you 1,000 values one at a time, 913 00:40:25,910 --> 00:40:30,800 reducing memory significantly in the computer itself. 914 00:40:30,800 --> 00:40:31,310 All right. 915 00:40:31,310 --> 00:40:34,745 So, besides this, what about doing something forever in Scratch? 916 00:40:34,745 --> 00:40:38,060 Well, we could do this, literally, with a forever block, which didn't quite 917 00:40:38,060 --> 00:40:42,590 exist in C. In C, we had to hack it together by saying while True-- 918 00:40:42,590 --> 00:40:46,000 because True is, by definition, T-R-U-E, always true. 919 00:40:46,000 --> 00:40:50,420 So this just deliberately induces an infinite loop for us. 920 00:40:50,420 --> 00:40:53,375 In Python, the logic's going to be almost the same. 921 00:40:53,375 --> 00:40:55,250 And infinite loops in Python tend to actually 922 00:40:55,250 --> 00:40:58,760 be even more common because you can always break out of them, as you could 923 00:40:58,760 --> 00:41:02,280 in C. In Python, it looks like this. 924 00:41:02,280 --> 00:41:05,960 And this is slightly more subtle, but gone are the curly braces. 925 00:41:05,960 --> 00:41:07,370 Gone are the parentheses. 926 00:41:07,370 --> 00:41:10,400 But ever so slight difference, too? 927 00:41:10,400 --> 00:41:13,187 A capital T for True and it's going to be a capital F for False. 928 00:41:13,187 --> 00:41:14,270 Stupid little differences. 929 00:41:14,270 --> 00:41:16,440 Eventually, you're going to mistype one or the other. 930 00:41:16,440 --> 00:41:18,607 But these are the kinds of things to keep an eye out 931 00:41:18,607 --> 00:41:21,770 and to start recognizing in your mind's eye when you read code. 932 00:41:21,770 --> 00:41:25,310 Questions, now, on any of these building blocks? 933 00:41:25,310 --> 00:41:26,075 Yeah? 934 00:41:26,075 --> 00:41:31,360 AUDIENCE: In the for loop, was i set to 0 once for [? every loop? ?] 935 00:41:31,360 --> 00:41:33,970 DAVID MALAN: In the for loop, was i-- 936 00:41:33,970 --> 00:41:37,090 it was set to 0 on the first iteration, then 1 on the next, 937 00:41:37,090 --> 00:41:38,530 then 2 on the third. 938 00:41:38,530 --> 00:41:39,985 And the same thing for range. 939 00:41:39,985 --> 00:41:44,050 It just doesn't use up as much memory all at once. 940 00:41:44,050 --> 00:41:49,860 Other questions, now, on any of these building blocks of Python? 941 00:41:49,860 --> 00:41:50,400 All right. 942 00:41:50,400 --> 00:41:53,250 Well, let's go ahead and build something a little more than hello. 943 00:41:53,250 --> 00:41:56,415 Let me propose that, over here, we implement, maybe, 944 00:41:56,415 --> 00:41:58,200 the simplest of calculators here. 945 00:41:58,200 --> 00:42:02,145 So let me go back to VS Code here, open my terminal window 946 00:42:02,145 --> 00:42:06,885 and open up, say, a file called calculator.py. 947 00:42:06,885 --> 00:42:09,000 And in calculator.py, we'll have an opportunity 948 00:42:09,000 --> 00:42:11,340 to explore some of these building blocks, 949 00:42:11,340 --> 00:42:13,890 but we'll allow things to escalate pretty quickly 950 00:42:13,890 --> 00:42:17,225 to more interesting examples so that we can do the same thing, ultimately, 951 00:42:17,225 --> 00:42:17,760 as well. 952 00:42:17,760 --> 00:42:19,510 And, in fact, let me go ahead and do this. 953 00:42:19,510 --> 00:42:22,950 Moreover, I've brought some code with me in advance. 954 00:42:22,950 --> 00:42:25,725 For instance, something called calculator0.c, 955 00:42:25,725 --> 00:42:28,860 from the first week of C. And let me go ahead 956 00:42:28,860 --> 00:42:34,420 and split my window here, in fact, so that I can now do something like this. 957 00:42:34,420 --> 00:42:37,170 Let me move this over here, here. 958 00:42:37,170 --> 00:42:38,105 Calculator.py. 959 00:42:38,105 --> 00:42:40,920 So now, I have, on the left of my screen, calculator.c-- 960 00:42:40,920 --> 00:42:43,620 or calculator0.c because that's the first version I 961 00:42:43,620 --> 00:42:45,690 made-- and calculator.py on the right. 962 00:42:45,690 --> 00:42:48,290 Let me go ahead and implement, really, the same idea here. 963 00:42:48,290 --> 00:42:51,675 So on the right-hand side, the analog of including cs50.h 964 00:42:51,675 --> 00:42:56,390 would be from cs50 import get_int if I want to, indeed, use this function. 965 00:42:56,390 --> 00:42:58,140 Now, I'm going to go ahead and give myself 966 00:42:58,140 --> 00:43:00,453 a variable x without defining its type. 967 00:43:00,453 --> 00:43:02,370 I'm going to use this get_int function and I'm 968 00:43:02,370 --> 00:43:05,302 going to prompt the user for x, just like in C. 969 00:43:05,302 --> 00:43:08,010 I'm, then, going to go ahead and prompt the user for another int, 970 00:43:08,010 --> 00:43:12,300 like y, here, just like in C. And at the very end, I'm going to go ahead 971 00:43:12,300 --> 00:43:14,640 and do print x plus y. 972 00:43:14,640 --> 00:43:15,690 And that's it. 973 00:43:15,690 --> 00:43:19,020 Now, granted, I have some comments in my C version of the code, 974 00:43:19,020 --> 00:43:21,090 just to remind you of what each line is doing. 975 00:43:21,090 --> 00:43:23,878 But I've still distilled this into six lines-- or, really, four 976 00:43:23,878 --> 00:43:25,170 if I get rid of the blank line. 977 00:43:25,170 --> 00:43:29,580 So it's already, perhaps, a bit tighter here. 978 00:43:29,580 --> 00:43:33,600 It's tighter because something really important, historically, is missing. 979 00:43:33,600 --> 00:43:38,240 What did I seem to omit altogether that we haven't really highlighted yet? 980 00:43:38,240 --> 00:43:39,136 Yeah? 981 00:43:39,136 --> 00:43:40,530 AUDIENCE: [INAUDIBLE] 982 00:43:40,530 --> 00:43:41,280 DAVID MALAN: Yeah. 983 00:43:41,280 --> 00:43:42,910 The main function is gone. 984 00:43:42,910 --> 00:43:45,330 And in fact, maybe you took for granted that it just 985 00:43:45,330 --> 00:43:47,580 worked a moment ago when I wrote hello, but I didn't 986 00:43:47,580 --> 00:43:49,273 have a main function in hello, either. 987 00:43:49,273 --> 00:43:52,440 And this, too, is a feature of Python and a lot of other languages, as well. 988 00:43:52,440 --> 00:43:55,320 Instead of having to adhere to these long-standing traditions, 989 00:43:55,320 --> 00:43:57,400 if you just want to write code and get something done, fine. 990 00:43:57,400 --> 00:43:59,925 Just write code and get something done without, necessarily, 991 00:43:59,925 --> 00:44:01,185 all of this same boilerplate. 992 00:44:01,185 --> 00:44:04,380 So whatever is in your Python file-- 993 00:44:04,380 --> 00:44:06,510 left indented, if you will, by default-- 994 00:44:06,510 --> 00:44:10,180 is just going to be the code that the interpreter runs, top to bottom, 995 00:44:10,180 --> 00:44:10,850 left to right. 996 00:44:10,850 --> 00:44:14,300 Well, let me go ahead, now, and run code like this. 997 00:44:14,300 --> 00:44:17,470 Let me go ahead and open back up my terminal window, 998 00:44:17,470 --> 00:44:19,140 run python of calculator.py. 999 00:44:19,140 --> 00:44:21,570 And I'll do x is 1, y is 2. 1000 00:44:21,570 --> 00:44:23,460 And as you might expect, it gives me 3. 1001 00:44:23,460 --> 00:44:24,570 Slight aesthetic bug. 1002 00:44:24,570 --> 00:44:26,590 I put my space in the wrong place here. 1003 00:44:26,590 --> 00:44:27,810 So that's a newbie mistake. 1004 00:44:27,810 --> 00:44:29,220 Let me fix that, aesthetically. 1005 00:44:29,220 --> 00:44:31,050 Let me rerun python of calculator.py. 1006 00:44:31,050 --> 00:44:31,680 Type in 1. 1007 00:44:31,680 --> 00:44:32,250 Type in 2. 1008 00:44:32,250 --> 00:44:36,280 And, voila, there is now my same version again. 1009 00:44:36,280 --> 00:44:39,585 But let me propose, now, that we get rid of this training wheel. 1010 00:44:39,585 --> 00:44:41,460 We don't want to keep taking one step forward 1011 00:44:41,460 --> 00:44:43,793 and then two steps back by adding these training wheels, 1012 00:44:43,793 --> 00:44:45,330 so let me instead do this. 1013 00:44:45,330 --> 00:44:49,590 In my version of calculator.py, suppose that we take away, already, 1014 00:44:49,590 --> 00:44:53,610 the training wheel that is the CS50 library here and let me, 1015 00:44:53,610 --> 00:44:56,910 instead, then, use just Python's built-in function called 1016 00:44:56,910 --> 00:44:59,020 input, which literally does just that. 1017 00:44:59,020 --> 00:45:03,600 It gets input from the user and it stores it, as before, in x and y. 1018 00:45:03,600 --> 00:45:04,950 So this is not CS50-specific. 1019 00:45:04,950 --> 00:45:07,155 This is real-world Python programming. 1020 00:45:07,155 --> 00:45:10,740 Well, let me go ahead and run, again, python of calculator.py. 1021 00:45:10,740 --> 00:45:16,530 And, of course, if x is 1 and y is 2, x plus y should, of course, still be 3. 1022 00:45:16,530 --> 00:45:19,306 1023 00:45:19,306 --> 00:45:24,285 It's apparently 12, according to Python, until CS50's library gets involved. 1024 00:45:24,285 --> 00:45:28,620 But does anyone want to infer what just went wrong? 1025 00:45:28,620 --> 00:45:29,160 Yeah? 1026 00:45:29,160 --> 00:45:32,925 AUDIENCE: We're always [INAUDIBLE]. 1027 00:45:32,925 --> 00:45:33,800 DAVID MALAN: Exactly. 1028 00:45:33,800 --> 00:45:37,660 The input function, by design, always returns a string of text. 1029 00:45:37,660 --> 00:45:39,410 After all, that's what the human typed in. 1030 00:45:39,410 --> 00:45:42,620 And even though, yes, I typed the number keys on the keyboard, 1031 00:45:42,620 --> 00:45:44,600 it's still coming back as all text. 1032 00:45:44,600 --> 00:45:47,090 Now, maybe we should use like a get_int function. 1033 00:45:47,090 --> 00:45:48,575 Well, that doesn't exist in Python. 1034 00:45:48,575 --> 00:45:52,340 All you can do is get textual input-- a string from the user. 1035 00:45:52,340 --> 00:45:54,415 But we can convert one to the other. 1036 00:45:54,415 --> 00:45:58,610 And so, a fix for this so that we don't accidentally concatenate-- 1037 00:45:58,610 --> 00:46:02,760 that is, join x plus y together-- would be to do something like this. 1038 00:46:02,760 --> 00:46:04,595 Let me go back to my Python code, here. 1039 00:46:04,595 --> 00:46:08,870 And whereas, in C, we could previously do typecasting-- 1040 00:46:08,870 --> 00:46:11,060 we can convert one type to another-- 1041 00:46:11,060 --> 00:46:14,420 that generally wasn't the case when you were doing something complex, 1042 00:46:14,420 --> 00:46:15,470 like a string to an int. 1043 00:46:15,470 --> 00:46:18,450 You could do a char to an int and vise versa. 1044 00:46:18,450 --> 00:46:22,370 But for a string, recall, there was a special function in the C-type library 1045 00:46:22,370 --> 00:46:25,100 called a to I, like Ascii to integer. 1046 00:46:25,100 --> 00:46:27,880 That's the closest analog, here. 1047 00:46:27,880 --> 00:46:29,630 And, in fact, the way to do this in Python 1048 00:46:29,630 --> 00:46:32,740 would be to use a function called int, which, 1049 00:46:32,740 --> 00:46:34,490 indeed, is the name of the data type, too, 1050 00:46:34,490 --> 00:46:36,380 even though I have not yet had to type it. 1051 00:46:36,380 --> 00:46:40,340 And I can convert the output of the input function 1052 00:46:40,340 --> 00:46:44,600 automatically from a string immediately to an int. 1053 00:46:44,600 --> 00:46:48,620 And now, if I go back to my terminal window, rerun python of calculator.py 1054 00:46:48,620 --> 00:46:52,770 with 1 and 2 for x and y, now, I'm back in business. 1055 00:46:52,770 --> 00:46:55,400 So that, then, is, for instance, what the CS50 library 1056 00:46:55,400 --> 00:46:59,420 does, if temporarily this week, is it just deals with the conversion for you. 1057 00:46:59,420 --> 00:47:03,500 And, in fact, bad things could happen if I type the wrong thing, 1058 00:47:03,500 --> 00:47:05,615 like dog or cat instead of a number. 1059 00:47:05,615 --> 00:47:08,400 But we'll cross that bridge in just a moment, as well. 1060 00:47:08,400 --> 00:47:08,900 All right. 1061 00:47:08,900 --> 00:47:11,990 What if we do something slightly different, now, with our calculator. 1062 00:47:11,990 --> 00:47:16,400 1063 00:47:16,400 --> 00:47:18,790 Instead of addition, let's do division instead. 1064 00:47:18,790 --> 00:47:23,990 So z equals x divided by y, thereby giving me a third variable z. 1065 00:47:23,990 --> 00:47:27,320 Let me go ahead and run python of calculator.py again. 1066 00:47:27,320 --> 00:47:29,120 I'll type in 1. 1067 00:47:29,120 --> 00:47:31,790 I'll type in 3 this time. 1068 00:47:31,790 --> 00:47:37,470 And what problem do you think we're about to see? 1069 00:47:37,470 --> 00:47:38,400 Or is it gone? 1070 00:47:38,400 --> 00:47:41,670 What happened when I did this in C, albeit with some slightly more 1071 00:47:41,670 --> 00:47:47,680 cryptic syntax, when I divided one number, like 1 divided by 3? 1072 00:47:47,680 --> 00:47:48,600 Anyone recall? 1073 00:47:48,600 --> 00:47:49,100 Yeah? 1074 00:47:49,100 --> 00:47:51,310 AUDIENCE: You would round to the nearest integer. 1075 00:47:51,310 --> 00:47:52,060 DAVID MALAN: Yeah. 1076 00:47:52,060 --> 00:47:55,030 So it would round down to the nearest integer, 1077 00:47:55,030 --> 00:47:57,560 whereby you experience truncation. 1078 00:47:57,560 --> 00:48:00,340 So if you take an integer like 1, you divide it 1079 00:48:00,340 --> 00:48:02,530 by another integer like 3, that technically 1080 00:48:02,530 --> 00:48:06,310 should be 0.33333, infinitely long. 1081 00:48:06,310 --> 00:48:10,297 But in C, recall, you truncate the value. 1082 00:48:10,297 --> 00:48:12,130 If you divide an int by an int, you get back 1083 00:48:12,130 --> 00:48:14,965 an int, which means you get only the integer part, which was the 0. 1084 00:48:14,965 --> 00:48:18,805 Now, Python actually handles this for us and avoids the truncation. 1085 00:48:18,805 --> 00:48:23,650 But it leaves us, still, with one other problem here, which is going to be, 1086 00:48:23,650 --> 00:48:27,453 for instance, not necessarily visible at a glance. 1087 00:48:27,453 --> 00:48:28,245 This looks correct. 1088 00:48:28,245 --> 00:48:31,780 This has solved the problem in C. So truncation does not happen. 1089 00:48:31,780 --> 00:48:36,010 The integers are automatically converted to a float-- a floating point value. 1090 00:48:36,010 --> 00:48:41,970 But what other problem did we trip over, back in week one? 1091 00:48:41,970 --> 00:48:44,480 1092 00:48:44,480 --> 00:48:49,700 What else got a little dicey when dealing with simple arithmetic? 1093 00:48:49,700 --> 00:48:51,238 Anyone recall? 1094 00:48:51,238 --> 00:48:53,280 Well, the syntax in Python is a little different, 1095 00:48:53,280 --> 00:48:54,780 but let me go ahead and do this. 1096 00:48:54,780 --> 00:48:58,700 It turns out, in Python, if you want to see more significant digits than what 1097 00:48:58,700 --> 00:49:02,360 I'm seeing here by default, which is a dozen or so, let me go ahead 1098 00:49:02,360 --> 00:49:03,715 and print out z as follows. 1099 00:49:03,715 --> 00:49:07,310 Let me first print out a format string because I want to format z 1100 00:49:07,310 --> 00:49:08,780 in an interesting way. 1101 00:49:08,780 --> 00:49:11,330 And notice, this would have no effect on the difference. 1102 00:49:11,330 --> 00:49:14,630 This is just a format string that, for no compelling reason at the moment, 1103 00:49:14,630 --> 00:49:19,280 is interpolating z in those curly braces using an fstring or format string. 1104 00:49:19,280 --> 00:49:23,390 If I run this again with 1 and 3, we'll see, indeed, the exact same thing. 1105 00:49:23,390 --> 00:49:25,700 But when you use an fstring, you, indeed, 1106 00:49:25,700 --> 00:49:28,460 have the ability to format that string more precisely. 1107 00:49:28,460 --> 00:49:32,930 Just like with %f in Python, you could start to fine-tune how many significant 1108 00:49:32,930 --> 00:49:35,720 digits you see-- 1109 00:49:35,720 --> 00:49:37,070 in C, rather. 1110 00:49:37,070 --> 00:49:40,190 In Python, you can do the same, but the syntax is a little different. 1111 00:49:40,190 --> 00:49:43,925 If you want the computer to interpolate z and show you 1112 00:49:43,925 --> 00:49:47,570 50 significant digits-- that is, 50 numbers 1113 00:49:47,570 --> 00:49:50,033 after the decimal point-- syntax is similar to C, 1114 00:49:50,033 --> 00:49:51,200 but it's a little different. 1115 00:49:51,200 --> 00:49:54,110 You literally put a colon after the variable's name. 1116 00:49:54,110 --> 00:49:59,090 dot 50 means show me the decimal point and, then, 50 digits to the right, 1117 00:49:59,090 --> 00:50:02,760 and the f just indicates please treat this as a floating point value. 1118 00:50:02,760 --> 00:50:05,540 So now, if I rerun python of calculator.py, 1119 00:50:05,540 --> 00:50:11,495 divide 1 by 3, unfortunately, Python has not solved all of the world's problems 1120 00:50:11,495 --> 00:50:12,710 for us. 1121 00:50:12,710 --> 00:50:15,545 This, again, was an example of floating point imprecision. 1122 00:50:15,545 --> 00:50:17,692 So that problem is still latent. 1123 00:50:17,692 --> 00:50:20,150 So just because the world has advanced, doesn't necessarily 1124 00:50:20,150 --> 00:50:22,317 mean that all of our problems from C have gone away. 1125 00:50:22,317 --> 00:50:26,418 There are solutions using third-party libraries for scientific calculations 1126 00:50:26,418 --> 00:50:26,960 and the like. 1127 00:50:26,960 --> 00:50:31,445 But out of the box, floating point imprecision is still an issue. 1128 00:50:31,445 --> 00:50:35,780 Meanwhile, there was one other problem in C 1129 00:50:35,780 --> 00:50:39,890 that we ran into involving numbers, and that was this-- integer overflow. 1130 00:50:39,890 --> 00:50:41,930 Recall that an integer in C only took up, 1131 00:50:41,930 --> 00:50:45,140 what, 32 bits typically, which meant you could count as high as 4 billion 1132 00:50:45,140 --> 00:50:48,140 or, maybe, if you're doing positive and negatives, as high as 2 billion, 1133 00:50:48,140 --> 00:50:50,030 after which, weird things would happen. 1134 00:50:50,030 --> 00:50:54,798 The number would go to 0 or negative or it would overflow or wrap back around. 1135 00:50:54,798 --> 00:50:56,840 Well, wonderfully, in Python, they did, at least, 1136 00:50:56,840 --> 00:51:00,800 address this, whereby you can count as high as you want. 1137 00:51:00,800 --> 00:51:03,830 And Python will just use more and more and more and more 1138 00:51:03,830 --> 00:51:08,000 bits and bytes to store really big numbers so integer overflow is not 1139 00:51:08,000 --> 00:51:09,020 a thing. 1140 00:51:09,020 --> 00:51:13,820 With that said, Python is limited to how many digits it will show you 1141 00:51:13,820 --> 00:51:15,410 on the screen at once as a string. 1142 00:51:15,410 --> 00:51:18,560 But, mathematically, your math will be correct now. 1143 00:51:18,560 --> 00:51:21,860 So we've taken a couple of steps forward, one step sideways. 1144 00:51:21,860 --> 00:51:25,530 But, indeed, we have solved some of our problems here. 1145 00:51:25,530 --> 00:51:26,030 All right. 1146 00:51:26,030 --> 00:51:32,230 Questions, now, on any of these examples thus far? 1147 00:51:32,230 --> 00:51:34,400 Question? 1148 00:51:34,400 --> 00:51:35,000 All right. 1149 00:51:35,000 --> 00:51:40,250 Well, how about another problem that we encountered in C. Let's 1150 00:51:40,250 --> 00:51:41,720 revisit it here in Python, as well. 1151 00:51:41,720 --> 00:51:43,595 So let me go ahead and, on the left-hand side 1152 00:51:43,595 --> 00:51:54,020 here, let me open up a file called, say, compare3.c on the left, 1153 00:51:54,020 --> 00:51:57,640 and let me go ahead and create a new file on the right called compare.py. 1154 00:51:57,640 --> 00:52:00,070 Because recall that bad things happened when 1155 00:52:00,070 --> 00:52:03,580 we needed to compare two values in C. So on the left, 1156 00:52:03,580 --> 00:52:06,550 here, is a reminder of what we once did in C, 1157 00:52:06,550 --> 00:52:11,230 whereby, if we want to compare values, we can get an int in C, store it in x. 1158 00:52:11,230 --> 00:52:13,450 A get_int in C, store it in y. 1159 00:52:13,450 --> 00:52:16,180 We then have our familiar, conditional logic here, 1160 00:52:16,180 --> 00:52:19,210 just printing out if x x less than y or not. 1161 00:52:19,210 --> 00:52:23,080 Well, we can certainly do the same thing, ultimately, in Python 1162 00:52:23,080 --> 00:52:25,720 by using some fairly familiar syntax. 1163 00:52:25,720 --> 00:52:27,640 And let's just demonstrate this one quickly. 1164 00:52:27,640 --> 00:52:29,500 Let me go over here, too. 1165 00:52:29,500 --> 00:52:34,690 I'll do from cs50 import get_int, even though I could do this, instead, 1166 00:52:34,690 --> 00:52:36,700 with the input function itself. 1167 00:52:36,700 --> 00:52:39,700 x equals get_int, and I'll prompt the user for that. 1168 00:52:39,700 --> 00:52:42,880 y equals get_int, and I'll prompt the user for that. 1169 00:52:42,880 --> 00:52:45,910 After that, recall that I can say, without parentheses, 1170 00:52:45,910 --> 00:52:52,010 if x is less than y, then print out, without the f, "x is less than y." 1171 00:52:52,010 --> 00:52:58,570 Then, I can go ahead and say else if x is greater than y, I can print out, 1172 00:52:58,570 --> 00:53:01,270 quote unquote, "x is greater than y." 1173 00:53:01,270 --> 00:53:05,320 If you'd like to interject now, what did I screw up? 1174 00:53:05,320 --> 00:53:05,820 Anyone? 1175 00:53:05,820 --> 00:53:06,150 Yeah? 1176 00:53:06,150 --> 00:53:06,915 AUDIENCE: Elif. 1177 00:53:06,915 --> 00:53:07,957 DAVID MALAN: Elif, right? 1178 00:53:07,957 --> 00:53:13,965 So elif x is greater than y, else-- this part's the same-- print 1179 00:53:13,965 --> 00:53:18,000 "x is equal to y." 1180 00:53:18,000 --> 00:53:19,805 There's no new logic going on here. 1181 00:53:19,805 --> 00:53:21,960 But, at least syntactically, it's a little cleaner. 1182 00:53:21,960 --> 00:53:25,500 Indeed, this program is only 11 lines long, albeit without any comments. 1183 00:53:25,500 --> 00:53:27,765 Let me go ahead and run python of compare.py. 1184 00:53:27,765 --> 00:53:28,350 Let's see. 1185 00:53:28,350 --> 00:53:30,235 Is 1 less than 2? 1186 00:53:30,235 --> 00:53:30,735 Indeed. 1187 00:53:30,735 --> 00:53:32,070 Let's run it again. 1188 00:53:32,070 --> 00:53:33,330 Is 2 less than 1? 1189 00:53:33,330 --> 00:53:34,890 No, it's greater than. 1190 00:53:34,890 --> 00:53:37,740 And let's, lastly, type in 1 and 1 twice. 1191 00:53:37,740 --> 00:53:38,910 x is equal to y. 1192 00:53:38,910 --> 00:53:42,030 So we've got a pretty side-by-side, one-to-one conversion here. 1193 00:53:42,030 --> 00:53:44,190 Let's do something a little more interesting, then. 1194 00:53:44,190 --> 00:53:48,270 In C, how about I open, instead, something where we actually 1195 00:53:48,270 --> 00:53:49,310 compared for a purpose? 1196 00:53:49,310 --> 00:53:54,150 So if I open up, from earlier in the course-- 1197 00:53:54,150 --> 00:54:00,320 how about agree.c, which prompt the user to agree to something or not? 1198 00:54:00,320 --> 00:54:03,860 And let me code up a new version here, called agree.py. 1199 00:54:03,860 --> 00:54:06,720 And I'll do this on the right-hand side, with agree.py. 1200 00:54:06,720 --> 00:54:08,830 But on agree.c on the left-- 1201 00:54:08,830 --> 00:54:12,210 notice that this is how we did this yes-no thing in C-- 1202 00:54:12,210 --> 00:54:16,590 we compared c, a character, equal to single quotes 'Y' 1203 00:54:16,590 --> 00:54:18,840 or equal to single quotes little 'y.' 1204 00:54:18,840 --> 00:54:20,430 And then, the same thing for n. 1205 00:54:20,430 --> 00:54:22,470 Now, in Python, this one is actually going 1206 00:54:22,470 --> 00:54:23,960 to be a little bit different, here. 1207 00:54:23,960 --> 00:54:27,310 Let me go ahead and, in the Python version of this, 1208 00:54:27,310 --> 00:54:29,640 let me do something like this. 1209 00:54:29,640 --> 00:54:31,258 We'll use get_string. 1210 00:54:31,258 --> 00:54:31,800 Actually, no. 1211 00:54:31,800 --> 00:54:33,217 We'll just use input in this case. 1212 00:54:33,217 --> 00:54:36,780 So let's do s equals input. 1213 00:54:36,780 --> 00:54:38,940 And we'll ask the user the same thing-- 1214 00:54:38,940 --> 00:54:40,875 Do you agree, question mark. 1215 00:54:40,875 --> 00:54:46,110 Then, let's go ahead and say, if s equals equals-- 1216 00:54:46,110 --> 00:54:48,940 how about Y? 1217 00:54:48,940 --> 00:54:49,740 Huh. 1218 00:54:49,740 --> 00:54:50,758 How do I do this? 1219 00:54:50,758 --> 00:54:51,550 Well, a few things. 1220 00:54:51,550 --> 00:54:54,660 Turns out, I'm going to do this-- s equals equals little y. 1221 00:54:54,660 --> 00:54:57,210 Then, I'm going to go ahead and print out "Agreed." 1222 00:54:57,210 --> 00:55:03,390 And elif s equals equals capital N or s equals equals lowercase n, 1223 00:55:03,390 --> 00:55:05,520 I'm going to go ahead and print out "Not agreed." 1224 00:55:05,520 --> 00:55:08,820 And I claim, for the moment, that this is identical, now, 1225 00:55:08,820 --> 00:55:13,760 to the program on the left in C. But what's different? 1226 00:55:13,760 --> 00:55:17,280 So we're still doing the same kind of logic, these equal equals 1227 00:55:17,280 --> 00:55:18,780 for comparing for equality. 1228 00:55:18,780 --> 00:55:21,922 But notice that, nicely enough, Python got rid of the two vertical bars, 1229 00:55:21,922 --> 00:55:23,505 and it's just literally the word "or." 1230 00:55:23,505 --> 00:55:27,933 If you recall seeing ampersand ampersand to express a logical and in C, [GRUNTS] 1231 00:55:27,933 --> 00:55:29,850 you can just write, literally, the word "and." 1232 00:55:29,850 --> 00:55:33,390 And so, here's a hint of why Python tends to be pretty popular. 1233 00:55:33,390 --> 00:55:35,640 People just like that it's a little closer to English. 1234 00:55:35,640 --> 00:55:38,520 There's a little less of the cryptic syntax here. 1235 00:55:38,520 --> 00:55:41,850 Now, this is correct, as this code will now work. 1236 00:55:41,850 --> 00:55:45,750 But I've also used double quotes instead of single quotes, 1237 00:55:45,750 --> 00:55:48,780 and I also omitted, a few minutes ago, from my list of data 1238 00:55:48,780 --> 00:55:51,180 types in Python the word "char." 1239 00:55:51,180 --> 00:55:53,430 In Python, there are no chars. 1240 00:55:53,430 --> 00:55:55,320 There are no individual characters. 1241 00:55:55,320 --> 00:55:58,830 If you want to manipulate an individual character, you use a string-- 1242 00:55:58,830 --> 00:56:00,510 that is to say, a str-- 1243 00:56:00,510 --> 00:56:01,680 of size 1. 1244 00:56:01,680 --> 00:56:04,930 Now, in Python, you can use single quotes or double quotes. 1245 00:56:04,930 --> 00:56:06,930 I'm deliberately using double quotes everywhere, 1246 00:56:06,930 --> 00:56:09,715 just for consistency with how we treat strings in C. 1247 00:56:09,715 --> 00:56:12,090 It's pretty common, though, to use single quotes instead, 1248 00:56:12,090 --> 00:56:14,190 if only because, on most keyboards, you don't 1249 00:56:14,190 --> 00:56:16,320 have to hold the Shift key anymore. 1250 00:56:16,320 --> 00:56:18,288 Humans have really started to optimize just how 1251 00:56:18,288 --> 00:56:19,830 quickly they want to be able to code. 1252 00:56:19,830 --> 00:56:22,110 So using a single quote tends to be pretty popular 1253 00:56:22,110 --> 00:56:24,270 in Python and other languages, as well. 1254 00:56:24,270 --> 00:56:29,520 They are fundamentally the same, single or double, unlike in C, 1255 00:56:29,520 --> 00:56:30,570 where they have meaning. 1256 00:56:30,570 --> 00:56:33,120 So this is correct, I claim. 1257 00:56:33,120 --> 00:56:34,830 And, in fact, let me run this real quick. 1258 00:56:34,830 --> 00:56:37,090 I'll open up my terminal window here. 1259 00:56:37,090 --> 00:56:40,230 Let me get rid of the version in C and run python of agree.py. 1260 00:56:40,230 --> 00:56:42,420 And I'll type in Y. OK. 1261 00:56:42,420 --> 00:56:44,220 I'll run it again and type in little y. 1262 00:56:44,220 --> 00:56:46,780 And I'll stipulate it's going to work for no, as well. 1263 00:56:46,780 --> 00:56:49,840 But this isn't necessarily the only way we can do this. 1264 00:56:49,840 --> 00:56:52,350 There are other ways to implement the same idea. 1265 00:56:52,350 --> 00:56:57,630 And in fact, I can go about doing this instead. 1266 00:56:57,630 --> 00:56:59,910 Let me go back up to my code here. 1267 00:56:59,910 --> 00:57:03,240 And we saw a hint of this earlier. 1268 00:57:03,240 --> 00:57:06,240 We know that lists exist in Python, and you can create them 1269 00:57:06,240 --> 00:57:08,040 just by using square brackets. 1270 00:57:08,040 --> 00:57:10,380 So what if I simplify the code a little bit and just 1271 00:57:10,380 --> 00:57:14,940 say if s is in the following list of values-- 1272 00:57:14,940 --> 00:57:17,850 capital Y or lowercase y. 1273 00:57:17,850 --> 00:57:21,090 It's not all that different, logically, but it's a little tighter. 1274 00:57:21,090 --> 00:57:22,440 It's a little more compact. 1275 00:57:22,440 --> 00:57:29,040 So elif s is in capital N or lowercase n, I can express that same idea, too. 1276 00:57:29,040 --> 00:57:32,220 So here, again, it's just getting a little more pleasant to write code. 1277 00:57:32,220 --> 00:57:33,960 There's less hitting of the keyboard. 1278 00:57:33,960 --> 00:57:36,090 You can express yourself a little more succinctly. 1279 00:57:36,090 --> 00:57:40,020 And using the keyword in, Python will figure out 1280 00:57:40,020 --> 00:57:44,370 how to search the entire list for whatever the value of s is. 1281 00:57:44,370 --> 00:57:47,010 And if it finds it, it will return True automatically. 1282 00:57:47,010 --> 00:57:48,230 Else, it will return False. 1283 00:57:48,230 --> 00:57:54,960 So if I run agree.py again and type in capital Y or lowercase y, that still, 1284 00:57:54,960 --> 00:57:55,695 now, works. 1285 00:57:55,695 --> 00:58:00,330 Well, I can tighten this up further if I want to add more features. 1286 00:58:00,330 --> 00:58:04,710 Well, what if I want to support not just big Y and little y, 1287 00:58:04,710 --> 00:58:10,050 but how about "Yes" or "yes" or, in case the user 1288 00:58:10,050 --> 00:58:14,357 is yelling or someone who isn't good with CapsLock types in "YES?" 1289 00:58:14,357 --> 00:58:14,940 Wait a minute. 1290 00:58:14,940 --> 00:58:16,020 But it could be weird. 1291 00:58:16,020 --> 00:58:20,850 Do we want to support this or this? 1292 00:58:20,850 --> 00:58:23,480 This just gets really tedious, quickly, combinatorially, 1293 00:58:23,480 --> 00:58:25,710 if you consider all of these possible permutations. 1294 00:58:25,710 --> 00:58:27,990 What would be smarter than doing something 1295 00:58:27,990 --> 00:58:30,120 like this, if you want to just be able to tolerate 1296 00:58:30,120 --> 00:58:33,570 "yes" in any form of capitalization? 1297 00:58:33,570 --> 00:58:35,370 Logically, what would be nice? 1298 00:58:35,370 --> 00:58:38,232 AUDIENCE: Maybe, whatever the input is, you just transfer it over 1299 00:58:38,232 --> 00:58:40,357 to all lowercase while uppercase, and then redo it? 1300 00:58:40,357 --> 00:58:41,125 DAVID MALAN: Exactly. 1301 00:58:41,125 --> 00:58:42,042 Super common paradigm. 1302 00:58:42,042 --> 00:58:46,510 Why don't we just force the user's input to all lowercase or all uppercase-- 1303 00:58:46,510 --> 00:58:49,570 doesn't matter, so long as we're self-consistent-- and just compare 1304 00:58:49,570 --> 00:58:52,030 against all uppercase or all lowercase. 1305 00:58:52,030 --> 00:58:55,760 And that will get rid of all of the possible permutations, otherwise. 1306 00:58:55,760 --> 00:58:58,510 Now, in C, we might have done something like this. 1307 00:58:58,510 --> 00:59:01,820 We might have simplified this whole list and just said-- 1308 00:59:01,820 --> 00:59:04,940 let's say we'll do-- 1309 00:59:04,940 --> 00:59:06,220 how about lowercase? 1310 00:59:06,220 --> 00:59:10,490 So y or yes, and we'll just leave it at that. 1311 00:59:10,490 --> 00:59:12,370 But we need to force, now, s to lowercase. 1312 00:59:12,370 --> 00:59:15,970 Well, in C, we would have used the C-type library. 1313 00:59:15,970 --> 00:59:19,660 We would have done to.lower and call that function, passing it in. 1314 00:59:19,660 --> 00:59:22,330 Although, not really because, in C-type, those 1315 00:59:22,330 --> 00:59:25,870 operate on individual characters or chars, not whole strings. 1316 00:59:25,870 --> 00:59:29,920 We actually didn't see a function that could convert a whole string in C 1317 00:59:29,920 --> 00:59:31,030 to lowercase. 1318 00:59:31,030 --> 00:59:34,910 But in Python, we're going to benefit from some other feature, as well. 1319 00:59:34,910 --> 00:59:39,330 It turns out that Python supports what's called object-oriented programming. 1320 00:59:39,330 --> 00:59:41,830 And we're only going to scratch the surface of this in CS50. 1321 00:59:41,830 --> 00:59:44,740 But if you take a higher-level C course in programming or CS, 1322 00:59:44,740 --> 00:59:46,750 you explore this as a different paradigm. 1323 00:59:46,750 --> 00:59:49,930 Up until now, in C, we've been focusing on what's called, really, 1324 00:59:49,930 --> 00:59:51,025 procedural programming. 1325 00:59:51,025 --> 00:59:52,210 You write procedures. 1326 00:59:52,210 --> 00:59:55,250 You write functions, top to bottom, left to right. 1327 00:59:55,250 --> 00:59:57,790 And when you want to change some value, we 1328 00:59:57,790 --> 01:00:00,550 were in the habit of using a procedure-- that is, a function. 1329 01:00:00,550 --> 01:00:03,670 You would pass something, like a variable, into a function, 1330 01:00:03,670 --> 01:00:07,600 like toupper or tolower, and it would do its thing and hand you back a value. 1331 01:00:07,600 --> 01:00:12,610 Well, it turns out that it would be nicer, programming-wise, if some data 1332 01:00:12,610 --> 01:00:15,250 types just had built-in functionality. 1333 01:00:15,250 --> 01:00:18,220 Why do we have our variables over here and all of our helper functions, 1334 01:00:18,220 --> 01:00:21,010 like toupper and tolower over here, such that we constantly 1335 01:00:21,010 --> 01:00:22,660 have to pass one into the other. 1336 01:00:22,660 --> 01:00:27,590 It would be nice to bake into our data type some built-in functionality 1337 01:00:27,590 --> 01:00:33,267 so that you can change variables using their own, default built-in 1338 01:00:33,267 --> 01:00:33,850 functionality. 1339 01:00:33,850 --> 01:00:37,510 And so, Object-Oriented Programming, otherwise known as OOP, 1340 01:00:37,510 --> 01:00:41,635 is a technique whereby certain types of values, like a string-- 1341 01:00:41,635 --> 01:00:47,230 AKA str-- not only have properties inside of them-- 1342 01:00:47,230 --> 01:00:49,900 attributes, just like a struct in C-- 1343 01:00:49,900 --> 01:00:54,480 your data can also have functions built into them, as well. 1344 01:00:54,480 --> 01:00:57,955 So, whereas in C, which is not object-oriented, you have structs. 1345 01:00:57,955 --> 01:01:01,150 And structs can only store data, like a name and a number 1346 01:01:01,150 --> 01:01:02,620 when implementing a person. 1347 01:01:02,620 --> 01:01:07,210 In Python, you can, for instance, have not just a structure-- 1348 01:01:07,210 --> 01:01:09,010 otherwise known as a class-- 1349 01:01:09,010 --> 01:01:10,930 storing a name and a number. 1350 01:01:10,930 --> 01:01:15,460 You can have a function call that person or email that person 1351 01:01:15,460 --> 01:01:19,510 or actual verbs or actions associated with that piece of data. 1352 01:01:19,510 --> 01:01:21,910 Now, in the context of strings, it turns out 1353 01:01:21,910 --> 01:01:24,565 that strings come with a lot of useful functionality. 1354 01:01:24,565 --> 01:01:28,900 And in fact, at this URL here, which is in docs.python.org, 1355 01:01:28,900 --> 01:01:31,720 which is the official documentation for Python, 1356 01:01:31,720 --> 01:01:34,300 you'll see a whole list of methods-- 1357 01:01:34,300 --> 01:01:37,705 that is, functions-- that come with strings that you can actually 1358 01:01:37,705 --> 01:01:40,150 use to modify their values. 1359 01:01:40,150 --> 01:01:42,440 And what I mean by this is the following. 1360 01:01:42,440 --> 01:01:44,900 If we go through the documentation, poke around, 1361 01:01:44,900 --> 01:01:48,163 it turns out that strings come with a function called lower. 1362 01:01:48,163 --> 01:01:50,080 And if you want to use that function, you just 1363 01:01:50,080 --> 01:01:54,850 have to use slightly different syntax than in C. You do not do tolower, 1364 01:01:54,850 --> 01:01:59,140 and you do not say, as I just did, lower because this function is 1365 01:01:59,140 --> 01:02:01,150 built into s itself. 1366 01:02:01,150 --> 01:02:05,770 And just like in C, when you want to go inside of a variable, like a structure, 1367 01:02:05,770 --> 01:02:09,790 and access a piece of data inside of it, like name or number, 1368 01:02:09,790 --> 01:02:12,370 when you also have functions built into data types-- 1369 01:02:12,370 --> 01:02:17,530 AKA methods; a method is just a function that is built into a piece of data-- 1370 01:02:17,530 --> 01:02:23,480 you can do s dot lower open paren, closed paren in this case. 1371 01:02:23,480 --> 01:02:25,480 And I can do this down here, as well. 1372 01:02:25,480 --> 01:02:33,280 If s.lower in, quote unquote, "n" or "no", the whole thing, 1373 01:02:33,280 --> 01:02:35,455 I can force this whole thing to lowercase. 1374 01:02:35,455 --> 01:02:38,620 So the only difference here, now, as an object-oriented programming, 1375 01:02:38,620 --> 01:02:41,840 instead of constantly passing a value into a function, 1376 01:02:41,840 --> 01:02:45,910 you just access a function that's inside of the value. 1377 01:02:45,910 --> 01:02:48,928 It just works because of how the language itself is defined. 1378 01:02:48,928 --> 01:02:51,220 And the only way you know whether these functions exist 1379 01:02:51,220 --> 01:02:55,495 is the documentation-- a class, a book, a website or the like. 1380 01:02:55,495 --> 01:03:00,490 Questions, now, on this technique? 1381 01:03:00,490 --> 01:03:01,070 All right. 1382 01:03:01,070 --> 01:03:02,513 I claim this is correct. 1383 01:03:02,513 --> 01:03:05,180 Now, even though you've never programmed, most of you, in Python 1384 01:03:05,180 --> 01:03:07,655 before, not super well-designed. 1385 01:03:07,655 --> 01:03:12,140 There's an subtle inefficiency, now, on lines 3 and 5 together. 1386 01:03:12,140 --> 01:03:18,150 What's dumb about how I've used lower, might you think? 1387 01:03:18,150 --> 01:03:18,720 Yeah? 1388 01:03:18,720 --> 01:03:21,975 AUDIENCE: I feel like, using it twice, you'd just want another [? variable. ?] 1389 01:03:21,975 --> 01:03:22,440 DAVID MALAN: Yeah. 1390 01:03:22,440 --> 01:03:25,482 If you're going to use the same function twice and ask the same question, 1391 01:03:25,482 --> 01:03:29,248 expecting the same answer, why are you calling the function itself twice? 1392 01:03:29,248 --> 01:03:31,415 Maybe we should just store the result in a variable. 1393 01:03:31,415 --> 01:03:33,030 So we could do this in a couple of different ways. 1394 01:03:33,030 --> 01:03:36,360 We, for instance, could go up here and create another variable called t 1395 01:03:36,360 --> 01:03:38,040 and set that equal to s.lower. 1396 01:03:38,040 --> 01:03:41,330 And then, we could just change this to be t, here. 1397 01:03:41,330 --> 01:03:43,080 But honestly, I don't think we technically 1398 01:03:43,080 --> 01:03:45,480 need another variable altogether, here. 1399 01:03:45,480 --> 01:03:47,410 I could just do something like this. 1400 01:03:47,410 --> 01:03:52,360 Let's change the value of s to be the lowercase version thereof. 1401 01:03:52,360 --> 01:03:55,920 And so, now, I can quite simply refer to s again and again like this, 1402 01:03:55,920 --> 01:03:57,550 reusing that same value. 1403 01:03:57,550 --> 01:04:01,380 Now, to be sure, I have now just lost the user's original input. 1404 01:04:01,380 --> 01:04:05,430 And if I care about that-- if they typed in all caps, I have no idea anymore. 1405 01:04:05,430 --> 01:04:08,070 So maybe I do want to use a separate variable, altogether. 1406 01:04:08,070 --> 01:04:10,830 But a takeaway here, too, is that strings in Python 1407 01:04:10,830 --> 01:04:13,590 are technically what we'll call immutable-- 1408 01:04:13,590 --> 01:04:15,640 that is, they cannot be changed. 1409 01:04:15,640 --> 01:04:19,830 This was not true in C. Once we gave you arrays in week two 1410 01:04:19,830 --> 01:04:22,800 or memory in week four, you could go to town on a string 1411 01:04:22,800 --> 01:04:25,780 and change any of the characters you want-- uppercasing, lowercasing, 1412 01:04:25,780 --> 01:04:27,560 changing it, shortening it and so forth. 1413 01:04:27,560 --> 01:04:33,690 But in this case, this returns a copy of s, forced to lowercase. 1414 01:04:33,690 --> 01:04:35,790 It doesn't change the original string-- 1415 01:04:35,790 --> 01:04:38,700 that is, the bytes in the computer's memory. 1416 01:04:38,700 --> 01:04:41,580 When you assign it back to s, you're essentially 1417 01:04:41,580 --> 01:04:43,703 forgetting about the old version of s. 1418 01:04:43,703 --> 01:04:46,620 But because Python does memory management for you-- there's no malloc, 1419 01:04:46,620 --> 01:04:47,820 there's no free-- 1420 01:04:47,820 --> 01:04:52,200 Python automatically frees up the original bytes, like Y-E-S, 1421 01:04:52,200 --> 01:04:54,750 and hands them back to the operating system for you. 1422 01:04:54,750 --> 01:04:55,340 All right. 1423 01:04:55,340 --> 01:04:59,640 Questions, now, on this technique? 1424 01:04:59,640 --> 01:05:02,310 Questions on this? 1425 01:05:02,310 --> 01:05:05,145 In general, I'll call out-- the Python documentation 1426 01:05:05,145 --> 01:05:07,927 will start to be your friend because, in class, we'll only scratch 1427 01:05:07,927 --> 01:05:09,510 the surface with some of these things. 1428 01:05:09,510 --> 01:05:12,210 But in docs.python.org, for instance, there's 1429 01:05:12,210 --> 01:05:15,630 a whole reference of all of the built-in functions that come with the language, 1430 01:05:15,630 --> 01:05:18,135 as well as, for instance, those with a string. 1431 01:05:18,135 --> 01:05:19,620 All right. 1432 01:05:19,620 --> 01:05:23,205 Before we take a break, let's go ahead and create something a little familiar 1433 01:05:23,205 --> 01:05:27,030 too based on our weeks here, in C. Let me 1434 01:05:27,030 --> 01:05:30,690 propose that we revisit those examples involving some meows. 1435 01:05:30,690 --> 01:05:34,260 So, for instance, when we had our cat meow back in the first week 1436 01:05:34,260 --> 01:05:37,650 and, then, second in C, we did something that was a little stupid at first 1437 01:05:37,650 --> 01:05:41,960 whereby we created a file, as I'll do here-- this time, called meow.py. 1438 01:05:41,960 --> 01:05:44,550 And if I want a cat to meow three times, I 1439 01:05:44,550 --> 01:05:47,190 could run it once, like this, a little copy-paste. 1440 01:05:47,190 --> 01:05:50,580 And now, python of meow.py, and I'm done. 1441 01:05:50,580 --> 01:05:53,100 Now, we've visited this example two times, at least, 1442 01:05:53,100 --> 01:05:54,690 now in Scratch and in C. 1443 01:05:54,690 --> 01:06:00,080 It's correct, I'll stipulate, but what's, obviously, poorly designed? 1444 01:06:00,080 --> 01:06:01,655 What's the fault here? 1445 01:06:01,655 --> 01:06:02,212 Yeah? 1446 01:06:02,212 --> 01:06:03,670 AUDIENCE: It should just be a loop. 1447 01:06:03,670 --> 01:06:04,990 DAVID MALAN: It should just be a loop, right? 1448 01:06:04,990 --> 01:06:05,990 Why type it three times? 1449 01:06:05,990 --> 01:06:08,560 Literally, copying and pasting is almost always a bad thing-- 1450 01:06:08,560 --> 01:06:11,440 except in C, when you have the function prototypes that you need to borrow. 1451 01:06:11,440 --> 01:06:13,232 But in this case, this is just inefficient. 1452 01:06:13,232 --> 01:06:15,652 So what could we do better here, in Python? 1453 01:06:15,652 --> 01:06:18,610 Well, in Python, we could probably change this in a few different ways. 1454 01:06:18,610 --> 01:06:21,280 We could borrow some of the syntax we proposed in slide form 1455 01:06:21,280 --> 01:06:23,710 earlier, like give me a variable called i. 1456 01:06:23,710 --> 01:06:26,080 Set it to 0, no semicolon. 1457 01:06:26,080 --> 01:06:29,510 While i is less than 3-- if I want to do this three times-- 1458 01:06:29,510 --> 01:06:31,280 I can go ahead and print out "meow." 1459 01:06:31,280 --> 01:06:33,580 And then, I can do i plus equals 1. 1460 01:06:33,580 --> 01:06:35,080 And I think this would do the trick. 1461 01:06:35,080 --> 01:06:38,650 Python of meow.py, and we're back in business already. 1462 01:06:38,650 --> 01:06:41,463 Well, if I wanted to change this to a for loop, well, in Python, 1463 01:06:41,463 --> 01:06:44,380 it would be a little tighter, but this would not be the best approach. 1464 01:06:44,380 --> 01:06:52,510 So for i in 0, 1, 2, I could just do print "meow", like this. 1465 01:06:52,510 --> 01:06:54,250 And that, too, would get the job done. 1466 01:06:54,250 --> 01:06:58,390 But, to our discussion earlier, this would get stupid pretty quickly 1467 01:06:58,390 --> 01:07:00,970 if you had to keep enumerating all of these values. 1468 01:07:00,970 --> 01:07:03,880 What did we introduce instead? 1469 01:07:03,880 --> 01:07:04,940 The range function. 1470 01:07:04,940 --> 01:07:05,440 Exactly. 1471 01:07:05,440 --> 01:07:09,040 So that hands me back, way more efficiently, just the values I want, 1472 01:07:09,040 --> 01:07:10,635 indeed, one at a time. 1473 01:07:10,635 --> 01:07:14,745 So even this, if I run it a third or fourth time, we've got the same result. 1474 01:07:14,745 --> 01:07:18,220 But now, let's transition to where we went with this back in the day. 1475 01:07:18,220 --> 01:07:20,650 How can we start to modularize this? 1476 01:07:20,650 --> 01:07:24,100 It would be nice, I claimed, if MIT had given us a meow function. 1477 01:07:24,100 --> 01:07:27,370 Wouldn't it be nice if Python had given us a meow function? 1478 01:07:27,370 --> 01:07:30,580 Maybe less compelling in Python, but how can I build my own function? 1479 01:07:30,580 --> 01:07:33,618 Well, I did this briefly with the spell checker earlier, 1480 01:07:33,618 --> 01:07:36,160 but let me go ahead and propose that we could implement, now, 1481 01:07:36,160 --> 01:07:40,280 our own version of this in Python as follows. 1482 01:07:40,280 --> 01:07:44,050 Let me go ahead and start fresh here and use the keyword def. 1483 01:07:44,050 --> 01:07:47,860 So this did not exist in C. You had the return value, the function 1484 01:07:47,860 --> 01:07:48,850 name, the arguments. 1485 01:07:48,850 --> 01:07:52,120 In Python, you literally say def to define a function. 1486 01:07:52,120 --> 01:07:54,757 You give it a name, like meow. 1487 01:07:54,757 --> 01:07:57,840 And now, I'm going to go ahead and, in this function, just print out meow. 1488 01:07:57,840 --> 01:08:01,460 And this lets me change it to anything else I want in the future. 1489 01:08:01,460 --> 01:08:03,400 But for now, it's an abstraction. 1490 01:08:03,400 --> 01:08:07,773 And in fact, I can move it out of sight, out of mind-- 1491 01:08:07,773 --> 01:08:09,940 just going to hit Enter a bunch of times to pretend, 1492 01:08:09,940 --> 01:08:13,382 now, it exists, but I don't care how it is implemented. 1493 01:08:13,382 --> 01:08:15,340 And up here, now, I can do something like this. 1494 01:08:15,340 --> 01:08:20,590 For i in range of 3, let me go ahead and not print "meow" anymore. 1495 01:08:20,590 --> 01:08:25,359 Let me just call meow and tightening up my code further. 1496 01:08:25,359 --> 01:08:25,960 Let's see. 1497 01:08:25,960 --> 01:08:26,859 Python of meow.py. 1498 01:08:26,859 --> 01:08:31,240 This is, I think, going to be the first time it does not work correctly. 1499 01:08:31,240 --> 01:08:32,680 OK. 1500 01:08:32,680 --> 01:08:36,310 So here, we have, sadly, our first Python error. 1501 01:08:36,310 --> 01:08:37,569 And let's see. 1502 01:08:37,569 --> 01:08:40,300 The syntax is going to be different from C or Clangs output. 1503 01:08:40,300 --> 01:08:41,920 Traceback is the term of art here. 1504 01:08:41,920 --> 01:08:44,859 This is like a trace back of all of the lines of code 1505 01:08:44,859 --> 01:08:47,560 that were just executed or, really, functions you've called. 1506 01:08:47,560 --> 01:08:49,090 The file name is uninteresting. 1507 01:08:49,090 --> 01:08:52,149 This is my codespace, specifically, but the file name 1508 01:08:52,149 --> 01:08:53,890 is important here-- meow.py. 1509 01:08:53,890 --> 01:08:55,675 Our line 2 is the issue-- 1510 01:08:55,675 --> 01:08:58,060 OK, I didn't get very far before I screwed up-- 1511 01:08:58,060 --> 01:08:59,470 and then, there's a name error. 1512 01:08:59,470 --> 01:09:03,430 And you'll see, in Python, there's typically these capitalized keywords 1513 01:09:03,430 --> 01:09:05,350 that hint at what the issue is. 1514 01:09:05,350 --> 01:09:09,260 It's something related to names of variables. "meow" is not defined. 1515 01:09:09,260 --> 01:09:09,760 All right. 1516 01:09:09,760 --> 01:09:11,635 You're programming Python for the first time. 1517 01:09:11,635 --> 01:09:12,399 You've screwed up. 1518 01:09:12,399 --> 01:09:14,560 You're following some online tutorial. 1519 01:09:14,560 --> 01:09:16,149 You're seeing this. 1520 01:09:16,149 --> 01:09:18,010 Reason through it. 1521 01:09:18,010 --> 01:09:20,680 Why might "meow" not be defined? 1522 01:09:20,680 --> 01:09:24,779 What can we infer about Python? 1523 01:09:24,779 --> 01:09:27,240 How to troubleshoot, logically? 1524 01:09:27,240 --> 01:09:29,147 AUDIENCE: [INAUDIBLE] 1525 01:09:29,147 --> 01:09:29,939 DAVID MALAN: Maybe. 1526 01:09:29,939 --> 01:09:32,520 Is it because "meow" is defined after? 1527 01:09:32,520 --> 01:09:34,890 As smart as Python seems to be, vis-a-vis C, 1528 01:09:34,890 --> 01:09:37,055 they have some similar design characteristics. 1529 01:09:37,055 --> 01:09:37,920 So let's try that. 1530 01:09:37,920 --> 01:09:41,729 So let me scroll all the way back down to where I moved this earlier. 1531 01:09:41,729 --> 01:09:43,649 Let me get rid of it-- 1532 01:09:43,649 --> 01:09:44,279 way down there. 1533 01:09:44,279 --> 01:09:46,410 I'll copy it to my clipboard. 1534 01:09:46,410 --> 01:09:48,180 And let me just hack something together. 1535 01:09:48,180 --> 01:09:49,963 Let me just put it up here. 1536 01:09:49,963 --> 01:09:51,130 And let's see if this works. 1537 01:09:51,130 --> 01:09:54,120 So now, let me clear my terminal, run python of meow.py. 1538 01:09:54,120 --> 01:09:55,110 OK. 1539 01:09:55,110 --> 01:09:56,198 We're back in business. 1540 01:09:56,198 --> 01:09:57,990 So that was actually really good intuition. 1541 01:09:57,990 --> 01:10:00,180 Good debugging technique, just reason through it. 1542 01:10:00,180 --> 01:10:02,430 Now, this is contradicting what I claimed back 1543 01:10:02,430 --> 01:10:05,325 in week one, which was that the main part of your program, 1544 01:10:05,325 --> 01:10:07,470 ideally, should just be at the top of the file. 1545 01:10:07,470 --> 01:10:08,580 Don't make me look for it. 1546 01:10:08,580 --> 01:10:10,497 It's not a huge deal with a four-line program, 1547 01:10:10,497 --> 01:10:13,290 but if you've got 40 lines or 400 lines, you 1548 01:10:13,290 --> 01:10:15,480 don't want the juicy part of your program 1549 01:10:15,480 --> 01:10:18,455 to be way down here, and all of these functions way up here. 1550 01:10:18,455 --> 01:10:22,085 So it would be nice, maybe, if we actually have a main function. 1551 01:10:22,085 --> 01:10:25,260 And so, it actually turns out to be a convention in Python 1552 01:10:25,260 --> 01:10:27,460 to define a main function. 1553 01:10:27,460 --> 01:10:30,720 It's not a special function that's automatically called, like in C. 1554 01:10:30,720 --> 01:10:32,340 But humans realized, you know what? 1555 01:10:32,340 --> 01:10:34,120 That was a pretty useful feature. 1556 01:10:34,120 --> 01:10:36,540 Let me define a function called main. 1557 01:10:36,540 --> 01:10:39,000 Let me indent these lines underneath it. 1558 01:10:39,000 --> 01:10:41,070 Let me practice what I'm preaching, which is put 1559 01:10:41,070 --> 01:10:43,290 the main code at the top of the file. 1560 01:10:43,290 --> 01:10:47,730 And, wonderfully, in Python, now, you do not need prototypes. 1561 01:10:47,730 --> 01:10:49,920 There's none of that hackish copying and pasting 1562 01:10:49,920 --> 01:10:52,462 of the return type, the name and the arguments to a function, 1563 01:10:52,462 --> 01:10:58,485 like we needed in C. This is now OK instead, except for one, minor detail. 1564 01:10:58,485 --> 01:11:01,290 Let me go ahead and run python of meow.py. 1565 01:11:01,290 --> 01:11:05,940 Hopefully, now, I've solved this problem by having [GROANS] a main function. 1566 01:11:05,940 --> 01:11:08,170 But now, nothing has happened. 1567 01:11:08,170 --> 01:11:08,670 All right. 1568 01:11:08,670 --> 01:11:12,200 Even if you've never programmed in Python before, 1569 01:11:12,200 --> 01:11:17,855 what might explain this behavior, and how do I fix? 1570 01:11:17,855 --> 01:11:20,730 Again, when you're off in the real world, learning some new language, 1571 01:11:20,730 --> 01:11:23,790 all you have is deductive logic to debug. 1572 01:11:23,790 --> 01:11:24,300 Yeah? 1573 01:11:24,300 --> 01:11:28,656 AUDIENCE: I remember in C, even though we [INAUDIBLE].. 1574 01:11:28,656 --> 01:11:31,708 1575 01:11:31,708 --> 01:11:32,500 DAVID MALAN: Right. 1576 01:11:32,500 --> 01:11:34,510 So the solution, to be clear, in C was that we 1577 01:11:34,510 --> 01:11:35,650 had to put the prototype up here. 1578 01:11:35,650 --> 01:11:36,790 Otherwise, we'd get an error message. 1579 01:11:36,790 --> 01:11:39,123 In this case, I'm actually not getting an error message. 1580 01:11:39,123 --> 01:11:42,610 And, indeed, I'll claim that you don't need the prototypes in Python. 1581 01:11:42,610 --> 01:11:46,910 Just not necessary because that was annoying, if nothing else. 1582 01:11:46,910 --> 01:11:48,820 But what else might explain? 1583 01:11:48,820 --> 01:11:49,570 Yeah, in the back? 1584 01:11:49,570 --> 01:11:51,030 AUDIENCE: [INAUDIBLE] 1585 01:11:51,030 --> 01:11:51,780 DAVID MALAN: Yeah. 1586 01:11:51,780 --> 01:11:53,880 Maybe you have to call main itself. 1587 01:11:53,880 --> 01:11:58,410 If main is not some special status in Python, maybe just because it exists 1588 01:11:58,410 --> 01:11:59,040 isn't enough. 1589 01:11:59,040 --> 01:12:02,580 And, indeed, if you want to call main, the new convention 1590 01:12:02,580 --> 01:12:05,460 is actually going to be-- as the very last line of your program, 1591 01:12:05,460 --> 01:12:07,350 typically-- to literally call main. 1592 01:12:07,350 --> 01:12:10,950 It's a little stupid-looking, but they made a design decision. 1593 01:12:10,950 --> 01:12:13,200 And this is how, now, we work around it. 1594 01:12:13,200 --> 01:12:14,610 Python of meow.py. 1595 01:12:14,610 --> 01:12:16,890 Now we're back in business. 1596 01:12:16,890 --> 01:12:19,560 But now, logically, why does this work the way it does? 1597 01:12:19,560 --> 01:12:22,320 Well, in this case-- top to bottom-- 1598 01:12:22,320 --> 01:12:25,350 line 1 is telling Python to define a function called main 1599 01:12:25,350 --> 01:12:27,660 and, then, define it as follows, lines 2 and 3. 1600 01:12:27,660 --> 01:12:29,610 But it's not calling main yet. 1601 01:12:29,610 --> 01:12:33,210 Line 6 is telling Python how to define a function called meow, 1602 01:12:33,210 --> 01:12:35,580 but it's not calling these lines yet. 1603 01:12:35,580 --> 01:12:38,730 Now, on line 10, you're telling Python, call main. 1604 01:12:38,730 --> 01:12:41,310 And at that point, Python has been trained, if you will, 1605 01:12:41,310 --> 01:12:45,390 to know what main is on line 1, to know what meow is on line 6. 1606 01:12:45,390 --> 01:12:49,650 And so, it's now perfectly OK for main to be above meow 1607 01:12:49,650 --> 01:12:51,150 because you never called them yet. 1608 01:12:51,150 --> 01:12:54,340 You defined, defined, and then, you called. 1609 01:12:54,340 --> 01:12:56,380 And that's the logic behind this. 1610 01:12:56,380 --> 01:13:01,250 Any questions, now, on the structure of this technique, here? 1611 01:13:01,250 --> 01:13:03,000 Now, let's do one more, then. 1612 01:13:03,000 --> 01:13:07,740 Recall that the last thing we did in Scratch and in C was to, 1613 01:13:07,740 --> 01:13:10,940 actually, parameterize these same functions. 1614 01:13:10,940 --> 01:13:14,070 So suppose that you don't want main to be responsible for the loop here. 1615 01:13:14,070 --> 01:13:17,580 You instead want to, very simply, do something like "meow" three times 1616 01:13:17,580 --> 01:13:18,660 and be done with it. 1617 01:13:18,660 --> 01:13:21,427 Well, in Python, it's going to be similar in spirit to C. 1618 01:13:21,427 --> 01:13:23,760 But, again, we don't need to keep mentioning data types. 1619 01:13:23,760 --> 01:13:26,310 If you want "meow" to take some argument-- 1620 01:13:26,310 --> 01:13:27,930 like a number n-- 1621 01:13:27,930 --> 01:13:30,792 you can just specify n as the name of that argument. 1622 01:13:30,792 --> 01:13:33,250 Or you can call it anything else, of course, that you want. 1623 01:13:33,250 --> 01:13:35,700 You don't have to specify int or anything else. 1624 01:13:35,700 --> 01:13:40,890 In your code, now, inside of meow, you can do something like for i in, 1625 01:13:40,890 --> 01:13:41,670 let's say-- 1626 01:13:41,670 --> 01:13:45,690 I definitely, now, can't do this because that would be weird, to start the list 1627 01:13:45,690 --> 01:13:46,590 and end it with n. 1628 01:13:46,590 --> 01:13:49,360 So, if I can come back over here, what's the solution? 1629 01:13:49,360 --> 01:13:51,270 How can I do something n times? 1630 01:13:51,270 --> 01:13:52,410 AUDIENCE: [INAUDIBLE] 1631 01:13:52,410 --> 01:13:53,160 DAVID MALAN: Yeah. 1632 01:13:53,160 --> 01:13:54,340 Using range. 1633 01:13:54,340 --> 01:13:58,140 So range is nice because I can pass in, now, this variable n. 1634 01:13:58,140 --> 01:13:59,940 And now, I can meow-- whoops. 1635 01:13:59,940 --> 01:14:03,195 Now i can print out, quote unquote, "meow." 1636 01:14:03,195 --> 01:14:05,820 So it's almost the same as in Scratch, almost the same as in C. 1637 01:14:05,820 --> 01:14:06,903 But it's a little simpler. 1638 01:14:06,903 --> 01:14:12,210 And if, now, I run meow.py, I'll have the ability, now, to do this here, 1639 01:14:12,210 --> 01:14:13,110 as well. 1640 01:14:13,110 --> 01:14:13,770 All right. 1641 01:14:13,770 --> 01:14:16,590 Questions on any of this? 1642 01:14:16,590 --> 01:14:19,800 Right now, we're taking this stroll through week one. 1643 01:14:19,800 --> 01:14:22,050 We're going to, momentarily, escalate things 1644 01:14:22,050 --> 01:14:24,840 to look not only at some of these basics, 1645 01:14:24,840 --> 01:14:27,390 but also, other features, like we saw with face recognition 1646 01:14:27,390 --> 01:14:28,920 with the speller or the like. 1647 01:14:28,920 --> 01:14:31,962 Because of how many of us are here, we have a huge amount of candy 1648 01:14:31,962 --> 01:14:32,670 out in the lobby. 1649 01:14:32,670 --> 01:14:34,440 So why don't we go ahead and take a 10-minute break? 1650 01:14:34,440 --> 01:14:37,230 And when we come back, we'll do even fancier, more powerful things 1651 01:14:37,230 --> 01:14:38,595 with Python in 10. 1652 01:14:38,595 --> 01:14:40,020 All right. 1653 01:14:40,020 --> 01:14:41,730 So we are back. 1654 01:14:41,730 --> 01:14:44,280 Among our goals, now, are to introduce a few more building 1655 01:14:44,280 --> 01:14:47,880 blocks so that we can solve more interesting problems at the end, 1656 01:14:47,880 --> 01:14:49,560 much like those that we began with. 1657 01:14:49,560 --> 01:14:52,830 You'll recall, from a few weeks ago, we played with this two-dimensional Super 1658 01:14:52,830 --> 01:14:53,670 Mario world. 1659 01:14:53,670 --> 01:14:57,380 And we tried to print a vertical column of three or more bricks. 1660 01:14:57,380 --> 01:15:00,210 Well, let me propose that we use this as an opportunity to, now, 1661 01:15:00,210 --> 01:15:02,880 tinker with some of Python's more useful, more 1662 01:15:02,880 --> 01:15:04,470 user-friendly functionality, as well. 1663 01:15:04,470 --> 01:15:09,265 So let me code a file called mario.py, and let's just print out 1664 01:15:09,265 --> 01:15:10,890 the equivalent of that vertical column. 1665 01:15:10,890 --> 01:15:12,690 So it's of height 3. 1666 01:15:12,690 --> 01:15:16,740 Each one is a hash, so let's do for i in range of 3 initially, 1667 01:15:16,740 --> 01:15:18,600 and let's just print out a single hash. 1668 01:15:18,600 --> 01:15:21,790 And I think, now, python of mario.py-- 1669 01:15:21,790 --> 01:15:22,290 voila. 1670 01:15:22,290 --> 01:15:27,480 We're in business, printing out just that same column there. 1671 01:15:27,480 --> 01:15:31,110 What if, though, we want to print a column of some variable height 1672 01:15:31,110 --> 01:15:33,510 where the user tells us how tall they want it to be? 1673 01:15:33,510 --> 01:15:39,600 Well, let me go up here, for instance and, instead, how about-- 1674 01:15:39,600 --> 01:15:40,920 let's do this. 1675 01:15:40,920 --> 01:15:45,210 How about from cs50 import? 1676 01:15:45,210 --> 01:15:47,620 How about the get_int function, as before? 1677 01:15:47,620 --> 01:15:50,430 So it will deal with making sure the user gives us an integer. 1678 01:15:50,430 --> 01:15:54,750 And now, in the past, whenever we wanted to get a number from a user, 1679 01:15:54,750 --> 01:15:56,780 we've actually followed a certain paradigm. 1680 01:15:56,780 --> 01:16:02,895 In fact, if I open up here, for instance, 1681 01:16:02,895 --> 01:16:06,630 how about mario1.c from a while back, you 1682 01:16:06,630 --> 01:16:11,430 might recall that we had code like this. 1683 01:16:11,430 --> 01:16:13,800 And we specifically use the do while loop in C 1684 01:16:13,800 --> 01:16:16,410 whenever we want to get something from the user, 1685 01:16:16,410 --> 01:16:18,858 maybe, again and again and again, until they cooperate. 1686 01:16:18,858 --> 01:16:20,900 At which point, we finally break out of the loop. 1687 01:16:20,900 --> 01:16:22,830 So it turns out, Python does have while loops, 1688 01:16:22,830 --> 01:16:25,698 does have for loops, does not have do while loops. 1689 01:16:25,698 --> 01:16:27,990 And yet, pretty much any time you've gotten user input, 1690 01:16:27,990 --> 01:16:30,100 you've probably used this paradigm. 1691 01:16:30,100 --> 01:16:33,930 So it turns out that the Python equivalent of this is to do, 1692 01:16:33,930 --> 01:16:36,450 similar in spirit, but using only a while loop. 1693 01:16:36,450 --> 01:16:39,300 And a common paradigm in Python, as I alluded earlier, 1694 01:16:39,300 --> 01:16:43,440 is to actually deliberately induce an infinite loop while True-- 1695 01:16:43,440 --> 01:16:48,240 capital T-- and then, do what you want to do, like get an int from the user 1696 01:16:48,240 --> 01:16:51,690 and prompt them for the height, for instance, in question. 1697 01:16:51,690 --> 01:16:56,070 And then, if you're sure that the user has given you what you want-- 1698 01:16:56,070 --> 01:16:59,220 like n is greater than 0, which is what I want, in this case, 1699 01:16:59,220 --> 01:17:02,610 because I want a positive integer; otherwise, there's nothing to print-- 1700 01:17:02,610 --> 01:17:04,505 you literally just break out of the loop. 1701 01:17:04,505 --> 01:17:08,070 And so, we could actually use this technique in C. It's just not 1702 01:17:08,070 --> 01:17:10,260 really done in C. You could absolutely, in C, 1703 01:17:10,260 --> 01:17:13,590 have done a while True loop with the parentheses, lowercase true. 1704 01:17:13,590 --> 01:17:15,670 You could break out of it, and so forth. 1705 01:17:15,670 --> 01:17:18,312 But in Python, this is the Python way. 1706 01:17:18,312 --> 01:17:19,770 And this is actually a term of art. 1707 01:17:19,770 --> 01:17:24,017 This way in Python is pythonic This is "the way everyone does it," 1708 01:17:24,017 --> 01:17:24,600 quote unquote. 1709 01:17:24,600 --> 01:17:28,830 Doesn't mean you have to, but that's the way the cool Python programmers would 1710 01:17:28,830 --> 01:17:31,980 implement an idea like this-- trying to do something again and again 1711 01:17:31,980 --> 01:17:34,607 and again until the user actually cooperates. 1712 01:17:34,607 --> 01:17:36,690 But all we've done is take away the do while loop. 1713 01:17:36,690 --> 01:17:39,790 But still, logically, we can implement the same idea. 1714 01:17:39,790 --> 01:17:44,580 Now, below this, let me go ahead and just print out, for i in range of n 1715 01:17:44,580 --> 01:17:47,370 this time-- because I want it to be variable and not 3. 1716 01:17:47,370 --> 01:17:49,920 I can go ahead and print out the hash-- 1717 01:17:49,920 --> 01:17:52,260 let me go ahead and get rid of the C version here-- 1718 01:17:52,260 --> 01:17:55,920 open my terminal window and I'll run, again, Python of mario.py. 1719 01:17:55,920 --> 01:17:58,530 I'll type in 3 and I get back those three hashes. 1720 01:17:58,530 --> 01:18:02,635 But if I, instead, type in 4, I now get four hashes instead. 1721 01:18:02,635 --> 01:18:04,640 So the takeaway here is, quite simply, that this 1722 01:18:04,640 --> 01:18:08,030 would be the way, for instance, to actually get back 1723 01:18:08,030 --> 01:18:11,615 a value in Python that is consistent with some parameter, 1724 01:18:11,615 --> 01:18:13,160 like greater than 0. 1725 01:18:13,160 --> 01:18:13,950 How about this? 1726 01:18:13,950 --> 01:18:17,810 Let's actually practice what we preached a moment ago with our meowing examples 1727 01:18:17,810 --> 01:18:19,830 and factoring all this out. 1728 01:18:19,830 --> 01:18:23,220 Let me go ahead and define a main function, as before. 1729 01:18:23,220 --> 01:18:25,190 Let me go ahead and assume, for the moment, 1730 01:18:25,190 --> 01:18:28,673 that a get_height function exists, which is not a thing in Python. 1731 01:18:28,673 --> 01:18:30,340 I'm going to invent it in just a moment. 1732 01:18:30,340 --> 01:18:33,620 And now, I'm going to go ahead and do something like this. for i 1733 01:18:33,620 --> 01:18:39,470 in the range of that height, well, let's go ahead and print out those hashes. 1734 01:18:39,470 --> 01:18:41,760 So I'm assuming that get_height exists. 1735 01:18:41,760 --> 01:18:44,725 Let me go ahead and implement that abstraction, so define a function, 1736 01:18:44,725 --> 01:18:46,100 now, called get_height. 1737 01:18:46,100 --> 01:18:48,830 It's not going to take any arguments in this design. 1738 01:18:48,830 --> 01:18:52,820 While True, I can go ahead and do the same thing as before-- 1739 01:18:52,820 --> 01:18:55,880 assign a variable n, the return value of get_int 1740 01:18:55,880 --> 01:18:58,140 prompting the user for that height. 1741 01:18:58,140 --> 01:19:03,980 And then, if n is greater than 0, I can go ahead and break. 1742 01:19:03,980 --> 01:19:08,390 But if I break here, I, logically-- just like in C-- 1743 01:19:08,390 --> 01:19:11,360 end up executing below the loop in question. 1744 01:19:11,360 --> 01:19:12,690 But there's nothing there. 1745 01:19:12,690 --> 01:19:16,820 But if I want get_height to return the height, what should 1746 01:19:16,820 --> 01:19:18,650 I type here on line 14, logically? 1747 01:19:18,650 --> 01:19:21,580 1748 01:19:21,580 --> 01:19:23,380 What do I want to return, to be clear? 1749 01:19:23,380 --> 01:19:23,995 AUDIENCE: [INAUDIBLE] 1750 01:19:23,995 --> 01:19:24,745 DAVID MALAN: Yeah. 1751 01:19:24,745 --> 01:19:26,890 So I actually want to return n. 1752 01:19:26,890 --> 01:19:30,880 And here's another curiosity of Python, vis-a-vis C. 1753 01:19:30,880 --> 01:19:33,670 There doesn't seem to be an issue of scope anymore, right? 1754 01:19:33,670 --> 01:19:37,180 In C, it was super important to not only declare your variables with the data 1755 01:19:37,180 --> 01:19:39,550 types, you also had to be mindful of where they exist-- 1756 01:19:39,550 --> 01:19:41,200 inside of those curly braces. 1757 01:19:41,200 --> 01:19:45,238 In Python, it turns out you can be a little looser with things, for better 1758 01:19:45,238 --> 01:19:45,780 or for worse. 1759 01:19:45,780 --> 01:19:50,020 And so, on line 11, if I create a variable called n, 1760 01:19:50,020 --> 01:19:57,170 it exists on line 11, 12 and even 13, outside of the while loop. 1761 01:19:57,170 --> 01:19:59,710 So to be clear, in C, with a while loop, we 1762 01:19:59,710 --> 01:20:03,040 would have ordinarily had not a colon. 1763 01:20:03,040 --> 01:20:05,920 We would have had the curly brace, like here and over here. 1764 01:20:05,920 --> 01:20:08,770 And a week ago, I would have claimed that, in C, n 1765 01:20:08,770 --> 01:20:12,130 does not exist outside of the while loop, by nature of those curly braces. 1766 01:20:12,130 --> 01:20:15,250 Even though the curly braces are gone, Python actually 1767 01:20:15,250 --> 01:20:20,685 allows you to use a variable any time after you have assigned it a value. 1768 01:20:20,685 --> 01:20:23,625 So slightly more powerful, as such. 1769 01:20:23,625 --> 01:20:26,830 However, I can tighten this up a little bit, logically. 1770 01:20:26,830 --> 01:20:30,700 And this is true in C. I don't really need to break out of the loop 1771 01:20:30,700 --> 01:20:32,020 by using break. 1772 01:20:32,020 --> 01:20:36,070 Recall that or know that I can actually-- once I'm ready to go, 1773 01:20:36,070 --> 01:20:40,030 I can just return the value I care about, even inside of the loop. 1774 01:20:40,030 --> 01:20:43,000 And that will have the side effect of breaking me out of the loop 1775 01:20:43,000 --> 01:20:46,590 and, also, breaking me out of and returning from the entire function. 1776 01:20:46,590 --> 01:20:50,470 So nothing too new here, in terms of C versus Python, except for this issue 1777 01:20:50,470 --> 01:20:51,490 with scope. 1778 01:20:51,490 --> 01:20:53,770 And I, indeed, returned n at the bottom there, 1779 01:20:53,770 --> 01:20:56,360 just to make clear that n would still exist. 1780 01:20:56,360 --> 01:20:58,170 So either of those are correct. 1781 01:20:58,170 --> 01:21:02,350 Now, I just have a Python program that I think 1782 01:21:02,350 --> 01:21:05,590 is going to allow me to implement this same Mario idea. 1783 01:21:05,590 --> 01:21:07,450 So let's run python of mario.py. 1784 01:21:07,450 --> 01:21:09,820 And-- OK, so nothing happened. 1785 01:21:09,820 --> 01:21:13,390 Python of mario.py. 1786 01:21:13,390 --> 01:21:14,260 What did I do wrong? 1787 01:21:14,260 --> 01:21:14,965 AUDIENCE: [INAUDIBLE] 1788 01:21:14,965 --> 01:21:16,590 DAVID MALAN: Yeah, I have to call main. 1789 01:21:16,590 --> 01:21:19,720 So, at the bottom of my code, I have to call main here. 1790 01:21:19,720 --> 01:21:22,720 And this is a stylistic detail that's been subtle. 1791 01:21:22,720 --> 01:21:26,050 Generally speaking, when you are writing in Python, 1792 01:21:26,050 --> 01:21:28,360 there's not a CS50 style guide, per se. 1793 01:21:28,360 --> 01:21:33,700 There's actually a Python style guide that most people adhere to. 1794 01:21:33,700 --> 01:21:37,480 And in this case, double blank lines between functions is the norm. 1795 01:21:37,480 --> 01:21:41,890 I'm doing that deliberately, although it might, otherwise, not be obvious. 1796 01:21:41,890 --> 01:21:45,130 But now that I've called main on line 16, let's run mario.py once more. 1797 01:21:45,130 --> 01:21:46,690 Aha. 1798 01:21:46,690 --> 01:21:47,560 Now we see it. 1799 01:21:47,560 --> 01:21:51,730 Type in 3, and I'm back in business, printing out the values there. 1800 01:21:51,730 --> 01:21:52,330 Yeah? 1801 01:21:52,330 --> 01:21:54,146 AUDIENCE: Why do you [INAUDIBLE]? 1802 01:21:54,146 --> 01:21:56,120 Why can't [INAUDIBLE]? 1803 01:21:56,120 --> 01:21:56,870 DAVID MALAN: Sure. 1804 01:21:56,870 --> 01:21:58,453 Why do I need the if condition at all? 1805 01:21:58,453 --> 01:22:02,390 Why can't I just return n here as by doing return n. 1806 01:22:02,390 --> 01:22:06,890 Or if I really want to be succinct, I could technically just do this. 1807 01:22:06,890 --> 01:22:09,512 The only reason I added the if condition is 1808 01:22:09,512 --> 01:22:11,720 because, if the user types in negative 1, negative 2, 1809 01:22:11,720 --> 01:22:13,850 I wanted to prompt them again and again. 1810 01:22:13,850 --> 01:22:14,390 That's all. 1811 01:22:14,390 --> 01:22:17,660 But that would be totally acceptable, too, if you were OK with that result 1812 01:22:17,660 --> 01:22:18,630 instead. 1813 01:22:18,630 --> 01:22:21,170 Well, let me do one other thing here to point out 1814 01:22:21,170 --> 01:22:23,870 why we are using get_int so frequently. 1815 01:22:23,870 --> 01:22:26,030 This new training wheel, albeit temporarily. 1816 01:22:26,030 --> 01:22:28,490 So let me go back to the way it was a moment ago 1817 01:22:28,490 --> 01:22:32,510 and let me propose, now, to take away get_int. 1818 01:22:32,510 --> 01:22:35,840 I claimed earlier that, if you're not using get_int, 1819 01:22:35,840 --> 01:22:40,400 you can just use the input function itself from Python. 1820 01:22:40,400 --> 01:22:43,250 But that always returns a string, or a str. 1821 01:22:43,250 --> 01:22:48,110 And so, recall that you have to pass the output of the input function to an int, 1822 01:22:48,110 --> 01:22:51,930 either on the same line or, if you prefer, on another line, instead. 1823 01:22:51,930 --> 01:22:54,110 But it turns out what I didn't do was show 1824 01:22:54,110 --> 01:22:59,250 you what happens if you don't cooperate with the program. 1825 01:22:59,250 --> 01:23:02,540 So if I run python of mario.py now, works great, even 1826 01:23:02,540 --> 01:23:04,252 without the get_int function. 1827 01:23:04,252 --> 01:23:05,210 And I can do it with 4. 1828 01:23:05,210 --> 01:23:06,575 Still works great. 1829 01:23:06,575 --> 01:23:09,122 But let me clear my terminal and be difficult, now, 1830 01:23:09,122 --> 01:23:11,330 as the user and type in "cat" for the height instead. 1831 01:23:11,330 --> 01:23:12,560 Enter. 1832 01:23:12,560 --> 01:23:14,540 Now, we see one of those trace backs again. 1833 01:23:14,540 --> 01:23:15,900 This one is different. 1834 01:23:15,900 --> 01:23:18,780 This isn't a name error, but, apparently, a value error. 1835 01:23:18,780 --> 01:23:20,870 And if I ignore the stuff I don't understand, 1836 01:23:20,870 --> 01:23:24,440 I can see "invalid literal for int with base 10-- "cat."" 1837 01:23:24,440 --> 01:23:27,800 That's a super cryptic way of saying that C-A-T is not 1838 01:23:27,800 --> 01:23:29,640 a number in decimal notation. 1839 01:23:29,640 --> 01:23:32,600 And so, I would seem to have to, somehow, handle this case. 1840 01:23:32,600 --> 01:23:34,490 And if you want to be more curious, you'll 1841 01:23:34,490 --> 01:23:36,350 see that this is, indeed, a traceback. 1842 01:23:36,350 --> 01:23:40,100 And C tends to do this, too, or the debugger would do this for you, too. 1843 01:23:40,100 --> 01:23:41,960 You can see all of the functions that have 1844 01:23:41,960 --> 01:23:43,502 been called to get you to this point. 1845 01:23:43,502 --> 01:23:48,170 So apparently, my problem is, initially, in line 14. 1846 01:23:48,170 --> 01:23:50,375 But line 14, if I keep scrolling, is uninteresting. 1847 01:23:50,375 --> 01:23:51,410 It's main. 1848 01:23:51,410 --> 01:23:55,820 But line 14 leads me to execute line 2, which is, indeed, in main. 1849 01:23:55,820 --> 01:23:59,225 That leads me to execute line 9, which is in get_height. 1850 01:23:59,225 --> 01:24:00,880 And so, OK, here is the issue. 1851 01:24:00,880 --> 01:24:02,960 So the closest line number to the error message 1852 01:24:02,960 --> 01:24:05,360 is the one that probably reveals the most. 1853 01:24:05,360 --> 01:24:06,950 Line 9 is where my issue is. 1854 01:24:06,950 --> 01:24:10,940 So I can't just blindly ask the user for input and, then, convert it to an int 1855 01:24:10,940 --> 01:24:12,620 if they're not going to give me an int. 1856 01:24:12,620 --> 01:24:13,870 Now, how do we deal with this? 1857 01:24:13,870 --> 01:24:16,010 Well, back in problem set two, you might recall 1858 01:24:16,010 --> 01:24:18,380 validating that the user typed in a number 1859 01:24:18,380 --> 01:24:19,862 and using a for loop and the like. 1860 01:24:19,862 --> 01:24:22,445 Well, it turns out, there's a better way to do this in Python, 1861 01:24:22,445 --> 01:24:24,830 and the semantics are there. 1862 01:24:24,830 --> 01:24:29,600 If you want to try to convert something to a number that might not actually 1863 01:24:29,600 --> 01:24:32,780 be a number, turns out, Python and certain other languages 1864 01:24:32,780 --> 01:24:35,060 literally have a keyword called try. 1865 01:24:35,060 --> 01:24:37,820 And if only this existed for the past few weeks, I know. 1866 01:24:37,820 --> 01:24:40,583 But you can try to do the following with your code. 1867 01:24:40,583 --> 01:24:41,750 What do I want to try to do? 1868 01:24:41,750 --> 01:24:46,980 Well, I want to try to execute those few lines, except if there's an error. 1869 01:24:46,980 --> 01:24:50,225 So I can say except if there's a value error-- specifically, 1870 01:24:50,225 --> 01:24:53,065 the one I screwed up and created a moment ago. 1871 01:24:53,065 --> 01:24:56,480 And if there is a value error, I can print out an informative message 1872 01:24:56,480 --> 01:25:00,920 to the user, like "not an integer" or anything else. 1873 01:25:00,920 --> 01:25:05,270 And what's happening here, now, is literally this operative word, try. 1874 01:25:05,270 --> 01:25:09,920 Python is going to try to get input and try to convert it to an int, 1875 01:25:09,920 --> 01:25:12,470 and it's going to try to check if it's greater than 0 1876 01:25:12,470 --> 01:25:14,750 and then try to return it. 1877 01:25:14,750 --> 01:25:15,467 Why? 1878 01:25:15,467 --> 01:25:17,300 Three of those lines are inside of, indented 1879 01:25:17,300 --> 01:25:20,780 underneath the try block, except if something goes wrong-- 1880 01:25:20,780 --> 01:25:23,540 specifically, a value error happens. 1881 01:25:23,540 --> 01:25:24,560 Then, it prints this. 1882 01:25:24,560 --> 01:25:26,110 But it doesn't return anything. 1883 01:25:26,110 --> 01:25:30,335 And because I'm in a loop, that means it's going to do it again and again 1884 01:25:30,335 --> 01:25:33,980 and again until the human actually cooperates and gives me 1885 01:25:33,980 --> 01:25:35,360 an actual number. 1886 01:25:35,360 --> 01:25:38,210 And so, this, too, is what the world would call pythonic. 1887 01:25:38,210 --> 01:25:41,420 In Python, you don't, necessarily, rigorously try to validate 1888 01:25:41,420 --> 01:25:43,940 the user's input, make sure they haven't screwed up. 1889 01:25:43,940 --> 01:25:46,160 You honestly take a more lackadaisical approach 1890 01:25:46,160 --> 01:25:50,300 and just try to do something, but catch an error if it happens. 1891 01:25:50,300 --> 01:25:53,720 So catch is also a term of art, even though it's not a keyword here. 1892 01:25:53,720 --> 01:25:55,760 Except if something happens, you handle it. 1893 01:25:55,760 --> 01:25:57,470 So you try and you handle it. 1894 01:25:57,470 --> 01:25:59,480 You best-effort programming, if you will. 1895 01:25:59,480 --> 01:26:04,200 But this is baked into the mindset of the Python programming community. 1896 01:26:04,200 --> 01:26:08,630 So now, if I do python of mario.py and I cooperate, works great as before. 1897 01:26:08,630 --> 01:26:09,830 Try and succeed. 1898 01:26:09,830 --> 01:26:10,670 3 works. 1899 01:26:10,670 --> 01:26:11,345 4 works. 1900 01:26:11,345 --> 01:26:17,243 If, though, I try and fail by typing in "cat," it doesn't crash, per se. 1901 01:26:17,243 --> 01:26:18,410 It doesn't show me an error. 1902 01:26:18,410 --> 01:26:20,695 It shows me something more user-friendly, like "not an integer." 1903 01:26:20,695 --> 01:26:22,610 And then, I can try again with "dog." 1904 01:26:22,610 --> 01:26:23,390 "Not an integer." 1905 01:26:23,390 --> 01:26:24,980 I can try again with 5. 1906 01:26:24,980 --> 01:26:26,240 And now, it works. 1907 01:26:26,240 --> 01:26:28,160 So we won't, generally, have you write much 1908 01:26:28,160 --> 01:26:30,500 in the way of these try-except blocks, only because they 1909 01:26:30,500 --> 01:26:33,080 get a little sophisticated quickly. 1910 01:26:33,080 --> 01:26:35,777 But that is to reveal what the get_int function is doing. 1911 01:26:35,777 --> 01:26:37,610 This is why we give you the training wheels, 1912 01:26:37,610 --> 01:26:39,420 so that, when you want to get an int, you 1913 01:26:39,420 --> 01:26:41,990 don't have to jump through all these annoying hoops to do so. 1914 01:26:41,990 --> 01:26:45,965 But that's all the library's really doing for you, is just try and except. 1915 01:26:45,965 --> 01:26:48,980 You won't be left with any training wheels, ultimately. 1916 01:26:48,980 --> 01:26:52,760 Questions, now, on getting input and trying in this way? 1917 01:26:52,760 --> 01:26:55,433 1918 01:26:55,433 --> 01:26:56,100 Anything at all? 1919 01:26:56,100 --> 01:26:56,610 Yeah? 1920 01:26:56,610 --> 01:27:03,643 AUDIENCE: I'm still [INAUDIBLE] try block. 1921 01:27:03,643 --> 01:27:06,560 DAVID MALAN: Oh, could you put the condition outside of the try block? 1922 01:27:06,560 --> 01:27:07,310 Short answer, yes. 1923 01:27:07,310 --> 01:27:09,227 And, in fact, I struggled with this last night 1924 01:27:09,227 --> 01:27:11,750 when tweaking this example to show the simplest version. 1925 01:27:11,750 --> 01:27:17,180 I will disclaim that, really, I should only be trying, literally, 1926 01:27:17,180 --> 01:27:18,470 to do the fragile part. 1927 01:27:18,470 --> 01:27:21,710 And then, down here, I should be really doing 1928 01:27:21,710 --> 01:27:24,380 what you're proposing, which is do the condition out here. 1929 01:27:24,380 --> 01:27:27,380 The problem is, though, that, logically, this gets messy quickly, right? 1930 01:27:27,380 --> 01:27:31,205 Because except if there's a value error, I want to print out "not an integer." 1931 01:27:31,205 --> 01:27:33,920 I can't compare n against 0, then, because n doesn't 1932 01:27:33,920 --> 01:27:35,752 exist because there was an error. 1933 01:27:35,752 --> 01:27:37,460 So it turns out-- and I'll show you this; 1934 01:27:37,460 --> 01:27:39,350 this is now the advanced version of Python-- 1935 01:27:39,350 --> 01:27:42,620 there's actually an else keyword you can use in Python 1936 01:27:42,620 --> 01:27:44,570 that does not accompany if or elif. 1937 01:27:44,570 --> 01:27:48,680 It accompanies try and except, which I think is weirdly confusing. 1938 01:27:48,680 --> 01:27:50,640 A different word would have been better. 1939 01:27:50,640 --> 01:27:53,692 But if you'd really prefer, I could have done this, instead. 1940 01:27:53,692 --> 01:27:56,900 And this is one of these design things where reasonable people will disagree. 1941 01:27:56,900 --> 01:27:58,775 Generally speaking, you should only try to do 1942 01:27:58,775 --> 01:28:00,980 the one line that might very well fail. 1943 01:28:00,980 --> 01:28:02,420 But honestly, this looks stupid. 1944 01:28:02,420 --> 01:28:04,850 No, it's just unnecessarily complicated. 1945 01:28:04,850 --> 01:28:08,560 And so, my own preference was actually the original, which was-- yeah, 1946 01:28:08,560 --> 01:28:10,310 I'm trying a few extra lines that, really, 1947 01:28:10,310 --> 01:28:11,973 aren't going to fail, mathematically. 1948 01:28:11,973 --> 01:28:12,890 But it's just tighter. 1949 01:28:12,890 --> 01:28:14,030 It's cleaner this way. 1950 01:28:14,030 --> 01:28:16,580 And here's, again, the sort of arguments you'll 1951 01:28:16,580 --> 01:28:18,530 start to make yourself as you get more comfortable with programming. 1952 01:28:18,530 --> 01:28:19,280 You'll have an opinion. 1953 01:28:19,280 --> 01:28:20,488 You'll disagree with someone. 1954 01:28:20,488 --> 01:28:25,200 And so long as you can back you argument up, it's pretty reasonable, probably. 1955 01:28:25,200 --> 01:28:25,700 All right. 1956 01:28:25,700 --> 01:28:30,222 So how about we, now, take away some piece of magic 1957 01:28:30,222 --> 01:28:31,430 that's been here for a while. 1958 01:28:31,430 --> 01:28:33,950 Let me go ahead and delete all of this here. 1959 01:28:33,950 --> 01:28:38,855 And let me propose that we revisit not that vertical column and the exceptions 1960 01:28:38,855 --> 01:28:42,110 that might result from getting input, but these horizontal question marks 1961 01:28:42,110 --> 01:28:43,130 that we saw a while ago. 1962 01:28:43,130 --> 01:28:45,980 So I want all of those question marks on the same line. 1963 01:28:45,980 --> 01:28:48,860 And yet, I worry we're about to see a challenge here because print, 1964 01:28:48,860 --> 01:28:51,830 up until now, has been putting new lines everywhere automatically, 1965 01:28:51,830 --> 01:28:53,570 even without those backslash n's. 1966 01:28:53,570 --> 01:28:56,360 Well, let me propose that we do this. 1967 01:28:56,360 --> 01:28:58,130 for i in the range of 4. 1968 01:28:58,130 --> 01:29:02,165 If I want four question marks, let me just print four question marks. 1969 01:29:02,165 --> 01:29:04,370 Unfortunately, I don't think this is correct yet. 1970 01:29:04,370 --> 01:29:06,530 Let me run python of mario.py. 1971 01:29:06,530 --> 01:29:11,510 And, of course, this gives me a column instead of the row of question marks 1972 01:29:11,510 --> 01:29:12,630 that I want. 1973 01:29:12,630 --> 01:29:13,550 So how do we do this? 1974 01:29:13,550 --> 01:29:17,785 Well, it turns out, if you read the documentation for the print function, 1975 01:29:17,785 --> 01:29:19,910 it turns out that print, not surprisingly, perhaps, 1976 01:29:19,910 --> 01:29:22,000 takes a lot of different arguments, as well. 1977 01:29:22,000 --> 01:29:24,590 And in fact, if you go to the documentation for it, 1978 01:29:24,590 --> 01:29:27,650 you'll see that it takes not just positional 1979 01:29:27,650 --> 01:29:30,685 arguments-- that is, from left to right, separated by commas. 1980 01:29:30,685 --> 01:29:32,810 It turns out, Python has supports a fancier feature 1981 01:29:32,810 --> 01:29:36,860 with arguments where you can pass the names of arguments to functions, too. 1982 01:29:36,860 --> 01:29:38,470 So what do I mean by this? 1983 01:29:38,470 --> 01:29:43,430 If I go back to VS Code here and I've read the documentation, 1984 01:29:43,430 --> 01:29:48,995 it turns out that, yes, as before, you can pass multiple arguments to Python, 1985 01:29:48,995 --> 01:29:49,700 like this. 1986 01:29:49,700 --> 01:29:53,030 Hello comma David comma Nalan, that will just automatically 1987 01:29:53,030 --> 01:29:56,553 concatenate all three of those positional arguments together. 1988 01:29:56,553 --> 01:29:59,720 They're positional in the sense that they literally flow from left to right, 1989 01:29:59,720 --> 01:30:01,238 separated by commas. 1990 01:30:01,238 --> 01:30:03,530 But if you don't want to just pass in values like that, 1991 01:30:03,530 --> 01:30:07,370 you want to actually print out, as I did before, a question mark. 1992 01:30:07,370 --> 01:30:11,240 But you want to override the default behavior of print 1993 01:30:11,240 --> 01:30:14,610 by changing the line ending, you can actually do this. 1994 01:30:14,610 --> 01:30:18,890 You can use the name of an argument that you know exists from the documentation 1995 01:30:18,890 --> 01:30:22,130 and set it equal to some alternative value. 1996 01:30:22,130 --> 01:30:24,770 And in fact, even though this looks cryptic, 1997 01:30:24,770 --> 01:30:30,380 this is how I would override the end of each line, to be quote, unquote. 1998 01:30:30,380 --> 01:30:32,900 That is nothing because, if you read the documentation, 1999 01:30:32,900 --> 01:30:37,190 the default value for this end argument-- does someone want to guess-- 2000 01:30:37,190 --> 01:30:38,750 is-- 2001 01:30:38,750 --> 01:30:39,800 is backslash n. 2002 01:30:39,800 --> 01:30:41,690 So if you read the documentation, you'll se 2003 01:30:41,690 --> 01:30:46,550 that backslash n is the implied default for this end argument. 2004 01:30:46,550 --> 01:30:49,810 And so, if you want to change it, you just say end equals something else. 2005 01:30:49,810 --> 01:30:57,057 And so, here, I can change it to nothing and, now, rerun python of mario.py. 2006 01:30:57,057 --> 01:30:58,640 And now, they're all in the same line. 2007 01:30:58,640 --> 01:31:01,190 Now, it looks a little stupid because I made that week 2008 01:31:01,190 --> 01:31:04,190 one mistake where I still need to move the cursor to the next line. 2009 01:31:04,190 --> 01:31:05,570 That's just a different problem. 2010 01:31:05,570 --> 01:31:07,612 I'm just going to go over here and print nothing. 2011 01:31:07,612 --> 01:31:10,550 I don't even need to print backslash n because, if print automatically 2012 01:31:10,550 --> 01:31:13,970 gives you a backslash n, just call print with nothing, 2013 01:31:13,970 --> 01:31:15,420 and you'll get that for free. 2014 01:31:15,420 --> 01:31:16,940 So let me rerun python of mario.py. 2015 01:31:16,940 --> 01:31:19,895 And now, it looks a little prettier at the prompt. 2016 01:31:19,895 --> 01:31:21,770 And to be super clear as to what's going on-- 2017 01:31:21,770 --> 01:31:24,300 suppose I want to make an exclamation here. 2018 01:31:24,300 --> 01:31:27,320 I could change the backslash n default to an exclamation point, 2019 01:31:27,320 --> 01:31:28,680 just for kicks. 2020 01:31:28,680 --> 01:31:31,550 And if I run python of mario.py Again, now, I 2021 01:31:31,550 --> 01:31:36,662 get this exclamation with question marks and exclamation points, as well. 2022 01:31:36,662 --> 01:31:38,120 So that's all that's going on here. 2023 01:31:38,120 --> 01:31:40,670 And this is what's called a named argument. 2024 01:31:40,670 --> 01:31:43,670 It literally has a name that you can specify when calling it in. 2025 01:31:43,670 --> 01:31:47,787 And it's different from positional in that you're literally using the name. 2026 01:31:47,787 --> 01:31:49,370 Let me propose something else, though. 2027 01:31:49,370 --> 01:31:50,828 And this is why people like Python. 2028 01:31:50,828 --> 01:31:52,550 There's just cool ways to do things. 2029 01:31:52,550 --> 01:31:55,724 2030 01:31:55,724 --> 01:32:00,740 That's a three-line, verbose way of printing out four question marks. 2031 01:32:00,740 --> 01:32:04,002 I could certainly take the shortcut and just do this. 2032 01:32:04,002 --> 01:32:06,085 But that's not really that interesting for anyone, 2033 01:32:06,085 --> 01:32:08,720 especially if I want to do it a variable number of times. 2034 01:32:08,720 --> 01:32:10,390 But Python does let you do this. 2035 01:32:10,390 --> 01:32:15,110 If you want to multiply a character some number of times, 2036 01:32:15,110 --> 01:32:18,020 not only can you use plus for concatenation, 2037 01:32:18,020 --> 01:32:23,930 you can use star or an asterisk for multiplication, if you will-- that is, 2038 01:32:23,930 --> 01:32:26,250 concatenation again and again and again. 2039 01:32:26,250 --> 01:32:29,030 So if I just print out, quote unquote, "?" 2040 01:32:29,030 --> 01:32:34,190 times 4, that's actually going to be the tightest way, the most distinct way 2041 01:32:34,190 --> 01:32:36,020 I can print four question marks instead. 2042 01:32:36,020 --> 01:32:39,095 And if I don't use 4, I use n, where I get n from the user. 2043 01:32:39,095 --> 01:32:39,830 Bang. 2044 01:32:39,830 --> 01:32:42,320 Now, I've gotten rid of the for loop entirely, 2045 01:32:42,320 --> 01:32:48,000 and I'm using the star operator to manipulate it instead. 2046 01:32:48,000 --> 01:32:50,120 And, to be super clear here, insofar as Python 2047 01:32:50,120 --> 01:32:54,440 does not have malloc or free or memory management that you have to do, 2048 01:32:54,440 --> 01:32:56,060 guess what Python also doesn't have. 2049 01:32:56,060 --> 01:32:59,760 2050 01:32:59,760 --> 01:33:03,110 Anything on your minds in the past couple of week? 2051 01:33:03,110 --> 01:33:03,875 Doesn't have-- 2052 01:33:03,875 --> 01:33:04,853 AUDIENCE: Pointers. 2053 01:33:04,853 --> 01:33:06,020 DAVID MALAN: Pointers, yeah. 2054 01:33:06,020 --> 01:33:09,295 So Python does not have pointers, which just means that all of that 2055 01:33:09,295 --> 01:33:11,420 happens for you automatically, underneath the hood, 2056 01:33:11,420 --> 01:33:14,150 again, by way of code that someone else wrote. 2057 01:33:14,150 --> 01:33:15,950 How about one more throwback with Mario? 2058 01:33:15,950 --> 01:33:20,450 We've talked about, in week one, this two-dimensional structure where 2059 01:33:20,450 --> 01:33:24,302 it's like I claim 3 by 3-- a grid of bricks, if you will. 2060 01:33:24,302 --> 01:33:25,760 Well, how can we do this in Python? 2061 01:33:25,760 --> 01:33:27,590 We can do this in a couple of ways, now. 2062 01:33:27,590 --> 01:33:32,810 Let me go back to my mario.py, and let me do something like for i in range 2063 01:33:32,810 --> 01:33:36,200 of-- we'll just do 3, even though I know, now, I could use get_int 2064 01:33:36,200 --> 01:33:38,453 or I could use input and int. 2065 01:33:38,453 --> 01:33:41,120 And if I want to do something two-dimensionally, just like in C, 2066 01:33:41,120 --> 01:33:42,590 you can nest your for loops. 2067 01:33:42,590 --> 01:33:45,980 So maybe I could do for j in range of 3. 2068 01:33:45,980 --> 01:33:50,690 And then, in here, I could print out a hash symbol. 2069 01:33:50,690 --> 01:33:53,210 And then, let's see if that gives me 9 total. 2070 01:33:53,210 --> 01:33:56,870 So if I've got a nested loop like this, python of mario.py 2071 01:33:56,870 --> 01:33:58,625 hopefully gives me a grid. 2072 01:33:58,625 --> 01:34:01,710 No, it gave me a column of 9. 2073 01:34:01,710 --> 01:34:09,280 Why, logically, even though I've got my row and my columns? 2074 01:34:09,280 --> 01:34:10,210 Yeah. 2075 01:34:10,210 --> 01:34:11,542 AUDIENCE: [INAUDIBLE] 2076 01:34:11,542 --> 01:34:13,000 DAVID MALAN: Yeah, the line ending. 2077 01:34:13,000 --> 01:34:17,380 So in my row, I can't let print just keep adding new line, adding new line. 2078 01:34:17,380 --> 01:34:20,740 So I just have to override this here and let me not screw up like before. 2079 01:34:20,740 --> 01:34:24,250 Let me print one at the end of the whole row, just to move the cursor down. 2080 01:34:24,250 --> 01:34:28,090 And I think, now, together, we've got our 3 by 3. 2081 01:34:28,090 --> 01:34:29,950 Of course, we could tighten this up further. 2082 01:34:29,950 --> 01:34:33,730 If I don't like the nested loop, I probably could go in here 2083 01:34:33,730 --> 01:34:37,975 and just print out, for instance, a brick times 3. 2084 01:34:37,975 --> 01:34:41,055 Or I could change the 3 to a variable if I've gotten it from the user. 2085 01:34:41,055 --> 01:34:42,582 So I can tighten this up further. 2086 01:34:42,582 --> 01:34:45,790 So, again, just different ways to solve the same problem and, again, evidence 2087 01:34:45,790 --> 01:34:47,575 of why a lot of people like Python. 2088 01:34:47,575 --> 01:34:49,825 There's just some more pleasant ways to solve problems 2089 01:34:49,825 --> 01:34:52,330 without getting into the weeds, constantly, of doing things, 2090 01:34:52,330 --> 01:34:56,845 like with for loops and while loops endlessly. 2091 01:34:56,845 --> 01:34:57,430 All right. 2092 01:34:57,430 --> 01:34:59,222 Well, how about some other building blocks? 2093 01:34:59,222 --> 01:35:02,983 Lists are going to be so incredibly useful in Python, just as arrays 2094 01:35:02,983 --> 01:35:04,900 were in C. But arrays are annoying because you 2095 01:35:04,900 --> 01:35:06,410 have to manage the memory yourself. 2096 01:35:06,410 --> 01:35:08,327 You have to in advance how big they are or you 2097 01:35:08,327 --> 01:35:11,440 have to use pointers and malloc or realloc to resize them. 2098 01:35:11,440 --> 01:35:12,100 Oh my god. 2099 01:35:12,100 --> 01:35:14,267 The past two weeks have been painful, in that sense. 2100 01:35:14,267 --> 01:35:17,298 But Python does this all for free for you. 2101 01:35:17,298 --> 01:35:19,090 In fact, there's a whole bunch of functions 2102 01:35:19,090 --> 01:35:22,030 that come with Python that involve lists, 2103 01:35:22,030 --> 01:35:29,678 and they'll allow us, ultimately, to do things again and again and again 2104 01:35:29,678 --> 01:35:30,970 within the same data structure. 2105 01:35:30,970 --> 01:35:33,220 And, for instance, we'll be able to get the length of a list. 2106 01:35:33,220 --> 01:35:35,560 You don't have to remember it yourself in a variable. 2107 01:35:35,560 --> 01:35:39,085 You can just ask Python how many elements are in this list. 2108 01:35:39,085 --> 01:35:42,850 And with this, I think we can solve some old problems, too. 2109 01:35:42,850 --> 01:35:45,250 So let me go back here, to VS Code. 2110 01:35:45,250 --> 01:35:50,890 Let me close mario and give us a new program called scores.py. 2111 01:35:50,890 --> 01:35:54,535 And rather than show the C and the Python now, let's just focus on Python. 2112 01:35:54,535 --> 01:35:59,390 And in scores.c way back when, we just averaged three test scores or something 2113 01:35:59,390 --> 01:35:59,890 like that-- 2114 01:35:59,890 --> 01:36:01,900 72, 73, and 33-- 2115 01:36:01,900 --> 01:36:03,230 a few weeks ago. 2116 01:36:03,230 --> 01:36:07,450 So if I want to create a list in this Python version of 72, 73, 33, 2117 01:36:07,450 --> 01:36:09,220 I just use my square bracket notation. 2118 01:36:09,220 --> 01:36:12,640 C let you use curly braces if you know the values in advance, 2119 01:36:12,640 --> 01:36:14,170 but Python's just this. 2120 01:36:14,170 --> 01:36:16,855 And now, if I want to compute the average-- 2121 01:36:16,855 --> 01:36:19,360 in C, recall, I did something with a loop. 2122 01:36:19,360 --> 01:36:21,140 I added all the values together. 2123 01:36:21,140 --> 01:36:23,230 I, then, divide it by the total number of values 2124 01:36:23,230 --> 01:36:26,110 just like you would in grade school, and that gave me the average. 2125 01:36:26,110 --> 01:36:29,085 Well, Python comes with a lot of super handy functions-- 2126 01:36:29,085 --> 01:36:31,395 not just length, but others, as well. 2127 01:36:31,395 --> 01:36:34,150 And so, in fact, if you want to compute the average, 2128 01:36:34,150 --> 01:36:36,970 you can take the sum of all of those scores 2129 01:36:36,970 --> 01:36:40,010 and divide it by the length of all of those scores. 2130 01:36:40,010 --> 01:36:42,490 So Python comes with length, comes with sum. 2131 01:36:42,490 --> 01:36:45,310 You can just pass in a whole list of any size 2132 01:36:45,310 --> 01:36:47,590 and let it deal with that problem for you. 2133 01:36:47,590 --> 01:36:49,900 So if I want to, now, print out this average, 2134 01:36:49,900 --> 01:36:51,760 I can print out Average colon-- 2135 01:36:51,760 --> 01:36:55,570 and then, I'll plug in my average variable for interpolation. 2136 01:36:55,570 --> 01:36:58,900 Let me make this an fstring so that it gets formatted, 2137 01:36:58,900 --> 01:37:01,530 and let me just run python of scores.py. 2138 01:37:01,530 --> 01:37:02,800 And there is my average. 2139 01:37:02,800 --> 01:37:05,890 It's rounding weird because we're still vulnerable to some floating point 2140 01:37:05,890 --> 01:37:09,340 imprecision, but at least I didn't need loops 2141 01:37:09,340 --> 01:37:11,575 and I didn't have to write all this darn code just 2142 01:37:11,575 --> 01:37:15,130 to do something that Excel and Google Spreadsheets can just do like that. 2143 01:37:15,130 --> 01:37:17,950 Well, Python is closer to those kinds of tools, 2144 01:37:17,950 --> 01:37:21,790 but more powerful in that you can manipulate the data yourself. 2145 01:37:21,790 --> 01:37:25,510 How about, though, if I want to get a bunch of scores manually from the user 2146 01:37:25,510 --> 01:37:27,280 and, then, sum them together. 2147 01:37:27,280 --> 01:37:28,920 Well, let's combine a few ideas here. 2148 01:37:28,920 --> 01:37:29,830 How about this? 2149 01:37:29,830 --> 01:37:36,070 First, let me go ahead and import the get_int function from the CS50 library, 2150 01:37:36,070 --> 01:37:39,340 just so we don't have to deal with try and except or all of that. 2151 01:37:39,340 --> 01:37:42,340 And let me go ahead and give myself an empty list. 2152 01:37:42,340 --> 01:37:44,410 And this is powerful. 2153 01:37:44,410 --> 01:37:48,068 In C, [SIGHS] there's no point to an empty array 2154 01:37:48,068 --> 01:37:50,860 because, if you create an empty array with square bracket notation, 2155 01:37:50,860 --> 01:37:52,600 it's not useful for anything. 2156 01:37:52,600 --> 01:37:55,780 But in Python, you can create it empty because Python 2157 01:37:55,780 --> 01:37:59,590 will grow and shrink the list for you automatically, as you add things to it. 2158 01:37:59,590 --> 01:38:01,600 So if I want to get three scores from the user, 2159 01:38:01,600 --> 01:38:04,840 I could do something like this-- for i in range of 3. 2160 01:38:04,840 --> 01:38:08,680 And then, I can grab a variable called "score" or anything. 2161 01:38:08,680 --> 01:38:11,467 I could call get_int, prompt the human for the score 2162 01:38:11,467 --> 01:38:12,550 that they want to type in. 2163 01:38:12,550 --> 01:38:15,060 And then, once they do, I can do this. 2164 01:38:15,060 --> 01:38:19,450 Thinking back to our object-oriented programming capability now, 2165 01:38:19,450 --> 01:38:24,358 I could do scores.append, and I can append that score to it. 2166 01:38:24,358 --> 01:38:27,400 And you would only know this from having read the documentation, heard it 2167 01:38:27,400 --> 01:38:30,040 in class, in a book or whatnot, but it turns out 2168 01:38:30,040 --> 01:38:33,880 that, just like strings have functions like lower built into them, 2169 01:38:33,880 --> 01:38:37,735 lists have functions like append built into them that just literally appends 2170 01:38:37,735 --> 01:38:40,165 to the end of the list for you, and Python 2171 01:38:40,165 --> 01:38:42,250 will grow or shrink it as needed. 2172 01:38:42,250 --> 01:38:44,760 No more malloc or realloc or the like. 2173 01:38:44,760 --> 01:38:49,120 So this just appends to the scores list. 2174 01:38:49,120 --> 01:38:51,740 That score, and then again and again and again. 2175 01:38:51,740 --> 01:38:52,990 So the array starts at-- 2176 01:38:52,990 --> 01:38:57,640 sorry, the list starts at size 0, then grows to 1 then 2 then 3 2177 01:38:57,640 --> 01:38:59,320 without you having to do anything else. 2178 01:38:59,320 --> 01:39:02,845 And so, now, down here, I can compute an average 2179 01:39:02,845 --> 01:39:05,620 with the sum of those scores divided by the length 2180 01:39:05,620 --> 01:39:07,455 of the total number of scores. 2181 01:39:07,455 --> 01:39:11,830 And to be clear, length is the total number of elements in the list. 2182 01:39:11,830 --> 01:39:14,200 Doesn't matter how big the values themselves are. 2183 01:39:14,200 --> 01:39:18,160 Now I can go ahead and print out an fstring with something 2184 01:39:18,160 --> 01:39:22,100 like Average colon average in curly braces. 2185 01:39:22,100 --> 01:39:24,680 And if I run python of scores.py-- 2186 01:39:24,680 --> 01:39:27,505 I'll type in, just for the sake of discussion, the three values, 2187 01:39:27,505 --> 01:39:29,440 I still get the same answer. 2188 01:39:29,440 --> 01:39:31,390 But that would have been painful to do in C 2189 01:39:31,390 --> 01:39:35,770 unless you committed, in advance, to a fixed size array-- which we already 2190 01:39:35,770 --> 01:39:41,830 decided, weeks ago, was annoying-- or you grew it dynamically 2191 01:39:41,830 --> 01:39:44,740 using malloc or realloc or the like. 2192 01:39:44,740 --> 01:39:45,400 All right. 2193 01:39:45,400 --> 01:39:46,240 What else can I do? 2194 01:39:46,240 --> 01:39:49,990 Well, there's some nice things you might as well know exist. 2195 01:39:49,990 --> 01:39:54,340 Instead of scores.append, you can do slight fanciness like this. 2196 01:39:54,340 --> 01:39:57,290 If you want to append something to a list, 2197 01:39:57,290 --> 01:40:00,100 you can actually do plus equals, and then 2198 01:40:00,100 --> 01:40:03,620 put that thing in a temporary list of its own 2199 01:40:03,620 --> 01:40:05,740 and just use what is essentially concatenation-- 2200 01:40:05,740 --> 01:40:09,410 but not concatenation of strings, but concatenation of lists. 2201 01:40:09,410 --> 01:40:13,480 So this new line 6 appends to the score's list-- 2202 01:40:13,480 --> 01:40:15,640 this tiny, little list I'm temporarily creating 2203 01:40:15,640 --> 01:40:17,670 with just the current new score. 2204 01:40:17,670 --> 01:40:20,260 So just another piece of syntax that's worth seeing that 2205 01:40:20,260 --> 01:40:23,290 allows you to do something like that, as well. 2206 01:40:23,290 --> 01:40:23,890 All right. 2207 01:40:23,890 --> 01:40:26,093 Well, how about we go back to strings for a moment? 2208 01:40:26,093 --> 01:40:29,260 And all of these examples, as always, are on the course's website afterward. 2209 01:40:29,260 --> 01:40:32,860 Suppose we want to do something like converting characters to uppercase. 2210 01:40:32,860 --> 01:40:35,170 Well, to be clear, I could do something like this. 2211 01:40:35,170 --> 01:40:38,080 Let me create a program called uppercase.py. 2212 01:40:38,080 --> 01:40:42,280 Let me prompt the user for a before string as by using the input function 2213 01:40:42,280 --> 01:40:44,510 or get_string, which is almost the same. 2214 01:40:44,510 --> 01:40:47,110 And I'll prompt the user for a string beforehand. 2215 01:40:47,110 --> 01:40:52,750 Then, let me go ahead and print out, how about, the keyword "After," 2216 01:40:52,750 --> 01:40:56,650 and then end the new line with nothing, just so 2217 01:40:56,650 --> 01:41:00,010 that I can see "Before" on one line and "After" on the next line. 2218 01:41:00,010 --> 01:41:01,240 And then, let me do this-- 2219 01:41:01,240 --> 01:41:04,450 and here's where Python gets pleasant, too, with loops-- 2220 01:41:04,450 --> 01:41:07,270 for c in before-- 2221 01:41:07,270 --> 01:41:11,110 print c.upper end equals quote, unquote. 2222 01:41:11,110 --> 01:41:12,580 And then, I'll print this here. 2223 01:41:12,580 --> 01:41:13,120 All right. 2224 01:41:13,120 --> 01:41:15,950 That was fast, but let's try to infer what's going on. 2225 01:41:15,950 --> 01:41:19,600 So line 1 just gets input from the user, stores it in a variable called before. 2226 01:41:19,600 --> 01:41:22,510 Line two literally just prints "After" but doesn't 2227 01:41:22,510 --> 01:41:25,300 move the cursor to the next line. 2228 01:41:25,300 --> 01:41:27,015 What it, then, does is this. 2229 01:41:27,015 --> 01:41:29,875 And, in C, this was a little more annoying. 2230 01:41:29,875 --> 01:41:31,450 You needed a for loop with i. 2231 01:41:31,450 --> 01:41:34,690 You needed array notation with the square brackets. 2232 01:41:34,690 --> 01:41:39,850 But, Python, if you say for variable in string-- 2233 01:41:39,850 --> 01:41:42,670 so for c, for character, in string, Python 2234 01:41:42,670 --> 01:41:46,060 is going to automatically assign c to the first letter 2235 01:41:46,060 --> 01:41:47,110 that the user types in. 2236 01:41:47,110 --> 01:41:49,120 Then, on the next iteration, the second letter, the third letter, 2237 01:41:49,120 --> 01:41:49,745 and the fourth. 2238 01:41:49,745 --> 01:41:52,360 So you don't need any square bracket notation, you just use c, 2239 01:41:52,360 --> 01:41:55,180 and Python will do it for you and just hand you back, 2240 01:41:55,180 --> 01:41:59,000 one at a time, each of the letters that the user has typed in. 2241 01:41:59,000 --> 01:42:04,720 So if I go back over here and I run, for instance, python of uppercase.py 2242 01:42:04,720 --> 01:42:09,760 and I'll type in, how about, "david" in all lowercase and hit Enter, 2243 01:42:09,760 --> 01:42:13,630 you'll now see that it's all uppercase instead by iterating over it, 2244 01:42:13,630 --> 01:42:15,372 indeed, one character at a time. 2245 01:42:15,372 --> 01:42:17,830 But we already know, thanks to object-oriented programming, 2246 01:42:17,830 --> 01:42:20,027 strings themselves have the functionality built 2247 01:42:20,027 --> 01:42:24,100 in to not just uppercase single characters, but the whole string. 2248 01:42:24,100 --> 01:42:26,530 So, honestly, this was a bit of a silly exercise. 2249 01:42:26,530 --> 01:42:31,360 I don't need to use a loop anymore, like in C. And so, some of the habits 2250 01:42:31,360 --> 01:42:34,720 you've only just developed in recent weeks, it's time to start breaking them 2251 01:42:34,720 --> 01:42:36,130 when they're not necessary. 2252 01:42:36,130 --> 01:42:40,470 I can create a variable called after, set it equal to before.upper-- 2253 01:42:40,470 --> 01:42:43,600 which, indeed, exists, just like dot lower exists. 2254 01:42:43,600 --> 01:42:47,490 And then, what I can go ahead and print out is, for instance-- 2255 01:42:47,490 --> 01:42:49,990 let's get rid of this print line here and do it at the end-- 2256 01:42:49,990 --> 01:42:53,900 "After" and print the value of that variable. 2257 01:42:53,900 --> 01:42:58,005 So now, if I rerun uppercase.py, type in "david" in all lowercase, 2258 01:42:58,005 --> 01:43:03,400 I can just uppercase the whole thing all at once because, again, in Python, 2259 01:43:03,400 --> 01:43:07,000 you don't have to operate on characters individually. 2260 01:43:07,000 --> 01:43:13,310 Questions on any of these tricks up until now? 2261 01:43:13,310 --> 01:43:13,810 No? 2262 01:43:13,810 --> 01:43:14,290 All right. 2263 01:43:14,290 --> 01:43:17,290 How about a few other techniques that we saw in C that we'll bring back, 2264 01:43:17,290 --> 01:43:18,145 now, in Python. 2265 01:43:18,145 --> 01:43:22,860 So it turns out, in Python, there are other libraries you can use, too, 2266 01:43:22,860 --> 01:43:24,360 that unlock even more functionality. 2267 01:43:24,360 --> 01:43:27,040 So, in C, if you wanted command line arguments, 2268 01:43:27,040 --> 01:43:32,410 you just change the signature for main to be, instead of void, 2269 01:43:32,410 --> 01:43:38,515 int argc comma string argv, open brackets for an array or char star, 2270 01:43:38,515 --> 01:43:39,130 eventually. 2271 01:43:39,130 --> 01:43:41,770 Well, it turns out, in Python, that, if you want to access command line 2272 01:43:41,770 --> 01:43:44,770 arguments, it's a little simpler, but they're tucked away in a library-- 2273 01:43:44,770 --> 01:43:46,990 otherwise known as a module-- 2274 01:43:46,990 --> 01:43:49,552 called sys, the system module. 2275 01:43:49,552 --> 01:43:51,760 Now, this is similar, in spirit, to the CS50 library, 2276 01:43:51,760 --> 01:43:53,802 and that's got a bunch of functionality built in. 2277 01:43:53,802 --> 01:43:55,725 But this one comes with Python itself. 2278 01:43:55,725 --> 01:43:59,710 So if I want tot create a program like greet.py, in VS Code, 2279 01:43:59,710 --> 01:44:01,510 here, let me go ahead and do this. 2280 01:44:01,510 --> 01:44:05,785 From the sys library, let's import argv. 2281 01:44:05,785 --> 01:44:07,850 And that's just a thing that exists. 2282 01:44:07,850 --> 01:44:10,660 It's not built into main because there is no main, per se, anymore. 2283 01:44:10,660 --> 01:44:12,590 So it's tucked away in that library. 2284 01:44:12,590 --> 01:44:14,330 And now, I can do something like this. 2285 01:44:14,330 --> 01:44:16,925 If the length of argv equals equals 2, well, 2286 01:44:16,925 --> 01:44:19,090 let's go ahead and print out something friendly, 2287 01:44:19,090 --> 01:44:24,955 like hello comma argv bracket 1, and then, close quotes. 2288 01:44:24,955 --> 01:44:28,360 Else, if the length of argv is not equal to 2, 2289 01:44:28,360 --> 01:44:30,400 Let's just go ahead and print out hello, world. 2290 01:44:30,400 --> 01:44:32,525 Now, at a glance, this might look a little cryptic, 2291 01:44:32,525 --> 01:44:35,050 but it's identical to what we did a few weeks ago. 2292 01:44:35,050 --> 01:44:39,570 When I run this, python of greet.py, with no arguments, 2293 01:44:39,570 --> 01:44:40,950 it just says "hello, world." 2294 01:44:40,950 --> 01:44:46,180 But if I, instead, add a command line argument, like my first name and hit 2295 01:44:46,180 --> 01:44:49,825 Enter, now, the length of argv is no longer 1. 2296 01:44:49,825 --> 01:44:51,700 It's going to be 2. 2297 01:44:51,700 --> 01:44:54,680 And so, it prints out "Hello, David" instead. 2298 01:44:54,680 --> 01:44:57,880 So the takeaway here is that, whereas in C, 2299 01:44:57,880 --> 01:45:03,955 argv technically contained the name of your program, like ./hello or ./greet, 2300 01:45:03,955 --> 01:45:05,455 and then everything the human typed. 2301 01:45:05,455 --> 01:45:08,410 Python's a little different in that, because we're 2302 01:45:08,410 --> 01:45:10,150 using the interpreter in this way-- 2303 01:45:10,150 --> 01:45:16,090 technically, when you run python of greet.py, the length of argv is only 1. 2304 01:45:16,090 --> 01:45:18,760 It contains only greet.py, so the name of the file. 2305 01:45:18,760 --> 01:45:21,670 It does not unnecessarily contain Python itself 2306 01:45:21,670 --> 01:45:24,460 because what's the point of that being there, omnipresently? 2307 01:45:24,460 --> 01:45:28,760 It does contain the number of words that the human typed after Python itself. 2308 01:45:28,760 --> 01:45:32,230 So argv is length 1 here. argv is length 2 here. 2309 01:45:32,230 --> 01:45:35,350 And that's why, when it did equal 2, I saw "Hello, David" instead 2310 01:45:35,350 --> 01:45:37,240 of the default "Hello, world." 2311 01:45:37,240 --> 01:45:41,440 So same ability to access command line arguments, add these kinds of inputs 2312 01:45:41,440 --> 01:45:43,570 to your functions, but you have to unlock it 2313 01:45:43,570 --> 01:45:47,830 by way of using argv instead, in this way. 2314 01:45:47,830 --> 01:45:51,910 If you want to see all of the words, you could do something like this. 2315 01:45:51,910 --> 01:45:57,760 Just as-- if we combine ideas, here-- for i in range of, how about, length 2316 01:45:57,760 --> 01:45:59,610 of argv. 2317 01:45:59,610 --> 01:46:02,260 Then, I can do this-- print argv bracket i. 2318 01:46:02,260 --> 01:46:02,860 All right. 2319 01:46:02,860 --> 01:46:06,385 A little cryptic, but line 3 is just a for loop iterating 2320 01:46:06,385 --> 01:46:08,410 over the range of length of argv. 2321 01:46:08,410 --> 01:46:12,640 So if the human types in two words, the length of argv will be 2. 2322 01:46:12,640 --> 01:46:16,885 So this is just a way of saying iterate over all of the words in argv, 2323 01:46:16,885 --> 01:46:18,380 printing them one at a time. 2324 01:46:18,380 --> 01:46:22,810 So python of greet.py, Enter just prints out the name of the program. 2325 01:46:22,810 --> 01:46:27,340 python of greet.py with David prints out greet.py and, then, David. 2326 01:46:27,340 --> 01:46:29,470 I can keep running it though with more words, 2327 01:46:29,470 --> 01:46:32,650 and they'll each get printed one at a time. 2328 01:46:32,650 --> 01:46:35,440 But what's nice, too, about Python-- 2329 01:46:35,440 --> 01:46:38,920 and this is the point of this exercise-- honestly, this looks pretty cryptic. 2330 01:46:38,920 --> 01:46:40,720 This is not very pleasant to look at. 2331 01:46:40,720 --> 01:46:46,150 If you just want to iterate over every word in a list, which argv is, 2332 01:46:46,150 --> 01:46:47,680 watch what I can do. 2333 01:46:47,680 --> 01:46:52,090 I can do for arg or any variable name in argv. 2334 01:46:52,090 --> 01:46:54,147 Let me just, now, print out that argument. 2335 01:46:54,147 --> 01:46:56,980 I could keep calling it i, but i seems weird when it's not a number. 2336 01:46:56,980 --> 01:46:59,710 So I'm changing to arg as a word, instead. 2337 01:46:59,710 --> 01:47:03,970 If I now do python of greet.py, it does this. 2338 01:47:03,970 --> 01:47:06,460 If I do python of greet.py, David, it does that again. 2339 01:47:06,460 --> 01:47:08,690 David Malan, it does that again. 2340 01:47:08,690 --> 01:47:10,898 So this is, again, why Python is just very appealing. 2341 01:47:10,898 --> 01:47:13,482 You want to do something this many times, iterate over a list? 2342 01:47:13,482 --> 01:47:15,820 Just say it, and it reads a little more like English. 2343 01:47:15,820 --> 01:47:18,130 And there's even other fanciness, too, if I may. 2344 01:47:18,130 --> 01:47:21,820 It's a little stupid that I keep seeing the name of the program, greet.py, 2345 01:47:21,820 --> 01:47:24,640 so it'd be nice if I could remove that. 2346 01:47:24,640 --> 01:47:28,960 Python also supports what are called slices of arrays-- 2347 01:47:28,960 --> 01:47:30,340 sorry, slices of lists. 2348 01:47:30,340 --> 01:47:32,050 Even I get the terminology confused. 2349 01:47:32,050 --> 01:47:36,400 If argv is a list, then it's going to print out everything in it. 2350 01:47:36,400 --> 01:47:41,950 But if I want a slice of it that starts at location 1 all the way to the end, 2351 01:47:41,950 --> 01:47:45,500 you can use this funky syntax in between the square brackets, which 2352 01:47:45,500 --> 01:47:48,700 we've not seen yet, that's going to start at item 1 2353 01:47:48,700 --> 01:47:50,220 and go all the way to the end. 2354 01:47:50,220 --> 01:47:53,830 And so, this is a nice, clever way of slicing off, 2355 01:47:53,830 --> 01:47:56,170 if you will, the very first element because now, 2356 01:47:56,170 --> 01:48:01,900 when I run greet.py, David Malan, I should only see David and Malan. 2357 01:48:01,900 --> 01:48:04,940 If I only want one element, I could do 1 to 2. 2358 01:48:04,940 --> 01:48:08,260 If I want all of them, I could do 0 onward. 2359 01:48:08,260 --> 01:48:10,900 I could give myself just one of them in this way. 2360 01:48:10,900 --> 01:48:14,380 So you can play with the start value and the end value in this way, 2361 01:48:14,380 --> 01:48:17,020 to slice and dice these lists in different ways. 2362 01:48:17,020 --> 01:48:20,620 That would have been a pain in C, just because we didn't really 2363 01:48:20,620 --> 01:48:26,840 have the built-in support for manipulating arrays as cleanly as this. 2364 01:48:26,840 --> 01:48:27,340 All right. 2365 01:48:27,340 --> 01:48:31,440 Just so you've seen it, too-- though, this one is less exciting to see live-- 2366 01:48:31,440 --> 01:48:33,940 if I go ahead and create a quick program here, it turns out, 2367 01:48:33,940 --> 01:48:37,630 there's something else in the sys library, the ability to exit programs-- 2368 01:48:37,630 --> 01:48:41,590 either exiting with status code 1 or 0, as we've been doing any time something 2369 01:48:41,590 --> 01:48:42,673 goes right or wrong. 2370 01:48:42,673 --> 01:48:45,340 So, for instance, let me whip up a quick program that just says, 2371 01:48:45,340 --> 01:48:52,300 if the length of sys.argv does not equal 2, then let's yell at the user 2372 01:48:52,300 --> 01:48:54,970 and say you're missing a command line argument. 2373 01:48:54,970 --> 01:48:57,380 Otherwise, command-line argument. 2374 01:48:57,380 --> 01:49:01,360 And let's, then, return sys.exit(1). 2375 01:49:01,360 --> 01:49:05,590 Else, let's go ahead and, logically, just say print a formatted string that 2376 01:49:05,590 --> 01:49:07,450 says hello-- as before-- 2377 01:49:07,450 --> 01:49:09,640 sys.argv 1. 2378 01:49:09,640 --> 01:49:11,770 Now, things look different all of a sudden, 2379 01:49:11,770 --> 01:49:13,312 but I'm doing something deliberately. 2380 01:49:13,312 --> 01:49:14,870 First, let's see what this does. 2381 01:49:14,870 --> 01:49:18,730 So, on line 1, I'm importing not argv, specifically. 2382 01:49:18,730 --> 01:49:22,150 I'm importing the whole sys library, and we'll see why in a second. 2383 01:49:22,150 --> 01:49:27,220 Well, it turns out that the sys library has not only the argv list, 2384 01:49:27,220 --> 01:49:30,580 it also has a function called exit, which I'd like to be able to use, 2385 01:49:30,580 --> 01:49:31,370 as well. 2386 01:49:31,370 --> 01:49:35,200 So it turns out that, if you import a whole library in this way, that's fine. 2387 01:49:35,200 --> 01:49:37,840 But you have to refer to the things inside of it 2388 01:49:37,840 --> 01:49:42,980 by using that same library's name and a dot to namespace it, so to speak. 2389 01:49:42,980 --> 01:49:47,002 So here, I'm just saying, if the user does not type in two words, 2390 01:49:47,002 --> 01:49:49,960 yell at them with missing command line argument, and then, exit with 1. 2391 01:49:49,960 --> 01:49:52,975 Just like in C, when you do exit 1, just means something went wrong. 2392 01:49:52,975 --> 01:49:54,785 Otherwise, print out hello to this. 2393 01:49:54,785 --> 01:49:57,910 And this is starting to look cryptic, but it's just a combination of ideas. 2394 01:49:57,910 --> 01:50:02,080 The curly braces means interpolate this value, plug it in here. 2395 01:50:02,080 --> 01:50:05,740 sys.argv is just the verbose way of saying go into the sys library 2396 01:50:05,740 --> 01:50:09,010 and get the argv variable therein. 2397 01:50:09,010 --> 01:50:11,860 And bracket 1, of course, just like arrays in C, 2398 01:50:11,860 --> 01:50:15,440 is just the second element at the prompt. 2399 01:50:15,440 --> 01:50:18,700 So when I run this version, now-- python of exit.py-- 2400 01:50:18,700 --> 01:50:21,340 with no arguments, I get yelled at in this way. 2401 01:50:21,340 --> 01:50:24,640 If, however, I type in two arguments total-- 2402 01:50:24,640 --> 01:50:26,950 the name of the file and my own name-- 2403 01:50:26,950 --> 01:50:29,050 now, I get greeted with hello, David. 2404 01:50:29,050 --> 01:50:30,310 And it's the same idea before. 2405 01:50:30,310 --> 01:50:33,160 This was a very low-level technique, but same thing here. 2406 01:50:33,160 --> 01:50:36,310 If you do echo dollar sign question mark Enter, 2407 01:50:36,310 --> 01:50:39,170 you'll see the exit code of your program. 2408 01:50:39,170 --> 01:50:41,270 So if I do this incorrectly again-- 2409 01:50:41,270 --> 01:50:43,953 let me rerun it without my name, Enter-- 2410 01:50:43,953 --> 01:50:44,620 I get yelled at. 2411 01:50:44,620 --> 01:50:47,320 But if I do echo dollar sign question mark, 2412 01:50:47,320 --> 01:50:50,170 there's the secret one that's returned. 2413 01:50:50,170 --> 01:50:54,160 Again, just to show you parity with C, in this case. 2414 01:50:54,160 --> 01:50:56,320 Questions, now, on any of these techniques, here? 2415 01:50:56,320 --> 01:50:58,900 2416 01:50:58,900 --> 01:50:59,400 No. 2417 01:50:59,400 --> 01:51:00,030 All right. 2418 01:51:00,030 --> 01:51:02,580 How about something that's a little more powerful, too? 2419 01:51:02,580 --> 01:51:05,880 We spend so much time in week 0 and 1 doing searching 2420 01:51:05,880 --> 01:51:07,830 and, then, eventually, sorting in week 3. 2421 01:51:07,830 --> 01:51:10,288 Well, it turns out, Python can help with some of this, too. 2422 01:51:10,288 --> 01:51:12,720 Let me go ahead and create a program called names.py 2423 01:51:12,720 --> 01:51:15,053 that's just going to be an opportunity to, maybe, search 2424 01:51:15,053 --> 01:51:16,650 over a whole bunch of names. 2425 01:51:16,650 --> 01:51:21,060 Let me go ahead and import sys, just so I have access to exit. 2426 01:51:21,060 --> 01:51:22,920 And let me go ahead and create a variable 2427 01:51:22,920 --> 01:51:26,756 called names that's going to be a list with a whole bunch of names. 2428 01:51:26,756 --> 01:51:27,660 How about here? 2429 01:51:27,660 --> 01:51:34,740 Charlie and Fred and George and Ginny and Percy and, lastly, Ron. 2430 01:51:34,740 --> 01:51:36,290 So a whole bunch of names here. 2431 01:51:36,290 --> 01:51:38,040 And it'd be a little annoying to implement 2432 01:51:38,040 --> 01:51:42,540 code that iterates over that, from left to right, in C, searching for one 2433 01:51:42,540 --> 01:51:43,165 of those names. 2434 01:51:43,165 --> 01:51:43,957 In fact, what name? 2435 01:51:43,957 --> 01:51:46,290 Well, let's go ahead and ask the user to input the name 2436 01:51:46,290 --> 01:51:48,498 that they want to search for so that we can tell them 2437 01:51:48,498 --> 01:51:50,460 if the name is there or not. 2438 01:51:50,460 --> 01:51:54,670 And we could do this, similar to C, in Python, doing something like this. 2439 01:51:54,670 --> 01:52:00,600 So for n in names, where n is just a variable to iterate over each name-- 2440 01:52:00,600 --> 01:52:05,595 if the name I'm looking for equals the current name in the list-- 2441 01:52:05,595 --> 01:52:09,060 AKA n-- well, let's print out something friendly, like "Found." 2442 01:52:09,060 --> 01:52:14,250 And then, let's do sys.exit 0 to indicate that we found whoever that is. 2443 01:52:14,250 --> 01:52:17,460 Otherwise, if we get all the way to the bottom here, outside of this loop, 2444 01:52:17,460 --> 01:52:20,340 let's just print "Not found" because if we haven't exited yet. 2445 01:52:20,340 --> 01:52:22,800 And then, let's just exit with 1. 2446 01:52:22,800 --> 01:52:25,980 Just to be clear, I can continue importing all of sys, 2447 01:52:25,980 --> 01:52:31,920 or I could do from sys import exit, and then, I could get rid of sys dot 2448 01:52:31,920 --> 01:52:33,240 everywhere else. 2449 01:52:33,240 --> 01:52:36,540 But sometimes, it's helpful to know exactly where functions came from. 2450 01:52:36,540 --> 01:52:39,675 So this, too, is just a matter of style, in this case. 2451 01:52:39,675 --> 01:52:40,230 All right. 2452 01:52:40,230 --> 01:52:41,522 So let's go ahead and run this. 2453 01:52:41,522 --> 01:52:46,540 python of names.py, and let's look for Ron, all the way at the end. 2454 01:52:46,540 --> 01:52:47,040 All right. 2455 01:52:47,040 --> 01:52:47,910 He's found. 2456 01:52:47,910 --> 01:52:51,570 And let's search for someone outside of the family here, like Hermione. 2457 01:52:51,570 --> 01:52:52,700 Not found. 2458 01:52:52,700 --> 01:52:53,200 OK. 2459 01:52:53,200 --> 01:52:54,783 So it seems to be working in this way. 2460 01:52:54,783 --> 01:52:58,548 But I've essentially implemented what algorithm? 2461 01:52:58,548 --> 01:53:05,247 What algorithm would this seem to be, per line 7 and 8 to 9 and 10? 2462 01:53:05,247 --> 01:53:05,955 AUDIENCE: Linear. 2463 01:53:05,955 --> 01:53:06,450 DAVID MALAN: Yeah. 2464 01:53:06,450 --> 01:53:07,350 So it's just linear search. 2465 01:53:07,350 --> 01:53:10,185 It's a loop, even thought he syntax is a little more succinct today, 2466 01:53:10,185 --> 01:53:12,060 and it's just iterating over the whole thing. 2467 01:53:12,060 --> 01:53:15,240 Well, honestly, we've seen an even more terse way to do this in Python. 2468 01:53:15,240 --> 01:53:19,230 And this, again, is what makes it a more pleasant language, sometimes. 2469 01:53:19,230 --> 01:53:20,630 Why don't I just do this? 2470 01:53:20,630 --> 01:53:24,790 Instead of iterating one at a time, why don't I just say this? 2471 01:53:24,790 --> 01:53:27,840 Let me go ahead and change my condition to just 2472 01:53:27,840 --> 01:53:33,270 be-- how about if the name we're looking for is in the names list, we're done. 2473 01:53:33,270 --> 01:53:33,960 We found it. 2474 01:53:33,960 --> 01:53:36,570 Use the end preposition that we've seen a couple of times, 2475 01:53:36,570 --> 01:53:40,710 now, that itself asks the question, is something in something else? 2476 01:53:40,710 --> 01:53:44,050 And Python will take care of linear search for us. 2477 01:53:44,050 --> 01:53:46,080 And it's going to work exactly the same if I 2478 01:53:46,080 --> 01:53:48,030 do python of names.py, search for Ron. 2479 01:53:48,030 --> 01:53:50,077 It's still going to find him and it's still 2480 01:53:50,077 --> 01:53:51,660 going to do it linearly, in this case. 2481 01:53:51,660 --> 01:53:58,060 But I don't have to write all of the lower-level code myself, in this case. 2482 01:53:58,060 --> 01:54:02,430 Questions, now, on any of this? 2483 01:54:02,430 --> 01:54:05,380 The code's just getting shorter and shorter. 2484 01:54:05,380 --> 01:54:05,880 No? 2485 01:54:05,880 --> 01:54:07,740 What about-- let's see. 2486 01:54:07,740 --> 01:54:09,250 What else might we have here? 2487 01:54:09,250 --> 01:54:10,770 How about this? 2488 01:54:10,770 --> 01:54:12,780 Let's go ahead and implement that phonebook 2489 01:54:12,780 --> 01:54:15,690 that we started, metaphorically, with in the beginning of the course. 2490 01:54:15,690 --> 01:54:17,940 Let's code up a program called phonebook.py. 2491 01:54:17,940 --> 01:54:22,440 And in this case, let's go ahead and let's create a dictionary this time. 2492 01:54:22,440 --> 01:54:25,470 Recall that a dictionary is a little something that 2493 01:54:25,470 --> 01:54:27,060 implements something like this-- 2494 01:54:27,060 --> 01:54:31,140 a two-column table that's got keys and values, words 2495 01:54:31,140 --> 01:54:33,240 and definitions, names and numbers. 2496 01:54:33,240 --> 01:54:36,367 And let's focus on the last of those, names and numbers, in this case. 2497 01:54:36,367 --> 01:54:38,700 Well, I claimed earlier that Python has built-in support 2498 01:54:38,700 --> 01:54:42,780 for dictionaries-- dict objects-- that you can create with one line. 2499 01:54:42,780 --> 01:54:45,120 I didn't need it for speller because a set is sufficient 2500 01:54:45,120 --> 01:54:47,610 when you only want one of the keys or the values, not both. 2501 01:54:47,610 --> 01:54:49,680 But now, I want some names and numbers. 2502 01:54:49,680 --> 01:54:53,220 So it turns out, in Python, you can create an empty dictionary 2503 01:54:53,220 --> 01:54:55,680 by saying dict open parenthesis, closed. 2504 01:54:55,680 --> 01:54:58,080 And that just gives you, essentially, a chart that 2505 01:54:58,080 --> 01:54:59,640 looks like this, with nothing in it. 2506 01:54:59,640 --> 01:55:01,725 Or there's more succinct syntax. 2507 01:55:01,725 --> 01:55:06,858 You can, alternatively, do this, with two curly braces, instead. 2508 01:55:06,858 --> 01:55:09,150 And, in fact, I've been using a shortcut all this time. 2509 01:55:09,150 --> 01:55:15,885 When I had a list, earlier, where my variable was called scores, 2510 01:55:15,885 --> 01:55:19,860 and I did this, that was actually the shorthand version of this-- 2511 01:55:19,860 --> 01:55:21,637 hey, Python, give me an empty list. 2512 01:55:21,637 --> 01:55:23,970 So there's different syntax for achieving the same goal. 2513 01:55:23,970 --> 01:55:27,540 In this case, if I want a dictionary for people, 2514 01:55:27,540 --> 01:55:32,530 I can either do this or, more commonly, just two curly braces, like that. 2515 01:55:32,530 --> 01:55:33,030 All right. 2516 01:55:33,030 --> 01:55:34,360 Well, what do I want to put in this? 2517 01:55:34,360 --> 01:55:36,360 Well, let me actually put some things in this. 2518 01:55:36,360 --> 01:55:39,360 And I'm going to just move my closed curly brace to a new line. 2519 01:55:39,360 --> 01:55:42,580 If I want to implement this idea of keys and values, 2520 01:55:42,580 --> 01:55:47,220 the way you do this in Python is key colon value comma. 2521 01:55:47,220 --> 01:55:48,230 Key colon value. 2522 01:55:48,230 --> 01:55:50,410 So you'd implement it more in code. 2523 01:55:50,410 --> 01:55:54,270 So, for instance, if I want Carter to be the first key in my phone book and I 2524 01:55:54,270 --> 01:56:00,135 want his number to be +1-617-495-1000, I can put that as the corresponding 2525 01:56:00,135 --> 01:56:00,960 value. 2526 01:56:00,960 --> 01:56:02,010 The colon is in between. 2527 01:56:02,010 --> 01:56:05,970 Both are strings, or strs, so I've quoted both deliberately. 2528 01:56:05,970 --> 01:56:07,762 If I want to add myself, I can put a comma. 2529 01:56:07,762 --> 01:56:10,970 And then, just to keep things pretty, I'm moving the cursor to the next line. 2530 01:56:10,970 --> 01:56:12,990 But that's not strictly required, aesthetically. 2531 01:56:12,990 --> 01:56:13,865 It's just good style. 2532 01:56:13,865 --> 01:56:19,500 And here, I might do +1-949-468-2750. 2533 01:56:19,500 --> 01:56:24,270 And now, I have a dictionary that, essentially, has two rows, here-- 2534 01:56:24,270 --> 01:56:27,322 Carter and his number and David and his number, as well. 2535 01:56:27,322 --> 01:56:30,405 And if I kept adding to this, this chart would just get longer and longer. 2536 01:56:30,405 --> 01:56:32,430 Suppose I want to search for one of our numbers. 2537 01:56:32,430 --> 01:56:34,950 Well, let's prompt the user for the name, 2538 01:56:34,950 --> 01:56:37,470 for whose number you want to search by getting string. 2539 01:56:37,470 --> 01:56:38,560 Or you know what? 2540 01:56:38,560 --> 01:56:39,893 We don't need this CS50 library. 2541 01:56:39,893 --> 01:56:43,090 Let's just use input and prompt the user for a name. 2542 01:56:43,090 --> 01:56:49,230 And now, we can use this super terse syntax and just say if name in people, 2543 01:56:49,230 --> 01:56:53,700 print the formatted string number colon and-- 2544 01:56:53,700 --> 01:56:57,160 here, we can do this-- people bracket name. 2545 01:56:57,160 --> 01:56:57,930 OK. 2546 01:56:57,930 --> 01:57:01,800 So this is getting cool quickly, confusingly. 2547 01:57:01,800 --> 01:57:02,805 So let me run this. 2548 01:57:02,805 --> 01:57:06,810 python of phonebook.py Let's type in Carter. 2549 01:57:06,810 --> 01:57:08,910 And, indeed, I see his number. 2550 01:57:08,910 --> 01:57:12,910 Let's run it again with David, and I see my number here. 2551 01:57:12,910 --> 01:57:14,590 So what's going on? 2552 01:57:14,590 --> 01:57:19,320 Well, it turns out that a dictionary is very similar, in spirit, to a list. 2553 01:57:19,320 --> 01:57:22,350 It's actually very similar, in spirit, to an array in C. 2554 01:57:22,350 --> 01:57:27,150 But instead of being limited to keys that are numbers, like bracket 0, 2555 01:57:27,150 --> 01:57:30,690 bracket 1, bracket 2, you can actually use words. 2556 01:57:30,690 --> 01:57:33,060 And that's all I'm doing here on line 8. 2557 01:57:33,060 --> 01:57:36,765 If I want to check for the name Carter, which is currently 2558 01:57:36,765 --> 01:57:39,555 in this variable called name, I can index 2559 01:57:39,555 --> 01:57:42,660 into my people dictionary using not a number, 2560 01:57:42,660 --> 01:57:44,830 but using, literally, a string-- 2561 01:57:44,830 --> 01:57:48,000 the name Carter or David or anything else. 2562 01:57:48,000 --> 01:57:50,640 To make this clearer, too, notice that I'm, at the moment, 2563 01:57:50,640 --> 01:57:54,095 using this format string, which is adding some undue complexity. 2564 01:57:54,095 --> 01:57:56,220 But I could clarify this, perhaps, further as this. 2565 01:57:56,220 --> 01:57:58,080 I could give myself another variable called 2566 01:57:58,080 --> 01:58:01,320 number, set it equal to the people dictionary, 2567 01:58:01,320 --> 01:58:03,875 indexing into it using the current name. 2568 01:58:03,875 --> 01:58:07,230 And now, I can shorten this to make it clearer that all I'm doing 2569 01:58:07,230 --> 01:58:09,910 is printing the value of that. 2570 01:58:09,910 --> 01:58:12,930 And, in fact, I can do this even more cryptically. 2571 01:58:12,930 --> 01:58:16,710 This would be weird to do, but if I only ever want to show David's phone number 2572 01:58:16,710 --> 01:58:21,150 and never Carter's, I can literally, quote unquote, "index into" the people 2573 01:58:21,150 --> 01:58:24,930 dictionary because, now, when I run this, even if I type Carter, 2574 01:58:24,930 --> 01:58:27,020 I'm going to get back my number instead. 2575 01:58:27,020 --> 01:58:31,080 But that's all that's happening if I undo that, because that's now a bug. 2576 01:58:31,080 --> 01:58:35,250 But I index into it using the value of name. 2577 01:58:35,250 --> 01:58:37,230 Dictionaries are just so wonderfully convenient 2578 01:58:37,230 --> 01:58:39,688 because, now, you can associate anything with anything else 2579 01:58:39,688 --> 01:58:43,420 but not using numbers, but entire key words, instead. 2580 01:58:43,420 --> 01:58:46,770 So here's how, if, in speller, we gave you not just words, 2581 01:58:46,770 --> 01:58:50,340 but hundreds of thousands of definitions, as well, 2582 01:58:50,340 --> 01:58:52,385 you could essentially store them as this. 2583 01:58:52,385 --> 01:58:55,680 And then, when the human wants to look up a definition in a proper dictionary, 2584 01:58:55,680 --> 01:58:57,750 not just for spell checking, you could index 2585 01:58:57,750 --> 01:59:00,290 into the dictionary using square brackets 2586 01:59:00,290 --> 01:59:04,240 and get back the definition in English, as well. 2587 01:59:04,240 --> 01:59:06,770 Questions on this? 2588 01:59:06,770 --> 01:59:07,280 Yeah? 2589 01:59:07,280 --> 01:59:09,760 AUDIENCE: Is the way this code does, as presented, 2590 01:59:09,760 --> 01:59:11,744 saying that Python has [INAUDIBLE]? 2591 01:59:11,744 --> 01:59:21,390 2592 01:59:21,390 --> 01:59:22,890 DAVID MALAN: A really good question. 2593 01:59:22,890 --> 01:59:27,330 So, to summarize, how is Python finding that name within that dictionary? 2594 01:59:27,330 --> 01:59:31,110 This is where, honestly, speller in p-set 5 is what Python's all about. 2595 01:59:31,110 --> 01:59:34,215 So you have struggled, are struggling with implementing your own spell 2596 01:59:34,215 --> 01:59:36,090 checker and implementing your own hash table. 2597 01:59:36,090 --> 01:59:39,210 And recall that, per last week, the goal of a hash table is to, 2598 01:59:39,210 --> 01:59:41,190 ideally, get constant time access. 2599 01:59:41,190 --> 01:59:45,435 Not something linear, which is slow and even better than something logarithmic, 2600 01:59:45,435 --> 01:59:47,400 like log base 2 of n. 2601 01:59:47,400 --> 01:59:50,130 So Python and the really smart people who invented it, 2602 01:59:50,130 --> 01:59:53,310 they have written the code that does its best to give you 2603 01:59:53,310 --> 01:59:55,853 constant time searches of dictionaries. 2604 01:59:55,853 --> 01:59:58,020 And they're not always going to succeed, just as you 2605 01:59:58,020 --> 01:59:59,430 and your own problem set are probably going 2606 01:59:59,430 --> 02:00:01,805 to have some collisions once in a while and start to have 2607 02:00:01,805 --> 02:00:03,440 chains of length lists of words. 2608 02:00:03,440 --> 02:00:05,940 But this is where, again, you defer to someone else, someone 2609 02:00:05,940 --> 02:00:07,800 smarter than you, someone with more time than you 2610 02:00:07,800 --> 02:00:09,270 to solve these problems for you. 2611 02:00:09,270 --> 02:00:11,490 And if you read Python's documentation, you'll 2612 02:00:11,490 --> 02:00:13,650 see that it doesn't guarantee constant time, 2613 02:00:13,650 --> 02:00:15,990 but it's going to, ideally, optimize the data structure 2614 02:00:15,990 --> 02:00:19,320 for you to get as fast as possible. 2615 02:00:19,320 --> 02:00:22,690 And of all of the data structures like a dictionary, 2616 02:00:22,690 --> 02:00:25,380 a hash table is, really, like the Swiss army knife of computing 2617 02:00:25,380 --> 02:00:28,260 because it just lets you associate something with something else. 2618 02:00:28,260 --> 02:00:30,510 And even though we keep focusing on names and numbers, 2619 02:00:30,510 --> 02:00:32,400 that's a really powerful thing because it's 2620 02:00:32,400 --> 02:00:34,230 more powerful than lists and arrays, which 2621 02:00:34,230 --> 02:00:35,910 are only numbers and something else. 2622 02:00:35,910 --> 02:00:38,690 Now, you can have any sorts of relationships, instead. 2623 02:00:38,690 --> 02:00:39,270 All right. 2624 02:00:39,270 --> 02:00:41,178 Let me show a few other examples before we 2625 02:00:41,178 --> 02:00:43,470 culminate with some more powerful techniques in Python, 2626 02:00:43,470 --> 02:00:45,000 thanks to libraries. 2627 02:00:45,000 --> 02:00:49,480 How about this problem we encountered in week 4, which was this. 2628 02:00:49,480 --> 02:00:54,120 Let me code up a program called, again, compare.py here but, this time, 2629 02:00:54,120 --> 02:00:56,770 compare to strings and not numbers. 2630 02:00:56,770 --> 02:01:01,230 So let me, for instance, get one string from the user called s. 2631 02:01:01,230 --> 02:01:04,890 Just for the sake of discussion, let me get another string from the user 2632 02:01:04,890 --> 02:01:07,830 called t so that we can actually do some comparison here. 2633 02:01:07,830 --> 02:01:12,780 And if s equals equals t, let's go ahead and print out that they're the same. 2634 02:01:12,780 --> 02:01:15,640 Else, let's go ahead and print out that they're different. 2635 02:01:15,640 --> 02:01:17,910 So this is very similar to what we did in week 4. 2636 02:01:17,910 --> 02:01:20,580 But in week 4, recall we did this specifically 2637 02:01:20,580 --> 02:01:23,800 because we had encountered a problem. 2638 02:01:23,800 --> 02:01:28,680 For instance, if I run-- whoops. 2639 02:01:28,680 --> 02:01:34,970 If I run-- what's going on? 2640 02:01:34,970 --> 02:01:40,396 [INAUDIBLE] Come on. 2641 02:01:40,396 --> 02:01:41,390 Oh. 2642 02:01:41,390 --> 02:01:41,890 OK. 2643 02:01:41,890 --> 02:01:43,240 Wow, OK. 2644 02:01:43,240 --> 02:01:43,840 Long day. 2645 02:01:43,840 --> 02:01:44,380 All right. 2646 02:01:44,380 --> 02:01:48,670 If I run the proper command, python of compare.py, then let's go ahead 2647 02:01:48,670 --> 02:01:53,785 and type in something like "cat" in all lowercase, "cat" in all lowercase. 2648 02:01:53,785 --> 02:01:56,110 And they're the same. 2649 02:01:56,110 --> 02:01:59,565 If, though, I do this again with "dog" and "dog," they're the same. 2650 02:01:59,565 --> 02:02:01,690 And, of course, "cat" and "dog," they're different. 2651 02:02:01,690 --> 02:02:06,430 But does anyone recall, from two weeks ago, when I typed in my name twice, 2652 02:02:06,430 --> 02:02:08,680 both identically capitalized. 2653 02:02:08,680 --> 02:02:10,360 What did it say? 2654 02:02:10,360 --> 02:02:13,390 That they were, in fact, different. 2655 02:02:13,390 --> 02:02:14,110 And why was that? 2656 02:02:14,110 --> 02:02:16,660 Why were two strings in C different, even though I typed literally 2657 02:02:16,660 --> 02:02:17,410 the same thing? 2658 02:02:17,410 --> 02:02:20,040 2659 02:02:20,040 --> 02:02:21,540 Two different places in memory. 2660 02:02:21,540 --> 02:02:24,560 So each string might look the same, aesthetically, but, of course, 2661 02:02:24,560 --> 02:02:25,852 was stored elsewhere in memory. 2662 02:02:25,852 --> 02:02:29,970 And yet, Python appears to be using the equality operator-- 2663 02:02:29,970 --> 02:02:33,510 equals equals-- like you and I would expect, as humans-- actually 2664 02:02:33,510 --> 02:02:38,510 comparing for us char by char in each of those strings for actual [? quality. ?] 2665 02:02:38,510 --> 02:02:41,610 So this is a feature of Python, in that it's just easier to do. 2666 02:02:41,610 --> 02:02:42,210 And why? 2667 02:02:42,210 --> 02:02:44,627 Well, this derives from the reality that, in Python, there 2668 02:02:44,627 --> 02:02:45,630 are no pointers anymore. 2669 02:02:45,630 --> 02:02:47,297 There's no underlying memory management. 2670 02:02:47,297 --> 02:02:50,400 It's not up to you, now, to worry about those lower-level details. 2671 02:02:50,400 --> 02:02:52,960 The language itself takes care of that for you. 2672 02:02:52,960 --> 02:02:55,050 And so, similarly, if I do this and don't 2673 02:02:55,050 --> 02:02:57,510 ask the user for two strings, but just one, 2674 02:02:57,510 --> 02:02:59,370 and then, I do something like this. 2675 02:02:59,370 --> 02:03:05,550 How about give myself a second variable t, set it equal to s.capitalize, which, 2676 02:03:05,550 --> 02:03:08,040 note, is not the same as upper; capitalize, by design, 2677 02:03:08,040 --> 02:03:12,270 per Python's documentation, will only capitalize the first letter for you-- 2678 02:03:12,270 --> 02:03:15,240 I can now print out, say, two fstrings here-- 2679 02:03:15,240 --> 02:03:18,240 what the value of s is and, then, let me print out, 2680 02:03:18,240 --> 02:03:20,340 with another fstring, what the value of t is. 2681 02:03:20,340 --> 02:03:22,995 And recall that, in C, this was a problem 2682 02:03:22,995 --> 02:03:26,820 because if you capitalize s and store it in t, 2683 02:03:26,820 --> 02:03:29,670 we accidentally capitalized both s and t. 2684 02:03:29,670 --> 02:03:33,510 But in this case, in Python, when I actually run this and type in "cat" 2685 02:03:33,510 --> 02:03:37,770 In all lowercase, the original s is unchanged 2686 02:03:37,770 --> 02:03:42,780 because, when I use capitalize on line 3, this is, indeed, capitalizing s. 2687 02:03:42,780 --> 02:03:47,550 But it's returning a copy of the result. It cannot change s itself 2688 02:03:47,550 --> 02:03:50,385 because, again, for that technical term, s is immutable. 2689 02:03:50,385 --> 02:03:53,265 Strings, once they exist, cannot be changed themselves. 2690 02:03:53,265 --> 02:03:58,590 But you can return copies and modify mutated copies of those same strings. 2691 02:03:58,590 --> 02:04:02,040 So, in short, all of those headaches we encountered in week 4 2692 02:04:02,040 --> 02:04:05,070 are now solved, really, in the way you might expect. 2693 02:04:05,070 --> 02:04:07,500 And here's another one that we dwelled on in week 4, 2694 02:04:07,500 --> 02:04:09,660 with the colored liquid in glasses. 2695 02:04:09,660 --> 02:04:12,150 Let me code up a program called swap.py. 2696 02:04:12,150 --> 02:04:16,690 And in swap.py, let me set x equal to 1, y equal to 2. 2697 02:04:16,690 --> 02:04:18,690 And then, let me just print out an fstring here. 2698 02:04:18,690 --> 02:04:24,360 So how about x is this comma y is that. 2699 02:04:24,360 --> 02:04:27,735 And then, let me do that twice, just for the sake of demonstration. 2700 02:04:27,735 --> 02:04:31,005 And in here, recall that we had to create a swap function. 2701 02:04:31,005 --> 02:04:33,630 But then, we had to pass it in by reference with the ampersand. 2702 02:04:33,630 --> 02:04:38,460 And oh my god, that was peak complexity in C. Well, 2703 02:04:38,460 --> 02:04:41,100 if you want to swap x and y in Python, you 2704 02:04:41,100 --> 02:04:43,830 could do x comma y equals y comma x. 2705 02:04:43,830 --> 02:04:49,020 And now, python of swap.py. 2706 02:04:49,020 --> 02:04:50,130 And there we go. 2707 02:04:50,130 --> 02:04:51,840 All of that's handled for you. 2708 02:04:51,840 --> 02:04:56,350 It's like a shell game without even a temporary variable in mind. 2709 02:04:56,350 --> 02:04:58,290 So what more can we do here? 2710 02:04:58,290 --> 02:05:00,870 How about a few final building blocks? 2711 02:05:00,870 --> 02:05:03,330 And these related, now, to files from that week 4. 2712 02:05:03,330 --> 02:05:07,710 Suppose that I want to save some names and numbers in a CSV file-- 2713 02:05:07,710 --> 02:05:11,080 Comma Separated Values, which is like a very lightweight spreadsheet. 2714 02:05:11,080 --> 02:05:15,300 Well, first, let me create a phonebook.csv file 2715 02:05:15,300 --> 02:05:19,458 that just has name comma number as the first row there. 2716 02:05:19,458 --> 02:05:21,750 But after that, I'm going to go ahead, now, and code up 2717 02:05:21,750 --> 02:05:25,170 a phonebook.py program that actually allows 2718 02:05:25,170 --> 02:05:27,040 me to add things to this phonebook. 2719 02:05:27,040 --> 02:05:31,020 So let me split my screen here so that we can see the old and the new. 2720 02:05:31,020 --> 02:05:34,050 And down here, in my code for phonebook.py, 2721 02:05:34,050 --> 02:05:36,360 in this new and improved version, I'm going 2722 02:05:36,360 --> 02:05:40,020 to actually import a whole other library, this one called CSV. 2723 02:05:40,020 --> 02:05:42,885 And here, too, especially for people in data science and the like, 2724 02:05:42,885 --> 02:05:46,500 really like being able to manipulate files and data that might very well be 2725 02:05:46,500 --> 02:05:48,060 stored in spreadsheets or CSVs-- 2726 02:05:48,060 --> 02:05:51,510 Comma Separated Values, which we saw briefly in week 4. 2727 02:05:51,510 --> 02:05:53,670 In phonebook.py, then, it suffices to just 2728 02:05:53,670 --> 02:05:57,348 import CSV after reading the documentation therefore 2729 02:05:57,348 --> 02:05:59,265 because this is going to give me functionality 2730 02:05:59,265 --> 02:06:02,150 in code related to CSV files. 2731 02:06:02,150 --> 02:06:04,950 So here's how I might open a file in Python. 2732 02:06:04,950 --> 02:06:08,340 I literally call open-- it's not fopen now; it's just open-- 2733 02:06:08,340 --> 02:06:10,860 and I open this file called phonebook.csv. 2734 02:06:10,860 --> 02:06:13,470 And just as in C, I'm going to open it in append mode-- 2735 02:06:13,470 --> 02:06:15,930 not right, where it would change the whole thing. 2736 02:06:15,930 --> 02:06:18,660 I want to append new line at a time. 2737 02:06:18,660 --> 02:06:21,750 After this, I want to get, maybe, a name from the user. 2738 02:06:21,750 --> 02:06:25,350 So let's prompt the user for some input for their name. 2739 02:06:25,350 --> 02:06:27,255 And then, let's prompt the user for a number, 2740 02:06:27,255 --> 02:06:31,060 as well, using input prompting for number. 2741 02:06:31,060 --> 02:06:31,560 All right. 2742 02:06:31,560 --> 02:06:33,602 And now, this is a little cryptic, and you'd only 2743 02:06:33,602 --> 02:06:35,050 know this from the documentation. 2744 02:06:35,050 --> 02:06:38,370 But if you want to write rows to a CSV file 2745 02:06:38,370 --> 02:06:41,850 that you can, then, view in Excel or the like, you can do this-- 2746 02:06:41,850 --> 02:06:45,060 give me a variable called writer-- but I could call it anything I want. 2747 02:06:45,060 --> 02:06:50,760 Let me use a csv.writer function that comes with this CSV library, 2748 02:06:50,760 --> 02:06:51,885 passing in the file. 2749 02:06:51,885 --> 02:06:56,070 This is like saying, hey, Python, treat this open file as a CSV file 2750 02:06:56,070 --> 02:06:59,340 so that things are separated with commas and nicely formatted 2751 02:06:59,340 --> 02:07:00,515 in rows and columns. 2752 02:07:00,515 --> 02:07:02,100 Now, I'm going to do this-- 2753 02:07:02,100 --> 02:07:04,030 use that writer to write a row. 2754 02:07:04,030 --> 02:07:05,280 Well, what do I want to write? 2755 02:07:05,280 --> 02:07:07,380 I want to write a short list-- 2756 02:07:07,380 --> 02:07:10,200 namely, the current name and the current number-- 2757 02:07:10,200 --> 02:07:14,790 to that file, but I don't want to use fprintf and %s and all of that stuff 2758 02:07:14,790 --> 02:07:16,440 that we might have had in the past. 2759 02:07:16,440 --> 02:07:19,030 And now, I just want to close the file. 2760 02:07:19,030 --> 02:07:20,410 Let me reopen my terminal. 2761 02:07:20,410 --> 02:07:26,102 Let me run python of phonebook.py, and let me type in David and then 2762 02:07:26,102 --> 02:07:30,190 +1-949-468-2750 and, crossing my fingers, 2763 02:07:30,190 --> 02:07:33,430 watching the actual CSV at top-left. 2764 02:07:33,430 --> 02:07:35,737 My code has just added me to the file. 2765 02:07:35,737 --> 02:07:37,570 And if I were to run it again, for instance, 2766 02:07:37,570 --> 02:07:41,770 with Carter and +1-617-495-1000, crossing my fingers again-- 2767 02:07:41,770 --> 02:07:42,820 we've updated the file. 2768 02:07:42,820 --> 02:07:46,150 And it turns out, there's code now, via which I can even read that file. 2769 02:07:46,150 --> 02:07:48,850 But I can, first, tighten this up, just so you've seen it. 2770 02:07:48,850 --> 02:07:52,720 It turns out, in Python, it's so common to open files and close them. 2771 02:07:52,720 --> 02:07:54,610 Humans make mistakes, and they often forget 2772 02:07:54,610 --> 02:07:58,477 to close files, which might, then, end up using more memory than you intend. 2773 02:07:58,477 --> 02:08:00,310 So you can, alternatively, do this in Python 2774 02:08:00,310 --> 02:08:03,310 so that you don't have to worry about closing files. 2775 02:08:03,310 --> 02:08:05,920 You can use this keyword instead. 2776 02:08:05,920 --> 02:08:09,100 You can say with the opening of this file 2777 02:08:09,100 --> 02:08:13,420 as a variable called file do all of the following underneath. 2778 02:08:13,420 --> 02:08:15,470 So I'm indenting most of my code. 2779 02:08:15,470 --> 02:08:18,430 I'm using this new, Python-specific keyword called width. 2780 02:08:18,430 --> 02:08:22,330 And this is just a matter of saying, with the following opening of the file, 2781 02:08:22,330 --> 02:08:26,120 do those next four lines of code, and then, automatically close it for me 2782 02:08:26,120 --> 02:08:27,370 at the end of the indentation. 2783 02:08:27,370 --> 02:08:31,480 It's a minor optimization, but this, again, is the pythonic way 2784 02:08:31,480 --> 02:08:33,250 to do things, instead. 2785 02:08:33,250 --> 02:08:34,720 How else might I do this, too? 2786 02:08:34,720 --> 02:08:38,860 Well, it turns out that the code I've written here-- on line 9, 2787 02:08:38,860 --> 02:08:40,630 especially-- is a little fragile. 2788 02:08:40,630 --> 02:08:44,350 If any human opens this spreadsheet-- the CSV file in Excel, 2789 02:08:44,350 --> 02:08:46,000 Google Spreadsheets, Apple Numbers-- 2790 02:08:46,000 --> 02:08:49,390 and maybe moves the columns around just because, maybe, they're fussing. 2791 02:08:49,390 --> 02:08:52,790 They saved it, and they don't realize they've, now, changed my assumptions. 2792 02:08:52,790 --> 02:08:55,120 I don't want to, necessarily, write name and number 2793 02:08:55,120 --> 02:08:58,360 always in that order because what if someone screws up and flips those two 2794 02:08:58,360 --> 02:09:01,040 columns by literally dragging and dropping? 2795 02:09:01,040 --> 02:09:03,640 So it turns out that, instead of using a list here, 2796 02:09:03,640 --> 02:09:06,890 we can use another feature of this library, as follows. 2797 02:09:06,890 --> 02:09:09,520 Instead of using a writer, there's something 2798 02:09:09,520 --> 02:09:11,530 called a dictionary writer or dict writer 2799 02:09:11,530 --> 02:09:14,140 that takes the same argument as input-- 2800 02:09:14,140 --> 02:09:15,580 the file that's opened. 2801 02:09:15,580 --> 02:09:18,070 But now, the one difference here is that you 2802 02:09:18,070 --> 02:09:25,030 need to tell this dictionary writer that your field names are name and number. 2803 02:09:25,030 --> 02:09:27,370 And let me close the CSV here. 2804 02:09:27,370 --> 02:09:32,140 Name and number are the names of the fields, the columns in this CSV file. 2805 02:09:32,140 --> 02:09:34,450 And when it comes time to write a new row, 2806 02:09:34,450 --> 02:09:37,750 the syntax here is going to be a little uglier, but it's just a dictionary. 2807 02:09:37,750 --> 02:09:40,120 The name I want to write to the dictionary 2808 02:09:40,120 --> 02:09:42,310 is going to be whatever name the human typed in. 2809 02:09:42,310 --> 02:09:45,790 The number that I want to write to the CSV file 2810 02:09:45,790 --> 02:09:48,550 is going to be whatever the number the human typed in. 2811 02:09:48,550 --> 02:09:51,010 But what's different, now, about this code is, 2812 02:09:51,010 --> 02:09:55,960 by simply using a dictionary writer here instead of the generic writer, 2813 02:09:55,960 --> 02:10:00,640 now, the columns can be in this order or this order or any order. 2814 02:10:00,640 --> 02:10:03,010 And the dictionary writer is going to figure out, 2815 02:10:03,010 --> 02:10:06,557 based on the first line of text in that CSV, where to put name, 2816 02:10:06,557 --> 02:10:07,390 where to put number. 2817 02:10:07,390 --> 02:10:08,883 So if you flip them, no big deal. 2818 02:10:08,883 --> 02:10:11,050 It's going to notice, oh, wait, the columns changed. 2819 02:10:11,050 --> 02:10:14,330 And it's going to insert the columns correctly. 2820 02:10:14,330 --> 02:10:18,970 So just, again, another more powerful feature that lets you 2821 02:10:18,970 --> 02:10:22,750 focus on real work, as opposed to actually getting 2822 02:10:22,750 --> 02:10:27,250 tied up in the weeds of writing code like this, otherwise. 2823 02:10:27,250 --> 02:10:30,440 Questions on this one, as well? 2824 02:10:30,440 --> 02:10:33,520 But what we will do, now, is come full circle 2825 02:10:33,520 --> 02:10:37,180 to some of the more sophisticated examples with which we began, 2826 02:10:37,180 --> 02:10:40,855 and I'm going to go back over to my own Mac laptop 2827 02:10:40,855 --> 02:10:43,743 here, where I've got my own terminal window up and running, 2828 02:10:43,743 --> 02:10:46,285 and I was just going to introduce a couple of final libraries 2829 02:10:46,285 --> 02:10:49,788 that really speak to just how powerful Python can be 2830 02:10:49,788 --> 02:10:51,580 and how quickly you can get up and running. 2831 02:10:51,580 --> 02:10:54,330 To be fair, can't necessarily do all of these things in the cloud, 2832 02:10:54,330 --> 02:10:57,337 like in code spaces, because you need access to your own speakers 2833 02:10:57,337 --> 02:10:58,420 or microphone or the like. 2834 02:10:58,420 --> 02:11:01,090 So that's why I'm doing it on my own Mac, here. 2835 02:11:01,090 --> 02:11:05,680 But let me go ahead and open up a program called speech.py. 2836 02:11:05,680 --> 02:11:07,300 And I'm not using VS Code here. 2837 02:11:07,300 --> 02:11:10,150 I'm using a program called VI that's entirely terminal window based. 2838 02:11:10,150 --> 02:11:13,105 But it's going to allow me, for instance, to import the Python 2839 02:11:13,105 --> 02:11:16,120 text to speech version 3 library. 2840 02:11:16,120 --> 02:11:18,790 I'm going to give myself a variable called engine that's 2841 02:11:18,790 --> 02:11:21,610 going to be set equal to the Python text to speech 2842 02:11:21,610 --> 02:11:26,350 3 libraries init method, which is just going to initialize this library that 2843 02:11:26,350 --> 02:11:28,090 relates to text to speech. 2844 02:11:28,090 --> 02:11:32,410 I'm going to, then, use the engine's say function to say something 2845 02:11:32,410 --> 02:11:35,260 like, how about, hello comma world. 2846 02:11:35,260 --> 02:11:39,850 And then, as my last line, I'm going to say engine.runAndWait, capitalized 2847 02:11:39,850 --> 02:11:44,690 as such, to tell my program, now, to run that speech and wait until it's done. 2848 02:11:44,690 --> 02:11:45,190 All right. 2849 02:11:45,190 --> 02:11:46,540 I'm going to save this file. 2850 02:11:46,540 --> 02:11:49,110 I'm going to run python of speech.py. 2851 02:11:49,110 --> 02:11:52,357 And I'm going to cross my fingers, as always, and-- 2852 02:11:52,357 --> 02:11:53,440 INTERPRETER: Hello, world. 2853 02:11:53,440 --> 02:11:54,398 DAVID MALAN: All right. 2854 02:11:54,398 --> 02:11:57,130 So now, I have a program that's actually synthesizing speech 2855 02:11:57,130 --> 02:11:58,570 using a library like this. 2856 02:11:58,570 --> 02:12:01,285 How can I, now, modify this to be a little more interesting? 2857 02:12:01,285 --> 02:12:02,690 Well, how about this? 2858 02:12:02,690 --> 02:12:05,050 Let me go ahead and prompt the user for their name, 2859 02:12:05,050 --> 02:12:08,680 like we've done several times here, using Python's built-in name function. 2860 02:12:08,680 --> 02:12:11,665 And now, let me go ahead and use a format string in conjunction 2861 02:12:11,665 --> 02:12:14,980 with this library, interpolating the value of name there. 2862 02:12:14,980 --> 02:12:18,460 And-- at least, if my name is somewhat phonetically pronounceable-- 2863 02:12:18,460 --> 02:12:23,587 let's go ahead and run python of speech.py, type in my name, and-- 2864 02:12:23,587 --> 02:12:24,670 INTERPRETER: Hello, David. 2865 02:12:24,670 --> 02:12:25,445 DAVID MALAN: OK. 2866 02:12:25,445 --> 02:12:27,640 It's a weird choice of inflection, but we're 2867 02:12:27,640 --> 02:12:30,475 starting to synthesize voice, not unlike Siri or Google Assistant 2868 02:12:30,475 --> 02:12:32,050 or Alexa or the like. 2869 02:12:32,050 --> 02:12:36,130 Now, we can, maybe, do something a little more advanced, too. 2870 02:12:36,130 --> 02:12:39,310 In addition to synthesizing speech in this way, 2871 02:12:39,310 --> 02:12:43,270 we could synthesize, for instance, an actual graphic. 2872 02:12:43,270 --> 02:12:45,740 Let me go ahead, now, and do something like this. 2873 02:12:45,740 --> 02:12:48,760 Let me create a program called qr.py. 2874 02:12:48,760 --> 02:12:50,890 I'm going to go ahead and import a library called 2875 02:12:50,890 --> 02:12:54,860 OS, which gives you access to operating system related functionality in Python. 2876 02:12:54,860 --> 02:12:56,860 I'm going to import a library I've pre-installed 2877 02:12:56,860 --> 02:12:59,830 called qrcode, which is a two-dimensional barcode that you 2878 02:12:59,830 --> 02:13:01,300 might have seen in the real world. 2879 02:13:01,300 --> 02:13:03,715 I'm going to go ahead and create an image variable using 2880 02:13:03,715 --> 02:13:08,260 this qrcode library's make function, which, per its documentation, 2881 02:13:08,260 --> 02:13:10,365 takes a URL, like one of CS50's own videos. 2882 02:13:10,365 --> 02:13:23,003 So we'll do this with youtu.be/xvF2joSPgG0. 2883 02:13:23,003 --> 02:13:24,670 So, hopefully, that's the right lecture. 2884 02:13:24,670 --> 02:13:27,160 And now, we've got img.save, which is going to allow 2885 02:13:27,160 --> 02:13:30,130 me to create a file called qr.ping. 2886 02:13:30,130 --> 02:13:33,460 Think back, now, on problem set 4 and how painful it was to save files. 2887 02:13:33,460 --> 02:13:36,940 We'll just use the save function, now, in Python and save this as a PNG file-- 2888 02:13:36,940 --> 02:13:38,260 Portable Network Graphic. 2889 02:13:38,260 --> 02:13:43,420 And then, lastly, let's just go ahead and open with the command open qr.png 2890 02:13:43,420 --> 02:13:46,120 on my Mac so that, hopefully, this just automatically opens. 2891 02:13:46,120 --> 02:13:46,660 All right. 2892 02:13:46,660 --> 02:13:49,300 I'm going to go ahead and just double-check my syntax here 2893 02:13:49,300 --> 02:13:51,280 so that I haven't made any mistakes. 2894 02:13:51,280 --> 02:13:54,235 I'm going to go ahead and run python of qr.py. 2895 02:13:54,235 --> 02:13:55,810 Enter. 2896 02:13:55,810 --> 02:13:57,223 That opens up this. 2897 02:13:57,223 --> 02:13:58,390 Let me go ahead and zoom in. 2898 02:13:58,390 --> 02:14:03,750 If you've got a phone handy and you'd like to scan this code here, 2899 02:14:03,750 --> 02:14:07,131 whether in person or online-- 2900 02:14:07,131 --> 02:14:08,095 I apologize. 2901 02:14:08,095 --> 02:14:09,130 You won't appreciate it. 2902 02:14:09,130 --> 02:14:11,640 2903 02:14:11,640 --> 02:14:12,140 Amazing! 2904 02:14:12,140 --> 02:14:13,600 OK. 2905 02:14:13,600 --> 02:14:17,230 And, lastly, let me go back into our speech example 2906 02:14:17,230 --> 02:14:21,400 here, create a final ending here in our final moments. 2907 02:14:21,400 --> 02:14:26,060 And how about we just say something like "This was CS50," like this. 2908 02:14:26,060 --> 02:14:27,087 Let's go ahead, here. 2909 02:14:27,087 --> 02:14:28,795 Fix my capitalization, just for tidiness. 2910 02:14:28,795 --> 02:14:29,878 Let's get rid of the name. 2911 02:14:29,878 --> 02:14:33,840 And now, with our final flourish and your introduction to Python equipped-- 2912 02:14:33,840 --> 02:14:35,230 here we go-- 2913 02:14:35,230 --> 02:14:36,535 INTERPRETER: This was CS50. 2914 02:14:36,535 --> 02:14:37,000 DAVID MALAN: All right. 2915 02:14:37,000 --> 02:14:38,000 We'll see you next time. 2916 02:14:38,000 --> 02:14:39,460 [APPLAUSE] 2917 02:14:39,460 --> 02:14:41,860 2918 02:14:41,860 --> 02:14:45,210 [MUSIC PLAYING] 2919 02:14:45,210 --> 02:15:18,000 241803

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.